To content

Abstracts of TRR 391 Conference 2026

Please find below all abstracts or download the Book of Abstracts (will be available soon) of this year's TRR Conference.

Wednesday, June 10

Department of Quantitative Economics, Maastricht University

Many forecasting tasks involve multiple, interrelated time series that must satisfy linear aggregation constraints, where the components collectively sum to the total. Ensuring such coherence across all aggregation levels is the goal of forecast reconciliation, which is essential for consistent and aligned decision-making. In cross-temporal frameworks, the focus of this talk, these aggregation constraints extend across both cross-sectional and temporal dimensions. Existing literature primarily relies on linear reconciliation methods, which adjust base forecasts through linear transformations within a least-squares framework to satisfy aggregation constraints. In this work, we move beyond this paradigm and introduce a non-linear forecast reconciliation approach for cross-temporal frameworks. Our method directly and automatically produces cross-temporal coherent forecasts by leveraging popular machine learning techniques. We empirically validate our framework on large-scale streaming datasets from a leading European on-demand delivery platform and a bicycle-sharing system in New York City.

Faculty of Business Administration and Economics, University of Duisburg-Essen

Joint work with Simon Wood (University of Edinburgh) and Florian Ziel (University of Duisburg-Essen).

We address the computational bottleneck arising from formationand factorization of the penalized Hessian in large generalized additive mod-els (GAMs) with high-dimensional parameter vectors. We combine the generalized Fellner–Schall smoothing parameter update with stochastic trace estimation (Hutch++) and preconditioned conjugate gradients (PCG) thereby avoiding formation or Cholesky factorization of the GAM penalized Hessian. The resulting procedure relies only on matrix–vector products, enables low memory-bandwidth parallelization and allows exploitation of model term sparsity with minimal infill. The performance of the approach is demonstrated on the NMMAPS respiratory mortality data with over one million observations and more than 20,000 coefficients, fitted in just over half an hour.

Department of Mathematics, University of Hamburg

Score matching is an alternative to maximum likelihood when the normalizing constant is unknown or too costly to evaluate. However, vanilla score matching has shown to be inefficient relative to maximum likelihood for the estimation of multimodal distribution with well-separated modes. We compare a novel diffusion-based denoising score matching estimator (DDSME) to the vanilla score matching estimator (SME) in this scenario. In particular, we prove statistical guarantees for both estimators showing that the error bound for the SME worsens when the separation between the modes increases, which can be avoided in case of the DDSME with suitable hyper parameter tuning. This provides a novel theoretical explanation for the superior behavior of diffusion-based score matching over the vanilla version.

Faculty of Mathematics, Ruhr University Bochum

Joint work with Alexis Boulin (Ruhr University Bochum).

We propose a new and interpretable class of high-dimensional tail dependence models based on latent linear factor structures. The number of factors K is much smaller than the dimension of the observable vector, thereby inducing an explicit form of dimension reduction. The loading structure may additionally exhibit sparsity, meaning that each component is influenced by only a small number of latent factors. Under mild structural assumptions, we establish identifiability of the model parameters and provide a constructive recovery procedure based on a margin-free tail pairwise dependence matrix, which also yields practical rank-based estimation methods. The framework combines naturally with marginal tail models and is particularly well suited to high-dimensional settings. We illustrate its applicability in a spatial wind energy application, where the latent factor structure enables tractable estimation of the risk that a large proportion of turbines simultaneously fall below their cut-in wind speed thresholds.

Department of Statistics, TU Dortmund University

Structural identification in Vector Autoregression models faces a fundamental challenge when extending to multi-country settings: the number of required restrictions proliferates rapidly. Existing approaches either require an impractical number of restrictions or impose assumptions that limit applicability to large-scale problems. We propose Bayesian Structural Matrix Autoregressions (BSMAR), a novel framework that exploits a bilinear matrix structure to jointly identify country-specific shocks while dramatically reducing the required restrictions. We demonstrate the method by identifying supply and demand shocks across a large panel of countries, revealing strong connectedness and international spillover effects. The approach recovers sensible responses and provides a parsimonious yet interpretable way to capture shock transmission across countries.

Faculty of Business Administration and Economics, University of Duisburg-Essen

Distributional forecasts play an increasingly important role across diverse fields, including meteorology, economics, and finance. In this paper, we propose "online" tests to monitor the adequacy of density forecasts.  In contrast to one-shot tests, this allows forecast failure to be detected quickly, which is important as a timely review of the forecast methodology may be restored. Our monitoring procedure not only allows to evaluate univariate density forecasts, but also multivariate ones. Over some pre-specified horizon, our tests hold size exactly even in finite samples. Hence, under the null of correctly specified distributional forecasts, our monitoring scheme only rejects with a given probability of, say, 5%. Monte Carlo simulations demonstrate that the power of our tests to detect misspecified forecasts is much higher compared to the only other extant proposal.

Faculty of Business Administration and Economics, University of Duisburg-Essen

Large-scale streaming data are increasingly common in modern forecasting applications, particularly in the energy and finance sectors, and have motivated the development of online learning algorithms. In many empirical settings, jointly modeling two or more response variables conditional on covariates is of substantial interest. To achieve maximal flexibility, we adopt a generalized linear model framework that models the marginal distributions and the copula separately, rather than assuming a multivariate distribution for the responses. We extend existing approaches by incorporating an efficient online learning algorithm with exponential forgetting, based on online coordinate descent and elastic-net regularization. We validate our approach in simulations and conduct a forecasting study focused on the joint prediction of net load in the UK as in Gioia et al. (JASA 2025). The proposed algorithms are implemented in the computationally efficient Python package ondil.

Faculty of Business Administration and Economics, University of Duisburg-Essen

This paper studies what we call a “robust” Diebold-Mariano (DM) type test. The unique feature of our test is that - even in the absence of any knowledge of the forecasting method - it is robust to estimation noise in the forecasts, i.e., size is kept irrespective of estimation effects induced by model fitting. This feature is obtained by a test statistic that is based on rolling-window means whose length is a vanishing fraction of the total evaluation sample. This leads to non-standard Gumbel limit laws. A second desirable feature of our test is that it is easily robustified against time-varying volatility. Simulations demonstrate the benefits of our multiply robust implementation vis-à-vis several competitors. An empirical application to forecasts for several variables, horizons, vintages and methods from the Survey of Professional Forecasters illustrates the relevance of the new approach, allowing us to identify forecasters with superior models. Such conclusions are in fact impossible to infer by extant tests, since information on the models and estimation procedures behind the forecasts are typically proprietary and, hence, estimation effects cannot be factored out.

Thursday, June 11

Department of Mathematics, King's College London

In many modern applications, experiments are conducted on units connected through a network structure, such as social, biological, or technological systems. In these settings, classical assumptions are violated because outcomes may be influenced not only by a unit's own treatment, but also by treatments on connected units, leading to interference and spillover effects. In this talk, I will present recent work on experimental design in networked settings, aiming to make better use of the network structure to improve the estimation of treatment effects in large or complex systems. I will illustrate the approach through examples, highlighting improvements over standard methods and discussing some of the challenges.

Department of Statistics, TU Dortmund University

We consider the problem of optimally allocating measurement devices in complex time-dependent networks to enable precise state estimation of the network at a future time point. For that purpose, we formulate a linear random-effects model in which the network structure and time-dependence can be separated.

We then focus on the problem of optimally allocating measurement devices of different precision by reformulating the discrete optimization problem as a continuous one. Using the A-optimality criterion, we formulate an optimality criterion for the optimal state estimation of the network at a future time point, and we provide an analytical solution to the corresponding A-optimal designs. However, calculating the A-optimal design becomes computationally demanding when the network structure is large and complex. In this situation, the network has to be reduced before calculating the A-optimal design. For that purpose, we propose two methods: one uses the quotient graph, and the other cuts the network into subgraphs, which are then treated separately. We show in a simulation study that designs based on the cutting approach perform well, compared with analytically determined designs, in terms of A-efficiencies.

Faculty of Mathematics, Ruhr University Bochum

Spatially and temporally varying data can often be understood as a realisation of a spatio-temporal stochastic process to take the dependence structure into account. This general point of view includes different models as special cases. The common problem is to choose an optimal s.-t. product design, meaning that one measures at each location at all chosen time points. A natural approach is to globally minimise the mean squared prediction error (MSPE) of the best linear predictor, where the latter only depends on the mean and covariance structure of the process and the measurement error. The talk discusses two ways to achieve this, first by minimising the integrated MSPE and second by minimising the maximum MSPE, which in other words optimises the worst case. Naturally, this is more complicated when the process's mean and covariance function are unknown. In this case, a parametric model is assumed, and different approaches to consider for the uncertainty introduced by the parameter estimation are discussed.

Institute for Operations Research, Karlsruhe Institute of Technology

In this talk, we will explore the concept of a more data-driven approach to location problems. We will address key questions that arise in the application of location models, algorithm design, and solution strategies. These questions are particularly important when dealing with real-world location problems that have certain temporal contexts and are subject to uncertainty and additional side constraints:

  • Which problem aspects should be represented in the model?
  • What are the key cost drivers in the model to identify the right objective function(s)?
  • Which parts of the solution are performance-critical?
  • Which decisions are driven by data structures and remain stable across model variants?
  • Which algorithms perform best for a certain location problem, and why?
  • How can the algorithm be adapted to the data?

Whenever possible, we will suggest quantitative measures to evaluate the aforementioned aspects.

Several papers have addressed some of these questions, but to our knowledge, there has been limited systematic work on data-driven aspects of location science.

In addition to providing a general overview and review of DDLS, we will present new results for some specific aspects: Data-Driven Interpretation of Solutions for Location Problems, Data-Driven Algorithm Design and Data-Informed Model Choice.

Faculty of Business Administration and Economics, European University Frankfurt (Oder)

Joint work with Sven Knoth (Helmut Schmidt University) and Yarema Okhrin (University of Augsburg).

In many practical situations, data are collected in the form of matrices. Essentially, anywhere where more complex multivariate relationships are present, matrices can come into play to capture them all at once. For example, in environmetrics various air pollutants such as particulate matter, ozone, or nitrogen oxides are measured at different measurement stations. In engineering, completely new production methods have been introduced recently as, e.g., 3d printing. Here, we also measure data over a certain grid. Nowadays, also images are taken to supervise the production and an image process is nothing else than a high-dimensional matrix process. In finance, we are interested in com paring various companies with respect to certain characteristics. These examples show that matrix-valued processes arise in many fields of science. The aim of this paper is to detect changes in the behavior of a matrix-valued process over time. The data are usually spatially correlated and they also may be correlated over time. In this presentation, we focus on detecting changes in the mean behavior of the underlying process. In order to detect the change point, we make use of control charts. Control charts are a basic tool for statistical process control. Originally, they were introduced to monitor a production process assuming that the underlying samples are independent. In recent years, these important tools have been extended to more complex data sets such as, e.g., multivariate time series, network data, functional data, and image processes. In order to extend control charts to more complex processes, it is necessary to take into account the probability structure of the underlying target process. Here, we focus on control charts based on exponentially weighted moving averages (EWMA). Such types of control charts have been intensively studied for univariate and multivariate processes. To monitor the mean behavior of an independent multivariate sample Lowry et al. (1992) introduced a multivariate EWMA chart. It was extended to multivariate time series by Kramer and Schmid (1997).

The most common way to deal with matrix-valued data is to use vectorization (cf. Okhrin et al. (2020, 2025), Li and Li (2026)). This means that the columns of the matrix are written below each other as a vector. Now it is possible to apply control charts to detect changes in a multivariate random process. Although this procedure is the most obvious one, it has several disadvantages that have been described in detail by Knoth et al. (2026a). First, the dimension of the random vector is soon very large. Second, the neighboring values within the matrix are no longer neighboring values within the vector. Third, the number of smoothing parameters is growing.

Assuming that the random matrices are independent over time and matrix-valued normally distributed in the in-control state, Knoth et al. (2026a) compared the vectorization approach with a matrix-valued EWMA approach. They show that the proposed matrix EWMA chart is much more flexible than the EWMA vectorized attempt since the amount of smoothing parameters is much smaller and the structure of the original process remains. In Knoth et al. (2026b) these results are generalized to matrix-valued time series processes (e.g., Chen et al. (2021), Wu and Bi (2023)).

Department of Statistics, TU Dortmund University

State estimation in electrical power distribution grids is a central task for monitoring and control, but poses several statistical challenges due to limited measurement availability, nonlinear system behavior, and increasing variability from distributed energy resources. Its reliability strongly depends on measurement quality, model accuracy, and the presence of disturbances or faulty data. Therefore, monitoring mechanisms are required to assess the consistency and reliability of the estimation process. This talk presents a Kalman filter-based approach for dynamic state estimation in power distribution grids and investigates monitoring indicators derived from the filtering process. In particular, indicators such as measurement residuals and the normalized innovation squared (NIS) are evaluated to detect measurement inconsistencies and abnormal conditions that may compromise the reliability of the state estimation.

Faculty of Business Administration and Economics, University of Duisburg-Essen

We propose a regularized distributional recurrent neural network with a linear expert skip connection for day-ahead electricity price forecasting. The model combines a linear expert model with a recurrent neural network to capture both structured linear effects and non-linear temporal dependencies, and produces probabilistic forecasts through a distributional output layer. We apply the framework to the German-Luxembourg day-ahead electricity market using electricity prices, load, renewable generation forecasts, fuel prices, carbon allowances, and calendar effects as inputs. Two parametric specifications are evaluated, namely a Normal version and a Johnson's SU version. Empirically, both neural specifications substantially improve upon the naive model in point and probabilistic forecasting accuracy. The Normal specification achieves the best overall results. These findings show that the proposed hybrid linear-recurrent architecture is effective for probabilistic electricity price forecasting, while also indicating that the simpler Normal assumption provides a better trade-off between flexibility, stability, and out-of-sample performance than the more flexible Johnson's SU alternative.

Karlsruhe Institute of Technology

This talk uses flight operations data to illustrate how discrepancies between planned and actual trajectories create a visibility gap in logistics processes. Focusing on timing, level of detail, and contextual information in heterogeneous operational data, it highlights observable deviations that shape arrival-related decisions. These observations show how richer operational information can support efforts towards closing the visibility gap and enable more adaptive behaviour in complex logistics networks.

Department of Statistics, TU Dortmund University

Reliably identifying sudden changes in the structure of data is crucial for any application. Our focus is on detecting changes in spatial data. Schmidt (2024) introduced a method to detect any number of structural breaks in the mean of a time series by partitioning the data into blocks and calculating Gini’s mean difference of the block means. In our work, we extend this approach to spatial data. We do not limit ourselves to Gini's mean difference, but also consider other choices of variability measures that are applied to the block means. We examine the asymptotic behavior of such test statistics under the hypothesis of a constant mean and prove the asymptotic consistency of the resulting test for a large set of non-constant mean functions. Simulation studies demonstrate that these methods perform well for both independent data and spatial autoregressive and moving-average random fields. Applications to Amazonian rainforest satellite images and estimates of Mittag-Leffler distribution parameters over the recurrence of vorticity extremes illustrate its relevance to real-world data. In ongoing work, we extend this approach to online monitoring of spatio-temporal data.

Department of Statistics, TU Dortmund University

Joint work with Matei Demetrescu (TU Dortmund University) and Robert Taylor (University of Essex).

Stock return predictability, should it exist, is likely to be episodic in nature. In order to exploit such pockets of predictability it is essential that they are rapidly detected, in real-time, as the nascent predictive regime emerges. This will typically entail the repeated (sequential) application of one-shot end-of-sample predictability statistics, updated as new data become available. Consequently, in addition to dealing with the usual inference problems caused by unknown regressor persistence and predictive regression endogeneity, one must also account for the multiple testing issues inherent in such monitoring procedures. In addition, stock returns and/or the predictors commonly used typically exhibit time-varying volatility and it is known that ignoring such data features can result in the spurious detection of predictability. We propose real-time monitoring procedures which take account of these issues. Our preferred monitoring strategy uses a CUSUM-type procedure based on a specific moment condition related to the predictive power of the putative predictor. We implement nonparametric adjustment methods to allow for the possibility of time varying volatility which do not require the practitioner to assume any particular parametric model for volatility. Monte Carlo simulations confirm that our proposed detection procedures display well controlled false positive rate across a range of feasible volatility paths coupled with good power to rapidly detect an emerging predictive episode. The empirical relevance of our proposed monitoring strategy is illustrated in a pseudo real-time monitoring exercise using a well-known dataset of S&P 500 returns.

Faculty of Management and Economics, Ruhr University Bochum

Joint work with Miriam Isabel Seifert (Ruhr University Bochum), Sven Soukup (Ruhr University Bochum), and Jan Vogler (Ruhr University Bochum).

Availability of realized covariances enables estimation of multivariate realized volatility models which are modern extensions of multivariate GARCH models. As a realized model is estimated it is of interest to monitor the validity of its covariance matrix forecasts period by period, i.e. online. The tools for such online monitoring (known as control charts) are elabotared   in the current project. A signal from a control chart indicates that the under consideration model fails to provide a valid signal, possibly due to structural changes. The detection performance of control charts is illistrated both in a Monte Carlo study and in an empirical application.

Institute of Energy Systems, Energy Efficiency and Energy Economics, TU Dortmund University

Joint work with Marvin Napps (TU Dortmund University), Christian Rehtanz (TU Dortmund University), and Christine Müller (TU Dortmund University).

Measurement infrastructure in power system distribution grids is often poorly developed. Additionally, pseudo measurements are necessary to guarantee solvability of the estimation problem. These pseudo measurements have a high noise in comparison to the real-time measurements. That makes the state estimation more difficult. In our work, we introduce an adapted Iterated Extended Kalman Filter (IEKF) in combination with sigma points. The parameters of the IEKF are estimated with the Expectation-Maximization (EM) algorithm, which performs well under low measurement noise conditions. We consider a novel approximate maximum likelihood estimation of parameters using an EM algorithm inherent in a sigma point supported IEKF framework. In a simulation study, the novel versions of the IEKF are applied to voltage estimation compared by different quality measures.

Department of Statistics, TU Dortmund University

Joint work with Holger Dette (Ruhr University Bochum), Sonja Kuhnt (University of Applied Sciences and Arts Dortmund), Larissa Sander (University of Applied Sciences and Arts Dortmund), and Kirsten Schorning (TU Dortmund University).

Computer experiments play a key role in studying the input-output relationships of logistics systems, especially when real-world experiments are too expensive, time-consuming, or impossible. To maximize information gain from these simulations while maintaining low computational cost, we derive optimal experimental designs specifying the input settings and the number of replicates for each input configuration. Specifically, we study a computer simulation of a production system involving the transport, processing, and quality control of goods. The system can be configured through continuous, discrete as well as categorical input variables such as batch size or right-of-way strategy. We model the outputs of the logistic system using Generalized Additive Models for Location, Scale and Shape (GAMLSS), combining parametric regression and smooth functions. We aim to select the system’s inputs such that the parameters of the GAMLSS model can be estimated as precisely as possible. This results in an optimal design problem that has not been thoroughly studied for this type of setting. To address this problem, we derive information matrices for a GAMLSS model tailored to our use case and present optimal design criteria based on these matrices.

Faculty of Business Administration and Economic, University of Duisburg-Essen

Joint work with Florian Ziel (University of Duisburg-Essen).

As European bidding zones are highly interconnected by physical transmission lines, spatial influences propagate across neighboring nodes through a network. It is duly reflected in the day-ahead electricity prices across European bidding zones, as the auction algorithm also uses information beyond each bidding zone’s geographic boundary. To capture how this interconnection affects the neighboring bidding zone’s electricity prices, we have used a metric graph to map the spatial coverage of information using a well-defined neighborhood measure. We propose the Networked Spatio-Temporal Model (NSTM), which maps irregular spatial nodes into an ordered network, enabling the systematic incorporation of neighborhood information. We implement the NSTM across 39 bidding zones covering the majority of European electricity markets in a high-resolution, streaming-forecasting setup. The model uses autoregressive, cross-hour, and seasonal effects, along with fuel and emission prices and day-ahead forecasts of fundamentals, as interconnected information to predict the day-ahead prices for each bidding zone. A Europe-wide study presented in this paper shows that the NSTM consistently outperforms traditional island-based pure local models. This paper provides a framework that demonstrates the critical role the networked structure plays in propagating information across interconnected markets and its vast implications on day-ahead electricity price forecasting.

Faculty of Economics and Business, University of Groningen

Joint work with Anna Tort-Carrera (Netherlands Interdisciplinary Demographic Institute The Hague).

Empirical models with social or spatial interactions often yield parameter estimates that satisfy formal identification conditions but are difficult to interpret in terms of socially or economically meaningful patterns of interdependence. This paper develops a simple and general benchmark for assessing the plausibility of parameter estimates in linear-in-means models of peer and contextual effects, as well as in spatial econometric models with endogenous and exogenous interaction effects. This benchmark is based on Tobler's First Law of Geography and the ratio of indirect to direct effects and used to characterize the socially or economically admissible parameter space. A key advantage of the approach is that it is applicable to any interaction matrix used in practice and can be implemented without explicitly computing direct and indirect effects. We show that commonly used models may produce parameter estimates that fall outside this admissible region, even when identification conditions such as those in Bramoullé et al. (2009) are satisfied, thereby signaling potential model misspecification. Using their empirical setting, we illustrate how the proposed benchmark provides a transparent diagnostic tool that complements existing identification analyses.

Department of Mathematics, University of Hamburg

Joint work with Francesco Iafrate (University of Hamburg), Johannes Lederer (University of Hamburg), and Charlotte Dion-Blanc (Sorbonne Université)

Event sequence data arise in many domains, including financial markets, social media, and healthcare. In such settings, event occurrences are influenced not only by past events and their timing, but also by external covariates. These processes are commonly modeled using multivariate Hawkes processes, for which several machine learning approaches have been proposed. However, existing models for temporal point processes do not adequately capture the joint influence of covariates and past events.

To address this limitation, we propose a novel approach for learning the intensity function of the underlying point process. Our model leverages a transformer-based architecture with a new attention mechanism that explicitly captures the time-invariant structure of multivariate Hawkes processes, while modeling covariate effects separately. In addition to accurately representing event dynamics, our framework is designed to ensure interpretability by capturing how different event types influence one another, as well as how covariates affect event occurrences.

Department of Statistics, TU Dortmund University

Joint work with Andreas Groll (TU Dortmund University), Anne Dallmeyer (Max Planck Institute for Meteorology), and Nils Weitzel (TU Dortmund University).

Future projections of the Earth system rely on complex numerical models, which are validated by assessing their ability to reproduce past climate changes. Here, we develop a statistical model to emulate the climate – carbon dioxide – vegetation relationship in an Earth system model (ESM), with the goal of identifying drivers of environmental change and quantifying uncertainties. To facilitate interpretability of the emulated relationships, we employ semi-parametric distributional regression, specifically Generalized Additive Models for Location, Scale, and Shape (GAMLSS2), trained with spatio-temporal fields from an ESM. We focus on the ability to select predictors from a set of highly-correlated variables, for example through random forest pre-selection and penalization. The ESM implementation facilitates conditional independence between training data such that residual spatio-temporal structure can be used as a criterion for missing covariates. Preliminary results show a high accuracy for spatio-temporal vegetation patterns and an ecologically meaningful hierarchy of predictor importance.

Department of Statistics, TU Dortmund University

Our aim is to model and deliver probabilistic forecasts of electricity consumption at the household level. To this end, we explore hidden Markov models and periodic autoregressive models, and, in particular, seek to model the influence of various covariates on the model parameters. Among other things, the first approach leads to an extension of the classical Markov model, as previous observations can influence the emission parameters as covariates (autoregressive hidden Markov model). Concerning the second approach, such models typically assume normally distributed innovations, but empirical residuals often exhibit skewness and heavy tails. We therefore consider alternative innovation distributions in the model estimation, such as the skew-normal and skew-t distributions. This approach enables the construction of more realistic prediction intervals while maintaining the simple model structure of periodic autoregressive models.

Faculty of Computer Science, University of Applied Sciences and Arts Dortmund

Joint work with Sonja Kuhnt (University of Applied Sciences and Arts Dortmund).

Transport logistics facilities such as less-than-truckload terminals require fast and robust tactical decisions under uncertainty, for example regarding task scheduling and resource allocation. Since simulation experiments for such systems are computationally expensive, statistical surrogate models and simulation-based optimization methods are an important basis for decision support. At the same time, global sensitivity analysis can provide valuable information on which decision variables are most influential for relevant logistic key performance indicators.

This poster presents ongoing methodological work on combining global sensitivity analysis and Bayesian optimization more closely for simulation-based decision support in transport logistics. The central idea is to use sensitivity information not only for interpretation, but to actively guide the optimization process. The proposed framework includes sensitivity-based variable screening, sensitivity-guided candidate generation, soft search space reduction, and sensitivity-weighted acquisition strategies.

The poster outlines the methodological motivation of this framework and highlights key design questions concerning the integration of sensitivity information into sequential surrogate-assisted optimization. The overall goal is to link uncertainty quantification, sensitivity analysis, surrogate modeling, and global optimization more effectively for tactical decision support in transport logistics systems.

Department of Statistics, TU Dortmund University

Joint work with Konstantinos Fokianos (University of Cyprus) and Roland Fried (TU Dortmund University).

We present an R package that implements a flexible modeling framework for spatio-temporal data that efficiently captures spatial and temporal dependencies. Our approach is based on double generalized linear models and allows for the integration of autoregressive dependence structures, the inclusion of covariates as well as different distributions to account for the specific data properties. The R package allows the estimation of popular spatiotemporal INGARCH and log-linear models for count data, as well as other models found in the literature. The use of the double generalized linear models as a basis allows for spatio-temporally varying dispersion parameters as an extension of the aforementioned models. In addition to methods for parameter estimation, the framework also includes functions for statistical inference. A particular focus is given to a user-friendly implementation.

Department of Statistics, TU Dortmund University

Joint work with Gabin Agbalé (TU Dortmund University) and Stefan Harmeling (TU Dortmund University).

Instrumental Variable (IV) regression estimates causal effects in the presence of unobserved confounders by exploiting external variables, so-called instruments, that influence the treatment but not the outcome directly. We study IV regression when treatments are high‑dimensional yet governed by lower‑dimensional latent features through which they affect outcomes. Our main assumption constrains the conditional distribution of these features given the instrument to follow a conditional exponential factorial form. Building on Independently Modulated Component Analysis (IMCA), we introduce InfoIV, a contrastive learning framework that recovers latent representations up to an indeterminacy consistent with the IV assumptions. We prove that these representations are compatible with standard IV estimators such as two‑stage least squares and control functions, and further enable extrapolation across unseen values of the instruments. Empirical results on tabular and image data confirm InfoIV’s effectiveness for causal effect estimation and robustness under distribution shifts.

Faculty of Business Administration and Economic, University of Duisburg-Essen

Joint work with Eva-Maria Oeß (University of Cologne).

We introduce a super learner for estimating heterogeneous net treatment effects under unit-varying outcome and cost effects. Our approach is designed for optimal assignment of a binary treatment that induces a cost–benefit trade off. The net effect and its underlying outcome and cost components are characterized by unknown functional complexity, which our ensemble explores in a data-driven manner.

Directly targeting the net effect performs well when the estimand is simpler than the outcome and cost effects individually. In contrast, separately learning both effects and subsequently aggregating them into the net effect is advantageous when the components are structurally dissimilar or when their aggregation yields a target estimand that is more complex than the components themselves. A hybrid learning strategy succeeds in intermediate settings. Our ensemble nests all approaches and selects the winner by minimizing empirical risk.

In a simulation study, we consider multiple scenarios in which each individual approach dominates the others and show that the ensemble further improves precision across most settings. We additionally present an empirical application using data from a large nonprofit organization, where we analyze the net effect of a fundraising campaign aimed at increasing pledge payments while mitigating donor attrition.

Delft Institute of Applied Mathematics, TU Delft

Joint work with Archi Roy (Indian Institute of Management Kozhikode).

Considering a nonparametric regression model Yt = mt(Xt) + εt, we want to test for temporal break points in the regression relationship. This formulation is suitable to detect faults in wind turbines via their power curve, or to monitor the primary frequency control in a power grid. For the classical change-in-mean problem, multiscale procedures have been shown to yield minimax optimal detection rates for both short-lived and small changepoints. For the nonparametric problem, we use the marked empirical process to construct a test statistic with multiscale weights in both time t and space Xt, i.e. in the covariate axis. We derive the limit distribution of the test statistic and establish its minimax optimality.

Institute of Transport Logistics, TU Dortmund University

Joint work with Lara Kuhlmann de Canaviri (University of Applied Sciences and Arts Dortmund), Sonja Kuhnt (University of Applied Sciences and Arts Dortmund), and Uwe Clausen (TU Dortmund University).

Due to the high complexity of internal processes and the numerous stochastic influences (e.g., truck delays, fluctuations in shipment volume) affecting logistical terminals, simulation has proven to be an effective problem-solving method for decision support at these terminals (Chmielewski, 2007). However, the performance of simulation studies comes with various requirements, like ensuring a sufficient level of model accuracy, making a clearly defined process model crucial for providing a targeted modeling process (Rabe et al., 2008). For conducting these studies in the field of production and logistics, the process model defined in VDI 3633 (2014) has proven to be the most common approach. It provides a structured basis for planning and implementing simulation studies, taking into account five important phases (task analysis, model formulation, model implementation, model verification, and application). However, there is still no standardized approach for the development of experimental plans and the determination of suitable parameter combinations. To overcome this, the objective of this research is to extend the current process model using a Latin hypercube design to support the development of an experimental design. This aims to limit the number of simulation runs to a manageable size while covering the entire experimental space to ensure comprehensive results. To demonstrate the procedure, it is applied to the development of a simulation model for a less-than-truckload terminal. In this context, first, a conceptual model is developed, including a total of eight input parameters for the simulation model, and their corresponding possible value ranges. These ranges outline the experimental space and thus serve as the basis for developing the experimental plan. After further modeling steps are completed, resulting in an executable simulation model, the Latin hypercube design is applied to develop the experimental plan. Taking into account the possible value ranges of the input parameters, this yields an experimental plan of 64 parameter combinations that cover the entire experimental space. This enables further investigations using the simulation model, such as optimizing the selection of input parameters with regard to use-case-specific key performance indicators.

Department of Statistics, TU Dortmund University

The climate system is a high-dimensional, chaotic, and interacting system with variations on timescales from minutes to millions of years. Assessments of renewable energy potential and energy demand for the next century require accurate representation of these variations in climate simulators. Here, we compare the amplitude of timescale-dependent temperature variations in observations and reconstructions from natural climate archives with climate simulations in the presence of multiple uncertainty sources. For global mean temperature, we find consistency between simulators and reconstructions across all tested timescales. In contrast, the regional variability is significantly underestimated in climate simulators on multidecadal and longer timescales. This implies that the spatial covariance structure of climate simulators is misspecified on these timescales. To better understand the mechanisms responsible for the model-data discrepancies, we suggest a Bayesian hierarchical framework for reconstructing spatio-temporal fields from sparse and noisy temperature reconstructions. The framework combines physically motivated structures for modeling spatio-temporal fields and observational operators, that link climate changes to quantities measured in natural climate archives. Future work aims at using statistical postprocessing methods to integrate the insights from these reconstructions into refined projections of regional temperatures for the next centuries. These postprocessed projections could in turn serve as input to energy system models.

Department of Mathematics and Statistics, Helmut Schmidt University Hamburg

Joint work with Malte Jahn (Helmut Schmidt University) and Hee-Young Kim (Korea University Sejong).

Existing integer-valued generalised autoregressive conditional heteroskedasticity (INGARCH) models for spatio-temporal counts do not allow for negative parameter and autocorrelation values. Using approximately linear INGARCH models, the unified and flexible spatio-temporal INGARCH framework for modelling unbounded or bounded counts is proposed. These models combine negative dependencies with kinds of a long memory. Driven by real-world data applications, we show that our spatio-temporal INGARCH models are easily adapted to missing data problems, to special marginal features such as zero-inflation, or to pronounced cross-dependencies. In the latter case, we propose to use a copula related to the spatial error model together with an approximate approach for likelihood computation. Finally, we adapt the developed approaches to the task of modeling spatio-temporal ordinal data.

Friday, June 12

Department of Statistics, Texas A&M University

Graphical models are ubiquitous for summarizing conditional relations in multivariate data. In many applications involving multivariate time series, it is of interest to learn an interaction graph that treats each individual time series as nodes of the graph, with the presence of an edge between two nodes signifying conditional dependence given the others. Typically, the partial covariance is used as a measure of conditional dependence. However, in many applications, the outcomes may not be Gaussian and/or could be a mixture of different outcomes.

In this talk, we propose a broad class of time series models for multivariate mixed-type time series, that includes the classical VAR model as a special case.  For each node in the time series we model its conditional distribution with a distribution from  the exponential family.  We call this construction a conditionally exponential stationary graphical model CEStGM. The method of modelling has several advantages. The first is that due to the versatility of the exponential class, it allows one to stitch together several different variable "types".  The second is that the univariate conditional specification allows for simple estimation procedures using standard GLM tools. Finally, the conditional specification is specifically designed to succinctly encode process-wide conditional independence in its parameters

We derive conditions that ensure the model leads to a well defined strictly stationary time series  and show that the model is geometrically beta-mixing. We propose an approximate Gibbs sampler for simulating  sample paths from CEStGM.  Finally, we conclude the talk with some numerical experiments and real data examples. 

Department of Statistics, TU Dortmund University

We consider a structural matrix-autregressive (SMAR) model to conduct impulse response analysis for structural shocks to matrix-valued time series. The MAR model of order p offers a parsimonious and interpretable framework for these time series, thus addressing issues of high-dimensionality in corresponding vector-autoregressive (VAR) models. To interpret the dynamics, we resort to impulse response analysis as a popular tool from the SVAR context. Its conclusions rely on the valid identification of structural shocks that are mutually contemporaneously uncorrelated and interpretable. In contrast to the existing literature, the proposed SMAR model enables the identification of multiple structural shocks. To address the restrictive nature of the single-term MAR(p) model, we discuss the extension to a multi-term SMAR(p) model as a compromise between the single-term SMAR and the (unrestricted) SVAR model, trading off parsimony against flexibility. We discuss its identification, focusing in particular on issues that arise due to the typical Kronecker-product structure of the coefficient matrices in the MAR framework. Further, we discuss estimation and inference in the general multi-term SMAR(p) model. Specifically, we consider a suitable residual-based bootstrap method to compute confidence intervals for the impulse response curves and provide corresponding results. In this context, a key point concerns model misspecification and the use of MAR models to approximate more general SVAR data generating processes. Finally, we demonstrate the performance and practical use of our approach by Monte Carlo simulations and a real data application.

Faculty of Mathematics, Ruhr University Bochum

A crucial assumption to reduce computational complexity  in spatial-temporal data  analysis is separability, which factors the covariance structure  into a purely spatial and a purely temporal component. In this paper, we develop statistical inference tools for validating this assumption for a second-order stationary process  under  both domain-expanding-infill  asymptotics and domain-expanding  asymptotics. In contrast to previous work on this subject, the methodology neither requires the assumption of normally distributed data, nor uses spectral methods. Our approach is based on nonparametric estimates of measures for the  deviation between  the covariance matrix  and separable approximations, which vanish if and only if the  assumption of separability is satisfied. We derive the asymptotic distributions of appropriate estimators for these measures with non-standard limiting distributions and use these results to develop inference tools for validating the assumption of separability. More specifically, we derive confidence intervals for the deviation measures, tests  for the  hypothesis of exact separability, and  for the hypothesis that the deviation from separability is smaller than a prespecified threshold.

Department of Statistics, TU Dortmund University

The accurate estimation of covariances from spatial data is crucial in many applications. While parametric approaches have been widely used for modeling spatial covariances, the parametric assumption may not be correctly specified, which may lead to wrong conclusions.  On the contrary, non-parametric approaches are often neglected in practice, because their finite sample estimation accuracy suffers from the large number of covariance parameters to be estimated and they are often prone to severe bias issues. In this paper, we develop a novel framework for bias correction for non-parametric sample covariance estimators of stationary lattice processes. We derive the exact finite sample biases of different versions of sample covariance estimators for spatial data. Our results show that the estimators' expectations can be expressed as linear combinations of the population covariances that only depend on the spatial lag of the sample covariance and on the sample size of the observed lattice data. Exploiting this characterization enables the construction of (jointly) bias-corrected covariance estimators that are nearly unbiased, depending on the strength of spatial dependence in the lattice process. Additionally, we derive exact formulas for the (joint) mean-squared error (MSE) and provide asymptotic normality results. In particular, we show that the bias-corrected estimators are asymptotically equivalent to their uncorrected versions. As an application, we use the bias-corrected covariance estimates for bootstrap inference and propose a novel spatial bootstrap procedure. Simulation studies for (Gaussian) lattice processes demonstrate that our proposed bias correction can achieve substantial bias reduction, while often reducing also the MSE, particularly when the underlying process exhibits strong spatial persistence. In particular in these cases, the proposed bootstrap procedure benefits from the bias-correction and regularization of the covariance estimation in finite samples.

Department of Statistics, Iowa State University

Long-range dependent time series are often conceptualized by an unknown transformation of an underlying long-memory Gaussian process. The so-called Hermite rank of this transformation is a process parameter that critically impacts statistical inference, as sampling distributions change with the unknown rank.  A compounding issue is that ranks can further vary between statistics computed from the same time series. Over the past 50 years, no approach has existed to generally approximate Hermite ranks from data.  This talk describes a method for approximating both the Hermite rank as well as dependence parameter of the underlying Gaussian process, without knowledge of the underlying transformation that defines the observed long-memory time series.  As a result, the estimation approach can then be coupled with a bootstrap method for approximating the sampling distribution of statistics in practice. The inference method is illustrated through numerical studies and examples.

Department of Statistics, London School of Economics and Political Science

We introduce the coverage correlation coefficient, a novel nonparametric measure of statistical association designed to quantifies the extent to which two random variables have a joint distribution concentrated on a singular subset with respect to the product of the marginals. Our correlation statistic consistently estimates an f-divergence between the joint distribution and the product of the marginals, which is 0 if and only if the variables are independent and 1 if and only if the copula is singular. Using Monge–Kantorovich ranks, the coverage correlation naturally extends to measure association between random vectors. It is distribution-free, admits an analytically tractable asymptotic null distribution, and can be computed efficiently, making it well-suited for detecting complex, potentially nonlinear associations in large-scale pairwise testing.

Department of Informatics, Karlsruhe Institute of Technology

We propose a new probabilistic regression model for a multivariate response vector based on a copula process over the covariate space. It uses the implicit copula of a Gaussian multivariate regression, which we call a “regression copula process”. To allow for large covariate vectors the regression coefficients are regularized using a novel multivariate extension of the horseshoe prior. Bayesian inference and distributional predictions are evaluated using efficient variational inference methods, allowing application to large datasets. An advantage of the approach is that the marginal distributions of the response vector can be estimated separately and accurately, resulting in predictive distributions that are marginally calibrated. Two applications of the methodology illustrate its efficacy. The first is the econometric modeling and prediction of half-hourly regional Australian electricity prices. In a second application, we extend our approach to deep time series models using recurrent neural networks, and show that it provides accurate short-term probabilistic price forecasts, with the copula model dominating existing benchmarks. Moreover, the model provides a flexible framework for incorporating probabilistic forecasts of electricity demand, which increases upper tail forecast accuracy from the copula model significantly.

Faculty of Mathematics, Ruhr University Bochum

Within the rich variety of statistical regression models (ranging from classical conditional mean regression to fully distributional regression models), we are interested in using copulas as driving forces behind model specifications and validating those by means of goodness-of-fit tests. These tests are desirable since such parametric or semiparametric specifications are typically imposed to overcome the curse of dimensionality, but wrong model choices often detoriate the performance of statistical inference procedures, depending on their deviation from the true regression. Since model assumptions in applications are never entirely correct, we focus on new measures of deviation and relevant hypothesis testing. We present two goodness-of-fit testing approaches, one based on a weighted L2-distance, and the other based on marked empirical processes. These inference tools are illustrated via simulated and empirical data.

Department of Statistical Sciences, University of Toronto

Inference of the precision matrix is fundamental to understanding conditional dependence structures in multivariate time series. Classical methods typically rely on stationarity, an assumption frequently violated in practice when dependence patterns evolve over time. In this paper, we investigate the estimation and inference of time-varying precision matrices for high-dimensional, non-stationary time series. We propose a flexible framework in which the precision matrix is modeled as a smooth function of time, allowing conditional dependencies to evolve dynamically. Under a mild cross-sectional weak dependence condition, which induces sparsity in the precision matrices, we develop an estimation procedure that combines sieve approximations with group lasso regularization to ensure stability in high dimensions. We establish theoretical guarantees, including consistency and convergence rates, under mild assumptions on temporal smoothness and dependence. Furthermore, we introduce a refined estimator that enables simultaneous inference on both the time-varying precision matrices and the associated partial correlation functions. Simulation studies demonstrate that the proposed method accurately recovers evolving network structures. An application to brain imaging data illustrates its practical effectiveness in uncovering time-dependent conditional dependencies in complex systems.