To content
A07

Distributional copula regression for space-time data

A07 develops novel models for multivariate spatio-temporal data using distributional copula regression. Of particular interest are tests for the significance of predictors and automatic variable selection using Bayesian selection priors. In the long run, the project will consider computationally efficient modeling of non-stationary dependencies using stochastic partial differential equations.

Project Leaders

Prof. Dr. Holger Dette
Faculty of Mathematics - Chair of Stochastics
Ruhr University Bochum

Prof. Dr. Nadja Klein
Department of Informatics - Scientific Computing Center
Karlsruhe Institute of Technology

Summary

Modeling dependencies in space-time data is of interest for several projects of TRR 391 and copulas are an important mathematical tool to capture such potentially complex associations. In this project, we will develop novel models for multivariate spatio-temporal data based on copulas and distributional regression. In particular we leverage the potential of statistical testing and Bayesian shrinkage priors to induce sparse yet flexibly varying dependence structures between multiple outcomes that are observed over space and time. With the help of distributional regression it will be possible to describe the entire conditional distributions - including the dependence structure - as functions of space, time and potentially further covariates. To find a reasonable model we will construct statistical tests to determine the copula specification on the one hand, and complement these on the other hand by  automatic variable selection using Bayesian variable selection priors. The latter will be particularly appealing to allow for hierarchical model specifications and modular estimation in potentially high-dimensional spatio-temporal copula regression models. Estimation is planned to be conducted by variational inference and generalized Bayesian methods. In a long-term perspective we will consider modeling the dependence structures non-stationary, handle irregularly observed and missing space-time data and leverage the potential of deep learning methods to capture high-dimensional interactions of the joint covariate, space and time domains more thoroughly.

Acar, E. F., R. V. Craiu, and F. Yao (2011). Dependence Calibration in Conditional Copulas: A Nonparametric Approach. Biometrics 67, 445–453. doi: 10.1111/j.1541-0420.2010.01472.x.

Amato, F., F. Guignard, S. Robert, and M. Kanevski (2020). A novel framework for spatio-temporal prediction of environmental data using deep learning. Nature, Scientific Reports 10, 22243. doi: 10.1038/s41598-020-79148-7.

Ando, T. (2010). Bayesian model selection and statistical modeling. 1st ed. Chapman and Hall/CRC. doi: 10.1201/EBK1439836149.

Bach, P. and N. Klein (2022). Posterior Concentration Rates for Bayesian Penalized Splines.

Banerjee, S. (2017). High-dimensional Bayesian geostatistics. Bayesian Analysis 12, 583–614. doi: 10.1214/17-BA1056R.

Berg, D. (2009). Copula goodness-of-fit testing: An overview and power comparison. The European Journal of Finance 15, 675–701. doi: 10.1080/13518470802697428.

Bitto, A. and S. Frühwirth-Schnatter (2019). Achieving shrinkage in a time-varying parameter model framework. Journal of Econometrics 210, 75–97. doi: 10.1016/j.jeconom.2018.11.006.

Blei, D. M., A. Kucukelbir, and J. D. McAuliffe (2017). Variational inference: A review for statisticians. Journal of the American Statistical Association 112, 859–877. doi: 10.1080/01621459.2017.1285773.

Brown, P. J. and J. E. Griffin (2010). Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis 5, 171–188. doi: 10.1214/10-BA507.

Bücher, A. and H. Dette (2010). Some comments on goodness-of-fit tests for the parametric form of the copula based on L2-distances. Journal of Multivariate Analysis 101, 749–763. doi: 10.1016/j.jmva.2009.09.014.

Bücher, A., H. Dette, and S. Volgushev (2011). New estimators of the Pickands dependence function and a test for extreme-value dependence. The Annals of Statistics 39. doi: 10.1214/11-AOS890.

Bücher, A., H. Dette, and S. Volgushev (2012). A test for archimedeanity in bivariate copula models. Journal of Multivariate Analysis 110, 121–132. doi: 10.1016/j.jmva.2012.01.026.

Carvaloh, C. M., N. G. Polson, and J. G. Scott (2010). The horseshoe estimator for sparse signals. Biometrika 97, 465–480. doi: 10.1093/biomet/asq017.

Chen, Z., J. Fan, and K. Wang (2023). Multivariate Gaussian processes: Definitions, examples and applications. Metron 81, 181–191. doi: 10.1007/s40300-023-00238-3.

Czado, C. and T. Nagler (2022). Vine copula based modeling. Annual Review of Statistics and Its Application 9, 453–477. doi: 10.1146/annurev-statistics-040220-101153.

Dette, H., M. Guhlich, and N. Neumeyer (2015). Testing for additivity in nonparametric quantile regression. Annals of the Institute of Statistical Mathematics 67, 437–477. doi: 10.1007/s10463-014-0461-1.

Dette, H., R. Van Hecke, and S. Volgushev (2014). Some Comments on Copula-Based Regression. Journal of the American Statistical Association 109, 1319–1324. doi: 10.1080/01621459.2014.916577.

Fermanian, J.-D. (2005). Goodness-of-fit tests for copulas. Journal of Multivariate Analysis 95, 119–152. doi: 10.1016/j.jmva.2004.07.004.

Fermanian, J.-D. (2012). An overview of the goodness-of-fit test problem for copulas. SSRN Electronic Journal. doi: 10.2139/ssrn.2177921.

Frazier, D. T., R. Kohn, C. Drovandi, and D. Gunawan (2023). Reliable Bayesian Inference in Misspecified Models.

Frühwirth-Schnatter, S. and H. Wagner (2010). Stochastic model specification search for Gaussian and partial non-Gaussian state space models. Journal of Econometrics 154, 85–100. doi: 10.1016/j.jeconom.2009.07.003.

Fuglstad, G.-A., D. Simpson, F. Lindgren, and H. Rue (2019). Constructing priors that penalize the complexity of gaussian random fields. Journal of the American Statistical Association 114, 445–452. doi: 10.1080/01621459.2017.1415907.

Genest, C., B. Rémillard, and D. Beaudoin (2009). Goodness-of-fit tests for copulas: A review and a power study. Insurance: Mathematics and Economics 44, 199–213. doi: 10.1016/j.insmatheco.2007.10.005.

Gneiting, T., F. Balabdaoui, and A. E. Raftery (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society Series B: Statistical Methodology 69, 243–268. doi: 10.1111/j.1467-9868.2007.00587.x.

Goodfellow, I., Y. Bengio, and A. Courville (2016). Deep learning. MIT press.

Goto, Y., T. Kley, R. V. Hecke, S. Volgushev, et al. (2022). The integrated copula spectrum. The Annals of Statistics 50, 3563–3591. doi: 10.1214/22-AOS2240.

Grønneberg, S. and N. L. Hjort (2014). The copula information criteria. Scandinavian Journal of Statistics 41, 436–459. doi: 10.1111/sjos.12042.

Klein, N. (2023). Distributional Regression for Data Analysis. arXiv: 2307.10651.

Klein, N., M. Carlan, T. Kneib, S. Lang, et al. (2021). Bayesian effect selection in structured additive distributional regression models. Bayesian Analysis 16. doi: 10.1214/20-BA1214.

Klein, N. and T. Kneib (2016). Simultaneous inference in structured additive conditional copula regression models: a unifying Bayesian approach. Statistics and Computing 26, 841–860. doi: 10.1007/s11222-015-9573-6.

Klein, N., T. Kneib, and S. Lang (2015). Bayesian generalized additive models for location, scale, and shape for zero-inflated and overdispersed count data. Journal of the American Statistical Association 110, 405–419. doi: 10.1080/01621459.2014.912955.

Klein, N. and M. S. Smith (2019). Implicit copulas from Bayesian regularized regression smoothers. Bayesian Analysis 14, 1143–1171. doi: 10.1214/18-BA1138.

Klein, N. and M. S. Smith (2021). Bayesian variable selection for non-Gaussian responses: A Marginally-calibrated copula approach. Biometrics 77, 809–823. doi: 10.1111/biom.13355.

Klein, N., M. S. Smith, and D. J. Nott (2023). Deep Distributional Time Series Models and the Probabilistic Forecasting of Intraday Electricity Prices. Journal of Applied Econometrics 38, 493–511. doi: 10.1002/jae.2959.

Ko, V. and N. L. Hjort (2019). Copula information criterion for model selection with two-stage maximum likelihood estimation. Econometrics and Statistics, 167–180. doi: 10.1016/j.ecosta.2019.01.001.

Ko, V., N. L. Hjort, and I. Hobaek Haff (2019). Focused information criteria for copulas. Scandinavian Journal of Statistics 46, 1117–1140. doi: 10.1111/sjos.12387.

Kowal, D. R., D. S. Matteson, and D. Ruppert (2019). Dynamic shrinkage processes. Journal of the Royal Statistical Society, Series B 81, 781–804. doi: 10.1111/rssb.12325.

Krupskii, P. and M. G. Genton (2019). A copula model for non-Gaussian multivariate spatial data. Journal of Multivariate Analysis 169, 264–277. doi: 10.1016/j.jmva.2018.09.007.

Krupskii, P., R. Huser, and M. G. Genton (2018). Factor copula models for replicated spatial data. Journal of the American Statistical Association 113, 467–479. doi: 10.1080/01621459.2016.1261712.

Krupskii, P. and H. Joe (2013). Factor copula models for multivariate data. Journal of Multivariate Analysis 120, 85–101. doi: 10.1016/j.jmva.2013.05.001.

Krupskii, P., B. R. Nasri, and B. N. Remillard (2023). On factor copula-based mixed regression models.

Kutzker, T., N. Klein, and D. Wied (2021). Flexible specification testing in quantile regression Models.

Lindgren, F., H. Rue, and J. Lindström (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. Journal of the Royal Statistical Society Series B 73, 423–498. doi: 10.1111/j.1467-9868.2011.00777.x.

Liu, G., W. Long, B. Yang, and Z. Cai (2022). Semiparametric estimation and model selection for conditional mixture copula models. Scandinavian Journal of Statistics 49, 287–330. doi: 10.1111/sjos.12514.

Loaiza-Maya, R., M. S. Smith, D. J. Nott, and P. J. Danaher (2022). Fast and accurate variational inference for models with many latent variables. Journal of Econometrics 28, 523–539. doi: 10.1016/j.jeconom.2021.05.002.

Lobato, I. N. (2001). Testing that a dependent process is uncorrelated. Journal of the American Statistical Association 96, 1066–1076. doi: 10.1198/016214501753208726.

Lukoševičius, M. and H. Jaeger (2009). Reservoir computing approaches to recurrent neural network training. Computer Science Review 3, 127–149. doi: 10.1016/j.cosrev.2009.03.005.

Marques, I., N. Klein, and T. Kneib (2020). Non-stationary spatial regression for modelling monthly precipitation in Germany. Spatial Statistics 40, 100386. doi: 10.1016/j.spasta.2019.100386.

Marra, G. and R. Radice (2017). Bivariate copula additive models for location, scale and shape. Computational Statistics & Data Analysis 112, 99–113. doi: 10.1016/j.csda.2017.03.004.

Marra, G. and R. Radice (2020). Copula Link-Based Additive Models for Right-Censored Event Time Data. Journal of the American Statistical Association 115, 886–895. doi: 10.1080/01621459.2019.1593178.

McDermott, P. L. and C. K. Wikle (2017). An ensemble quadratic echo state network for non-linear spatio-temporal forecasting. Stat 6, 315–330. doi: 10.1002/sta4.160.

Murray, J. S., D. B. Dunson, L. Carin, and J. E. Lucas (2013). Bayesian Gaussian copula factor models for mixed data. Journal of the American Statistical Association 108, 656–665. doi: 10.1080/01621459.2012.762328.

Nelsen, R. B. (2006). An Introduction to Copulas. Springer Series in Statistics. Springer. doi: 10.1007/0-387-28678-0.

Noh, H., A. E. Ghouch, and T. Bouezmarni (2013). Copula-based regression estimation and inference. Journal of the American Statistical Association 108, 676–688. doi: 10.1080/01621459.2013.783842.

Noh, H., A. E. Ghouch, and I. V. Keilegom (2015). Semiparametric conditional quantile estimation through copula-based multivariate models. Journal of Business & Economic Statistics 33, 167–178. doi: 10.1080/07350015.2014.926171.

O’Hara, R. B. and M. J. Sillanpää (2009). A review of Bayesian variable selection methods: What, how and which. Bayesian Analysis 4, 85–117. doi: 10.1214/09-BA403.

Ong, V. M., D. Nott, and M. Smith (2018). Gaussian variational approximation with a factor covariance structure. Journal of Computational and Graphical Statistics 27, 465–478. doi: 10.1080/10618600.2017.1390472.

Riebl, H., N. Klein, and T. Kneib (2023). Modelling intra-annual tree stem growth with a distributional regression approach for Gaussian process responses. Journal of the Royal Statistical Society Series C: Applied Statistics 72, 414–433. doi: 10.1093/jrsssc/qlad015.

Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society. Series C (Applied Statistics) 54, 507–554. doi: 10.1111/j.1467-9876.2005.00510.x.

Rue, H. and L. Held (2005). Gaussian Markov Random Fields: Theory and Applications. New York/Boca Raton: CRC press. doi: 10.1201/9780203492024.

Rügamer, D., C. Kolb, and N. Klein (2023). Semi-structured distributional regression. The American Statistician. doi: 10.1080/00031305.2022.2164054.

Sabeti, A., M. Wei, and R. V. Craiu (2014). Additive models for conditional copulas. Stat 3, 300–312. doi: 10.1002/sta4.64.

Segers, J. (2012). Asymptotics of empirical copula processes under non-restrictive smoothness assumptions. Bernoulli 18. doi: 10.3150/11-BEJ387.

Sick, B., T. Hothorn, and O. Dürr (2021). Deep transformation models: Tackling complex regression problems with neural network based transformation models. 25th International Conference on Pattern Recognition, 2476–2481. doi: 10.1109/ICPR48806.2021.9413177.

Smith, M. S. and N. Klein (2021). Bayesian inference for regression copulas. Journal of Business & Economic Statistics 39, 712–728. doi: 10.1080/07350015.2020.1721295.

Smith, M. S. and R. Loaiza-Maya (2022). Implicit copula variational inference. Journal of Computational and Graphical Statistics, 1–13. doi: 10.1080/10618600.2022.2119987.

Smith, M. S., R. Loaiza-Maya, and D. J. Nott (2020). High-dimensional copula variational approximation through transformation. Journal of Computational and Graphical Statistics 29, 729–743. doi: 10.1080/10618600.2020.1740097.

Strothmann, C., H. Dette, and K. F. Siburg (2023). Rearranged dependence measures.

Sun, T. and Y. Ding (2021). Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics (Oxford, England) 22, 315–330. doi: 10.1093/biostatistics/kxz032.

Tadesse, M. and M. Vannucci (2021). Handbook of Bayesian Variable Selection. 1st ed. Chapman and Hall/CRC. doi: 10.1201/9781003089018.

Tang, Y., H. J. Wang, Y. Sun, and A. S. Hering (2019). Copula-based semiparametric models for spatiotemporal data. Biometrics 75, 1156–1167. doi: 10.1111/biom.13066.

Verhoijsen, A. and P. Krupskii (2022). Fast inference methods for high-dimensional factor copulas. Dependence Modeling 10, 270–289. doi: 10.1515/demo-2022-0117.

Volgushev, S., M. Birke, H. Dette, and N. Neumeyer (2013). Significance testing in quantile regression. Electronic Journal of Statistics 7. doi: 10.1214/12-EJS765.

Wikle, C. K. and A. Zammit-Mangion (2023). Statistical deep learning for spatial and spatiotemporal data. Annual Review of Statistics and Its Application 10, 247–270. doi: 10.1146/annurev-statistics-033021-112628.

Wood, S. N. (2017). Generalized additive models: An introduction with R. Second edition. Texts in statistical science. CRC Press Taylor & Francis Group. doi: 10.1201/9781315370279.

Yu, K., B. U. Park, and E. Mammen (2008). Smooth backfitting in generalized additive models. The Annals of Statistics 36, 228–260. doi: 10.1214/009053607000000596.

Zhang, D., A. Khalili, and M. Asgharian (2022). Post-model-selection inference in linear regression models: An integrated review. Statistics Surveys 16, 86–136. doi: 10.1214/22-SS135.