Erfan Hoque - profile

Research Overview

In an increasingly data-driven world, understanding and developing robust methodologies for complex and correlated data is crucial for advancing scientific and technological innovation across various fields, including health and environmental science. My research program revolves around complex and correlated data which presents unique statistical/biostatistical challenges and opportunities. The main research interest of my group's research focuses on the methodological developments in statistical dependence modeling in multivariate data. Primarily, our interest is in heterogeneous dependence models with the aim to provide unified and flexible representations of complex dependencies in univariate and multivariate complex and correlated data (such as longitudinal data, time series data). Currently, our research focuses on developing and applying complex and correlated data models and data science methods to solve real-world problems in life, health and environmental sciences and advancing state-of-the-art statistical/biostatistical and epidemiological methods to generate reliable real-world evidence.

Our current research themes include:

Incorporate heterogeneity in the dependence structure
Modeling interdependencies and incorporating heterogeneity in the dependence structure among a large number of variables is a difficult task, partly because not many statistical models can accommodate flexibility in higher dimensions. Here, the goal is to develop novel models and methods that account for potential heterogeneity in the dependence structure of complex longitudinal data, with applications in medical studies (evolution of disease in subjects over time, identifying risk factors) and environmental studies (evaluating the long-term effects of environmental exposures on human and animal health).
Models for random effects covariance structure
In many applications, researchers are interested in modeling the dependence of the covariance of random effects in terms of covariates which is a longstanding open problem. The objective here is to develop models and inferential procedures for random effects covariance structure for longitudinal data with various complex features (missing reponses and/or covariates, measurement errors etc.) and functional longitudinal data as well as Bayesian inference for random effects models.
Statistical machine learning for complex and correlated data
Statistical machine learning provides powerful tools for predicting disease outcomes and treatment responses using complex and correlated data commonly arising in health and biomedical research. Our goal focuses on integrating statistical modeling with machine learning methods to improve prediction when data exhibit dependence, such as correlation over time or within units. By explicitly accounting for complex dependence structures arising from repeated measurements, irregular observation times, or correlated errors, we aim to better understand how model assumptions affect predictive performance and to develop robust, data-driven approaches for real-world applications.
Dynamic data science models and applications
These days, dynamic data science combines advanced statistical modeling, machine learning, and computational techniques to analyze complex data that evolve over time. Here, we focuse on developing and applying dynamic models for smoothing, filtering, forecasting, and pattern discovery in areas such as health data, financial markets, and industrial time series, with applications ranging from disease progression and treatment response to risk management and algorithmic decision‑making.