Novel statistical methods for the optimal use of real-world patient data to improve clinical decision-making

Evidence-based medicine requires investigators to include the best available evidence into their clinical decision-making process. The best evidence regarding the causal effect of a treatment can be provided by properly conducted randomized trials. However, randomized trials, can be costly, infeasible, or unethical.

The goal of this project is to developed novel biostatistical methods that optimally leverage trials’ results and data from electronic medical registries which is more representative of real-world clinical practice.

Optimally estimating treatment effects from observational data

Controlling extreme weights

Because traditional biostatistical methodologies to estimate the causal effect of a treatment using data from electronic medical registries, such as inverse probability weighting, may lead to erroneous results due to practical positivity violation (lack of overlap), I developed a novel biostatistical technique based on a constrained nonlinear optimization problem that provide optimal weights for inference with constrained precision. I found that the method proposed outperformed traditional biostatistical methodologies, such as truncated inverse probability weights, with respect to bias and mean squared error, and I applied it to evaluate the impact of CD4 cell count at treatment initiation among HIV infected patients.

Santacatterina, M., Bottai, M., Optimal probability weights for inference with constrained precision, JASA, 2018
Santacatterina, M., et.al., Optimal probability weights for estimating causal effects of time-varying treatments with marginal structural Cox models, Statistics in Medicine, 2018

Mitigating model misspecification while controlling extreme weights

To mitigate possible model misspecification while simultaneously controlling for precision, I worked on a method that find weights that directly maximize covariate balance. I found that the proposed methodology outperformed state-of-the-art techniques, and I applied it to evaluate the effect of a spine surgical intervention on patient-reported outcomes.

Kallus, N., Pennicooke, B., Santacatterina, M., More robust estimation of sample average treatment effects using Kernel Optimal Matching in an observational study of spine surgical interventions, Under review on Statistics in Medicine, 2019
Kallus, N., Santacatterina, M., Optimal Estimation of Generalized Average Treatment Effects via Kernel Optimal Matching, 2019

I extendend the proposed methodology to control for time-dependent confounders, which are confounders that are affected by previous treatments and impacts future ones.

Kallus, N., Santacatterina, M., Optimal balancing of time-dependent confounders for marginal structural models, 2019

I extended the proposed methodology to optimally to balance confounders to estimate causal effects of continuous treatments.

Kallus, N., Santacatterina, M., Kernel Optimal Orthogonality Weighting: A Balancing Approach to Estimating Effects of Continuous Treatment, Work in progress, 2019

I proposed a new robust weights that balance confounders for both binary and continuous treatments with time-to-event-data. I applied the proposed weights on the evaluation of the effect of hormone therapy on time to coronary heart disease and on the effect of red meat consumption on time to colon cancer among 24,069 postmenopausal women enrolled in the Women’s Health Initiative observational study.

Santacatterina, M., Robust weights that optimally balance confounders for estimating the effect of binary and continuous treatments with time-to-event data, 2020

Generalizing trials’ results

Because trial participants may not be representative of the real-world population, I developed a biostatistical methodology that provide robust weights to generalize trials’ results to target populations. I found that the method outperformed traditional state-of-the-art methods and I applied it to generalize the effect of peer support on viral failure among HIV infected patients from Vietnam to the United States.

Kallus, N., Santacatterina, M., Optimal Estimation of Generalized Average Treatment Effects via Kernel Optimal Matching, 2019

Precision medicine

To evaluate and learn optimal individualized treatment regimes, I proposed the used of a doubly robust estimator and present a method that optimally combines weighted and outcome modeling techniques. I found that the proposed methodology performs well in terms of mean square error and bias.

Su, Y., Wang, L., Santacatterina, M., Thorsten, J., CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning, Accepted at NeurIPS 2018 Workshop on Causal Learning and ICML 2019, 2019

Prediction

To accurately predict COVID-19 related outcomes, such as for example death, and ICU admissions, together with colleagues at Cornell and Google, we developed a novel deep-learning architecture that improve predictions by using longitudinal X-rays.

Shu, M., Bowen, R. S., Herrmann, C., Qi, G., Santacatterina, M., & Zabih, R. (2021). Deep survival analysis with longitudinal X-rays for COVID-19, ICCV21, 2021