Research in Methodology and Biostatistics
I am broadly interested in problems of causal inference and prediction. In particular, I enjoy working on the development of novel methodologies for addressing scientific questions using complex observational data subject to sampling biases.
Methodological / Theoretical Interests
My statistical research generally lies at the intersection of causal inference and statistical epidemiology. Below, I list the four important topics I have been most involved in to date. Further details can be found by clicking on each topic.
Propensity Score Analyses
I have addressed many problems related to observational data, trying to draw causal relationship between an exposure and an outcome. I naturally applied several very popular propensity score approaches and evaluated the performances of these methods in different settings. However, several questions have risen when using these methods in practice, essentially regarding the form of the propensity score model, and the choice of the variables to be included in this model. Hence, one of my goals is to take advantage from the great advances in machine learning in order to improve the performance of propensity-based estimators. For instance, I am currently working on using the Super Learner, a method derived from the stacking algorithm proposed by A. Hubbard and M. van der Laan, to model the propensity score. Another approach is to improve the selection of the variables that should be included in the propensity score, by proposing some ad-hoc balance summary measures and/or supervised clustering methods.
Causal Inference for Longitudinal Data Structures
I started working in Biostatistics with the clinical question of estimating from observational data the benefit of ICU admission in patients with multiple successive triages. This question is typically a problem of causal inference for multiple time-point data. My contribution to this specific question has been to evaluate different statistical approaches for this specific problem, based on marginal structural models on the one hand, on instrumental variables on the other hand.
Targeted Maximum Likelihood Estimation for Variable Importance Measure
The problem of variable importance measure is crucial in Anesthesiology and Critical Care, in order to provide to clinicians some useful information concerning the variables they should care about for their patients. I am currently working on applying methods coming from the causal inference world to this question. Specifically, the use of collaborative targeted maximum likelihood estimation (cTMLE) seems to offer some great advantages in the context of high dimensional sparse observational data.
Machine Learning, Prediction and Personalized Medicine in the ICU
Predicting the risk of death is crucial in Critical Care as it allows stratifying the both patients for clinical or research purposes. I realized that the poor performances of the score currently used could be due to the fact that they roughly all rely on parametric models. I currently work on optimizing the ICU mortality prediction tools by using innovative machine learning methods, such as the Super Learner. Based on such methods, I have been able to propose a new prediction algorithm that clearly outperforms the SAPS II or the SOFA scores.
By clicking on the link below you will be able to use this new prediction algorithm and to get a probability of death from the same set of variables as the one used for the SAPS II score.
I am also interested in harnessing Big Data generated from continuous ICU monitoring by leveraging upcoming innovation in online Machine Learning to ultimately be able to provide personalized real-time prediction and decision tools that would help the clinician in the ICU. I recently created together with Prof. Antoine Chambaz a research team dedicated to this (ACTERREA) which was officially labelled by the French National Research Agency (CNRS).