11/17/23

Identifying Genetics Signals Adaptively and Reproducibly

We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. For example, a common goal in genetics is to identify DNA variants that carry distinct information on a trait of interest. However, strong local dependencies between nearby variants make it challenging to distinguish which of the many correlated features most directly influence the phenotype. A common solution is then to identify sets of variants that cover the truly important ones. Depending on the signal strengths, it is possible to resolve the individual variant contributions with more or less precision. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage e-values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analyzing data from the UK Biobank. Work with Paula Gablenz.

Chiara grew up in Italy, where she attended Liceo Classico Arnaldo in Brescia, and obtained a master degree in "Economics and Social Sciences" (DES) from the Luigi Bocconi University in Milan in 1993. She came to Stanford in 1994 to pursue a PhD in Statistics, then in 2000 she joined the faculty at UCLA in the departments of Human Genetics and Statistics. After 9 years, she came back north with her family and currently lives and works at Stanford. Chiara's research is centered on the development of statistical methods that enable the exploration of high dimensional data. This entails both reducing computational barriers and ensuring that the results obtained by sifting through a large number of variables are reliable, reproducible, and robust. Her work is by nature interdisciplinary: she has enjoyed collaborating with neuroscientists, engineers, chemists, psychiatrists, oncologists, and more in her home institutions and around the globe.

Previous

Innovations and Challenges in AI and its Application to Health

Next

Personalized Machine Learning for Clinical Natural Language Processing