Deep Neural Networks Really Are the Right Way for a Biostatistician to Analyze Biological Data or: How I Learned to Stop Worrying and Love the DNN

September 16, 2024

12:00 pm to 1:00 pm

French Family Science Center 4233

More information

Event sponsored by

Computational Biology and Bioinformatics (CBB)

Biostatistics and Bioinformatics

Duke Center for Genomic and Computational Biology (GCB)

Precision Genomics Collaboratory

School of Medicine (SOM)

Contact

Franklin, Monica

Speaker

David Page

I admit the title intentionally overstates the case. But many (most?) high-throughput biology datasets are based on aggregates, where aggregation occurs during either the experiment or data post-processing. As a result, any node in a graphical model of the data (e.g., Bayes net, dynamic Bayes net, Markov net, point process, or CRF) really is an aggregate of many idealized single-measurement nodes, so the real model can be viewed as a high-dimensional tree-structured graphical model. We prove that such models correspond to neural networks, and also that every neural network can be viewed as such a model. Based on this theoretical result, we discuss potential applications, including causal neural networks and the potential for a future "foundation model" for health. We also use examples from clinical data (such as EHRs) in addition to biological data.

Event Series

CBB Monday Seminar Series