Our research focuses on the development of machine learning techniques for application to problems in molecular biology. We approach these problems using Bayesian techniques such as hidden Markov models, as well as support vector machines and related, non-Bayesian methods. Much of our work addresses two core problems in machine learning: incorporating domain-specific prior knowledge and learning from heterogeneous data. We apply our techniques to problems such as automatic gene finding, microarray expression analysis, gene functional classification, and protein remote homology detection.

Selected Publications:

Ferhat Ay, Evelien M. Bunnik, Nelle Varoquaux, Sebastian M. Bol, Jacques Prudhomme, Jean-Philippe Vert, William Stafford Noble and Karine G. Le Roch. "Three-dimensional modeling of the P. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression." Genome Research.

Ferhat Ay, Timothy L. Bailey and William Stafford Noble. "Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts." Genome Research. 

Michael Hoffman, Jason Ernst, Steve P Wilder, Anshul Kundaje, Robert S. Harris, Max Libbrecht, Belinda Giardine, Paul Ellenbogen, Jeff A. Bilmes, Ewan Birney, Ross C. Hardison, Ian Dunham, Manolis Kellis and William Stafford Noble. "Integrative annotation of chromatin elements from ENCODE data." Nucleic Acids Research.

Ajit P. Singh, John Halloran, Jeff Bilmes and William Stafford Noble. "Spectrum identification using a dynamic Bayesian network model of tandem mass spectra." Uncertainty in Artificial Intelligence: Proceedings of the Twenty-Eighth Conference. Aug. 15-17, 2012. pp. 775-784.

Yanjun Qi, Merja Oja, Jason Weston and William Stafford Noble. "A unified multitask architecture for predicting local protein properties." PLoS One. 7(3):e32235, 2012.

Marina Spivak, Jason Weston, Michael J. MacCoss and William Stafford Noble. "Direct maximization of protein identifications from tandem mass spectra." Molecular and Cellular Proteomics. 11(2):M111.012161, 2012

Gabriel Cuellar-Partida, Fabian A. Buske, R C. McLeay, Tom Whitington, William Stafford Noble and Timothy L. Bailey. "Epigenetic priors for identifying active transcription factor binding sites." Bioinformatics. 28:56-62, 2012.

additional publication listings available via PubMed