Past Combi Seminars

Winter 2017

January 18 - Sergey Ovchinnikov
Baker Lab, University of Washington
"Protein structure determination using metagenome sequence data"

January 11 - Dr. James Thomas
University of Washington
"Genome sequence resources: where to find them and some of the things you can do with them"

Autumn 2016

December 7 - Dr. James Mullins
University of Washington
"Understanding and Eliminating HIV Reservoirs"

November 30 - Dr. Orr Ashenberg
Bloom Lab, Fred Hutchinson Cancer Research Center
"Mapping how innate immunity can shape influenza evolution"

November 16 - Dr. Heidi Dvinge
Bradley Lab, Fred Hutchinson Cancer Research Center
"RNA splicing as a driver of cancer"

November 9 - Dr. David Baker
University of Washington
"Post-Evolutionary Biology: Design of novel protein structures, functions and assemblies”

October 26 - Dr. Jesse Bloom
Fred Hutchinson Cancer Research Center
"Statistical tests of the adequacy of high-throughput functional experiments for describing actual biological selection in nature"

October 19 - Dr. Gytis Dudas
Bedford Lab, Fred Hutchinson Cancer Research Center
"Virus genomes reveal the factors that spread and sustained the West African Ebola epidemic"

October 12 - Dr. Peter Skene
Henikoff Lab, Fred Hutchinson Cancer Research Center
"CUT&RUN: An efficient alternative strategy for high-resolution mapping of DNA binding sites"

October 5 - Dr. Hong Qian
University of Washington
"The landscape of cellular biochemical dynamics - a mathematical theory"

September 28 -

CMB Student Research Symposium

Safiye Celik (thesis advisor: Su-In Lee)
"From Big Data to Mechanisms: Inferring module-based drivers of disease pathology and progression"

Ryan McGee (thesis advisor: Carl Bergstrom)
"No Pain, No Gain: Information Acquisition in the Genome"

Cecilia Noecker (thesis advisor: Elhanan Borenstein)
"MIMOSA: An integrative modeling framework for linking ecological and metabolic microbiome variation"


Spring 2016

June 1, postdoc research talk:

Dr. Cristina Valensisi

"Epigenomic landscape of hESC-derived neural rosettes: modeling neural tube formation and diseases"

May 18, postdoc research talk:

Dr. Josh Schraiber
“Archaic Metagenomics: Learning about archaic hominids by the traces they left in modern humans”

Dr. Damien Wilburn
"The Rapid Evolution of Sexy Protein Structures: Insights from NMR Analysis of Abalone Sperm Lysin"

May 11, Combi Seminar: Dr. Rachel Karchin
Johns Hopkins University
"Subclonal Hierarchy Inference from Somatic Mutations: Automatic Reconstruction of Cancer Evolutionary Trees From Multi-Region Next Generation Sequencing"

May 4, postdoc research talk: Dr. Mark Chaisson
"Resolving segmental duplication content de novo through assembly by phasing"

April 27, postdoc research talk: Dr. Adrian Verster
"Bacterial warfare in the gut microbiome"

April 20, postdoc research talk: Dr. Emma Timmins-Schiffman
"Critical decisions in metaproteomics: Achieving high confidence protein annotations in a sea of unknowns".

April 13, Combi Seminar: Dr. Jeff Vierstra
Altius Institute for Biomedical Sciences
“New tools for determining transcription factor occupancy and function”

March 30, Combi Seminar: Dr. Evandro Ferrada
Fowler Lab
University of Washington
"Conserved patterns of first-order intragenic epistasis shape the local adaptive landscape of proteins"


Winter 2016

March 9 - Dr. Erick Matsen
Fred Hutchinson Cancer Research Center
"Looking back in evolutionary time to guide parameter estimates for within-host antibody affinity maturation"

March 2 - Dr. Elhanan Borenstein
University of Washington
“Challenges and opportunities in cross-meta-omic analysis of the human microbiome”

February 17 - Dr. Arvind Subramaniam
Fred Hutchinson Cancer Research Center
"Quantitative Principles Underlying the Regulation of Protein Synthesis"

February 10 - Dr. Adam Phillippy
"How bioinformatics helped crack the Amerithrax case"


In the fall of 2001, at least five letters laced with deadly anthrax spores were mailed across the United States. Five died, many others were sickened, and a nation already reeling from 9-11 was further terrified. The resulting investigation was the most complex and expensive in U.S. history. I will retell the story of how The Institute for Genomic Research (TIGR) sequenced multiple whole genomes of Bacillus anthracis to determine the source of these attacks--an incredible feat at the time. This work pioneered the field of microbial forensics and foretold the future of outbreak genomics, for which this type of analysis is now routine.

February 3 - Dr. Tandy Warnow
University of Illinois
"Grand Challenges in Phylogenomics"

Estimating the Tree of Life will likely involve a two-step procedure, where in the first step trees are estimated on many genes, and then the gene trees are combined into a tree on all the taxa. However, the true gene trees may not agree with with the species tree due to biological processes such as deep coalescence, gene duplication and loss, and horizontal gene transfer. Statistically consistent methods based on the multi-species coalescent model have been developed to estimate species trees in the presence of incomplete lineage sorting; however, the relative accuracy of these methods compared to the usual "concatenation" approach is a matter of substantial debate within the research community.

I will present results showing that coalescent-based estimation methods are impacted by gene tree estimation error, so that they can be less accurate than concatenation in many cases. I will also present two new methods (ASTRAL and statistical binning) for estimating species trees in the presence of gene tree conflict due to ILS. Statistical binning and weighted statistical binning are used to improve gene tree estimation, while ASTRAL is a coalescent-based method that is provably statistically consistent that can construct species trees with 1000 species. Key to these methods is addressing gene tree estimation error more effectively. Finally, I present theoretical results investigating whether statistically consistent accurate species tree estimation is possible when gene trees have estimation error.

relevant papers: 
ASTRAL (Mirarab et al., Bioinformatics 2014) 
ASTRAL-2 (Mirarab and Warnow, Bioinformatics 2015) 
Statistical Binning (Mirarab et al., Science 2014) 
Weighted Statistical Binning (Bayzid et al., PLOS One 2015) 
Comment by Liu and Edwards (Science 2015) 
Response to Comment by Mirarab et al. (Science 2015) 
Theory given bounded number of sites (Roch and Warnow, Systematic Biology 2015) 
Thousand Plant Transcriptome Project (Wickett, Mirarab et al., PNAS 2014), 
Avian Phylogenomics Project (Jarvis, Mirarab, et al., Science 2014).

January 27 - Dr. Trevor Bedford
Fred Hutchinson Cancer Research Center
"Real-time tracking of virus evolution"

January 20 - Dr. William Noble
University of Washington
"Two applications of submodular optimization: choosing genomics assays and selecting representative protein sequences"

January 13 - Brendan MacLean
MacCoss Lab, University of Washington
"Skyline: Lessons in creating broadly used software for research"

Autumn 2015

December 9 - Dr. Christian Landry
"Speciation driven by hybridization and chromosomal plasticity in a wild yeast"
Hybridization is recognized as a powerful mechanism of speciation and a driving force in generating biodiversity. However, only few multicellular species, limited to a handful of plants and animals, have been shown to fulfill all the criteria of homoploid hybrid speciation. This lack of evidence could lead to the misconception that speciation by hybridization has a limited role in eukaryotes, particularly in single-celled organisms. Laboratory experiments have revealed that fungi such as budding yeasts can rapidly develop reproductive isolation and novel phenotypes through hybridization, showing that in principle homoploid speciation could occur in nature. Here we report a case of homoploid hybrid speciation in natural populations of the budding yeast Saccharomyces paradoxus inhabiting the North American forests. We show that the rapid evolution of chromosome architecture and an ecological context that led to secondary contact between nascent species drove the formation of an incipient hybrid species with a potentially unique ecological niche.

December 2 - Siva Kasinathan
graduate student, Henikoff Lab
Fred Hutchinson Cancer Research Center
“Insights into evolution of primate centromeres from single molecule sequencing”

November 18 - Dr. Su-In Lee
University of Washington
"Unsupervised learning of features in gene expression data reveals potential novel markers in cancer"

November 4 - Dr. Jesse Bloom
Fred Hutchinson Cancer Research Center
"Leveraging high-throughput experimental data to better detect the sites of biologically interesting selection in protein-coding genes"

October 28 - Dr. Evan Eichler
University of Washington
"De novo assembly and resolving the complexity of human genetic variation using long reads"

October 14 - Dr. Noah Simon
University of Washington
"Adjusting for Selection Bias in High Throughput Experiments"

Abstract: With recent advances in high throughput technology, researchers often find themselves running a large number of hypothesis tests
(thousands+) and estimating a large number of effect-sizes. Generally there is particular interest in those effects estimated to be most extreme.
Unfortunately naive estimates of these effect-sizes (even after potentially accounting for multiplicity in a testing procedure) can be severely biased.
In this talk we explore this bias from a frequentist perspective. We show that were the bias known apriori one could build estimates that (potentially
significantly) dominate our usual estimators, and bias corrected confidence intervals. In practice the bias will be unknown --- we discuss a bootstrap procedure to estimate it. Unlike other proposals for debiasing estimates, our procedure implicitly adjusts for unknown dependence between the features.

Summer 2015

July 1 - Dr. Tim Bailey
"Mapping the regulation of transcription"

Winter 2015

March 11 - Dr. Erick Matsen
Fred Hutchinson Cancer Research Center
"Learning how antibodies are drafted and revised"

March 4 - Dr. Cole Trapnell
University of Washington
“Differential analysis of single-cell gene expression trajectories?”

February 25 - Dr. Ryan Emerson
Adaptive Biotechnologies
"Inferring CMV serostatus via T cell receptor sequencing"

February 18 - Dr. Tony Chiang
Ruzzo Lab, University of Washington
"Whole Genome Comparisons Reveal a Clonal Global Expansion of a Marine Eukaryotic Microbe"

Friday, February 13 - Dr. Ravi Pandya
Microsoft Research
SNAP: Fast, accurate sequence alignment

February 11 - Dr. Justin Guinney
Sage Bionetworks
"The Consensus Molecular Subtypes of Colorectal Cancer"

February 4 - Dr. David Baker
University of Washington
"Post evolutionary biology"

January 28 - Dr. Gabriel Zentner
Henikoff Lab, Fred Hutchinson Cancer Research Center
"Epigenomic insights into chromatin remodeling and transcriptional regulation"

January 21 - Dr. William Noble
University of Washington
"Mass spectrometrists should only search for peptides they care about"

January 14 - Dr. Jonathan Carlson
Microsoft Research
"HIV adaptation as a window into complex host-pathogen interactions"

January 7 - Daniel Jones
Ruzzo Lab, University of Washington
"A Hierarchical Model for RNA-Seq Experiments"

Autumn 2014

December 3 - Dr. Jesse Bloom
Fred Hutchinson Cancer Research Center

November 19 - Dr. Suleyman Gulsuner
King Lab
University of Washington
"Spatial and temporal mapping of de novo mutations in schizophrenia"

November 12 - Dr. David Hendrickson
Rinn Lab
Department of Stem Cell and Regenerative Biology
Harvard University

"Widespread binding of RNA by DNA Binding Proteins"


Recent evidence suggests that the activity and localization of DNA binding proteins, specifically chromatin-associated proteins, are regulated in part via association with RNA. However, it is unknown if this observation is a bespoke mechanism for a few key RNAs and chromatin factors in specific contexts, or a general mechanism underlying the establishment of chromatin state and regulation of gene expression. To determine which is the prevalent model, we introduce formaldehyde RNA ImmunoPrecipitation (fRIP-seq), a sensitive method for cataloging protein-RNA interactions, to survey the RNA associated with a panel of 26 chromatin modifiers and RNA binding proteins. For each protein that reproducibly bound measurable quantities of bulk RNA (90% of the panel), we detected enrichment for hundreds to thousands of both non-coding and mRNA transcripts. We found that the enriched sets of RNA share biochemical, functional, and epigenetic properties. Thus, these data provide strong evidence that non-random RNA association is a common feature across diverse classes of chromatin modifying complexes.

November 5 - Dr. Robert Bradley
Fred Hutchinson Cancer Research Center
"RNA splicing as a new mechanism of leukemogenesis”

October 29 - Alex Rosenberg
Seelig Lab
University of Washington
"Combining Synthetic Biology with Lessons from Big Data"

Understanding how gene expression is programmed into the DNA sequences in our genomes is a central objective in human genetics. While challenging, the task of unraveling a 3 billion base code is not completely unprecedented. Over the past decade, computer scientists working in natural language processing have made immense progress using algorithms that learn from enormous data sets. Inspired this success of “big data” in traditional machine learning areas, we have applied synthetic biology to generate massive datasets profiling the biological function of many different DNA sequences. As a proof of principle, we have measured the alternative RNA splicing patterns of over 250,000 fully synthetic sequences—an order of magnitude more than exists in the natural genome. From these data, we have built a predictive sequence model of alternative splicing that outperforms the state of the art algorithms.

Alex Rosenberg is a 5th year graduate student in Georg Seelig's lab. His research combines machine learning and the development of new high throughput methods to understand gene regulation.

October 22 - Dr. Phil Green
University of Washington
"Efficient alignment and assembly of next-gen sequencing reads"

October 15 - Dr. Brigham Mecham
"Small business as an alternative career path for young scientists"

Brig Mecham graduated from the Genome Science PhD program in 2010.  He then spent 2 years working at Sage Bionetworks before founding Trialomics, a company that specializes in the design, construction and management of web platforms for life science organizations.  During his talk Dr. Mecham will discuss his alternative career path outside the world of academics, as well as highlight ongoing projects that involve collaborations with research groups at the University of Washington. 

October 8 - Dr. James Thomas
University of Washington
"Evolution of Fungal Genomes"

October 1 - Dr. Leonid Chindelevitch
"Modeling tuberculosis, from cells to populations"

Tuberculosis continues to afflict millions of people and causes over a million deaths a year worldwide. Multi-drug resistance is also on the rise, causing concern among public-health experts. This talk will give an overview of my work on modeling tuberculosis at various scales. On the cellular side I will describe models of the metabolism of M. tuberculosis, where insights from duality led to a consistent analysis of existing models, a systematic method for reconciling discrepant models, and the identification of putative drug targets. On the population side I will describe models of strain evolution, where a new metric combined with an optimization-based approach resulted in an accurate classification of complex infections as originating from mutation or mixed infection, as well as the identification of the strains composing these complex infections.

Winter 2014

April 23 - Dr. Remo Rohs
University of Southern California
"Efficient modeling of transcription factor binding specificities using DNA shape"

April 16 - Terry Farrah
Institute for Systems Biology
"The human proteome: using big proteomics data to confirm protein coding genes and discover new protein variants"

March 12 - Dr. William Noble
University of Washington
"On the importance of well calibrated scores for identifying shotgun proteomics spectra"

March 5 - Dr. Ali Shojaie
University of Washington

February 26 - Dr. Jesse Bloom
Fred Hutchinson Cancer Research Center
"An experimentally determined evolutionary model dramatically improves phylogenetic fit"

February 19 - Dr. Srinivas Ramachandran
Henikoff Lab, Fred Hutchinson Cancer Research Center
"Asymmetric Nucleosomes Poise Yeast Promoters for Activation"

February 12 - Dr. Larry Ruzzo
University of Washington
Widespread conserved RNA structures implicated in genetic cis-regulation

February 5 - Dr. Elhanan Borenstein
University of Washington
Studying the human microbiome: From systems biology to meta-omic analysis

January 29 - Dr. David Baker
University of Washington
"Design of protein structures, functions and assemblies"

Friday, January 24 - Dr. Gustavo Stolovitzky
IBM Computational Biology Center
"Seeking the Wisdom of the Crowds Through Challenge-Based Competitions in Biomedical Research"
2:00, Foege Auditorium

January 22 - Dr. Cole Trapnell
Harvard University
"Mapping Regulatory Networks with Single-Cell Transcriptomics in Cell Differentiation and Disease"
Foege Auditorium

January 15 - Dr. Dengke Ma
"Understanding the Genome for the Control of Animal Physiology and Behavior"
Foege Auditorium

January 8 - Dr. Roger Bumgarner
University of Washington
"Comparative Genomics of Propionibacteria from the skin and those associated with prosthetic implant failures"

Autumn 2013

December 4 - Dr. Hao Xiong
Katze Lab, University of Washington
"Turmoil in transcriptome: Using NGS and the CC Founders to unravel the host response to infection"

November 20 - Dr. Trevor Bedford
Fred Hutchinson Cancer Research Center
"Antigenic drift and geographic circulation in human influenza viruses"

November 6 - Dr. Michael Hawrylycz
The Allen Institute for Brain Science
"Large Scale Atlases of the Mouse and Human: Tools and Analyses"

October 30 - Dr. James Thomas
University of Washington
"The Vertebrate Tree of Life"

October 23 - Dr. Ferhat Ay
University of Washington
Noble Lab
"The dynamic three-dimensional model of the P. falciparum genome reveals the role of genome architecture in regulating gene expression"

October 16 - Dr. Jean-Philippe Vert
Curie Institute of Paris
Director of the Center for Computational Biology
"On finding breakpoints in DNA copy number profiles"

DNA reorganization, including amplification and deletion of particular genomic loci, is a hallmark of most cancers. Microarray- or sequencing-based technologies now allow to capture genome-wide profiles of DNA copy numbers, and give in particular information about locations of DNA breakpoints. In this talk, I will discuss several methods to identify breakpoints in noisy signals, and highlight in particular a method involving partial expert annotation to boost the performance of existing techniques. 

October 2 - Dr. Alexander Ratushny
Seattle Biomed and Institute for Systems Biology
"Mathematical modeling of dynamical biological systems"

Summer 2013

Wednesday, August 14 - Dr. Michal Linial
"The short proteins - A playground of evolution"
1:30, Foege S-110

Friday, July 26 - Dr. Anna Goldenberg
University of Toronto
3:30, Foege S-110

Title: Patient Network Fusion: a fast and effective method to aggregate multiple data types

Abstract: Recent technological advances have made large-scale collection of genomic, transcriptomic and epigenetic data rapid, cost-effective and widely available. Combining these multiple data types in a way that maximizes our knowledge of the disease or biological process in question is currently a major bottleneck. In this talk I will discuss our new approach that integrates multiple types of patient data by building similarity networks for each available data type and then combining them into a single network using Patient Network Fusion (PNF). PNF takes advantage of complementary information in different types of data, does not require gene pre-selection, and is fast and robust to different types of noise. We apply our method to a variety of biological datasets derived from Glioblastoma Multiforme (GBM) patients and show that our method both recapitulates prior knowledge and produces new biological insights. I will illustrate PNF's superior performance as compared to single data type analysis and established integrative approaches with regards to survival subtype profiles and cluster coherence in a total of five cancers. If time permits, I will discuss the potential of this new technology for new applications and the impact on precision medicine.

Biosketch: Anna Goldenberg is a Scientist in Genetics and Genome Biology program at the SickKids Research Institute and an Assistant Professor at the University of Toronto specializing in Computational Biology. She has obtained her PhD in Machine Learning from Carnegie Mellon University in 2007 and since then has been exploring the field of Computational Biology first as a postdoc at UPenn and then at UofT's Donnelly Center for Cellular and Molecular Biology. Her research focuses on developing methods to help address heterogeneity and identify mechanisms driving complex human diseases.

Tuesday, July 16 - Dr. Mary Goldman
1:00 - 2:30, Foege S-110

Winter 2013

March 13 - Dr. Nathan Price
Institute for Systems Biology
"Systems approaches to molecular diagnostics and biomolecular network analysis"

March 6 - Dr. Ka Yee Yeung
"Predicting relapse prior to transplantation in chronic myeloid leukemia by integrating expert knowledge and expression data"

February 27 - Seminar: The More-or-Less 10th Anniversary of the Human Genome Sequence
with a Discussion and Q & A among
Mary-Claire King
Maynard Olson
Robert Waterston

February 13 - Dr. William Noble
UW Genome Sciences
"Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts"

January 30 - Dr. Steven Henikoff
Fred Hutchinson Cancer Research Center
"High-resolution mapping of epigenome dynamics"

January 23 - Dr. Jesse Bloom
Fred Hutchinson Cancer Research Center
"Stability-mediated epistasis constrains the evolution of an influenza protein"

January 16 - Dr. Eugene Kolker
Seattle Children's Hospital
"Proteomes in profile"

January 9 - Dr. Anshul Kundaje
"Heterogeneity and dynamics of regulatory elements in the human genome"

Autumn 2012

December 5 - Dr. Hamid Bolouri
"Network analysis of Leukemia expression, whole-genome, and clinical data"

November 14 - Women in Genome Sciences seminar:
Dr. Anne Stone
Arizona State University
"TB and leprosy: origins and exchanges among humans and other primates"

November 7 - Dr. Uri Keich
University of Sydney
“Estimating the statistical significance of sequence motifs”

October 31 - Dr. Stephen Ramsey
Seattle Biomedical Research Institute
"An integrative systems approach to understanding cell type-specific gene regulatory networks:  application to macrophages”

October 24 - Dr. Phil Green
University of Washington
"How much of the human genome is functional? -- The Sequel"

October 10 - Dr. Ling-Hong Hung
University of Washington
"Fast superposition and comparison of protein structures using OpenCL and GPUs"

October 3 - Dr. Seth Ament
Institute for Systems Biology
"Gene Networks Underlying Honey Bee Social Behavior and Human Bipolar Disorder"

Summer 2012

Thursday, September 6 - Dr. Matthew Rasmussen
Cornell University

"Bayesian Models for Genome-wide Analysis of Phylogenies and Populations"

Gene phylogenies provide a rich source of information about the way evolution shapes genomes, populations, and phenotypes.  In addition to substitutions, evolutionary events such as gene duplication and loss play a major role in gene evolution, and many phylogenetic models have been developed in order to reconstruct and study these events.
However, these models typically make the simplifying assumption that population-related effects such as incomplete lineage sorting (ILS) are negligible.  While this assumption may have been reasonable in some settings, it has become increasingly problematic as increased genome sequencing has led to denser phylogenies, where effects such as ILS are more prominent.  To address this challenge, we have developed a new probabilistic model, DLCoal, that defines gene duplication and loss in a population setting, such that coalescence and ILS can be directly addressed. Interestingly, this model implies that in addition to the usual gene tree and species tree there exists a third tree, the locus tree, which will likely have many applications.  Using this model, we have developed the first general reconciliation method that accurately infers gene duplications and losses in the presence of ILS, and we show its improved inference of orthologs, paralogs, duplications, and losses for a variety of clades, including flies, fungi, and primates.  Also, our simulations show that gene duplications increase the frequency of ILS, further illustrating the importance of a joint model.

We have also developed a new method for reconstructing the history of genomic regions across several individuals within a single species.
Such histories, called Ancestral Recombination Graphs (ARGs), if efficiently inferred can have many applications within population genetics, ranging from detecting selection to characterizing demography.  Although many methods exist for sampling ARGs, they are often computationally intensive.  Our approach is to build upon the Sequentially Markov Coalescent model to develop a new Gibbs sampling method that can work on genomic scales.  Our sampling method is based on the idea of sampling an ARG of n chromosomes (haplotypes) by conditioning on a given ARG of n-1 chromosomes.  We call this "threading" a chromosome into an ARG, and this new threading method will likely have many applications.

August 15 - Dr. Timothy Bailey
University of Queensland

Wednesday, July 11
Dr. Paul Horton
1:30, Foege Auditorium

Excavating human nuclear mtDNAs

NUMTs (Nuclear mtDNA), are partial copies of the mitochondrial genome found in the nuclear genome.  They are sometimes referred to as molecular fossils, and, due to the higher mutation rate of mtDNA, can in some cases be more similar to parts of our ancestral mtDNA than our extent mtDNA genome is.

The existence of NUMTs has been known for decades and many informatics studies on NUMTs have attempted to elucidate the characteristics of their insertion sites.  By showing that NUMTs are typically very clean insertions with only minimal deletion or duplication of the surrounding nuclear DNA, these studies have lead to a consensus opinion that most NUMTs are likely inserted as filler DNA via NHEJ (Non-Homologous End Joining).

Previous informatics studies have not shed much light upon the preferred insertion sites of NUMTs.  Most of them conclude that NUMT insertion is random -- except for contradictory reports that NUMTs correlate positively, or negatively, with retrotransposons.
Fortunately, by employing more careful methodology, we were able to discover several as yet undiscovered aspects of this phenomenon.

We found that inferred NUMTs insertion sites strongly correlate with predicted physical properties of DNA (curvature and bendability) and
A+T rich oligomers.  Moreover, recently inserted NUMTs correlate
strongly with nucleosome free regions as measured by DNase-seq and FAIRE-seq.  We also firmly establishing that NUMTs do indeed tend to co-occur with retrotransposons.  As for the source mtDNA which is copied to create NUMTs, we find that part of the mtDNA D-loop region is very seldom copied.

Relating these facts to concrete hypotheses regarding the mechanism of NUMT insertion proved very challenging, but also fascinating, as it touched upon diverse topics in molecular biology: from retrotransposon activity and DNA repair to evolutionary conservation of chromatin structure and the packaging of mtDNA.

Tsuji et al., in press, NAR.


Spring 2012

Thursday, May 10 -

Dr. Rui Kuang
University of Minnesota
"Network-based Phenome-Genome Association Analysis"
12:00, Foege Auditorium

In the past decade, genome-wide studies of disease-gene associations enabled by the high-throughput genomic technologies revealed many disease causal genes. Driven by the determined associations, a promising next step is to perform large-scale association analysis between all genes and the complete collection of phenotypes (phenome). In this talk, I will introduce our research work on three aspects of phenome-genome association analysis, 1) disease gene prioritization which determines candidate disease genes for a single query disease phenotype; 2) phenome-genome association prediction which reconstructs the complete phenotype-gene associations; and 3) module-based phenome-genome analysis which co-clusters phenotypes and genes, and simultaneously identifies associations between the detected phenotype clusters and gene clusters. We designed subnetwork-based label propagation, bi-random walk and regularized non-negative matrix tri-factorization for the three problems, respectively. Promising results were observed in the cross-validation experiments on the disease phenotype-gene associations in OMIM. Examples of interesting findings in neurological, psychiatric and gastrointestinal disease phenotypes will also be discussed.

Bio: Dr. Rui Kuang is Assistant Professor in computer science and engineering at the University of Minnesota Twin Cities. His research interests span computational biology, machine learning and biomedical/health informatics. He specializes in machine learning algorithms and network analysis methods for understanding the molecular characteristics of disease phenotypes from high-throughput genomic sequence data and array-based expression data. His current projects center around cancer genomics, disease phenome-genome association, and analysis of protein structures and functions. Dr. Kuang is a recipient of NSF career award. He received his PhD from Columbia University, MS from Temple University and BS from Nankai University, all in computer science.

April 4 -

11:30 - Dr. Brendan Frey
University of Toronto
"The DNA programs controlling splicing"
Foege Auditorium

Viewing the genome as the software that controls the development of cells, tissues and organisms, my group uses high-throughput expression data and matched genome data to infer the genetic code that governs gene expression. We are currently focussing on alternative splicing, which cells use to dynamically modify over 90% of human genes, such as genes critical for synaptic plasticity. Unknowns such a trans-factor expression levels, protein binding sites, the effects of RNA secondary structures, and transcription rate-dependent effects are computationally inferred. We find that the resulting code can be used to reveal novel regulatory mechanisms, predict expression patterns under previously unexamined conditions, and simulate the effects of disease mutations with high accuracy.

March 28 - Dr. Tomer Hertz
Fred Hutchinson Cancer Research Center
"A computational approach for identifying host specificity determinants in zoonotic viruses"

Winter 2012

March 14 - Dr. William Noble
University of Washington

March 7 - Dr. Daniela Witten
University of Washington

February 15 - Dr. Ilya Shmulevich
Insitute for Systems Biology
“Integrative Analysis and Interactive Exploration of Data from The Cancer Genome Atlas”

February 8 - Dr. Erick Matsen
Fred Hutchinson Cancer Research Center
“A phylogenetic approach to the analysis of metagenomes”

February 1 - Dr. Peter Myler
Seattle BioMed
"Insights into Leishmania biology from transcriptome analysis using RNA-seq"

January 11 - Dr. Jesse Bloom
Fred Hutchinson Cancer Research Center
"Mapping epistasis in evolution"

January 4 - Dr. Roger Bumgarner
"Comparative genomics of human bacterial pathogens"

Autumn 2011

December 7 - Noah Iliinski
"Effective Data Visualization"
lecture pdf

November 16 - Dr. Michael Eisen
UC Berkeley
sponsored jointly with the Genome Sciences seminar series

November 9 - Dr. Phil Green
University of Washington
"How much of the human genome is functional?"

October 12 - Dr. David Baker
University of Washington
"Scientific discovery by protein folding game players"

October 5 - Dr. Colin Dewey
University of Wisconsin
"Enabling transcript quantification in non-model organisms with RNA-Seq and generative probabilistic models"

Spring 2011

Combi Seminar, June 1 - Dr. Ram Samudrala
University of Washington

April 27 - Dr. Cenk Sahinalp
Simon Fraser University
“Population scale detection of common and rare genomic rearrangements and transcriptomic aberrations”

April 13 - Dr. Ka Yee Yeung
University of Washington
"Construction of regulatory networks using expression time series data of a genotyped population"

Winter 2011

March 2 - Dr. Valerie Daggett
University of Washington

February 23 - Dr. Daniela Witten
University of Washington
"Sparse methods for the unsupervised analysis of genomic data"

February 16 - Michal Galdzicki
Graduate Student, Biomedical and Health Informatics, Sauro Lab
"Synthetic Biology Open Language (SBOL) a new information sharing framework to encourage re-use of DNA components for biological engineering"

February 9 - Dr. William Noble
University of Washington
"Modeling transcription factor binding from high resolution data and modeling the three-dimensional architecture of the yeast genome."

February 2 - Dr. Kyung Hyuk Kim 
University of Washington, Sauro Lab
“Fan-out consideration and noise control of synthetic gene circuits”

January 26 - Dr. James Bassingthwaighte
University of Washington
"Adenosine Supersensitivity in Coronary Flow Regulation: Integrative Multiscale Analysis of Cardiac Purine Metabolism"

January 19 - Dr. Noam Shental
CS Dept, The Open University of Israel
"Identification of rare alleles and their carriers using compressed se(que)nsing"

Abstract: Identification of rare variants by resequencing is important both for detecting novel variations and for screening individuals for known disease alleles. New technologies enable low-cost resequencing of target regions, although it is still prohibitive to test more than a few individuals. We propose a novel pooling design that enables the recovery of novel or known rare alleles and their carriers in groups of individuals. The method is based on a Compressed Sensing (CS) approach, which is general, simple and efficient. CS allows the use of generic algorithmic tools for simultaneous identification of multiple variants and their carriers. We model the experimental procedure and show via computer simulations that it enables the recovery of rare alleles and their carriers in larger groups than were possible before. Our approach can also be combined with barcoding techniques to provide a feasible solution based on current resequencing costs. For example, when targeting a small enough genomic region (~100 bp) and using only ~10 sequencing lanes and ~10 distinct barcodes per lane, one recovers the identity of 4 rare allele carriers out of a population of over 4000 individuals. We demonstrate the performance of our approach over several publicly available experimental data sets, including the 1000 Genomes Pilot 3 study.
We believe our approach may significantly improve cost effectiveness in future Genome Wide Association Studies, and in screening large DNA cohorts for specific risk alleles.

Joint work with Amnon Amir from the Weizmann Institute of Science, and Or Zuk from the Broad Institute of MIT and Harvard

January 12 - Dr. Georg Seelig
University of Washington
"DNA strand displacement as a mechanism for programming chemistry"

Autumn 2010

December 8 - Dr. Roger Bumgarner
University of Washington
“Methods for Network Inference”

December 1 - Dr. Joshua Herbeck
University of Washington
HIV-1 Evolution Within and Across Hosts

November 17 - Dr. Sohrab Shah
British Columbia Cancer Agency
"Defining mutational landscapes of tumors with sequencing and statistical models"

November 10 - Dr. David Baker
University of Washington
"Computational design of high affinity influenza H1N1 binding proteins"

November 3 - Dr. Wyeth Wasserman
"Identification and Design of Mammalian Transcriptional Regulatory Sequences"
University of British Columbia

October 27 - Dr. James Thomas
University of Washington
"Coevolution of retroelements and tandem ZF Genes"

October 20 - Dr. Brian Browning
University of Washington
“Fast detection of identity-by-descent in ‘unrelated’ individuals”
sponsored jointly with Genome Sciences seminars

October 6 - Dr. Olga Troyanskaya
Princeton University
“From Data to Networks to Understanding Complexity of Human Disease”
sponsored jointly with Genome Sciences seminars

Summer 2010

Thursday, June 24 - Dr. Quaid Morris
Assistant Professor of Cellular and Biomedical Research, University of Toronto
"Predicting the targets of mRNA-binding proteins"
3:30, Foege Auditorium

Spring 2010

May 19 - Genome Sciences & Pathology Seminar
Dr. Jenny Graves
1:30, Foege Auditorium

April 14 - Dr. A. Bernardo Calvaho
Departamento de Genética
Universidade Federal do Rio de Janeiro
"Origin and evolution of Y chromosomes: Drosophila tales"

Winter 2010

March 10 - Dr. Sean Eddy
HHMI Janelia Farm Research Campus
3:30, Foege Auditorium
sponsored jointly with Genome Sciences Seminar 

March 3 - Dr. Ken Wolfe
Smurfit Institute of Genetics, Trinity College
3:30, Foege Auditorium
sponsored jointly with Genome Sciences Seminar 

February 24 - Dr. Andrew Clark
Cornell University
3:30, Foege Auditorium
sponsored jointly with Genome Sciences Seminar

February 17 -  Jeremy Horst
Ph.D. student, Dept of Oral Biology

February 3 - Dr. David Reich
Harvard University 
“Learning about Population History from Genomic Data”
sponsored jointly with Genome Sciences Seminar

January 27 - Dr. Eric Green
Director, NHGRI
sponsored jointly with Combi Seminar

January 20 - Dr. Chris Burge
“Global Analysis of RNA Processing in Health and Disease”
sponsored jointly with Genome Sciences Seminar

January 13 - Dr. Harlan Robins
Fred Hutchinson Cancer Research Center

Autumn 2009

December 2 -  Dr. Aviv Regev
Broad Institute / MIT
“Modular Biology: The Function and Evolution of Regulatory Networks”
sponsored jointly with Genome Sciences Seminar

November 18 - Dr. James Thomas
University of Washington
"Gene Duplication and Divergence in Mammals"

November 4 -  Dr. Elizabeth Hauser
Duke University
"New Directions for Genetic Analysis of Coronary Artery Disease"
sponsored jointly with Women in Genome Sciences

October 28 - Dr. David Baker
University of Washington
“Heresy in Computational Biology”

October 21 - Dr. Dmitri Petrov 
Stanford University
"Adaptation in Drosophila"

October 14 - Dr. William Noble
University of Washington 

October 7 - Dr. Martin Tompa
University of Washington
"Comparing genome-size multiple sequence alignments for accuracy"

Summer 2009

August 19 - Dr. Greg Cooper
Acting Assistant Professor of Genome Sciences
"High-throughput analysis of large copy-number variants and hotspots of human genetic disease"

Thursday, July 2 - Combi / WiGS Seminar: Dr. Sharon Browning
"Localized haplotype clustering with the BEAGLE model."

Spring 2009

April 8 - Dr. Harmen Bussemaker
"Learning mechanistic models of gene expression regulation from natural sequence variation"

Winter 2009

March 11 - Dr. Trudy Mackay
William Neal Reynolds and Distinguished University Professor of Genetics North Carolina State University
"Systems Genetics of Complex Traits in Drosophila"
sponsored jointly with Genome Sciences Seminar

March 4 - Dr. Phil Bradley
Fred Hutchinson Cancer Research Center
"Toward Structure-based Prediction of Protein-DNA Interactions"

February 25 - Dr. Tim Bailey
Senior Research Fellow, Institute for Molecular Bioscience
University of Queensland
“Tissue-specific prediction of transcription factor binding sites using chromatin modification data”

February 18 - Dr. Joshua Akey
UW Genome Sciences
"Unnatural Selection in Dogs: A Genome-Wide Scan for Substrates of Human Tinkering"

February 11 - Dr. Robert Gentleman
Fred Hutchinson Cancer Research Center
"Comparative Chip-seq"

February 4 - Jeremy Horst
UW Depts of Oral Biology & Microbiology, graduate student
"Modeling protein structure, function, and interactions to characterize mechanisms of mammalian mineralization"

January 28 - Dr. Ping Ao
UW Mechanical Engineering
"Endogenous network dynamcial view on cancer genesis and progression: Its construction and initial preductions"

January 21 - Dr. Wendy Thomas
UW Bioengineering
"Mechanical Regulation of Receptor-Ligand Bonds"

January 14 - Dr. Roger Bumgarner
UW Microbiology
“Comparative Genomics of clinical isolates of Aggregatibacter actinomycetemcomitans”

Autumn 2008

November 19 - Dr. Conrad Nieduszynski
Institute of Genetics, University of Nottingham, UK
"Developing a mathematical model for chromosome replication"

November 12 - Dr. David Baker
UW Biochemistry
"Prediction of structure and design of function"

October 29 - Dr. Michael Katze
UW Microbiology
“Can Systems and Computational Biology Save the World From The Next Pandemic?"

October 22 - Dr. James Thomas
UW Genome Sciences
"Molecular evolution of mammalian transcription factors"

October 13 - Dr. Paul Horton
"Mitochondrial ß-Signal; The End of the Story?"
Computational Biology Research Center, AIST, Tokyo, Japan

October 8 - Dr. William Noble
UW Genome Sciences
"Computational analyses of human chromatin"

October 1 - Dr. Harmit Malik
Fred Hutchinson Cancer Research Center
sponsored jointly with Genome Sciences Seminar

Summer 2008

Tuesday, June 24 - Dr. Rebecca Doerge
“Whole Genome Expression Quantitative Trait Loci (eQTL) Analysis of Arabidopsis”
3:30, Foege Auditorium


There is increasing interest in understanding the molecular basis of complex traits. Initially, the genetic dissection of quantitative traits involved measurements of gross phenotypes.  Most recently, the underlying mechanisms of inheritance have been studied through various approaches that are supported by modern technological and methodological advances, namely quantitative trait locus/loci (QTL) analysis and mutant analysis in genetics; genome sequencing and gene expression analysis in genomics; and protein structure analysis and protein assay in proteomics. Since each technology and approach focuses on specific pieces of the larger, poorly understood systems biology, the challenge is to integrate these different types of information to elucidate the genetic architecture of complex traits.   To address one of these challenges we have combined QTL analysis with microarray analysis to characterize the genomic architecture that controls quantitative traits.  Using Affymetrix technology and 211 individuals from a segregating Arabidopsis population, the transcript variation (i.e., expression level polymorphisms, ELPs) of 22,810 genes, in both control and treatment conditions, provide data for mapping expression QTL (eQTL).  Results from our statistical analysis of the entire genome reveal both cis- and trans-eQTL under both control and treatment conditions.  The statistical methodology developed for this type of experiment will be presented for a directed analysis of SA-inducible secretory genes controlled by NPR1.

***This work is funded by NSF Arabidopsis 2010 in collaboration with Drs. Marilyn West, Hans van Leeuwen, Richard Michelmore, and Dina St.Clair University of California, Plant Sciences Dept, Davis CA, and Dr. Kyunga Kim, Purdue University, West Lafayette, IN, and Seoul National University, Seoul, Korea.


Spring 2008

Thursday, May 15 - Dr. Harmen Bussemaker
Columbia University
“Predicting expression from sequence: Data-driven biophysical models of (post-) transcriptional networks”
1:00 - 2:00, Foege Auditorium

Winter 2008

March 10 - Dr. James Bruce
“Chemistry and Mass Spectrometry: New Tools for Protein Interaction Network Identification”

March 5 - Dr. Adam Siepel
“Comparative genomics of animals and plants”
Cornell University

I will describe three recent comparative genomics projects, including two of mammals and one of plants. The first is a large-scale effort, carried out as part of the Mammalian Gene Collection (MGC) project, to identify human genes not yet in the gene catalogs. Our approach was to produce gene predictions by algorithms that rely on comparative sequence data but do not require direct cDNA evidence, then to test predicted novel genes by RT-PCR. This work led to the identification of more than 2000 novel exons corresponding to an estimated ~500 genes, including >160 that were completely absent from the gene catalogs. The second project is a comprehensive analysis of positively selected genes (PSGs) in mammals based on the six high-coverage eutherian mammalian genome assemblies now available. Compared with previous genome-wide scans for PSGs, the increased phylogenetic depth of this data set results in substantially improved statistical power, and permits several new lineage- and clade-specific tests to be applied. Several hundred apparent PSGs were identified, and a detailed analysis was performed of their selection histories (evolutionary patterns of selection and nonselection), the functional categories and pathways to which they belong, and their expression patterns. The third project is an attempt to apply tools and techniques developed for comparative genomics of mammals to a group of plants showing similar levels of genomic divergence. This project involved the generation and analysis of sequences for a ~100kb conserved syntenic segment (CSS) in the genomes of five members of the agriculturally important plant family Solanaceae. Our analysis suggests that, as in mammals, a large fraction of noncoding bases in these genes is under selection, and it sheds light on the dates of divergence if these species. Together, these three projects illustrate the power of comparative genomics in characterizing evolutionary dynamics, selection pressures, and genomic function.

February 27 - Dr. John Stamatoyannopoulos
University of Washington
“Multi-lineage programming of human regulatory DNA”
sponsored jointly with Genome Sciences seminar

February 20 - Dr. Chung-I Wu
“The birth and death of miRNAs in Drosophila and their evolving relationships with targets"

February 13 - Dr. Steven Henikoff
Fred Hutchinson Cancer Research Center
“Histone variants and epigenetic inheritance”
sponsored jointly with Genome Sciences seminar

February 6 - Dr. Goncalo Abecasis
University of Michigan
“Adventures in Genome Scanning: Meta-Analysis and Genotype Imputation Identify New Loci Influencing Lipid Levels and Coronary Artery Disease”
sponsored jointly with Genome Sciences seminar

January 23 - Dr. Joel Hirschhorn
Broad Institute of MIT and Harvard
“Genetics of body size and other complex traits”
sponsored jointly with Genome Sciences seminar

January 16 - Dr. William Noble
University of Washington
"Consistent probabilistic outputs for protein function prediction"

January 9 - Dr. Carlos Bustamante
Cornell University
"Whole Genome Association Mapping and Population Genomics of Domesticated Species: Promises, Potential Pitfalls, and Preliminary results"
sponsored jointly with Genome Sciences seminar

Autumn 2007

December 5 - Dr. David Baker
University of Washington
“Rapid structure determination, novel enzymes, and a multiplayer computer game”

November 28 - Dr. Robert Gentleman
Fred Hutchinson Cancer Research Center
“Modeling Interactions”

November 14 - Zizhen Yao
Ruzzo Lab, University of Washington
"Genome scale search of noncoding RNAs, Bacteria to Vertebrates"

November 7 - Dr. Wenying Shou
Fred Hutchinson Cancer Research Center
"Collapse or Collaborate: Experimental and Mathematical Analyses of a Synthetic Cooperative System”

October 31 - Dr. Michael Hawrylycz
Allen Institute for Brain Science
"AGEA: An Anatomic Gene Expression Atlas for the C57BL/6J mouse brain"

October 24 - Dr. Hong Qian
University of Washington
"From Biochemical Reaction Networks to Cellular States: A Computational Approach"

October 17 - Dr. James Mullins
University of Washington
"The evolution of HIV and its battle with host cellular immune responses during asymptomatic infection"

October 10 - Dr. Serafim Batzoglou
Stanford University
“Algorithms for Sequences, Networks, and Populations"

October 3 - Dr. Bernhard Palsson
University of California, San Diego
“Reconstruction of the genome-scale transcriptional regulatory network in e. coli

October 1 - Dr. Chris Ponting
University of Oxford
“Recombination and rapid evolution of genes and chromosomes”
1:30, Foege Auditorium
sponsored jointly with Genome Sciences seminar

September 26 - Dr. Michael Nachman
University of Arizona
“Population genetics of wild and inbred house mice: insights into speciation and the use of mice as models for biomedical research”
sponsored jointly with
Genome Sciences seminar

Summer 2007

July 25 - Dr. Gil Ast
"Alternative splicing and human genomic complexity"
1:30, Foege Auditorium

June 27 - Dr. Dahlia Nielsen
North Carolina State University
"Examining the dirt underneath the association mapping carpet: what kind of genes do we expect these methods to find?"
sponsored by WiGS

Winter 2007

March 7 - Dr. Joshua Akey
University of Washington
“Gene expression variation within and among human populations”

February 28 - Dr. Eric Siggia
The Rockefeller University
sponsored jointly with the Genome Sciences Seminar series

February 21 - Dr. Paul Pavlidis
University of British Columbia
"Large-scale mining of expression patterns in public microarray datasets"

February 14 - Tobias Mann
University of Washington
"A thermodynamic approach to PCR primer design"

February 7 - Dr. John Mittler
University of Washington
“Dynamical modeling of HIV-1 drug resistance"

January 31 - Dr. Martin Tompa
University of Washington
"Which Portions of Whole-Genome Multiple Alignments Are Reliable?"

January 24 - Dr. Martha Bulyk
Harvard University
sponsored jointly with the Genome Sciences Seminar series

January 17 - Dr. William Noble
University of Washington
“Machine Learning Analyses of Tandem Mass Spectra”

January 10 - Genome Sciences / Combi Seminar
Dr. Jonathan Pritchard

University of Chicago
“Genetic Variation and Natural Selection in the Human Genome”
sponsored jointly with the Genome Sciences Seminar series

Autumn 2006

December 6 - Genomic Medicine / Combi Seminar
Dr. Eric Schadt
Rosetta Inpharmatics
sponsored jointly with the Genome Sciences Seminar series

November 15 - Dr. Bradley Till
“TILLING and Ecotilling from Arabidopsis to Humans"
University of Washington

November 8 - Dr. Harlan Robins
Fred Hutchinson Cancer Research Center
“Isochores and symmetry breaking in the human genome”

November 1 - Dr. Jared Roach
Institute for Systems Biology
"Genetic Mapping at 3-kb Resolution"

October 18 - Dr. Daniel Zilberman
Fred Hutchinson Cancer Research Center
"Genome-wide analysis of DNA methylation and demethylation in Arabidopsis"

October 11
- Dr. Lon Cardon
Fred Hutchinson Cancer Research Center
"Finding Human Disease Genes by Whole Genome Association"
sponsored jointly with the Genome Sciences Seminar series

Wednesday, August 16

Dr. Chip Lawrence
Center for Computational Molecular Biology, Division of
Applied Mathematics, Brown University
“Abuse of the Mode and an Ensemble Alternative”

Advances in data collection technologies have rendered increasingly large data sets available for analysis.  While the emergence of such large data sets would seem to lead to increasingly more precise estimates of parameters, paradoxically just the opposite seems to becoming increasingly common. This paradoxical circumstance has emerged because these technologies have simultaneously opened opportunities to draw inferences on previously unanswerable high dimensional questions.  For decades optimization procedures such as maximum likelihood, minimum free energy, and MAP estimators, have been employed as the major tool of most inference procedures. It has been clearly recognized for some time now that the favorable properties of optimization based inferences rest on an asymptotic foundation that requires the data to grow in comparison with the number of unknowns. Nevertheless, optimization very often continues as the method of choice even when these supporting conditions are not present and even when the most probable solution has a probability of 10-10 or less. Genomics and computational molecular biology are among the more predominate fields experiencing the duality of the growth in data resources and inference expectations.  In fact, prediction and inference of high dimensional objects are now arguably the most important activities in these allied new biological fields, and the inspiration for this paper.  RNA secondary structure prediction offers a very special lens to examine the untoward consequences of the reliance on the mode in high dimensional inferences because polynomial time algorithms are available to comprehensively characterize the space of solutions, and a references set of structures is available for the comparison of alternative prediction methods.  Through this lens we will examine these untoward effects, consider their boarder implications, and present alternative "ensemble based" centroid estimators.  Using a model identical to the well know Mfold model, we find that centroid estimators both better represent the posterior space and improve positive predictive values of the reference set predictions by 30% while improving sensitivity by 3% compared to Mfold predictions that minimize free energy.

Dr. Eleazar Eskin
University of California, San Diego
“Whole genome association in inbred mouse strains”

Spring 2006:

4/19 - Dr. Sagi Snir
UC Berkeley
"Graphs, Colorings and Beyond in Comparative Genomics"

Comparative genomics seeks to explore characteristic patterns of a set of
organisms by comparing common features of the given organisms. Computational
methods are a significant part in this type of discipline. In this talk I will
describe the use of colored graphs to solve two problems in comparative

1. Micro-indels are small insertion or deletion events (indels) that occur
during genome evolution. The study of micro-indels is important, both in order
to better understand the underlying biological mechanisms, and also for
improving the evolutionary models used in sequence alignment and phylogenetic
analysis. The inference of micro-indels poses a difficult computational
problem, and is far more complicated than the related task of inferring the
history of point mutations. We introduce the concept of indel history, a tree
alignment based approach that is suitable for working with multiple genomes,
which enables us to arrive at some interesting computational and biological

2. Horizontal gene transfer (HGT) is the transfer of genetic material from one
lineage to another. HGT plays a major role in bacteria's genome diversification
as well as their ability to develop resistance to antibiotics. This mechanism
cannot be represented by the traditional tree-like evolution rather as a
network. We show new results, both combinatorial and statistical, improving
significantly over current approaches and enable us to analyze real biological
data sets.

4/12 - Dr. William Atchley
North Carolina State University
"Computational Biology, Molecular Architecture and Transcription"

Elucidating the underlying causes of sequence variability in proteins is an important goal in modern biology. Sophisticated statistical analyses geared toward partitioning sequence variation into its underlying causal components are difficult because of the so-called “sequence metric problem.” That is, biosequences are comprised of alphabetic codes that have no underlying natural metric. Recently, we offered a solution to this problem (PNAS 102:6395-6400, 2005) that permits these alphabetic codes to be transformed into highly interpretable numerical values. These new values were derived from multivariate statistical analysis of a large suite of amino acid physiochemical attributes. I will briefly describe this new approach and then introduce some applications using the basic helix-loop-helix (bHLH) proteins as exemplars. As time permits, I will discuss this approach and partitioning amino acid covariation, elucidating the molecular architecture of DNA binding and dimerization, and understanding those amino acids involved in enhanced binding specificity.

Winter 2006:

3/1 - Bert Tanner
Graduate Student, Department of Bioengineering, University of Washington
"force production in muscle modeled as a coupled network of motor proteins"

2/22 - Dr. Willie Swanson
University of Washington
“Adaptive evolution and co-evolution of sperm - egg recognition molecules”

2/15 - Dr. Joshua Akey
University of Washington

2/8 - Dr. John Storey
University of Washington
“PCA for Variance Decomposition and Linkage Analysis of Genome-wide Expression”

2/1 - Dr. James Thomas
University of Washington
"Evolution of gene families in Caenorhabditis"

1/25 - Dr. Michael Hawrylycz
The Allen Institute for Brain Science
"Data Mining of in situ Hybridization Expression Data in
the Adult Mouse Brain"

1/18 - Dr. Larry Ruzzo & Dr. Martin Tompa
University of Washington
“Computational prediction of non-coding RNA motifs in bacteria”

1/11 - Dr. Pavel Pevzner
UC San Diego
“Identification of post-translational modifications by blind search of mass spectra”

1/4 - Dr. Diane Genereux
University of Washington
"using empirical and mathematical approaches to understand the dynamics of DNA methylation in the human genome"

Autumn 2005:

12/7 - Dr. Hong Hung
University of Washington
"Clean structures from dirty data"

11/30 - Dr. John Mittler
University of Washington
"Modeling evolution of HIV-1 in vivo."

11/16 - Dr. William Noble
University of Washington
"Identifying remote protein homologs by network propagation"

11/9 - Dr. Zhirong Bao
University of Washington
"Mobilomics: RECON in the Genome's Junk Yard"

11/2 - Dr. Matthew Stephens
University of Washington
"Automatically detecting and genotyping SNPs by sequencing of diploid

10/26 - Amol Prakash
Graduate Student, Department of Computer Science & Engineering, University of Washington
"Comparative Genomics in Vertebrates and Multiple Alignments"

10/19 - Dr. David Baker
University of Washington
"Progress in high resolution modeling of protein structures and

10/5 - Dr. Daniel Miranker
University of Texas
“MoBIoS: A Specialized Database Management System for Biological Discovery”

9/28 - Dr. Hamid Bolouri
Institute For Systems Biology
"Pointillist: an open source tool for high throughput data integration"

Winter 2005:

Wednesday, March 9 - Dr. Ram Samudrala, University of Washington
"Modelling proteomes"

Wednesday, March 2 - Dr. Evan Eichler, University of Washington
“Segmental Duplications and Human Genome Evolution”

Wednesday, February 23 - Dr. William Noble, University of Washington
"Predicting the in vivo signature of human gene regulatory sequences"

Wednesday, February 16 - Dr. John Storey, Unversity of Washington
“Multiple Locus Linkage Analysis of High-throughput Phenotypes Applied to Genome-wide Expression in Yeast”

Wednesday, February 9 - Dr. Jon McAuliffe
"Statistical Methods for Genome Comparison"

Wednesday, February 2 - Dr. Michael MacCoss, University of Washington
"Computational Analysis of Shotgun Proteomics Data"

Wednesday, January 26 - Dr. Adam Siepel, University of California, Santa Cruz
"Comparative mammalian genomics: models of evolution and detection of functional elements"

Having the complete genomes of multiple species is causing sweeping changes in biology. Comparative sequence analysis is leading to new insights about the evolutionary forces that have shaped present-day genomes and is enabling previously unknown functional sequences to be identified and characterized. Comparative methods hold particular promise for mammalian and other vertebrate genomes, which--because of their size and complexity, and because of other obstacles to experimental study--have been more difficult to approach experimentally than the genomes of simpler organisms such as flies and nematodes.

In this talk, I will discuss both recent methodological advances in comparative sequence analysis and scientific insights gained from genome-wide surveys conducted with these methods. The main theme of the talk will be using evolutionary models to help shed light on sequence function. Three particular problems will be discussed: the identification of evolutionarily conserved elements, modeling context- or neighbor-dependent substitution, and the identification of evolutionarily conserved protein-coding exons. These problems have been addressed using phylogenetic hidden Markov models (phylo-HMMs), statistical models that describe both the process of nucleotide substitution at individual sites in a genome and how this process changes from one site to the next.

Using a phylo-HMM-based program called phastCons, we have conducted a comprehensive search for conserved elements in vertebrate genomes. I will discuss the results of this search and of parallel searches in Drosophila, Caenorhabditis, and Saccharomyces genomes. Particular attention will devoted to the most highly conserved of the elements identified in vertebrates, which appear to be associated with both transcriptional and post-transcriptional regulation and which show significant statistical evidence of an enrichment for RNA secondary structure. In separate work, another phylo-HMM-based program called ExoniPhy has been used to predict about 170,000 protein-coding exons conserved in the human, mouse, and rat genomes, corresponding to an expected 20,400 genes. Of these, about 23,000 predicted exons (2,800 genes) are not represented in sets of known genes. Preliminary experimental (RT-PCR) results indicate that the false positive rate of these predictions is quite low (<30%).

Wednesday, January 19 - Dr. Michael Katze, University of Washington
“Virology Meets Computational Biology: Is This Enough To Stop The Next Pandemic?”

Wednesday, January 12 - Dr. Michael Lynch, Indiana University
“The Origins of Gene and Genome Complexity”
sponsored jointly with Genome Sciences

Wednesday, January 5 - Dr. Martin Kreitman, University of Chicago
"Deciphering rules governing enhancer functional evolution"

Lack of knowledge about how regulatory regions evolve in relation to their structure-function may limit the utility of comparative sequence analysis in deciphering cis-regulatory sequences. To address this we applied reverse genetics to carry out the first functional genetic complementation analysis of a eukaryotic cis-regulatory module - the even-skipped stripe 2 enhancer - from four Drosophila species. The functional evolution of this enhancer is non-clocklike: important functional differences have evolved between closely related species that are not found between distantly related species. We can attribute the functional conservation between distantly related species to evolutionary convergence rather than evolutionary stasis. Functional divergence of the stripe2 enhancer between closely related species is attributable to differences in activation levels rather than spatio-temporal control of gene expression. Our findings have implications for understanding enhancer structure-function, mechanisms of speciation, and computational identification of regulatory modules.

sponsored jointly with Genome Sciences
3:30, Hitchcock 132

Autumn 2004:

Wed, December 8 - Dr. Emily Rocke, postdoctoral researcher, Thomas Lab, University of Washington
"Evidence for chromatin-interacting function of a GAGA motif in C. elegans"

Wednesday, December 1 - Dr. Rick Myers
"Genome-wide analysis of human transcriptional regulatory elements"
sponsored jointly with Genome Sciences
3:30, Hitchcock 132

Wed, November 17 - Dr. James Thomas, University of Washington
"Gene Clusters in C. elegans"

Wed, November 10 - Dr. Steven Henikoff, Fred Hutchinson Cancer Research Center
"Profiling DNA Methylation in the Arabidopsis Genome"

Wed, November 3 - Dr. David Baker, University of Washington
"Prediction and design of macromolecular structures and interactions"

Wed, October 27 - Zasha Weinberg, graduate student, UW Computer Science & Engineering
"Accurate annotation of non-coding RNA in practical time"

Wednesday, October 20 - Dr. Thomas Gingeras, Affymetrix Inc.
"Empirical Analysis of Sites of RNA Transcription for 30% of the Human Genome: The Changing Landscape of the Human Genome Annotations"
sponsored jointly with Genome Sciences

Wed, October 13 - Dr. Phil Green, University of Washington
"Signal and Noise in Genome Sequences"

Wed, October 6 - Dr. Martin Tompa, University of Washington
"Tools for Prediction of Regulatory Elements in Microbial Genes"

I will describe and demonstrate the course project tackled by undergraduate students in last spring's inaugural Computational Biology Capstone Course, CSE 490MT. The goal of the project was to write software that starts from a single microbial gene of interest, finds a large collection of orthologous genes from multiple microbes, and uses this collection to identify evolutionarily conserved motifs in their regulatory regions. These motifs are good candidates to be functional regulatory elements. Such a tool, if done well, could be very useful for working microbiologists.

(For more detail on the project, see .)

Wednesday, September 29 - Dr. Charles Aquadro, Cornell University
sponsored jointly with Genome Sciences
3:30, Hitchcock 132

Summer 2004:

Wed, September 1 - Dr. Daniel Barker, School of Animal and Microbial Sciences, University of Reading, U.K.
"Phylogeny, pathways and protein complexes"

Spring 2004:

Wed, June 2 - Dr. Willie Swanson, University of Washington
"Adaptive Evolution of Reproductive Proteins"

Wed, May 19 - Genome Sciences Symposium

Wed, May 12 - Dr. Ping Ao, Institute for Systems Biology, UW Mechanical Engineering
"Calculating Biological Behaviors in Phage Lambda Life Cycle"

Wed, May 5 - Dr. Charles Langley, Professor of Genetics, UC Davis
sponsored jointly with Genome Sciences

Wed, April 28 - Dr. Harmit Malik, Assistant Member, Basic Sciences, Fred Hutchinson Cancer Research Center & Affiliate Assistant Professor, University of Washington
"Molecular Investigations of Genetic Conflict"

Wed, April 21 - Dr. Ira Kalet, University of Washington
Associate Professor, Radiation Oncology and Medical Education and Biomedical Informatics (joint)

"Anatomy, Biomedicine and Computing: the ABC's of Informatics in Cancer Treatment"

The explosive growth in the application of computing to medical and health care research and practice has been most visible in computerized medical record systems, web based medical knowledge resources and biomolecular data such as GENBANK. By contrast, the field of radiation therapy for cancer has had computing as an integral part of research and practice since the mid-1960's. Designing radiation treatments is typically done with an interactive graphic software system called a "radiation therapy planning", or RTP, system. The design of RTP systems requires solving difficult graphic visualization problems, careful attention to user interface design, efficient numerical computation, and advanced networking. In addition, as these systems can grow to large size (from 75,000 up to a million lines of code), they provide an industrial strength trial opportunity for new ideas in software engineering. Finally, the design process itself, largely done by human experts, may be automated by creating computational models of human anatomy, tumor biology and radiation machinery. The talk will include an interactive demonstration of the Prism RTP system developed at the University of Washington, and will conclude with a short summary of the UW Biomedical and Health Informatics program, a larger context for such biocomputing research.


Wed, April 14 - Dr. Carl Bergstrom , University of Washington
"Mathematical Models of RNA Silencing: How an Intracellular Immune System Avoids Autoimmune Reactions"

Wed, April 7 - Dr. Martin Tompa, University of Washington
"An Assessment of Algorithms for the Discovery of Transcription Factor Binding Sites"

Winter 2004:

Thursday, March 4 - Dr. Mary Kuhner, Research Associate Professor of Genome Sciences, University of Washington
"Coalescent Likelihood Estimators in Theory and Practice"

Wed, March 3 - Dr. Joshua Akey, Affiliate Postdoctoral Fellow, Kruglyak Lab, Human Biology, Fred Hutchinson Cancer Research Center
"Computational Studies of Genetic Variation: Searching For Signatures of Selection in Humans and Mapping Gene Expression QTL in Yeast"
sponsored jointly with Genome Sciences

Wed, February 25 - Dr. Noah Rosenberg, Research Associate, Program in Molecular and Computational Biology, University of Southern Calfornia
"Genome-wide Analysis of Human Variation and Population Structure"
sponsored jointly with Genome Sciences

Wed, February 18 - Dr. Liqing Zhang, Department of Ecology and Evolution, University of Chicago
"The Distribution and Evolution of Duplications in the Genomes of Arabidopsis thaliana and Human"

Wed, February 11 - Dr. Sean Eddy, Associate Professor of Genetics, Washington University
"Computational Analysis of Noncoding RNA Genes"
sponsored jointly with Genome Sciences

Wed, February 4 - Dr. Eran Segal, Computer Science Department, Stanford University
"Rich Probabilistic Models for Genomic Data"

Wed, January 28 - Dr. Marcus Feldman, Professor, Stanford University
"Some Perspectives on the Genetic Structure of Human Populations"
sponsored jointly with Genome Sciences

Wed, January 21 - Dr. Elizabeth Thompson, Professor of Statistics and of Biostatistics, University of Washington
"Inferring Relationships Among Individuals and Populations"

Wed, January 14 - Dr. Terry Speed, Professor of Statistics, UC Berkeley
"Incorporating Dependence Into Models for Biomolecular Motifs"
sponsored jointly with Genome Sciences

Wed, January 7 - Dr. Eric Green, Senior Investigator and Chief, Genome Technology Branch, NHGRI
"Multi Species Comparative Sequencing: Using Evolution to Decode the Human Genome"

Autumn 2003:

Wed, December 10 - Dr. Jonathan Pritchard, Assistant Professor of Human Genetics, University of Chicago
"Linkage Disequilibrium in the Human Genome, and Implications for Complex Trait Mapping"

Wed, December 3 - Dr. Larry Ruzzo, Professor of Computer Science and Engineering, University of Washington
"Improved Gene Selection For Classification Using Microarrays"

Wed, November 19 - Dr. Thomas Daniel, Joan and Richard Komen Professor of Zoology, Dept. of Biology, UW
"Modeling Molecular Motors: Monte-Carlo Meets Mechanics"

Wed, November 12 - Dr. Simon Tavare, Professor of Biological Sciences, University of Southern California

Wed, November 5 - Dr. James Kent, UC Santa Cruz
"The Gene Family Browser and other Recent Research at"

Wed, October 29 - Dr. James Thomas, Professor of Genome Sciences, University of Washington
"Rapidly Evolving Domains in the C. elegans Genome"

Wed, October 22 - Dr. Terry Hwa, Professor of Physics, UC San Diego UC San Diego
"Complex Transcriptional Logics From Simple Molecular Interactions"

Wed, October 15 - Dr. William Noble, Assistant Professor of Genome Sciences, University of Washington
"A Statistical Framework for Genomic Data Fusion"

Wed, October 8 - Dr. Chao Tang, Sr. Research Staff Member, NEC Research Institute, Princeton, NJ
"Finding Transcriptional Modules From Large Scale Gene Expression Data"

Wed, October 1 - Dr. Phil Green, Professor of Genome Sciences, University of Washington
"Finishing the Gene-ome: Computationally Directed Gene Structure Verification in C. elegans"

Summer 2003:

Wednesday, August 20 - Dr. Michal Linial, Department of Biological Chemistry, Hebrew University
"Constructing the Protein Space: From Sequence to Functional Inference"
1:30 - 2:30, Health Sciences K-069

Spring 2003:

Wed, June 4 - Dr. Deirdre Meldrum, Prof., Dept. of Electrical Engineering, University of Washington
"Microsystems and Applications for Life-on-a-Chip"
3:30, Hitchcock 132
sponsored jointly with Genome Sciences

Wed, May 28 - Dr. Chris Carlson, Dept of Genome Sciences, University of Washington
"Building a maximally informative SNP map using linkage disequilibrium"

Wed, May 21 - Dr. Barbara Trask, Dept of Genome Sciences, University of Washington, Fred Hutchinson Cancer Research Center
"Dynamic Duplications in the Human Genome"

Wed, May 14 - Genome Sciences Symposium: "Human - Mouse Comparative Biology"

Wed, May 7 - Dr. Scott Edwards, Dept of Zoology, University of Washington
"Genome and Transcriptome Evolution in Reptilia, Including Birds"

Wed, April 30 - Dr. Subramani Mani, Center for Biomedical Information, University of Pittsburgh
"Discovering Causal Relationships from Biomedical Data"

Wed, April 23 - Dr. Steve Henikoff, Fred Hutchinson Cancer Research Center
"Traditional Mutagenesis in the Post-Genomic Era"

Wed, April 16 - Dr. Wyeth Wasserman, University of British Columbia
"Discovery of Regulatory Sequences Directing Transcription of Co-expressed Genes"

Wed, April 9 - Dr. Gane Ka-Shu Wong , Genome Center, University of Washington
"Genome Structure in Plants and Animals"

Wed, April 2 - Dr. Terry Gaasterland, Rockefeller University, Laboratory of Computational Genomics
"Computational Analysis of Splicing in Mouse and Trypanosomes"
3:30, Hitchcock 132
Jointly sponsored with Genome Sciences.

Winter 2003:

Wed, March 12 - Dr. Evan Eichler, Department of Genetics, Case Western Reserve University
"Recent Duplication, Disease and the Evolution of the Human Genome"
Jointly sponsored with Genome Sciences

Wed, March 5 - Dr. Len Pennacchio, Department of Genome Sciences, Lawrence Berkeley National Laboratory
"Expoiting Vertebrate Sequence for Insights into Human Biology"
Jointly sponsored with Genome Sciences

Wed, February 26 - Dr. Wei Wu, Wadsworth Center, NYS Dept. of Health, David Axelrod Inst., SUNY-Albany
"Intein function and manipulation for protein purification."

Wed, February 19 - Dr. Matthew Stephens, Dept of Statistics, University of Washington
"Haplotypes, Hotspots, and a Multilocus Model for Linkage Disequilibrium"

Wed, February 12 - Dr. David Baker, Dept of Biochemistry, University of Washington
"Prediction and Design of Protein Structures and Protein-Protein Interactions"

Wed, January 29 - Dr. Ellen Wijsman, Dept of Medical Genetics, University of Washington
"Genetic Analysis of Complex Traits: From Case-Control to Large Pedigree Designs"

Wed, January 22 - Dr. John Storey, Dept of Statistics, University of California, Berkeley
"Exploratory Detection of Differential Gene Expression in DNA Microarray Experiments"
Jointly sponsored with Genome Sciences. 3:30, Hitchcock 132

Wed, January 15 - Jared Roach, M.D., Ph.D., Institute for Systems Biology
"Evolutionary Algorithms for Multiobjective Optimization: Application to the SNP Selection Problem"

Wed, January 8 - Dr. Joe Felsenstein, University of Washington
"Using Wright's Quantitative Genetic Threshold Model to Analyze Discrete Traits"

Autumn 2002:

Wed, December 11 - Dr. Andrew Clark, Cornell University
"Comparative Genomics and Molecular Population Genetics of the Drosophila Y Chromosome"
Jointly sponsored with Genome Sciences. 3:30, Hitchcock 132

Wed, December 4 - Dr. Lincoln Stein, Cold Spring Harbor Laboratory
"How to Build a Model Organism System Database"
Jointly sponsored with Genome Sciences. 3:30, Hitchcock 132

Wed, November 20 - Dr. Willie Swanson, University of Washington
"Functional Inferences From Rapidly Evolving Reproductive Proteins"

Wed, November 13 - Dr. Martin Tompa, University of Washington
"Interdisciplinary Collaborations on Discovering Regulatory Elements."