Skip to main content
University of Washington School of Medicine Logo Genome Sciences

Data Science Options

Genome Sciences offers both a Data Science option and an Advanced Data Science option.  The two options have very similar structures. However, the Advanced Data Science option, as the name implies, is designed for students with considerable background in computer science, whereas the courses associated with the Data Science option are less demanding.  Advanced Data Science is oriented toward tool builders, whereas Data Science is oriented toward tool users. Both are official UW degree options which will appear on the student’s transcript.

Introduction

The Data Science options aim to educate the next generation of thought leaders who will both build and apply new methods for data science. These options will help to educate and recognize PhD students whose thesis work focuses on building and using data science tools. The goal of these options is not to educate all students in the foundations of data science but rather to provide advanced education to the students who will push the state-of-the-art in data science methods in their respective domains.

Students enrolled in either option can expect to interact with students enrolled in similar Data Science PhD options in Computer Science, Statistics, Oceanography, Chemical Engineering and Astronomy. In addition, the options are designed to complement the activities of the eScience Institute and to leverage ongoing activities associated with the Moore/Sloan Foundation Data Driven Discovery Initiative, involving the University of Washington, New York University and the University of California, Berkeley.

Advising

Students with an interest in the Advanced Data Science option but only limited experience in this area should take preparatory coursework before attempting the ADS courses. Please contact Bill Noble for suggested courses.

Admission

Genome Sciences students who choose to enroll in either Data Science option must have approval of their thesis advisor and should then let Brian Giebel (bgiebel [ a t ] uw.edu) know they are planning to follow this option. There is no additional admission procedure. Once you have completed all requirements for either option, please contact Brian so that he may have this option added to your transcript.

Faculty

Any Genome Sciences faculty member may serve as advisor to students enrolled in either Data Science option, although the student’s committee must include at least one of the following faculty members: David Baker, Trevor Bedford, Brian Beliveau, Jesse Bloom, Gavin Ha, Kelley Harris, Gail Jarvik, Su-In Lee, Erick Matsen, Sara Mostafavi, Bill Noble, or Cole Trapnell.


Course Sequence – Data Science option

The structure of the Data Science option is similar to that of the Advanced Data Science option, except that students can select from a wider variety of courses, including introductory courses in each topic area with no prerequisites. Note that all courses that count toward the Advanced Data Science option may instead be applied to the Data Science option. Also, only two quarters of the eScience Community Seminar are required, rather than four.

Requirements:

  1. One statistics course (or pair of courses) from the list below. Note that several of the courses are two-quarter series, which cover similar topics as GENOME 560 but in greater depth. If students opt to take one of these, they must complete both quarters to satisfy the Data Science option requirement.
  2. Genome 540: Computational Molecular Biology I
  3. One course from two of the following three areas, selected from the table below:
    • Data Management
    • Machine Learning
    • Data Visualization
  4. At least two quarters in the weekly UW Data Science Seminar series.
Area Course numberCourse titlePre-requisitesAdv
StatisticsGENOME 560Introduction to statistical genomicsNone
StatisticsBIOSTAT 511-512Medical biometry I & IINone
StatisticsBIOSTAT 517-518Applied biostatistics I & IINone
StatisticsSTAT 509Introduction to mathematical statistics
(resources for review)
STAT 311 and (MATH 308 or 309)X
StatisticsSTAT 512-513Statistical inferenceSTAT 395 and (STAT 421, 423, 504, or BIOST 512)X
Data managementBIOSTAT 544Introduction to data scienceBIOSTAT 511 or equivalent
Data managementCSE 583Software development for data scientistsNone
Data managementCHEME 546Software engineering for molecular data scientistsNone
Data managementBIOSTAT 545Biostatistical methods for big omics dataBIOST 511-12 or equivalent
Data managementCSE 414Introduction to database systemsCSE 143 or CSE 163
Data managementCSE 544Principles of database management systemsNone
Data managementGenome 569Bioinformatics Workflows for High-Throughput Sequencing ExperimentsNone
Machine learningBIOSTAT 546Machine learning for biomedical and public health dataBIOST 511-12 or equivalent
Machine learningCSE 416 / STAT 416Introduction to machine learning(CSE 143 or CSE 160) and (STAT 311 or STAT 390)
Machine learningSTAT 435Introduction to statistical machine learningSTAT 341, 390, or 391X
Machine learningCSE 546Machine learningCSE 312, STAT 341, or STAT 391X
VisualizationCSE 442
Data visualizationCSE 332
VisualizationCSE 412
Introduction to data visualizationCSE 143 or CSE 163
VisualizationCSE 512Data visualizationNoneX
VisualizationIMT 561Visualization designNone
VisualizationIMT 562Interactive information visualizationNone
VisualizationHCDE 511Information visualization / data visualization and exploratory analyticsNone
VisualizationHCDE 411Information visualizationHCDE 308 and 310

Advanced Data Science Option

Students who choose to follow the Advanced Data Science option of the Genome Sciences Ph.D. program should follow the regular Genome Sciences course sequence but also include the following course requirements:

1. Instead of Genome 560: Statistics for Genome Sciences (typically offered Spring Quarter), students enrolled in the Advanced Data Science option should take Statistics 509: Introduction to Mathematical Statistics. Statistics 509 was most recently offered during Autumn Quarter, but you should check the Department of Statistics website or the UW Time Schedule to see when it will next be offered. Please note that this course requires significant use of calculus. If you have not taken calculus for some number of years, you might want to consider taking a refresher course beforehand, and you should definitely take a look at the resources for review: https://www.stat.washington.edu/tsr/509review/

Alternatively, for a more advanced approach, students may choose to take Statistics 512: Statistical Inference. In this case, students may wish to consider also taking Statistics 513, the second course in this sequence.

2. Genome 540: Computational Molecular Biology (typically offered Winter Quarter each year)

3. Electives:
Students must take 2 of the following three courses:

Data Management: CSE 544.
Machine Learning, CSE 546 or STAT 535
Data Visualization: CSE 512.

4. Additionally, to further expand students’ education and create a campus-wide community, students will register for at least 4 quarters in the weekly eScience Community Seminar.

Please check the UW Time Schedule or the Department of Statistics and Department of Computer Science & Engineering websites for information on when these electives are offered.

Frequently Asked Questions:

Do I need to complete this coursework during my first year?

No. You are welcome to enroll & complete the course sequence at any time during your graduate studies. A good time to enroll might be at the end of year one, once you have selected a thesis lab, although you may end up completing some of the required courses (for example, Genome 540), during your first year.

How do I apply?

Simply obtain your thesis advisor’s permission and then contact Brian Giebel (bgiebel [ a t ] uw.edu) to let him know you are planning to follow this option. Once you have completed all coursework, contact Brian to let him know which courses you have taken to fulfill requirements, so that he may get this option added to your transcript.

Which is the right option for me – Data Science or Advanced Data Science?

Please contact Bill Noble for advice in which might be the best option for you.

Which courses should I take as prereqs in preparation for enrolling in this program?

Please contact Bill Noble for suggested courses.