Curriculum

Overview

Six courses are required of trainees: three statistics/computing courses and three biology courses.

Neuroscience-focused trainees will fulfill the biology requirement using neuroscience courses. Genomics-focused trainees will fulfill the biology requirement using genomics courses.

Trainees will be expected to participate in journal clubs & seminars and also receive ethics training.

Trainees Receiving One Year of Funding

Trainees receiving one year of funding without the option of applying for continued funding should complete one course in two of the following three categories (mathematical statistics, machine learning, data science). Genome 541 is optional.

Statistics & Computing Coursework

One course in mathematical statistics:

All students are required to take a course in mathematical statistics. They can choose between the less technical Statistics 509 and the more technical option of Statistics 512:

STAT 509 (Introduction to Mathematical Statistics) Examines methods, tools, and theory of mathematical
statistics. Covers probability densities, transformations, moment-generating functions, conditional expectation.
Additional topics include Bayesian analysis with conjugate priors, hypothesis tests, the Neyman-Pearson
Lemma, likelihood ratio tests, confidence intervals, maximum likelihood estimation, Central Limit Theorem,
Slutsky's Theorem, and the delta method.

STAT 512 (Statistical Inference) Random variables, transformations, conditional expectation, moment generating
functions, convergence, limit theorems, estimation, Cramer-Rao lower bound, maximum likelihood
estimation, sufficiency, ancillarity, completeness, Rao-Blackwell theorem, hypothesis testing: NeymanPearson
lemma, monotone likelihood ratio, likelihood-ratio tests, large-sample theory.

One course in statistical machine learning:

In addition, each student is required to take a course in statistical machine learning. A student can choose between a machine learning course under development by the Department of Biostatistics and a more technical option cross-listed between Statistics and CSE.

BIOST 546 (Machine Learning for Biomedical and Public Health Big Data)

CSE 546 / STAT 535 (Foundational Machine Learning) Methods for identifying valid, novel, useful, and
understandable patterns in data. Induction of predictive models from data: classification, regression, and
probability estimation. Discovery of clusters and association rules.

One course in data science:

Each trainee is required to take a course on "data science". A student can choose between a data science course under development by the Biostaistics Department or a data visualization or data management course taught in CSE.

BIOST 544 Introduction to Biomedical Data Science (formerly BIOST 578 B, Introduction to Data Science)

CSE 512 (Data Visualization) Covers techniques and algorithms for creating effective visualizations based on
principles from graphic design, visual art, perceptual psychology, and cognitive science. Topics include data
and image models; visual encoding; graphical perception; color; animation; interaction techniques; graph
layout; and automated design.

CSE 544 (Data Management) Data models and query languages (SQL, datalog, OQL). Relational databases,
enforcement of integrity constraints. Object-oriented databases and object-relational databases. Principles
of data storage and indexing. Query-execution methods and query optimization algorithms. Static analysis of
queries and rewriting of queries using views. Data integration, data mining, and principles of transaction
processing.

Coursework in Genomics and Neuroscience

Each trainee will take either three quarter-long courses in Neuroscience (one computational and two basic neuroscience) or three quarters worth of courses in Genome Sciences (one quarter-long computational course and four half-quarter courses in genetics, genomics or proteomics).

Genomics coursework:

All trainees specializing in genomics will take the following course:

GENOME 541 (Introduction to Computational Molecular Biology) This course provides a survey of topics
within the field of computational molecular biology. The course is divided into five two-week blocks, each
devoted to a single topic. The topics include areas such as genomics, metagenomics, epigenomics,
phylogenetics, proteomics, and protein structure. The course involves weekly, substantial programming
assignments.

Furthermore, all trainees specializing in genomics will take four of the following courses. Each of these courses lasts 5 weeks (one-half of one academic quarter). Therefore, four of these courses will amount to two quarters of coursework.

GENOME 551 (Principles of Gene Regulation) A detailed examination of the mechanisms of transcription
and translation as determined by experimental genetics, molecular biology, and biochemistry.

GENOME 552 (Technologies for Genome Analysis) Discussion of current and newly-emerging technologies
in genome analysis with regard to applications in biology and medicine and to potential advantages and
limitations.

GENOME 553 (Advanced Genetic Analysis) Classical genetic analysis is a powerful approach to dissect
complex biological processes. Selective removal, addition, or alteration of specific proteins to identify and
order genes in a pathway, define protein function, determine tissue and temporal requirements for gene
function, and distinguish among competing hypotheses to explain biological phenomena.

GENOME 555 (Protein Technology) Focuses on current and emerging technologies and approaches in
protein analysis, and considers applications of these technologies in biology, biotechnology, and medicine.

GENOME 561 (Molecular Population Genetics and Evolution) Surveys recent literature to gain an understanding
of the basic principles of molecular population genetics and evolution as applied to analysis of
genome data.

Neuroscience coursework:

All trainees specializing in neuroscience will take the following course:

CSE 528 (Computational Neuroscience) Introduction to computational methods for understanding nervous
systems and the principles governing their operation. Topics include representation of information by spiking
neurons, information processing in neural circuits, and algorithms for adaptation and learning.

Furthermore, all trainees specializing in neuroscience will take two of the following three courses:

NEURO 501 (Introduction to Neurobiology: Molecular & Cellular Neurobiology) Concepts and techniques of
molecular and cell biology as applied to understanding development and function of the nervous system.

NEURO 502 (Introduction to Neurobiology: Sensory & Motor Systems) Introduction to neuroanatomy and
modules on sensory and motor systems, examination of macroscopic and microscopic neural tissues.

NEURO 503 (Cognitive and Integrative Neuroscience) A discussion of higher neural processes like learning,
memory, and neuroendocrinology. Lecture and laboratory discussion of original literature, observation of
demonstrations and simulations.

Training in Ethics

Trainees must complete at least one of the following three options to meet the responsible conduct of research requirement: (1) BIOST 532: Ethical Issues for Biostatisticians, (2) GENOME 580: Ethics, or (3) the Biomedical Research Integrity Series.

Forums For Intellectual Exchanges

Each year, trainees will be required to register for four course credits of seminars or journal clubs related to genomics, neuroscience, statistics, or computing. These four course credits must include: 1) a computing seminar, 2) a statistics seminar, and 3) either a genomics or a neuroscience seminar.

For example a trainee might register for two credits of the Biostat Department seminar, one credit of the Machine Learning Lunch, and one credit of the Genome Sciences seminar.

A list of acceptable seminars and journal clubs includes:

Statistics

Biostatistics Department Seminar
Statistics Department Seminar
Statistical Genetics Seminar
Boeing Distinguished Colloquia (Applied Math)
Chem E 599, Topics in Data Science

Computing

Machine Learning Lunch
Computer Science & Engineering Colloquia
Electrical Engineering Research Colloquium
Trends in Optimization Seminar
Reduced Order Modeling Seminar
Chem E 591, Robotics and Control Systems Colloquium

Genomics

Genome Sciences Seminar
Reading and Research in Computational Biology
Combi Seminar
Medical Genetics Journal Club (Genome 525)

Neuroscience

Theoretical Neuroscience Journal Club
Computational Neuroscience Seminar
Physiology and Biophysics Seminar

Rotations

Students who join the BDGN TG beginning in their first year of graduate study will perform three quarter-long rotations, including one rotation in genomics or neuroscience, and one working in either statistics or computing. During each rotation, each BDGN trainee will be paired with a “rotation collaborator”: a senior PhD student or post-doc in the lab in which he or she is rotating, who will help guide the trainee over the course of the rotation.

Students who have already selected a thesis lab at the time they join the BDGN TG are exempt from the rotation requirement. They are encouraged, however, to have supervisory committees including faculty with expertise in both genomics/neuroscience and statistics/computing.

External Internships

Trainees are encouraged to spend the summer after the first year of their BDGN traineeships performing external internships in order to enhance their training in biomedical big data. The BDGN Steering Committee members will use their relationships with local companies to help place trainees in summer internships.