Home | Faculty | Academics | News & Events | Support Genome Sciences | Outreach | Computing | Administration | Directory
 
Grad Program Index
Information for Applicants
Information for Current Students
Current Students

Greg Finney

Joined Program: 2003
Previous Degree: B.S. Biochemistry & Molecular Biology, UC Santa Cruz
MacCoss Lab
gfinney (at) u.washington.edu

Research:

Thesis Topic: Relative Quantitative Proteomics by LC/MS without Stable Isotope Labeling

Comparing protein or peptide levels between control and experimental biological samples can be used for purposes such as biomarker discovery that indicate the presence or progression of a disease, or to investigate biological questions in a manner that complements gene expression experiments. An alternative to techniques used to differentially quantify samples such as isotopic is to find meaningful differences across samples by using the statistical power of using multiple technical replicates of controls and samples.

The main focus to date has been on chromatogram alignment – chromatography for the nanospray liquid chromatography/mass spectrometry (LC/MS) systems is not highly reproducible, with retention times varying by more than a peak width. Aligning LC/MS chromatograms in the time dimension will allow us to find differences between runs using statistical techniques that may be able to find changes that are missed by strategies which rely on finding peaks and comparing their area.

I am borrowing techniques from Dynamic Time Warping (DTW) for chromatogram alignment, which is analogous to dynamic programming techniques used to align nucleotide sequences. I have expanded work that has been published on aligning chromatograms with only one value per time point to one that uses a dot-product based scoring scheme to find similar time points considering all the mass-to-charge (m/z) values that are measured by the mass spectrometer. Current results look promising, and are able to align chromatograms that vary by up to 10% retention times.

A high degree of variation has been observed in the digestion efficiency between different samples. The degree of noise contribution from this component and other factors is being investigated.

Further work involves:

  • Extending the pairwise alignments (arbitrarily choosing a master template) to a system performing either multiple simultaneous alignments, or by making a consensus alignment from pairwise alignments.
  • Evaluating noise reduction approaches on alignment and identifying differences
  • Evaluating precursor ion scanning as an approach to perform fractionation in the mass spectrometer, which should be more reproducible than biochemical fractionations

Side Project: False Discovery Rate Estimation of Peptide Identifications:

Identification of peptides and proteins from large-scale proteomics experiments is commonly accomplished by algorithms that produce a score for the peptide matches which do not have a probabilistic interpretation. An empirical approach can use the False Discovery Rate (FDR) as a metric of the proportion of putative identities which are false. In prior work, this has been estimated by using a database of reversed proteins which comprise a set of random match candidates for the experimental spectra to match in order to estimate the expected false discovery rate

We are exploring the effects of three variant approaches towards FDR estimation: i)The use of decoy databases generated by various techniques (reversed, shuffled, generated from Markov models); ii) the use of SEQUEST Xcorr scores normalized by a auto-correlation of the observed spectrum[5]; iii) variations in the DTA-Select parameters used to filter peptide results in identifying proteins.

Results from this research were presented orally at the 53 rd ASMS conference on Mass Spectrometry. We found that decoy spectra from the reverse model are slightly but significantly more similar to the forward spectra than ones from Markov or shuffled models, but few effects were seen on the rate of protein identifications. I am working on a way to use the rate of decoy hits for a protein to estimate a p-value for its identification being correct.