Genome Sciences Hackathon
Do you love innovation and challenge? Do you like problem solving in collaborative environments? Do you want to learn new technical skills outside of your current projects? Most importantly, do you want to make tools that will improve the field of genomics and make it more accessible to the public?
If so, the department of Genome Sciences proudly invites you to participate in our second annual Hackathon! The Hackathon will span 5 days, from Monday September 18th through Friday September 22nd. All GS trainees and faculty (graduate, post-doctoral, and staff) are invited to participate. There will be four teams led by your amazing colleagues, where each team will build a deliverable by the end of the five days. This is an opportunity to build something cool, learn from colleagues outside of your lab, and have fun!
If you’re interested, please read the project details below and fill out the interest form:
https://forms.gle/eDxhcwNoa34wwkSm8
If you have any questions about the event, please contact Sayeh Gorjifard <sgorji [ a t ] uw.edu>
EVENT DETAILS:
Location: Each team will have a conference room in Foege
Dates: September 18-22, 2023
Times: 9 am - 5 pm (with afternoon review sessions in Foege auditorium 3 pm on Monday and Wednesday and a final presentation on Friday).
Food: Coffee/snacks are provided daily
Final celebration (beer hour + food) will be Friday at 5 pm on the 3rd floor common area.
Project #1:
Leading lab: Dunham Lab
Stakeholders: Maitreya Dunham and Leah Anderson
Hackathon project title: Developing bioinformatics and data visualization resources for yEvo (https://yevo.org/), which teaches high schoolers about evolution and genetics using yeast.
Desired deliverable: Website tools for high school students to see and interact with whole genome sequencing data, along with curricular materials to guide students through a lab about whole genome sequencing and bioinformatics. Long term deliverable is to offer a bioinformatics module as a standalone product that teachers could use in their classes without direct intervention from us. If this is successful, other yeast labs that have less access to computational resources would probably use it, and we hope that SGD will host the tool.
Expected coding experience level: Beginner-Intermediate-Advanced
Abstract:
Whole-genome sequencing is not accessible for users who lack access to expert training or coding expertise, like biologists at non-elite schools or high schoolers. To address this problem, the desired deliverable will be a web-based tool that associated protocol users can use to analyze and visualize their whole genome sequencing data. We aim to have a simple user interface that would allow the students to understand each step in the analysis pipeline. We were thrilled that yEvo was one of the Hackathon teams last year, as we were able to move our sequencing data analysis pipeline into Snakemake and build a Shiny app, Mutation Browser: https://yevo.org/mutation-browser/. To expand the tool further, we aim to create a website where students can see pile-ups and mutation calls and interact with their “raw” data. Further improvements to the Mutation Browser Shiny app will allow students to compare mutations across and between classes, and to link out to SGD and other resources to learn about their mutations. The stretch goal would be to build this as a website to help all yeast geneticists.
Project #2:
Leading lab: Beliveau Lab
Stakeholder: Conor Camplisson <concamp [ a t ] uw.edu>
Hackathon project title: Genome-wide simulation of in situ hybridization
Desired deliverable: A web app that allows users to input oligonucleotide probe sequence(s) and visualize their predicted binding genome-wide.
Expected coding experience level: Intermediate-Advanced.
Abstract:
Several tools exist for designing oligonucleotide probes for use in fluorescence in situ hybridization (FISH) experiments. However, there is an unmet need for a tool that can simulate the genome-wide binding profile of candidate probes under experimental hybridization conditions. Existing tools vary in the level of sophistication of their approaches to analyzing probe specificity as well as in the level of coding expertise needed to use the tools. This project will produce a tool that maximizes both of these dimensions, leveraging a state of the art specificity analysis pipeline based on short-read alignment and nearest-neighbor thermodynamics to simulate the genome-wide binding of candidate probes and visualize the predicted binding profile under experimental conditions. The app will include a web interface that can be used without any coding experience, allowing the users to easily and accurately predict the in situ performance of their candidate probes. The Beliveau lab has several related web tools (PaintSHOP, TigerFISH) as well as existing algorithms that will form the back-end bioinformatics component of the web app. For this project we will need to create a containerized app that can be deployed to a cloud server, design and implement the web interface, functionalize the app so that the bioinformatics algorithms can be integrated into the back-end, and create infrastructure that allows for the running of analysis jobs in the background (i.e. using a task queue framework) and exposing front-end parameters so that users can configure the simulations that run on the back-end.
Project #3:
Leading lab: Noble Lab
Stakeholder: Gang Li (gangliuw [ a t ] uw.edu), Hyeon-Jin Kim (khj3017 [ a t ] uw.edu), Borislav Hristov (borislav [ a t ] uw.edu)
Hackathon project title: Integrating single-cell Hi-C and single-cell RNA sequencing data
Desired deliverable: Benchmarking results comparing multiple existing software tools for integrating scHiC and scRNA data.
Expected coding experience level: Intermediate-Advanced.
Abstract:
Recently, Liu et al. developed HiRES, a multi-omics sequencing approach to simultaneously profile 3D chromatin contacts and gene expression in single cells. The HiRES assay thus directly links chromatin conformation and gene expression profiles within single cells. Previous studies that aimed to investigate the interplay between chromatin interactions and transcriptomic profiles necessarily generated unpaired single-cell RNA-sequencing and chromatin conformation data. For the hackathon, we will use the newly available HiRES data to benchmark existing software tools developed for matching cells across modalities. In particular, we evaluate GLUE, LS-MMDMA (Meng 2023), Pomona (Cao 2021), SCOT (Demetci 2022), Synmatch (Hristov 2022), and CMOT (Alatkar 2023). In addition, we will consider four different techniques for representing the scHi-C contact matrices, including the contact decay profile, the HiCRep (Lin 2021) similarity score, a latent Dirichlet allocation model (Kim 2020), and the scGAD method (Shen 2022). Our results will compare and contrast the performance of these tools and Hi-C representation techniques in the context of integration with scRNA-seq data.
For the hackathon, we will use co-assay data to validate integration of scRNA-seq and scHi-C.
Project #4
Leading lab: Nunn lab
Stakeholder: Brook L. Nunn brookh [ a t ] uw.edu
Hackathon project title: Metagenomic time series portal to interrogate bacterial taxonomic groups and their functions as they relate to temporal chemical and physical data
Expected coding experience level: Beginner(HTML/CSS) - Advanced(experience with back-end web development)
Abstract:
The Nunn Lab envisions a transformative metagenomic time series portal to dissect bacterial taxonomic groups and functions in relation to temporal chemical and physical data. The portal's deliverable is a user-friendly website granting access to 133 metagenomes from ocean water samples, each collected at distinct time points. Currently, the Joint Genome Institute (JGI) does not link time points together, so you cannot visualize how a gene is changing through time.
However, JGI has rich gene annotations. Our goal is to consolidate all of this data in usable formats. If this data can be interrogated through a website portal by inquiring how a specific gene or taxonomic group or keyword located within a gene set is changing through time, then the data set will be utilized more frequently. The website should allow users to query using keywords like protein names, KEGG annotations, and PFam annotations, with results displayed in a comprehensive web page showcasing gene abundance, taxonomic profiles, and more across the 133 time points. We anticipate the portal will be utilized in middle school and high school classrooms. In agreement with JGI when the funding was received we will be publishing a scientific data report within the year providing all metadata to the public. We aim to include this portal as part of the publication.
Project #5
Leading lab: Stergachis lab
Stakeholder: Morgan Hamm mhamm [ a t ] uw.edu
Hackathon project title : Improving m6A methylation calling from Oxford Nanopore data
Expected coding experience level: Intermediate-Advanced
Abstract:
Fiber-seq is a method to assess chromatin accessibility using a nonspecific adenine methyltransferase that selectively labels accessible, but not occluded, adenine bases. Major benefits of this method over other accessibility techniques is that DNA sequence is preserved and it is amenable to long-read sequencing. PacBio sequencing is typically used as the sequencing technology in Fiber-seq assays. Oxford Nanopore sequencing technology is also capable of calling Adenine methylation, however, methylation calling is less accurate than with PacBio data, and false positive calls disrupt downstream analysis steps such as labeling nucleosomes. For this hackathon project, we will train a model to improve m6A methylation calling for Oxford Nanopore sequencing data. The model will use the existing methylation calls, along with quality metrics, and the sequence context of the methylation base calls to reduce false positive calls. The Stergachis lab has sequenced methyltransferase treated samples for use in this project. The Stergachis lab Hackathon project last year performed a similar task, improving methylation calling for PacBio data, that project led to a currently in review manuscript for Nature Methods.