MIT – Computational Systems Biology: Deep Learning in the Life Sciences
Course description
This courses introduces foundations and state-of-the-art machine learning challenges in genomics and the life sciences more broadly. We introduce both deep learning and classical machine learning approaches to key problems, comparing and contrasting their power and limitations. We seek to enable students to evaluate a wide variety of solutions to key problems we face in this rapidly developing field, and to execute on new enabling solutions that can have large impact. As part of the subject students will implement solutions to challenging problems, first in problem sets that span a carefully chosen set of tasks, and then in an independent project. Students will program using Python 3 and TensorFlow 2 in Jupyter Notebooks, a nod to the importance of carefully documenting your work so it can be precisely reproduced by others.
Syllabus and schedule
When | Where | Description | Course materials | Reference | |
---|---|---|---|---|---|
Lecture 1 | Feb 16 1pm | Course Intro + Overview Foundations | Read Goodfellow Chapter 1Lecture slidesLecture video | DL in BioinformaticsDL for computational biologyThe Roots of BioinformaticsML in Genomic MedicineAwesome DeepBioVisual Information Theory | |
Lecture 2 | Feb 18 1pm | ML Foundations | Read Goodfellow Chapter 6Feed Forward BackpropLecture slidesLecture video | DamienNick LocascioTF site tutorials | |
Recitation 1 | Feb 19 3pm | ML Review | Recitation slides | ||
Lecture 3 | Feb 23 1pm | Convolutional Neural Networks | Read Goodfellow Chapter 9Lecture slidesLecture video | ||
Lecture 4 | Feb 25 1pm | Recurrent Neural Networks, Graph Neural Networks | Read Goodfellow Chapter 10Lecture slidesLecture video | ||
Recitation 2 | Feb 26 3pm | Neural Networks Review | Recitation notes | ||
Lecture 5 | Mar 2 1pm | Interpretability, Dimensionality Reduction | Lecture slidesLecture video | Binder et al. (Relevance Propagation)Dumoulin and Visin (Convolution Arithmetic)Finnegan and Song (Maximum entropy methods)Lundberg and Lee (SHAP)Ribeiro (LIME)Selvaraju et al. (Grad-CAM)Shrikumar et al. (Learning Important Features)Shrikumar et al. (DeepLIFT)Simonyan et al. (Saliency Maps)Springenberg et al. (CNN)Sundararajan et al. (Axiomatic Attribution)Yosinski et al. (Deep Visualization)Zeiler et al. (Deconvolutional Networks)Zeiler and Fergus (Understanding Convolutional Networks)Zhou et al. (Discriminative Localization) | |
Lecture 6 | Mar 4 1pm | Generative Models, GANs, VAE | Lecture slidesLecture video | ||
Recitation 3 | Mar 5 3pm | Interpreting ML Models | Recitation slides | ||
No class | Mar 9 | Monday Class Schedule | |||
Deadline | Mar 10 11:59pm | PS1 due | |||
Lecture 7 | Mar 11 1pm | DNA Accessibility, Promoters and Enhancers | Lecture slidesLecture video | ||
Recitation 4 | Mar 12 3pm | Chromatin and gene regulation | Recitation slides | ||
Lecture 8 | Mar 16 1pm | Transcription Factors, DNA methylation | Lecture slidesLecture video | ||
Lecture 9 | Mar 18 1pm | Gene Expression, Splicing | Lecture slidesLecture video | ||
Recitation 5 | Mar 19 3pm | RNA-seq, Splicing | Recitation slides | ||
No class | Mar 23 | Class Holiday | |||
Lecture 10 | Mar 25 1pm | Single cell RNA-sequencing | Lecture slidesLecture video | ||
Recitation 6 | Mar 26 3pm | scRNA-seq, dimensionality reduction | Recitation slides | ||
Lecture 11 | Mar 30 1pm | Dimensionality Reduction, Genetics, and Variation | Lecture slides ALecture slides BLecture video | ||
Lecture 12 | Apr 1 1pm | GWAS and Rare variants | Lecture slidesLecture video | ||
Deadline | Apr 1 11:59pm | PS2 due | |||
Recitation 7 | Apr 2 3pm | Genetics | Recitation slides ARecitation slides B | ||
Lecture 13 | Apr 6 1pm | eQTLs | Lecture slidesLecture video | ||
Lecture 14 | Apr 8 1pm | Electronic health records and patient data | Lecture slidesLecture video | ||
Recitation 8 | Apr 9 3pm | ML for health data | Recitation slides | ||
Lecture 15 | Apr 13 1pm | Graph analysis | Lecture slides Part ALecture slides Part BLecture video | ||
Lecture 16 | Apr 15 1pm | Drug discovery | Lecture slidesLecture video | ||
Recitation 9 | Apr 16 3pm | Protein structure prediction | Recitation slides | ||
No class | Apr 20 | Class Holiday | |||
Lecture 17 | Apr 22 1pm | Protein folding | Lecture slides Part ALecture slides Part BLecture slides Part CLecture video | ||
Deadline | Apr 23 11:59pm | PS3 due | |||
Recitation 10 | Apr 23 3pm | Exam prep session | Recitation slides | ||
Exam | Apr 27 11:59pm | In-class exam | |||
Lecture 19 | Apr 29 1pm | No lecture | |||
Deadline | Apr 29 11:59pm | PS4 due | |||
Recitation 11 | Apr 30 3pm | Structural biology and protein folding | Recitation slides | ||
Lecture 20 | May 4 1pm | Imaging applications in healthcare | Lecture slides Part ALecture slides Part BLecture video | ||
Lecture 21 | May 6 1pm | Video processing, structure determination | Lecture slidesLecture video | ||
No class | May 7 | Class Holiday | |||
Lecture 22 | May 11 1pm | Imaging and Cancer | Lecture slides Part ALecture slides Part BLecture video | ||
Lecture 23 | May 13 1pm | EHRs and data mining | Lecture slidesLecture video | ||
Recitation 12 | May 14 3pm | How to present | Recitation video | ||
Deadline | May 17 11:59pm | Final project reports due | |||
Lecture 24 | May 18 1pm | Neuroscience | Lecture slides Part ALecture slides Part BLecture video | ||
Deadline | May 19 11:59pm | Final presentations due | |||
Deadline | May 20 | In-class final presentations |
Tutorials for TensorFlow, NumPy, Google Cloud, and Jupyter notebooks
We collected a series of pointers to tutorials on NumPy, TensorFlow, Google Cloud and Conda here. We also provide a Quickstart tutorial to set up essential environment and tools for you to work on problem set 0 and problem set 1.
Prerequisites
You should be comfortable with calculus, linear algebra, (Python) programming, probability, and introductory molecular biology. This will be a fast paced course, and it is targeted towards students that are both mathematically and computational capable. There are many other subjects at MIT that teach overviews of computational biology that are less demanding, we would be happy to recommend other options if you find this subject is not what you desire.
Class meeting times
- Lecture: TR1-2.30
- Recitation: F3-4
- Mentoring Session: F4-5
Contact
You should feel free to contact the lecturer and the TAs about any questions through 6.874staff@mit.edu. The best way to get detailed questions answered is to attend TA office hours and recitation or post them on Piazza.
Office hours
Manolis Kellis (manoli@mit.edu): M 5-6pmZheng Dai, Dylan Cable: Tues 4-5pmJackie Valeri, Tessa Gustafson: Wed 7-8pm
Grading
Grading will be based upon five programming-intensive problem sets (30%), a quiz (25%), a project (35%), and participation plus one day of lecture scribing (10%). Attendance in lecture is important as the class moves quickly and you will need to be present. For students enrolled in one of the graduate versions of this class (6.874, 20.490, and HST.506) there will be an extra section on some problem sets. You can use three late days for problem set deadlines (or email the course staff).
Lecture Scribing
If you are enrolled in this course for credit, you are requiured to scribe for one lecture.
The requirements for lecture scribing are as follows:
- On the day of lecture you may take notes however you like. Lectures will be recorded, so asynchronous participation is fine.
- During the week after lecture, we ask that you work with everyone assigned to scribe your lecture to compile a finalized set of notes that summarize the key points of the lecture, explain important equations, images and plots, illustrate or describe relevant things that were written on the board, and describe any important questions & answers between student and professor that were exchanged.
The end goal is for you to generate a compact resource which you and your classmates can use to glean the important material from your lecture. The finalized notes should generally adhere to and extend from the structure outlined by the headings at the beginning of the notes template. - The notes template and finished scribed notes may be found here.
- Let the course staff know you are finished compiling the notes by sending an email to 6.874staff@mit.edu. The deadline for completing the notes will be end-of-day one week after your lecture (e.g. notes from a lecture on 2/18 will be due on 2/25 @ 11:59 PM).
Project
This subject has a substantial project component. We strongly recommend working on projects in team of 2-3 students, but if there’s a strong justification, we can consider exceptions. You are free to choose any problem in the life sciences related to the lectures of the course, and develop a deep learning solution using the subject’s methodologies or cloud resources. We will have extensive mentoring resources for the students to help provide guidance, access to datasets, and biological insights. We will hold mentoring sessions during which you will have a chance to refine your ideas in consultation with the teaching staff and research mentors for each research area.
Textbook
We will be using the book “Deep Learning” by Goodfellow, Bengio, and Courville. You can find the book online here and here. You can purchase a hard copy at MIT Press or on Amazon.
Another useful book is the Matrix Cookbook, an extensive collection of facts about matrices.MIT – 6.802 / 6.874 / 20.390 / 20.490 / HST.506 Computational Systems Biology: Deep Learning in the Life Sciences – Spring 2019
Filed under: MOOC |
Bình luận về bài viết này