MIT – Computational Systems Biology: Deep Learning in the Life Sciences

MIT – Computational Systems Biology: Deep Learning in the Life Sciences

Course description

This courses introduces foundations and state-of-the-art machine learning challenges in genomics and the life sciences more broadly. We introduce both deep learning and classical machine learning approaches to key problems, comparing and contrasting their power and limitations. We seek to enable students to evaluate a wide variety of solutions to key problems we face in this rapidly developing field, and to execute on new enabling solutions that can have large impact. As part of the subject students will implement solutions to challenging problems, first in problem sets that span a carefully chosen set of tasks, and then in an independent project. Students will program using Python 3 and TensorFlow 2 in Jupyter Notebooks, a nod to the importance of carefully documenting your work so it can be precisely reproduced by others.

Syllabus and schedule

 When                  Where    DescriptionCourse materialsReference
Lecture 1Feb 16 1pmCourse Intro + Overview FoundationsRead Goodfellow Chapter 1Lecture slidesLecture videoDL in BioinformaticsDL for computational biologyThe Roots of BioinformaticsML in Genomic MedicineAwesome DeepBioVisual Information Theory
Lecture 2Feb 18 1pmML FoundationsRead Goodfellow Chapter 6Feed Forward BackpropLecture slidesLecture videoDamienNick LocascioTF site tutorials
Recitation 1Feb 19 3pmML ReviewRecitation slides
Lecture 3Feb 23 1pmConvolutional Neural NetworksRead Goodfellow Chapter 9Lecture slidesLecture video
Lecture 4Feb 25 1pmRecurrent Neural Networks, Graph Neural NetworksRead Goodfellow Chapter 10Lecture slidesLecture video
Recitation 2Feb 26 3pmNeural Networks ReviewRecitation notes
Lecture 5Mar 2 1pmInterpretability, Dimensionality ReductionLecture slidesLecture videoBinder et al. (Relevance Propagation)Dumoulin and Visin (Convolution Arithmetic)Finnegan and Song (Maximum entropy methods)Lundberg and Lee (SHAP)Ribeiro (LIME)Selvaraju et al. (Grad-CAM)Shrikumar et al. (Learning Important Features)Shrikumar et al. (DeepLIFT)Simonyan et al. (Saliency Maps)Springenberg et al. (CNN)Sundararajan et al. (Axiomatic Attribution)Yosinski et al. (Deep Visualization)Zeiler et al. (Deconvolutional Networks)Zeiler and Fergus (Understanding Convolutional Networks)Zhou et al. (Discriminative Localization)
Lecture 6Mar 4 1pmGenerative Models, GANs, VAELecture slidesLecture video
Recitation 3Mar 5 3pmInterpreting ML ModelsRecitation slides
No classMar 9Monday Class Schedule
DeadlineMar 10 11:59pmPS1 due
Lecture 7Mar 11 1pmDNA Accessibility, Promoters and EnhancersLecture slidesLecture video
Recitation 4Mar 12 3pmChromatin and gene regulationRecitation slides
Lecture 8Mar 16 1pmTranscription Factors, DNA methylationLecture slidesLecture video
Lecture 9Mar 18 1pmGene Expression, SplicingLecture slidesLecture video
Recitation 5Mar 19 3pmRNA-seq, SplicingRecitation slides
No classMar 23Class Holiday
Lecture 10Mar 25 1pmSingle cell RNA-sequencingLecture slidesLecture video
Recitation 6Mar 26 3pmscRNA-seq, dimensionality reductionRecitation slides
Lecture 11Mar 30 1pmDimensionality Reduction, Genetics, and VariationLecture slides ALecture slides BLecture video
Lecture 12Apr 1 1pmGWAS and Rare variantsLecture slidesLecture video
DeadlineApr 1 11:59pmPS2 due
Recitation 7Apr 2 3pmGeneticsRecitation slides ARecitation slides B
Lecture 13Apr 6 1pmeQTLsLecture slidesLecture video
Lecture 14Apr 8 1pmElectronic health records and patient dataLecture slidesLecture video
Recitation 8Apr 9 3pmML for health dataRecitation slides
Lecture 15Apr 13 1pmGraph analysisLecture slides Part ALecture slides Part BLecture video
Lecture 16Apr 15 1pmDrug discoveryLecture slidesLecture video
Recitation 9Apr 16 3pmProtein structure predictionRecitation slides
No classApr 20Class Holiday
Lecture 17Apr 22 1pmProtein foldingLecture slides Part ALecture slides Part BLecture slides Part CLecture video
DeadlineApr 23 11:59pmPS3 due
Recitation 10Apr 23 3pmExam prep sessionRecitation slides
ExamApr 27 11:59pmIn-class exam
Lecture 19Apr 29 1pmNo lecture
DeadlineApr 29 11:59pmPS4 due
Recitation 11Apr 30 3pmStructural biology and protein foldingRecitation slides
Lecture 20May 4 1pmImaging applications in healthcareLecture slides Part ALecture slides Part BLecture video
Lecture 21May 6 1pmVideo processing, structure determinationLecture slidesLecture video
No classMay 7Class Holiday
Lecture 22May 11 1pmImaging and CancerLecture slides Part ALecture slides Part BLecture video
Lecture 23May 13 1pmEHRs and data miningLecture slidesLecture video
Recitation 12May 14 3pmHow to presentRecitation video
DeadlineMay 17 11:59pmFinal project reports due
Lecture 24May 18 1pmNeuroscienceLecture slides Part ALecture slides Part BLecture video
DeadlineMay 19 11:59pmFinal presentations due
DeadlineMay 20In-class final presentations

Tutorials for TensorFlow, NumPy, Google Cloud, and Jupyter notebooks

We collected a series of pointers to tutorials on NumPy, TensorFlow, Google Cloud and Conda here. We also provide a Quickstart tutorial to set up essential environment and tools for you to work on problem set 0 and problem set 1.

Prerequisites

You should be comfortable with calculus, linear algebra, (Python) programming, probability, and introductory molecular biology. This will be a fast paced course, and it is targeted towards students that are both mathematically and computational capable. There are many other subjects at MIT that teach overviews of computational biology that are less demanding, we would be happy to recommend other options if you find this subject is not what you desire.

Class meeting times

  • Lecture: TR1-2.30
  • Recitation: F3-4
  • Mentoring Session: F4-5

Contact

You should feel free to contact the lecturer and the TAs about any questions through 6.874staff@mit.edu. The best way to get detailed questions answered is to attend TA office hours and recitation or post them on Piazza.

Office hours

Manolis Kellis (manoli@mit.edu): M 5-6pmZheng Dai, Dylan Cable: Tues 4-5pmJackie Valeri, Tessa Gustafson: Wed 7-8pm

Grading

Grading will be based upon five programming-intensive problem sets (30%), a quiz (25%), a project (35%), and participation plus one day of lecture scribing (10%). Attendance in lecture is important as the class moves quickly and you will need to be present. For students enrolled in one of the graduate versions of this class (6.874, 20.490, and HST.506) there will be an extra section on some problem sets. You can use three late days for problem set deadlines (or email the course staff).

Lecture Scribing

If you are enrolled in this course for credit, you are requiured to scribe for one lecture.

The requirements for lecture scribing are as follows:

  1. On the day of lecture you may take notes however you like. Lectures will be recorded, so asynchronous participation is fine.
  2. During the week after lecture, we ask that you work with everyone assigned to scribe your lecture to compile a finalized set of notes that summarize the key points of the lecture, explain important equations, images and plots, illustrate or describe relevant things that were written on the board, and describe any important questions & answers between student and professor that were exchanged.
    The end goal is for you to generate a compact resource which you and your classmates can use to glean the important material from your lecture. The finalized notes should generally adhere to and extend from the structure outlined by the headings at the beginning of the notes template.
  3. The notes template and finished scribed notes may be found here.
  4. Let the course staff know you are finished compiling the notes by sending an email to 6.874staff@mit.edu. The deadline for completing the notes will be end-of-day one week after your lecture (e.g. notes from a lecture on 2/18 will be due on 2/25 @ 11:59 PM).

Project

This subject has a substantial project component. We strongly recommend working on projects in team of 2-3 students, but if there’s a strong justification, we can consider exceptions. You are free to choose any problem in the life sciences related to the lectures of the course, and develop a deep learning solution using the subject’s methodologies or cloud resources. We will have extensive mentoring resources for the students to help provide guidance, access to datasets, and biological insights. We will hold mentoring sessions during which you will have a chance to refine your ideas in consultation with the teaching staff and research mentors for each research area.

Textbook

We will be using the book “Deep Learning” by Goodfellow, Bengio, and Courville. You can find the book online here and here. You can purchase a hard copy at MIT Press or on Amazon.

Another useful book is the Matrix Cookbook, an extensive collection of facts about matrices.MIT – 6.802 / 6.874 / 20.390 / 20.490 / HST.506 Computational Systems Biology: Deep Learning in the Life Sciences – Spring 2019

Bình luận về bài viết này