AS.020.617 Fall 2024
CMDB Quantitative Biology Lab

Logistics & Contact Information

Instructors

Instructor Email
Rajiv McCoy rajiv.mccoy@jhu.edu
Michael Sauria msauria1@jhu.edu
Frederick Tan ftan2@jh.edu

Teaching Assistants

Teaching Assistant Email Office Hours
Nicolas Moya nmoya1@jh.edu Thursday 5-6:30 PM, UTL379
Jonathan Fischer jfisch27@jhu.edu Wednesday 4-6 PM, Mergenthaler 121
Siqi Ma sma46@jhu.edu Wednesday 4-6 PM, Mergenthaler 121

Class Meetings

Fridays, Sept 13, 2023 - Dec 6, 2024
10:00 A.M. ET - 12:30 P.M. ET, Rose Auditorium, Carnegie Institution for Science

Course Website

http://bxlab.github.io/cmdb-quantbio/

Course Description

This course builds upon the foundations of Quantitative Biology Bootcamp, reinforcing and expanding upon mathematical and computational methods for analysis of biological data. Weekly meetings of this 2.5-hour course are organized as active-learning modules focused on diverse areas of genomics. Students perform guided research with real genomic data, uploading their results and code to repositories where they receive feedback.

Topics are intergated with the curricula for concurrent core courses in molecular biology and cell biology, with datasets and analysis goals aimed at diverse topics in these fields. Examples of such topics include:

  • genome assembly, variant discovery, and genotyping
  • analysis of functional genomic data, such as bulk and single-cell RNA-seq
  • analysis of imaging data, including common tasks such as segmentation and classification

Upon completing the course, students should have the background to develop reproducible bioinformatic workflows tailored to their research questions, as well as familiarity with resources for expanding upon these skills.

Learning Goals

Upon completion of this course, students will be able to:

  • Design workflows for common genomic data types, including developing tractable questions, organizing and preprocessing data, developing analytic pipelines, and implementing statistical models.
  • Describe and implement common probability distributions, statistical tests, and correction for multiple hypothesis testing. Students will be able to identify confounding factors, as well as methods for their diagnosis and mitigation.
  • Maintain a GitHub repository with code used for documentation, output, and documentation.
  • Describe key algorithms and data structures that are commonly encountered in computational biology, along with when best to use them.

Study Materials

This course does not have a required text. TAs have created a short online textbook covering the fundamentals of Python and Git. Any lecture notes or slides will be made available on the course website.

Course Format

Attendance

Due to the interactive nature of this course, there is a policy that students should not participate in any other meetings, courses, or lab work during this course meeting time. Please be on time for course activities, and budget extra time in anticipation of technical delays. Your attendance is expected during all synchronous sessions and your effort is also expected during asynchronous sessions. Please let us know about any emergencies, family responsibilities, illness, etc. that may prevent attendance, and we will work to accommodate reasonable requests.

Asking for Help

We both expect and encourage you to ask questions and request help throughout this course. Therefore, please ask questions about whatever and whenever you need to.

Googling

Googling is always an acceptable way to find answers or help, and we encourage you to utilize it extensively. If you adopt a solution following a Google search, make sure you understand what you incorporate, rather than just copy/pasting without comprehension of the logic or code. Please see the Academic Integrity & Ethics syllabus section for more on this.

You may be familiar with ChatGPT and other large language models. After trying each problem/assignment/task on your own, if you’re still running into issues, feel free to use ChatGPT as you would any other online resource (Google, stack overflow, etc.). Learning how to succinctly describe exactly what you want to accomplish is a skillset in itself, so this can be good practice. If you find code that seems to work (e.g., from Google) but you’re not sure how exactly it works, you can also type it into ChatGPT and ask it to explain what’s happening. As always, please do not submit any code if you are not familiar entirely with how it works; flag it and ask a TA for assistance. Be aware that ChatGPT might confidently offer an answer that is not correct; so always check the output on your own.

Grading

Weekly exercise solutions (Jupyter notebooks, scripts, results, etc.) should be pushed to a student’s qbb2024-answers GitHub repository within a week after the assignment is posted. The grading for this course is based on reasonable completion. For each weekly exercise, students will be advised which if any exercises are advanced and therefore not required for submission.

TAs will verify that the student’s submitted work each week shows a reasonable level of individual effort. Letter grades will be assigned in line with the level of completion.

Resubmissions must occur within one week of receiving feedback and grades (please let us know if you need extra time and we will try and accommodate).

General guidelines for letter grade assignments:

  • A (excellent): 90-100% completion
  • B (passing): 80-90% completion
  • C or lower (not passing): <80% completion

Academic Integrity & Ethics

Academic and scientific institutions and research depend on honesty and integrity. You should complete your own lunch and homework exercises, and your work should not plagiarize others – including group partners, presenters, and strangers posting to online forums or blogs. You and your partners should be working together, but both persons should be writing and turning in unique, individualized code. Note your scripts and analysis may follow the same logic steps and even have tidbits of the same code, but no one person should be writing the solution the whole group uses character for character. Additionally, you should understand every line of code you write, are given and use, or find online and incorporate. If asked to, you should be able to explain exactly what your code is doing. Another aspect is properly acknowledging the source of borrowed code. Understanding can be cultivated and acknowledgement implemented by writing both inline and multiline comments (which is a terrific practice in general). Relatedly, don’t give someone code to copy and paste. Make sure any recipient can explain back to you any gifted code. See the University’s guidelines on plagiarism for details and contact a TA with any questions.

Code of Conduct

We are committed to maintaining a welcoming, inclusive, and harassment-free environment for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, religion (or lack thereof), political beliefs/leanings, or technology choices. We do not tolerate harassment in any form. This code of conduct applies to all course participants, including instructors and TAs, and to all modes of interaction and communication.

All class participants agree to:

  • Be considerate and polite. Quant Lab can be a daunting and sometimes frustrating experience - we're showing you a lot in a short amount of time! In addition, the collaborative nature of this course means it is essential that students are respectful of each other’s questions and points of confusion. Remember that you all are coming into this program with a wide range of coding backgrounds. A community where people feel comfortable is most conducive for learning.
  • Work together cooperatively. Be kind when giving feedback on your classmates’ code, and make an effort to help if they are stuck. You will gain much more from explaining concepts to someone else than you will from rushing through the assignments on your own.
  • Refrain from demeaning, discriminatory, or harassing behavior and speech. Harassment includes, but is not limited to: deliberate intimidation; stalking; unwanted photography or recording; sustained or willful disruption of talks or other events; inappropriate physical contact; use of sexual or discriminatory imagery, comments, or jokes; and unwelcome sexual attention. If you feel that someone has harassed you or otherwise treated you inappropriately, please alert one of the instructors or TAs.
  • Refrain from advocating for, or encouraging, negative behavior. And, if someone asks you to stop, then stop. Alert an instructor or TA if you notice a dangerous situation, someone in distress, or violations of this code of conduct, no matter how inconsequential it seems.

Weekly Schedule

Week Date Instructor Topic
1 9/13 Mike Schatz Genome Assembly
2 9/20 Mike Sauria Genome Contents
3 9/27 Rajiv McCoy Variant Discovery and Genotyping
4 10/4 Frederick Tan Project Work + Demo
5 10/11 Mike Sauria RNA-seq: Quantitation, Dimension Reduction, Clustering
- 10/18 No Class Retreat
6 10/25 Rajiv McCoy RNA-seq: Differential Expression <-- BLC5015 (basement!)
7 11/1 Frederick Tan Single-cell RNA-seq
8 11/8 Rajiv McCoy Project Work + Demo
9 11/15 Mike Sauria TBD
10 11/22 Frederick Tan TBD
- 11/29 No Class Thanksgiving
11 12/6 Final Presentations