AS.020.617 Fall 2023
CMDB Quantitative Biology Lab

Logistics & Contact Information

Instructors

Instructor Email
Rajiv McCoy rajiv.mccoy@jhu.edu
Michael Sauria msauria1@jhu.edu

Teaching Assistants

Teaching Assistant Email
Andrew Bortvin abortvi2@jhu.edu
Sadhana Chidambaran sadhana.chidambaran@jhu.edu
Dylan Taylor dtaylo95@jhu.edu

Class Meetings

Fridays, Sept 15, 2023 - Dec 8, 2023
10:00 A.M. ET - 12:30 P.M. ET, BLC 5015

Course Website

http://bxlab.github.io/cmdb-quantbio/

Course Description

This course builds upon the foundations of Quantitative Biology Bootcamp, reinforcing and expanding upon mathematical and computational methods for analysis of biological data. Weekly meetings of this 2.5-hour course are organized as active-learning modules focused on diverse areas of genomics. Students perform guided research with real genomic data, uploading their results and code to repositories where they receive feedback.

Topics are intergated with the curricula for concurrent core courses in molecular biology and cell biology, with datasets and analysis goals aimed at diverse topics in these fields. Examples of such topics include:

  • genome assembly and variant discovery, problems necessary for a wide array of downstream genomic analyses
  • genetic association studies, aimed at uncovering the genetic basis of phenotypic variation for complex traits
  • sequence alignment, a long-standing computational challenge that is foundational to evolutionary studies
  • interpretation of functional genomic data, such as ChIP-seq and bisulfite-seq, which provide insight into mechanisms of gene regulation

Upon completing the course, students should have the background to develop reproducible bioinformatic workflows tailored to their research questions, as well as familiarity with resources for expanding upon these skills.

Learning Goals

Upon completion of this course, students will be able to:

  • Design workflows for common genomic data types, including developing tractable questions, organizing and preprocessing data, developing analytic pipelines, and implementing statistical models.
  • Describe and implement common probability distributions, statistical tests, and correction for multiple hypothesis testing. Students will be able to identify confounding factors, as well as methods for their diagnosis and mitigation.
  • Maintain a GitHub repository with code used for documentation, output, and documentation.
  • Describe key algorithms and data structures that are commonly encountered in computational biology, along with when best to use them.

Study Materials

This course does not have a required text. TAs have created a short online textbook coverin the fundamentals of Python and Git. Any lecture notes or slides will be made available on the course website.

Course Format

Attendance

Due to the interactive nature of this course, there is a policy that students should not participate in any other meetings, courses, or lab work during this course meeting time. Please be on time for course activities, and budget extra time in anticipation of technical delays. Your attendance is expected during all synchronous sessions and your effort is also expected during asynchronous sessions. Please let us know about any emergencies, family responsibilities, illness, etc. that may prevent attendance, and we will work to accommodate reasonable requests.

Asking for Help

We both expect and encourage you to ask questions and request help throughout this course. Therefore, please ask questions about whatever and whenever you need to.

Googling

Googling is always an acceptable way to find answers or help, and we encourage you to utilize it extensively. If you adopt a solution following a Google search, make sure you understand what you incorporate, rather than just copy/pasting without comprehension of the logic or code. Please see the Academic Integrity & Ethics syllabus section for more on this.

You may be familiar with ChatGPT and other large language models. After trying each problem/assignment/task on your own, if you’re still running into issues, feel free to use ChatGPT as you would any other online resource (Google, stack overflow, etc.). Learning how to succinctly describe exactly what you want to accomplish is a skillset in itself, so this can be good practice. If you find code that seems to work (e.g., from Google) but you’re not sure how exactly it works, you can also type it into ChatGPT and ask it to explain what’s happening. As always, please do not submit any code if you are not familiar entirely with how it works; flag it and ask a TA for assistance. Be aware that ChatGPT might confidently offer an answer that is not correct; so always check the output on your own.

Grading

Weekly exercise solutions (Jupyter notebooks, scripts, results, etc.) should be pushed to a student’s qbb2023-answers GitHub repository within a week after the assignment is posted. The grading for this course is based on reasonable completion. For each weekly exercise, students will be advised which if any exercises are advanced and therefore not required for submission.

TAs will verify that the student’s submitted work each week shows a reasonable level of individual effort. Letter grades will be assigned in line with the level of completion.

General guidelines for letter grade assignments:

  • A (excellent): 90-100% completion
  • B (passing): 80-90% completion
  • C or lower (not passing): <80% completion

Academic Integrity & Ethics

Academic and scientific institutions and research depend on honesty and integrity. You should complete your own lunch and homework exercises, and your work should not plagiarize others – including group partners, presenters, and strangers posting to online forums or blogs. You and your partners should be working together, but both persons should be writing and turning in unique, individualized code. Note your scripts and analysis may follow the same logic steps and even have tidbits of the same code, but no one person should be writing the solution the whole group uses character for character. Additionally, you should understand every line of code you write, are given and use, or find online and incorporate. If asked to, you should be able to explain exactly what your code is doing. Another aspect is properly acknowledging the source of borrowed code. Understanding can be cultivated and acknowledgement implemented by writing both inline and multiline comments (which is a terrific practice in general). Relatedly, don’t give someone code to copy and paste. Make sure any recipient can explain back to you any gifted code. See the University’s guidelines on plagiarism for details and contact a TA with any questions.

Code of Conduct

We are committed to maintaining a welcoming, inclusive, and harassment-free environment for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, religion (or lack thereof), political beliefs/leanings, or technology choices. We do not tolerate harassment in any form. This code of conduct applies to all course participants, including instructors and TAs, and to all modes of interaction and communication.

All class participants agree to:

  • Be considerate and polite. Quant Lab can be a daunting and sometimes frustrating experience - we're showing you a lot in a short amount of time! In addition, the collaborative nature of this course means it is essential that students are respectful of each other’s questions and points of confusion. Remember that you all are coming into this program with a wide range of coding backgrounds. A community where people feel comfortable is most conducive for learning.
  • Work together cooperatively. Be kind when giving feedback on your classmates’ code, and make an effort to help if they are stuck. You will gain much more from explaining concepts to someone else than you will from rushing through the assignments on your own.
  • Refrain from demeaning, discriminatory, or harassing behavior and speech. Harassment includes, but is not limited to: deliberate intimidation; stalking; unwanted photography or recording; sustained or willful disruption of talks or other events; inappropriate physical contact; use of sexual or discriminatory imagery, comments, or jokes; and unwelcome sexual attention. If you feel that someone has harassed you or otherwise treated you inappropriately, please alert one of the instructors or TAs.
  • Refrain from advocating for, or encouraging, negative behavior. And, if someone asks you to stop, then stop. Alert an instructor or TA if you notice a dangerous situation, someone in distress, or violations of this code of conduct, no matter how inconsequential it seems.

Weekly Schedule

Week Date Instructor Topic
1 9/15 Rajiv McCoy Linear regression
2 9/22 Mike Schatz Genome assembly
3 9/29 Dylan Taylor Sequence alignment
4 10/6 Mike Sauria ChIP-seq
5 10/13 Rajiv McCoy Variant calling
- 10/20 No Class Retreat
6 10/27 Rajiv McCoy GWAS
7 11/3 Mike Sauria Single-molecule sequencing and DNA modifications
8 11/10 Mike Sauria 3D Genome
9 11/17 Rajiv McCoy Bulk RNA-seq
- 11/24 No class Thanksgiving Break
10 12/1 Rajiv McCoy Metagenomics
11 12/8 Mike Sauria Single-cell RNA-seq