WELCOME TO CS 181, COMPUTATIONAL MOLECULAR BIOLOGY!

Course Information

The aim of this course is to provide an introduction to computational molecular biology. The course is organized into six chapters:

  1. Sequence Alignment
  2. Combinatorial Pattern Matching
  3. Phylogenetic Trees
  4. Hidden Markov Models
  5. Genome Assembly
  6. Genomic Privacy

Each chapter is devoted to a class of basic computational problems related to the analysis of DNA, RNA, and protein sequences and their molecular function. Our journey in each chapter is driven by a set of beautiful algorithms. A “beautiful” algorithm is one that is rigorous, practical, elegantly simple, and easy to implement. In addition to these beautiful algorithms, each chapter contains a Foundations section that gives a detailed presentation of the biological problems discussed as well as the theoretical computer science and statatistical results that led to the invention of the algorithms. This class provides a serious introduction to the field of computational biology both for potential concentrators and for those who may take only a single course in the subject.

Historical note: CS181 was first taught at Brown 23 years ago by Professor Franco Preparata (i.e. before the completion of the Human Genome Project). This year’s offering is the 24rd incarnation of this foundational course in computational biology. See the Resources page for a biology primer written by Prof. Preparata.

FAQ

Who takes the course? As an interdisciplinary course, CS181 attracts a diverse group of students. Past students have ranged from sophomores concentrating in Computer Science and Computational Biology through Ph.D. students in Computer Science, Applied Mathematics, and Biology. The course staff will do its best to ensure that all students have a chance to succeed. Please do not hesitate to talk to a member of the course staff if you have trouble deciding whether CS181 is a good fit for you.
What biology background is needed? There are no biology prerequisites, and no prior biology knowledge is assumed; the material that you need to know will be covered in class. Students whose backgrounds are in the life sciences, however, will be expected to dig deeper into the biology.
What computer science and mathematics background is needed? Officially, one of CS16, CS18, CS200, or CS19 (i.e. a yearlong introduction to computer science). This can be waived by the instructor (especially for life science students). Students in the course generally have some prior exposure to basic concepts of discrete math (graphs, recurrence relations), discrete probability (random variables, independence), and algorithms (big-O notation, pseudocode).
What programming background is needed? This is not a programming-heavy course, although there will be programming assignments. The goal of these assignments is to gain a deeper understanding of the algorithms by implementing them and testing them on real data. Thus, some rudimentary programming skills (arrays, loops, functions, etc.) are required. Any language can be used, but common languages like Python will make it easier for the TAs to help you.
I am experienced in molecular biology, but do not have any formal mathematical or computational training. Can I take the course? We attempt to make the course genuinely accessible for students without a computer science background. At the same time, all students in the class should be prepared to complete medium-scale programming assignments, learn some new mathematical concepts, and reason about algorithms in a rigorous manner. Please reach out to a member of the course staff if you are unsure of your background.
I am interested in learning how to analyze *-Seq data from my (advisor's) lab. Will this course help me? Possibly, but perhaps not in the way that you expect. The goals of CS181 are to teach the algorithmic concepts that underlie a wide variety of software that is used to analyze biological data, particularly in genetics, genomics, and proteomics. The course will not teach you how to use any particular biological software package. Rather, you will learn how this software works, and more importantly for the long-term, how to think about biological problems in a computational way. Thus, when the latest and greatest technology for measuring DNA/RNA/protein is released in 5 or 10 years' time, you will have some algorithmic skills to work with this data, without waiting for the rest of the community to develop tools. If your interests are more narrowly focused on a particular, near-term application, another course might be more appropriate.
Can I get graduate credit for this course? Yes! To get it, you will need to do all undergraduate coursework in the class plus a final research project defined in discussions with the professor. Work for the final project consists of (1) a piece of code implementing a new algorithm or analysis or simulation, (2) a short written paper about your project and algorithms/code, and (3) a comprehensive powerpoint and a final project presentation to the class. Please email the professor for more information about this.