Bioinformatics Lectures

(b) indicates slides that contain primarily background information. (a) indicates “advanced” material. All slides (and errors) by Carl Kingsford unless noted. Often the material for a lecture was derived from some source material that is cited in each PDF file.

Introduction to Computers and Biology

  1. Introduction and Computational Successes
  2. Quick Biology Introduction (b)

Exact String Search

  1. Z-Algorithm
  2. Knuth-Morris-Pratt and Boyer-Moore (a)
  3. Seminumerical String Matching
  4. Multiple pattern search (a)

Dynamic Programming & Sequence Alignment

  1. Introduction to Dynamic Programming (b)
  2. More Dynamic Programming Examples: Subset Sum & Knapsack (b)
  3. Global Sequence Alignment
  4. Local Sequence Alignment
  5. General & Affine Gap Penalties [examples]
  6. Multiple Sequence Alignment
  7. Linear-space Sequence Alignment (a)
  8. “4 Russian’s Speedup” (a)

Genome Sequencing & Assembly

  1. DNA (Sanger) Sequencing & Lander-Waterman Statistics
  2. Introduction to Graphs (b)
  3. Genome Assembly Paradigms
  4. Comparing Whole Genomes with MUMmer

Data Structures for String Queries

  1. Introduction to Trees (b)
  2. (Optimal) Binary Search Trees (b)
  3. Suffix Tries & Trees [paper]
  4. Suffix Arrays (including Skew Algorithm) [examples]
  5. Applications of suffix trees
  6. Burrows-Wheeler Transform [examplespaper]
  7. Wavelet Trees (a)

Structural Biology

  1. RNA Folding [examples]
  2. Protein Folding
  3. Linear Programming (a)
  4. Side-chain Positioning (a)

Pattern Discovery

  1. Motif Finding with Gibbs Sampling [paper]
  2. Bacterial Gene Finding
  3. Hidden Markov Models and Eukaryotic Gene Finding [examples]
  4. Profile Hidden Markov Models


  1. Phylogenetics

Clustering Biological Networks

  1. Yeast 2-hybrid
  2. Other Interaction Experiments
  3. Function Prediction
  4. Network Modularity (a)
  5. Clustering via Minimum Spanning Trees
  6. Kernighan-Lin Graph Partitioning & Clustering via Distances (a)
  7. MCL and Other Graph Clustering (MCODE, RNSC) (a)

Aligning Biological Networks

  1. PathBLAST (a)
  2. Alignment via Color Coding (a)
  3. IsoRank (a)

Random Models of Network Evolution

  1. Models of Network Evolution (a)
  2. Attack vs. Failure Response (a)
  3. Network Motifs & Symmetry Breaking (a)

Computer Science Background

All slides by Carl Kingsford unless noted. Often the material for a lecture was derived from some source material that is cited in each PDF file.

Basic Data Structures

  1. Lists, Stacks, and Queues
  2. AVL Trees
  3. Splay Trees
  4. B-Trees
  5. Skip Lists
  6. Heaps

Other Data Structures

  1. RRR Compressed Bitvectors (a)

Interval & Spatial Data Structures

  1. Quadtrees (PR, MX, Point)
  2. kd-Trees
  3. Searching in kd-Trees
  4. Range Trees
  5. Interval Trees
  6. Finding Closest Pair of Points

Programming Languages

  1. Python
  2. C++

Network Flow Algorithms

  1. Network Flows
  2. Bipartite Matching
  3. Edge-disjoint paths
  4. Flows with Demands, etc.
  5. Shortest paths with negative weights

The Theory of NP-Completeness

  1. The P and NP Complexity Classes
  2. NP-completeness
  3. SAT, Coloring, TSP Problems
  4. 3-dimensional Matching, Subset Sum Problems