Modern molecular biology generates large amounts of data, such as sequences, structures and expression data, that needs different forms of statistical analysis and modelling to be properly interpreted. The fields of Bioinformatics and Computational Biology have this as their subject matter and there is no sharp boundary between them. Bioinformatics has an applied flavour while Computational Biology is viewed as the study of the models, statistical methodology and algorithms needed to do bioinformatics analysis. This course aims to present core topics of these fields with and emphasis on modelling and computation.
Knowledge of elementary probability would be very helpful.
· Fundamental Data Structures in Biology: Sequences, Genes, Networks and RNA secondary structure.
· Stochastic Models of Sequence and Genome Evolution including models of single nucleotide/amino acid/codon evolution.
· Phylogenies: enumerating phylogenies, the probability of sequences related by a specified phylogeny, the minimal number of events needed to explain a data set (Parsimony).
· Likelihood and algorithms (Markov Chain Monte Carlo) for inference based on the likelihood.
· Software packages for sample-based inference.
· Alignment Algorithms. Comparing 2 strings, an arbitrary number of strings, find segments of high similarity in 2 strings.
· Common Patterns in a set of Sequences.
· Network Inference and Network Evolution
· Detection of Recombinations in Sequences
Method of Assessment
It is proposed to assess this course by mini-project.
Teaching material and exercises can be found at:
http://www.stats.ox.ac.uk/research/genome/compbiol08