Introduction to computational proteomics / Golan Yona. — Boca Raton : CRC Press, Taylor & Francis Group, 2011. – (58.17421/Y55) |
Contents
CONTENTS
I The Basics
1 What Is Computational Proteomics?
1.1 The complexity of living organisms
1.2 Proteomics in the modern era
1.3 The main challenges in computational proteomics
2 Basic Notions in Molecular Biology
2.1 The cell structure of organisms
2.2 It all starts from the DNA
2.3 Proteins
2.4 From DNA to proteins
2.5 Protein folding - from sequence to structure
2.6 Evolution and relational classes in the protein space
2.7 Problems
3 Sequence Comparison
3.1 Introduction
3.2 Alignment of sequences
3.3 Heuristic algorithms for sequence comparison
3.4 Probability and statistics of sequence alignments
3.5 Scoring matrices and gap penalties
3.6 Distance and pseudo-distance functions for proteins
3.7 Further reading
3.8 Conclusions
3.9 Appendix - non-linear gap penalty functions
3.10 Appendix - implementation of BLAST and FASTA
3.11 Appendix - performance evaluation
3.12 Appendix - basic concepts in probability
3.13 Appendix - metrics and real normed spaces
3.14 Problems
4 Multiple Sequence Alignment, Profiles and Partial Graphs
4.1 Dynamic programming in N dimensions
4.2 Classical heuristic methods
4.3 MSA representation and scoring
4.4 Iterative and progressive alignment
4.5 Transitive alignment
4.6 Partial order alignment
4.7 Further reading
4.8 Conclusions
4.9 Problems
5 Motif Discovery
5.1 Introduction
5.2 Model-based algorithms
5.3 Searching for good models
5.4 Combinatorial approaches
5.5 Further reading
5.6 Conclusions
5.7 Appendix - the Expectation-Maximization algorithm
5.8 Problems
6 Markov Models of Protein Families
6.1 Introduction
6.2 Markov models
6.3 Main applications of hidden Markov models
6.4 Higher order models, codes and compression
6.5 Further reading
6.6 Conclusions
6.7 Problems
7 Classifiers and Kernels
7.1 Generative models vs. discriminative models
7.2 Classifiers and discriminant functions
7.3 Applying SVMs to protein classification
7.4 Decision trees
7.5 Further reading
7.6 Conclusions
7.7 Appendix - estimating the significance of a split
7.8 Problems
8 Protein Structure Analysis
8.1 Introduction
8.2 Structure prediction - the protein folding problem
8.3 Structure comparison
8.4 Generalized sequence profiles - integrating secondary structure with sequence information
8.5 Further reading
8.6 Conclusions
8.7 Appendix - minimizing RMSd
8.8 Problems
9 Protein Domains
9.1 Introduction
9.2 Domain detection
9.3 Learning domain boundaries from multiple features
9.4 Testing domain predictions
9.5 Multi-domain architectures
9.6 Further reading
9.7 Conclusions
9.8 Appendix - domain databases
9.9 Problems
II Putting All the Pieces Together
10 Clustering and Classification
10.1 Introduction
10.2 Clustering methods
10.3 Vector-space clustering algorithms
10.4 Graph-based clustering algorithms
10.5 Cluster validation and assessment
10.6 Clustering proteins
10.7 Further reading
10.8 Conclusions
10.9 Appendix - cross-validation tests
10.10 Problems
11 Embedding Algorithms and Vectorial Representations
11.1 Introduction
11.2 Structure preserving embedding
11.3 Setting the dimension of the host space
11.4 Vectorial representations
11.5 Further reading
11.6 Conclusions
11.7 Problems
12 Analysis of Gene Expression Data
12.1 Introduction
12.2 Microarrays
12.3 Analysis of individual genes
12.4 Pairwise analysis
12.5 Cluster analysis and class discovery
12.6 Protein arrays
12.7 Further reading
12.8 Conclusions
12.9 Problems
13 Protein-Protein Interactions
13.1 Introduction
13.2 Experimental detection of protein interactions
13.3 Prediction of protein-protein interactions
13.4 Interaction networks
13.5 Further reading
13.6 Conclusions
13.7 Appendix - DNA amplification and protein expression
13.8 Appendix - the Pearson correlation
13.9 Problems
14 Cellular Pathways
14.1 Introduction
14.2 Metabolic pathways
14.3 Pathway prediction
14.4 Regulatory networks: modules and regulation programs
14.5 Pathway networks and the minimal cell
14.6 Further reading
14.7 Conclusions
14.8 Problems
15 Learning Gene Networks with Bayesian Networks
15.1 Introduction
15.2 Computing the likelihood of observations
15.3 Probabilistic inference
15.4 Learning the parameters of a Bayesian network
15.5 Learning the structure of a Bayesian network
15.6 Learning Bayesian networks from microarray data
15.7 Further reading
15.8 Conclusions
15.9 Problems
References
Conference Abbreviations
Acronyms
Index