Bioconductor case studies / Florian Hahne ... [et al.]. -- New York : Spinger, 2008. – (58.17115/H148) |
Contents
Preface
List of Contributors
1
The ALL Dataset
1.1 Introduction
1.2 The ALL data
1.3 Data subsetting
1.4 Nonspecific filtering
1.5 BCR/ABL ALL1/AF4 subset
2 R
and Bioconductor Introduction
2.1 Finding help in R
2.2 Working with packages
2.3 Some basic R
2.4 Structures for genomic data
2.5 Graphics
3
Processing Affymetrix Expression Data
3.1 The input data: CEL files
3.2 Quality assessment
3.3 Preprocessing
3.4 Ranking and filtering probe
sets
3.5 Advanced preprocessing
4 Two-Color
Arrays
4.1 Introduction
4.2 Data import
4.3 Image plots
4.4 Normalization
4.5
Differential expression
5
Fold-Changes, Log-Ratios, Background Correction, Shrinkage Estimation,
and Variance Stabilization
5.1
Fold-changes and (log-)ratios
5.2
Background-correction and generalized logarithm
5.3 Calling VSN
5.4
How does VSN work?
5.5
Robust fitting and the "most genes not differentially
expressed" assumption
5.6
Single-color normalization
5.7
The interpretation of glog-ratios
5.8
Reference normalization
6 Easy Differential Expression
6.1
Example data
6.2
Nonspecific filtering
6.3
Differential expression
6.4
Multiple testing correction
7 Differential
Expression
7.1
Motivation
7.2
Nonspecific filtering
7.3
Differential expression
7.4
Multiple testing
7.5
Moderated test statistics and the limma package
7.6 Gene selection by Receiver
Operator Characteristic (ROC)
7.7 When power increases
8
Annotation and Metadata
8.1
Our data
8.2 Multiple probe sets per gene
8.3 Categories and
overrepresentation
8.4 Working with GO
8.5 Other annotations available
8.6 biomaRt
8.7 Database versions of
annotation packages
9
Supervised Machine Learning
9.1 Introduction
9.2 The example dataset
9.3 Feature selection and
standardization
9.4 Selecting a distance
9.5 Machine learning
9.6 Cross-validation
9.7 Random forests
9.8 Multigroup classification
10 Unsupervised
Machine Learning
10.1 Preliminaries
10.2 Distances
10.3 How many clusters?
10.4 Hierarchical clustering
10.5 Partitioning methods
10.6 Self-organizing maps
10.7 Hopach
10.8 Silhouette plots
10.9 Exploring transformations
10.10 Remarks
11 Using Graphs for Interactome Data
11.1 Introduction
11.2 Exploring the protein interaction graph
11.3 The co-expression graph
11.4 Testing the association between physical
interaction and coexpression
11.5 Some harder problems
11.6 Reading PSI-25 XML files from IntAct with the
Rintact package
12 Graph
Layout
12.1 Introduction
12.2 Layout and rendering using Rgraphviz
12.3 Directed graphs
12.4 Subgraphs
12.5 Tooltips and hyperlinks on graphs
13 Gene Set Enrichment Analysis
13.1 Introduction
13.2 Data analysis
13.3 Identifying and assessing the effects of
overlapping gene sets
14 Hypergeometric
Testing Used for Gene Set Enrichment Analysis
14.1 Introduction
14.2 The basic problem
14.3 Preprocessing and inputs
14.4 Outputs and result summarization
14.5 The conditional hypergeometric test
14.6 Other collections of gene sets
15 Solutions
to Exercises
2 R and Bioconductor
Introduction
3 Processing Affymetrix
Expression Data
4 Two-Color Arrays
5 Fold-Changes, Log-Ratios,
Background Correction, Shrinkage Estimation, and Variance Stabilization
6 Easy Differential Expression
7 Differential Expression
8 Annotation and Metadata
9 Supervised Machine Learning
10 Unsupervised Machine Learning
11 Using Graphs for Interactome
Data
12 Graph Layout
13 Gene Set Enrichment Analysis
14 Hypergeometric Testing Used
for Gene Set Enrichment Analysis
References
Index