Home Shop Service Stellenangebote Newsletter Das Unternehmen Shopping cart English
Bücher | Informatik | Datenbanken u. Data Warehousing | Data Mining for Genomics and Proteomics
Unsere Produkte
Bücher
 
Soeben erschienen
Titelsuche
Featured Sites
Zeitschriften
Wählen Sie Ihr Fachgebiet
 
Dziuda, Darius M.
Data Mining for Genomics and Proteomics
Analysis of Gene and Protein Expression Data
Wiley Series on Methods and Applications

1. Auflage - Juli 2010
77,90 Euro
2010. 328 Seiten, Hardcover
- Praktikerbuch -
ISBN-10: 0-470-16373-9
ISBN-13: 978-0-470-16373-3 - John Wiley & Sons

Preis inkl. Mehrwertsteuer zzgl. Versandkosten.

Bestellen



Probekapitel

Kurzbeschreibung
Data Mining for Genomics and Proteomics uses pragmatic examples and a complete case study to demonstrate step-by-step how biomedical studies can be used to maximize the chance of extracting new and useful biomedical knowledge from data. It is an excellent resource for students and professionals involved with gene or protein expression data in a variety of settings.

Aus dem Inhalt
Preface.

Acknowledgments.

1 Introduction.

1.1 Basic Terminology.

1.1.1 The Central Dogma of Molecular Biology.

1.1.2 Genome.

1.1.3 Proteome.

1.1.4 DNA (Deoxyribonucleic Acid).

1.1.5 RNA (Ribonucleic Acid).

1.1.6 mRNA (messenger RNA).

1.1.7 Genetic Code.

1.1.8 Gene.

1.1.9 Gene Expression and the Gene Expression Level.

1.1.10 Protein.

1.2 Overlapping Areas of Research.

1.2.1 Genomics.

1.2.2 Proteomics.

1.2.3 Bioinformatics.

1.2.4 Transcriptomics and Other -omics.

1.2.5 Data Mining.

2 Basic Analysis of Gene Expression Microarray Data.

2.1 Introduction.

2.2 Microarray Technology.

2.2.1 Spotted Microarrays.

2.2.2 Affymetrix GeneChip(r) Microarrays.

2.2.3 Bead-Based Microarrays.

2.3 Low-Level Preprocessing of Assymetrix Microarrays.

2.3.1 MAS5.

2.3.2 RMA.

2.3.3 GCRMA.

2.3.4 PLIER.

2.4 Public Repositories of Microarray Data.

2.4.1 Microarray Gene Expression Data Society (MGED) Standards.

2.4.2 Public Databases.

2.4.2.1 Gene Expression Omnibus (GEO).

2.4.2.2 ArrayExpress.

2.5 Gene Expression Matrix.

2.5.1 Elements of Gene Expression Microarray Data Analysis.

2.6 Additional Preprocessing, Quality Assessment, and Filtering.

2.6.1 Quality Assessment.

2.6.2 Filtering.

2.7 Basic Exploratory Data Analysis.

2.7.1 t Test.

2.7.1.1 t Test for Equal Variances.

2.7.1.2 t Test for Unequal Variances.

2.7.2 ANOVA F Test.

2.7.3 SAM t Statistic.

2.7.4 Limma.

2.7.5 Adjustment for Multiple Comparisons.

2.7.5.1 Single-Step Bonferroni Procedure.

2.7.5.2 Single-Step Sidak Procedure.

2.7.5.3 Step-Down Holm Procedure.

2.7.5.4 Step-Up Benjamini and Hochberg Procedure.

2.7.5.5 Permutation Based Multiplicity Adjustment.

2.8 Unsupervised Learning (Taxonomy-Related Analysis).

2.8.1 Cluster Analysis.

2.8.1.1 Measures of Similarity or Distance.

2.8.1.2 K-Means Clustering.

2.8.1.3 Hierarchical Clustering.

2.8.1.4 Two-Way Clustering and Related Methods.

2.8.2 Principal Component Analysis.

2.8.3 Self-Organizing Maps.

Exercises.

3 Biomarker Discovery and Classification.

3.1 Overview.

3.1.1 Gene Expression Matrix . . . Again.

3.1.2 Biomarker Discovery.

3.1.3 Classification Systems.

3.1.3.1 Parametric and Nonparametric Learning Algorithms.

3.1.3.2 Terms Associated with Common Assumptions Underlying Parametric Learning Algorithms.

3.1.3.3 Visualization of Classification Results.

3.1.4 Validation of the Classification Model.

3.1.4.1 Reclassification.

3.1.4.2 Leave-One-Out and K-Fold Cross-Validation.

3.1.4.3 External and Internal Cross-Validation.

3.1.4.4 Holdout Method of Validation.

3.1.4.5 Ensemble-Based Validation (Using Out-of-Bag Samples).

3.1.4.6 Validation on an Independent Data Set.

3.1.5 Reporting Validation Results.

3.1.5.1 Binary Classifiers.

3.1.5.2 Multiclass Classifiers.

3.1.6 Identifying Biological Processes Underlying the Class Differentiation.

3.2 Feature Selection.

3.2.1 Introduction.

3.2.2 Univariate Versus Multivariate Approaches.

3.2.3 Supervised Versus Unsupervised Methods.

3.2.4 Taxonomy of Feature Selection Methods.

3.2.4.1 Filters, Wrappers, Hybrid, and Embedded Models.

3.2.4.2 Strategy: Exhaustive, Complete, Sequential, Random, and Hybrid Searches.

3.2.4.3 Subset Evaluation Criteria.

3.2.4.4 Search-Stopping Criteria.

3.2.5 Feature Selection for Multiclass Discrimination.

3.2.6 Regularization and Feature Selection.

3.2.7 Stability of Biomarkers.

3.3 Discriminant Analysis.

3.3.1 Introduction.

3.3.2 Learning Algorithm.

3.3.3 A Stepwise Hybrid Feature Selection with T2.

3.4 Support Vector Machines.

3.4.1 Hard-Margin Support Vector Machines.

3.4.2 Soft-Margin Support Vector Machines.

3.4.3 Kernels.

3.4.4 SVMs and Multiclass Discrimination.

3.4.4.1 One-Versus-the-Rest Approach.

3.4.4.2 Pairwise Approach.

3.4.4.3 All-Classes-Simultaneously Approach.

3.4.5 SVMs and Feature Selection: Recursive Feature Elimination.

3.4.6 Summary.

3.5 Random Forests.

3.5.1 Introduction.

3.5.2 Random Forests Learning Algorithm.

3.5.3 Random Forests and Feature Selection.

3.5.4 Summary.

3.6 Ensemble Classifiers, Bootstrap Methods, and The Modified Bagging Schema.

3.6.1 Ensemble Classifiers.

3.6.1.1 Parallel Approach.

3.6.1.2 Serial Approach.

3.6.1.3 Ensemble Classifiers and Biomarker Discovery.

3.6.2 Bootstrap Methods.

3.6.3 Bootstrap and Linear Discriminant Analysis.

3.6.4 The Modified Bagging Schema.

3.7 Other Learning Algorithms.

3.7.1 k-Nearest Neighbor Classifiers.

3.7.2 Artificial Neural Networks.

3.7.2.1 Perceptron.

3.7.2.2 Multilayer Feedforward Neural Networks.

3.7.2.3 Training the Network (Supervised Learning).

3.8 Eight Commandments of Gene Expression Analysis (for Biomarker Discovery).

4 The Informative Set of Genes.

4.1 Introduction.

4.2 Definitions.

4.3 The Method.

4.3.1 Identification of the Informative Set of Genes.

4.3.2 Primary Expression Patterns of the Informative Set of Genes.

4.3.3 The Most Frequently Used Genes of the Primary Expression Patterns.

4.4 Using the Informative Set of Genes to Identify Robust Multivariate Biomarkers.

4.5 Summary.

5 Analysis of Protein Expression Data.

5.1 Introduction.

5.2 Protein Chip Technology.

5.2.1 Antibody Microarrays.

5.2.2 Peptide Microarrays.

5.2.3 Protein Microarrays.

5.2.4 Reverse Phase Microarrays.

5.3 Two-Dimensional Gel Electrophoresis.

5.4 MALDI-TOF and SELDI-TOF Mass Spectrometry.

5.4.1 MALDI-TOF Mass Spectrometry.

5.4.2 SELDI-TOF Mass Spectrometry.

5.5 Preprocessing of Mass Spectrometry Data.

5.5.1 Introduction.

5.5.2 Elements of Preprocessing of SELDI-TOF Mass Spectrometry Data.

5.5.2.1 Quality Assessment.

5.5.2.2 Calibration.

5.5.2.3 Baseline Correction.

5.5.2.4 Noise Reduction and Smoothing.

5.5.2.5 Peak Detection.

5.5.2.6 Intensity Normalization.

5.5.2.7 Peak Alignment Across Spectra.

5.6 Analysis of Protein Expression Data.

5.6.1 Additional Preprocessing.

5.6.2 Basic Exploratory Data Analysis.

5.6.3 Unsupervised Learning.

5.6.4 Supervised Learning--Feature Selection and Biomarker Discovery.

5.6.5 Supervised Learning--Classification Systems.

5.7 Associating Biomarker Peaks with Proteins.

5.7.1 Introduction.

5.7.2 The Universal Protein Resource (UniProt).

5.7.3 Search Programs.

5.7.4 Tandem Mass Spectrometry.

5.8 Summary.

6 Sketches for Selected Exercises.

6.1 Introduction.

6.2 Multiclass Discrimination (Exercise 3.2).

6.2.1 Data Set Selection, Downloading, and Consolidation.

6.2.2 Filtering Probe Sets.

6.2.3 Designing a Multistage Classification Schema.

6.3 Identifying the Informative Set of Genes (Exercises 4.2-4.6).

6.3.1 The Informative Set of Genes.

6.3.2 Primary Expression Patterns of the Informative Set.

6.3.3 The Most Frequently Used Genes of the Primary Expression Patterns.

6.4 Using the Informative Set of Genes to Identify Robust Multivariate Markers (Exercise 4.8).

6.5 Validating Biomarkers on an Independent Test Data Set (Exercise 4.8).

6.6 Using a Training Set that Combines More than One Data Set (Exercises 3.5 and 4.1-4.8).

6.6.1 Combining the Two Data Sets into a Single Training Set.

6.6.2 Filtering Probe Sets of the Combined Data.

6.6.3 Assessing the Discriminatory Power of the Biomarkers and Their Generalization.

6.6.4 Identifying the Informative Set of Genes.

6.6.5 Primary Expression Patterns of the Informative Set of Genes.

6.6.6 The Most Frequently Used Genes of the Primary Expression Patterns.

6.6.7 Using the Informative Set of Genes to Identify Robust Multivariate Markers.

6.6.8 Validating Biomarkers on an Independent Test Data Set.

References.

Index.


 
Bestellen
Kurzbeschreibung
Langtext
Weitere Titel der Reihe
Autoreninformation

Weitere Bücher

Microsoft SQL Server 2012 Bible

Professional Microsoft SQL Server 2012 Analysis Services with MDX

Professional Microsoft SQL Server 2012 Reporting Services


[mehr >>]

Weitere Zeitschriften

MLQ - Mathematical Logic Quarterly

[mehr>>]

Angebot

Krämer, Wolfgang / Schirmer, Ulrich / Jeschke, Peter / Witschel, Matthias (eds.)

Modern Crop Protection Compounds
449,- Euro
gültig bis
30. Juni 2012

[mehr Angebote >>]


 

        

Seite empfehlen          RSS-Feeds         Druckversion         Sitemap

©2012 Wiley-VCH Verlag GmbH & Co. KGaA - Betreiber
http://www.wiley-vch.de - mailto: info@wiley-vch.de
Datenschutz