Wiley-VCH


John Wiley & Sons Pattern Recognition in Computational Molecular Biology Cover A comprehensive overview of high-performance pattern recognition techniques and approaches to Comput.. Product #: 978-1-118-89368-5 Regular price: $139.25 $139.25 In Stock

Pattern Recognition in Computational Molecular Biology

Techniques and Approaches

Elloumi, Mourad / Iliopoulos, Costas / Wang, Jason T. L. / Zomaya, Albert Y.

Wiley Series in Bioinformatics

Cover

1. Edition February 2016
656 Pages, Hardcover
Wiley & Sons Ltd

ISBN: 978-1-118-89368-5
John Wiley & Sons

Further versions

A comprehensive overview of high-performance pattern recognition techniques and approaches to Computational Molecular Biology

This book surveys the developments of techniques and approaches on pattern recognition related to Computational Molecular Biology. Providing a broad coverage of the field, the authors cover fundamental and technical information on these techniques and approaches, as well as discussing their related problems. The text consists of twenty nine chapters, organized into seven parts: Pattern Recognition in Sequences, Pattern Recognition in Secondary Structures, Pattern Recognition in Tertiary Structures, Pattern Recognition in Quaternary Structures, Pattern Recognition in Microarrays, Pattern Recognition in Phylogenetic Trees, and Pattern Recognition in Biological Networks.
* Surveys the development of techniques and approaches on pattern recognition in biomolecular data
* Discusses pattern recognition in primary, secondary, tertiary and quaternary structures, as well as microarrays, phylogenetic trees and biological networks
* Includes case studies and examples to further illustrate the concepts discussed in the book
Pattern Recognition in Computational Molecular Biology: Techniques and Approaches is a reference for practitioners and professional researches in Computer Science, Life Science, and Mathematics. This book also serves as a supplementary reading for graduate students and young researches interested in Computational Molecular Biology.

LIST OF CONTRIBUTORS xxi

PREFACE xxvii

I PATTERN RECOGNITION IN SEQUENCES 1

1 COMBINATORIAL HAPLOTYPING PROBLEMS 3
Giuseppe Lancia

1.1 Introduction / 3

1.2 Single Individual Haplotyping / 5

1.3 Population Haplotyping / 12

References / 23

2 ALGORITHMIC PERSPECTIVES OF THE STRING BARCODING PROBLEMS 28
Sima Behpour and Bhaskar DasGupta

2.1 Introduction / 28

2.2 Summary of Algorithmic Complexity Results for Barcoding Problems / 32

2.3 Entropy-Based Information Content Technique for Designing

Approximation Algorithms for String Barcoding Problems / 34

2.4 Techniques for Proving Inapproximability Results for String Barcoding Problems / 36

2.5 Heuristic Algorithms for String Barcoding Problems / 39

2.6 Conclusion / 40

Acknowledgments / 41

References / 41

3 ALIGNMENT-FREE MEASURES FOR WHOLE-GENOME COMPARISON 43
Matteo Comin and Davide Verzotto

3.1 Introduction / 43

3.2 Whole-Genome Sequence Analysis / 44

3.3 Underlying Approach / 47

3.4 Experimental Results / 54

3.5 Conclusion / 61

Author's Contributions / 62

Acknowledgments / 62

References / 62

4 A MAXIMUM LIKELIHOOD FRAMEWORK FOR MULTIPLE SEQUENCE LOCAL ALIGNMENT 65
Chengpeng Bi

4.1 Introduction / 65

4.2 Multiple Sequence Local Alignment / 67

4.3 Motif Finding Algorithms / 70

4.4 Time Complexity / 75

4.5 Case Studies / 75

4.6 Conclusion / 80

References / 81

5 GLOBAL SEQUENCE ALIGNMENT WITH A BOUNDED NUMBER OF GAPS 83
Carl Barton, Tomás Flouri, Costas S. Iliopoulos, and Solon P. Pissis

5.1 Introduction / 83

5.2 Definitions and Notation / 85

5.3 Problem Definition / 87

5.4 Algorithms / 88

5.5 Conclusion / 94

References / 95

II PATTERN RECOGNITION IN SECONDARY STRUCTURES 97

6 A SHORT REVIEW ON PROTEIN SECONDARY STRUCTURE PREDICTION METHODS 99
Renxiang Yan, Jiangning Song, Weiwen Cai, and Ziding Zhang

6.1 Introduction / 99

6.2 Representative Protein Secondary Structure Prediction Methods / 102

6.3 Evaluation of Protein Secondary Structure Prediction Methods / 106

6.4 Conclusion / 110

Acknowledgments / 110

References / 111

7 A GENERIC APPROACH TO BIOLOGICAL SEQUENCE SEGMENTATION PROBLEMS: APPLICATION TO PROTEIN SECONDARY STRUCTURE PREDICTION 114
Yann Guermeur and Fabien Lauer

7.1 Introduction / 114

7.2 Biological Sequence Segmentation / 115

7.3 MSVMpred / 117

7.4 Postprocessing with A Generative Model / 119

7.5 Dedication to Protein Secondary Structure Prediction / 120

7.6 Conclusions and Ongoing Research / 125

Acknowledgments / 126

References / 126

8 STRUCTURAL MOTIF IDENTIFICATION AND RETRIEVAL: A GEOMETRICAL APPROACH 129
Virginio Cantoni, Marco Ferretti, Mirto Musci, and Nahumi Nugrahaningsih

8.1 Introduction / 129

8.2 A Few Basic Concepts / 130

8.3 State of the Art / 135

8.4 A Novel Geometrical Approach to Motif Retrieval / 138

8.5 Implementation Notes / 149

8.6 Conclusions and Future Work / 151

Acknowledgment / 152

References / 152

9 GENOME-WIDE SEARCH FOR PSEUDOKNOTTED NONCODING RNAs: A COMPARATIVE STUDY 155
Meghana Vasavada, Kevin Byron, Yang Song, and Jason T.L. Wang

9.1 Introduction / 155

9.2 Background / 156

9.3 Methodology / 157

9.4 Results and Interpretation / 161

9.5 Conclusion / 162

References / 163

III PATTERN RECOGNITION IN TERTIARY STRUCTURES 165

10 MOTIF DISCOVERY IN PROTEIN 3D-STRUCTURES USING GRAPH MINING TECHNIQUES 167
Wajdi Dhifli and Engelbert Mephu Nguifo

10.1 Introduction / 167

10.2 From Protein 3D-Structures to Protein Graphs / 169

10.3 Graph Mining / 172

10.4 Subgraph Mining / 173

10.5 Frequent Subgraph Discovery / 173

10.6 Feature Selection / 179

10.7 Feature Selection for Subgraphs / 180

10.8 Discussion / 183

10.9 Conclusion / 185

Acknowledgments / 185

References / 186

11 FUZZY AND UNCERTAIN LEARNING TECHNIQUES FOR THE ANALYSIS AND PREDICTION OF PROTEIN TERTIARY STRUCTURES 190
Chinua Umoja, Xiaxia Yu, and Robert Harrison

11.1 Introduction / 190

11.2 Genetic Algorithms / 192

11.3 Supervised Machine Learning Algorithm / 201

11.4 Fuzzy Application / 204

11.5 Conclusion / 207

References / 208

12 PROTEIN INTER-DOMAIN LINKER PREDICTION 212
Maad Shatnawi, Paul D. Yoo, and Sami Muhaidat

12.1 Introduction / 212

12.2 Protein Structure Overview / 213

12.3 Technical Challenges and Open Issues / 214

12.4 Prediction Assessment / 215

12.5 Current Approaches / 216

12.6 Domain Boundary Prediction Using Enhanced General Regression Network / 220

12.7 Inter-Domain Linkers Prediction Using Compositional Index and Simulated Annealing / 227

12.8 Conclusion / 232

References / 233

13 PREDICTION OF PROLINE CIS-TRANS ISOMERIZATION 236
Paul D. Yoo, Maad Shatnawi, Sami Muhaidat, Kamal Taha, and Albert Y. Zomaya

13.1 Introduction / 236

13.2 Methods / 238

13.3 Model Evaluation and Analysis / 243

13.4 Conclusion / 245

References / 245

IV PATTERN RECOGNITION IN QUATERNARY STRUCTURES 249

14 PREDICTION OF PROTEIN QUATERNARY STRUCTURES 251
Akbar Vaseghi, Maryam Faridounnia, Soheila Shokrollahzade, Samad Jahandideh, and Kuo-Chen Chou

14.1 Introduction / 251

14.2 Protein Structure Prediction / 255

14.3 Template-Based Predictions / 257

14.4 Critical Assessment of Protein Structure Prediction / 258

14.5 Quaternary Structure Prediction / 258

14.6 Conclusion / 261

Acknowledgments / 261

References / 261

15 COMPARISON OF PROTEIN QUATERNARY STRUCTURES BY GRAPH APPROACHES 266
Sheng-Lung Peng and Yu-Wei Tsay

15.1 Introduction / 266

15.2 Similarity in the Graph Model / 268

15.3 Measuring Structural Similarity VIA MCES / 272

15.4 Protein Comparison VIA Graph Spectra / 279

15.5 Conclusion / 287

References / 287

16 STRUCTURAL DOMAINS IN PREDICTION OF BIOLOGICAL PROTEIN-PROTEIN INTERACTIONS 291
Mina Maleki, Michael Hall, and Luis Rueda

16.1 Introduction / 291

16.2 Structural Domains / 293

16.3 The Prediction Framework / 293

16.4 Feature Extraction and Prediction Properties / 294

16.5 Feature Selection / 299

16.6 Classification / 301

16.7 Evaluation and Analysis / 304

16.8 Results and Discussion / 304

16.9 Conclusion / 309

References / 310

V PATTERN RECOGNITION IN MICROARRAYS 315

17 CONTENT-BASED RETRIEVAL OF MICROARRAY EXPERIMENTS 317
Hasan O¢gul

17.1 Introduction / 317

17.2 Information Retrieval: Terminology and Background / 318

17.3 Content-Based Retrieval / 320

17.4 Microarray Data and Databases / 322

17.5 Methods for Retrieving Microarray Experiments / 324

17.6 Similarity Metrics / 327

17.7 Evaluating Retrieval Performance / 329

17.8 Software Tools / 330

17.9 Conclusion and Future Directions / 331

Acknowledgment / 332

References / 332

18 EXTRACTION OF DIFFERENTIALLY EXPRESSED GENES IN MICROARRAY DATA 335
Tiratha Raj Singh, Brigitte Vannier, and Ahmed Moussa

18.1 Introduction / 335

18.2 From Microarray Image to Signal / 336

18.3 Microarray Signal Analysis / 337

18.4 Algorithms for De Gene Selection / 339

18.5 Gene Ontology Enrichment and Gene Set Enrichment Analysis / 343

18.6 Conclusion / 345

References / 345

19 CLUSTERING AND CLASSIFICATION TECHNIQUES FOR GENE EXPRESSION PROFILE PATTERN ANALYSIS 347
Emanuel Weitschek, Giulia Fiscon, Valentina Fustaino, Giovanni Felici, and Paola Bertolazzi

19.1 Introduction / 347

19.2 Transcriptome Analysis / 348

19.3 Microarrays / 349

19.4 RNA-Seq / 351

19.5 Benefits and Drawbacks of RNA-Seq and Microarray Technologies / 353

19.6 Gene Expression Profile Analysis / 356

19.7 Real Case Studies / 364

19.8 Conclusions / 367

References / 368

20 MINING INFORMATIVE PATTERNS IN MICROARRAY DATA 371
Li Teng

20.1 Introduction / 371

20.2 Patterns with Similarity / 373

20.3 Conclusion / 391

References / 391

21 ARROW PLOT AND CORRESPONDENCE ANALYSIS MAPS FOR VISUALIZING THE EFFECTS OF BACKGROUND CORRECTION AND NORMALIZATION METHODS ON MICROARRAY DATA 394
Carina Silva, Adelaide Freitas, Sara Roque, and Lisete Sousa

21.1 Overview / 394

21.2 Arrow Plot / 399

21.3 Significance Analysis of Microarrays / 404

21.4 Correspondence Analysis / 405

21.5 Impact of the Preprocessing Methods / 407

21.6 Conclusions / 412

Acknowledgments / 413

References / 413

VI PATTERN RECOGNITION IN PHYLOGENETIC TREES 417

22 PATTERN RECOGNITION IN PHYLOGENETICS: TREES AND NETWORKS 419
David A. Morrison

22.1 Introduction / 419

22.2 Networks and Trees / 420

22.3 Patterns and Their Processes / 424

22.4 The Types of Patterns / 427

22.5 Fingerprints / 431

22.6 Constructing Networks / 433

22.7 Multi-Labeled Trees / 435

22.8 Conclusion / 436

References / 437

23 DIVERSE CONSIDERATIONS FOR SUCCESSFUL PHYLOGENETIC TREE RECONSTRUCTION: IMPACTS FROM MODEL MISSPECIFICATION, RECOMBINATION, HOMOPLASY, AND PATTERN RECOGNITION 439
Diego Mallo, Agustín Sánchez-Cobos, and Miguel Arenas

23.1 Introduction / 440

23.2 Overview on Methods and Frameworks for Phylogenetic Tree Reconstruction / 440

23.3 Influence of Substitution Model Misspecification on Phylogenetic Tree Reconstruction / 445

23.4 Influence of Recombination on Phylogenetic Tree Reconstruction / 446

23.5 Influence of Diverse Evolutionary Processes on Species Tree Reconstruction / 447

23.6 Influence of Homoplasy on Phylogenetic Tree Reconstruction: The Goals of Pattern Recognition / 449

23.7 Concluding Remarks / 449

Acknowledgments / 450

References / 450

24 AUTOMATED PLAUSIBILITY ANALYSIS OF LARGE PHYLOGENIES 457
David Dao, Tomás Flouri, and Alexandros Stamatakis

24.1 Introduction / 457

24.2 Preliminaries / 459

24.3 A Naïve Approach / 462

24.4 Toward a Faster Method / 463

24.5 Improved Algorithm / 467

24.6 Implementation / 473

24.7 Evaluation / 474

24.8 Conclusion / 479

Acknowledgment / 481

References / 481

25 A NEW FAST METHOD FOR DETECTING AND VALIDATING HORIZONTAL GENE TRANSFER EVENTS USING PHYLOGENETIC TREES AND AGGREGATION FUNCTIONS 483
Dunarel Badescu, Nadia Tahiri, and Vladimir Makarenkov

25.1 Introduction / 483

25.2 Methods / 485

25.3 Experimental Study / 491

25.4 Results and Discussion / 501

25.5 Conclusion / 502

References / 503

VII PATTERN RECOGNITION IN BIOLOGICAL NETWORKS 505

26 COMPUTATIONAL METHODS FOR MODELING BIOLOGICAL INTERACTION NETWORKS 507
Christos Makris and Evangelos Theodoridis

26.1 Introduction / 507

26.2 Measures/Metrics / 508

26.3 Models of Biological Networks / 511

26.4 Reconstructing and Partitioning Biological Networks / 511

26.5 PPI Networks / 513

26.6 Mining PPI Networks--Interaction Prediction / 517

26.7 Conclusions / 519

References / 519

27 BIOLOGICAL NETWORK INFERENCE AT MULTIPLE SCALES: FROM GENE REGULATION TO SPECIES INTERACTIONS 525
Andrej Aderhold, V Anne Smith, and Dirk Husmeier

27.1 Introduction / 525

27.2 Molecular Systems / 528

27.3 Ecological Systems / 528

27.4 Models and Evaluation / 529

27.5 Learning Gene Regulation Networks / 532

27.6 Learning Species Interaction Networks / 540

27.7 Conclusion / 550

References / 550

28 DISCOVERING CAUSAL PATTERNS WITH STRUCTURAL EQUATION MODELING: APPLICATION TO TOLL-LIKE RECEPTOR SIGNALING PATHWAY IN CHRONIC LYMPHOCYTIC LEUKEMIA 555
Athina Tsanousa, Stavroula Ntoufa, Nikos Papakonstantinou, Kostas Stamatopoulos, and Lefteris Angelis

28.1 Introduction / 555

28.2 Toll-Like Receptors / 557

28.3 Structural Equation Modeling / 560

28.4 Application / 566

28.5 Conclusion / 580

References / 581

29 ANNOTATING PROTEINS WITH INCOMPLETE LABEL INFORMATION 585
Guoxian Yu, Huzefa Rangwala, and Carlotta Domeniconi

29.1 Introduction / 585

29.2 Related Work / 587

29.3 Problem Formulation / 589

29.4 Experimental Setup / 592

29.5 Experimental Analysis / 596

29.6 Conclusions / 605

Acknowledgments / 606

References / 606

INDEX 609