| | Contents | |
| | | |
| |
| | A Personal Foreword | XV |
| | Preface | XVII |
| | List of Contributors | XIX |
| 1 | Introduction to Chemoinformatics in Drug Discovery – A Personal View Garland R. Marshall | 1 |
| 1.1 | Introduction | 1 |
| 1.2 | Historical Evolution | 4 |
| 1.3 | Known versus Unknown Targets | 5 |
| 1.4 | Graph Theory and Molecular Numerology | 6 |
| 1.5 | Pharmacophore | 7 |
| 1.6 | Active-Analog Approach | 8 |
| 1.7 | Active-Site Modeling | 9 |
| 1.8 | Validation of the Active-Analog Approach and Active-Site Modeling | 10 |
| 1.9 | PLS/CoMFA | 11 |
| 1.10 | Prediction of Affinity | 12 |
| 1.11 | Protein Structure Prediction | 13 |
| 1.12 | Structure-Based Drug Design | 15 |
| 1.13 | Real World Pharmaceutical Issues | 15 |
| 1.14 | Combinatorial Chemistry and High-throughput Screens | 16 |
| 1.15 | Diversity and Similarity | 16 |
| 1.16 | Prediction of ADME | 17 |
| 1.17 | Failures to Accurately Predict | 17 |
| 1.18 | Summary | 18 |
| | References | 19 |
| Part I | Virtual Screening | 23 |
| 2 | Chemoinformatics in Lead Discovery Tudor I. Oprea | 25 |
| 2.1 | Chemoinformatics in the Context of Pharmaceutical Research | 25 |
| 2.2 | Leads in the Drug Discovery Paradigm | 27 |
| 2.3 | Is There a Trend for High Activity Molecules? | 29 |
| 2.4 | The Concept of Leadlikeness | 32 |
| 2.5 | Conclusions | 37 |
| | References | 38 |
| 3 | Computational Chemistry, Molecular Complexity and Screening Set Design Michael M. Hann, Andrew R. Leach, and Darren V.S. Green | 43 |
| 3.1 | Introduction | 43 |
| 3.2 | Background Concepts: the Virtual, Tangible and Real Worlds of Compounds, the ‘‘Knowledge Plot’’ and Target Tractability | 44 |
| 3.3 | The Construction of High Throughput Screening Sets | 45 |
| 3.4 | Compound Filters | 47 |
| 3.5 | ‘‘Leadlike’’ Screening Sets | 48 |
| 3.6 | Focused and Biased Set Design | 54 |
| 3.7 | Conclusion | 55 |
| | References | 56 |
| 4 | Algorithmic Engines in Virtual Screening Matthias Rarey, Christian Lemmen, and Hans Matter | 59 |
| 4.1 | Introduction | 59 |
| 4.2 | Software Tools for Virtual Screening | 61 |
| 4.3 | Physicochemical Models in Virtual Screening | 62 |
| 4.3.1 | Intermolecular Forces in Protein–Ligand Interactions | 63 |
| 4.3.2 | Scoring Functions for Protein–Ligand Recognition | 66 |
| 4.3.3 | Covering Conformational Space | 67 |
| 4.3.4 | Scoring Structural Alignments | 68 |
| 4.4 | Algorithmic Engines in Virtual Screening | 69 |
| 4.4.1 | Mathematical Concepts | 69 |
| 4.4.2 | Algorithmic Concepts | 76 |
| 4.4.3 | Descriptor Technology | 81 |
| 4.4.4 | Global Search Algorithms | 85 |
| | | |
| | | |
| 4.5 | Entering the Real World: Virtual Screening Applications | 89 |
| 4.5.1 | Practical Considerations on Virtual Screening | 89 |
| 4.5.2 | Successful Applications of Virtual Screening | 91 |
| 4.6 | Practical Virtual Screening: Some Final Remarks | 99 |
| | References | 101 |
| 5 | Strengths and Limitations of Pharmacophore-Based Virtual Screening Dragos Horvath, Boryeu Mao, Rafael Gozalbes, Fr´ed´erique Barbosa, and Sherry L. Rogalski | 117 |
| 5.1 | Introduction | 117 |
| 5.2 | The ‘‘Pharmacophore’’ Concept: Pharmacophore Features | 117 |
| 5.3 | Pharmacophore Models: Managing Pharmacophore-related Information | 118 |
| 5.4 | The Main Topic of This Paper | 119 |
| 5.5 | The Cox2 Data Set | 119 |
| 5.6 | Pharmacophore Fingerprints and Similarity Searches | 120 |
| 5.7 | Molecular Field Analysis (MFA)-Based Pharmacophore Information | 123 |
| 5.8 | QSAR Models | 125 |
| 5.9 | Hypothesis Models | 125 |
| 5.10 | The Minimalist Overlay-Independent QSAR Model | 126 |
| 5.11 | Minimalist and Consensus Overlay-Based QSAR Models | 128 |
| 5.12 | Diversity Analysis of the Cox2 Compound Set | 131 |
| 5.13 | Do Hypothesis Models Actually Tell Us More Than Similarity Models About the Structural Reasons of Activity? | 131 |
| 5.14 | Why Did Hypothesis Models Fail to Unveil the Key Cox2 Site–Ligand Interactions? | 134 |
| 5.15 | Conclusions | 136 |
| | References | 137 |
| Part II | Hit and Lead Discovery | 141 |
| 6 | Enhancing Hit Quality and Diversity Within Assay Throughput Constraints Iain McFadyen, Gary Walker, and Juan Alvarez | 143 |
| 6.1 | Introduction | 143 |
| 6.1.1 | What Makes a Good Lead Molecule? | 144 |
| 6.1.2 | Compound Collections – Suitability as Leads | 144 |
| 6.1.3 | Compound Collections – Diversity | 145 |
| 6.1.4 | Data Reliability | 146 |
| 6.1.5 | Selection Methods | 149 |
| 6.1.6 | Enhancing Quality and Diversity of Actives | 153 |
| 6.2 | Methods | 154 |
| 6.2.1 | Screening Library | 155 |
| 6.2.2 | Determination of Activity Threshold | 156 |
| 6.2.3 | Filtering | 156 |
| 6.2.4 | High-Throughput Screen Clustering Algorithm (HTSCA) | 157 |
| 6.2.5 | Diversity Analysis | 160 |
| 6.2.6 | Data Visualization | 161 |
| 6.3 | Results | 162 |
| 6.3.1 | Peptide Hydrolase | 162 |
| 6.3.2 | Protein Kinase | 167 |
| 6.3.3 | Protein–Protein Interaction | 168 |
| 6.4 | Discussion and Conclusion | 169 |
| | References | 172 |
| 7 | Molecular Diversity in Lead Discovery: From Quantity to Quality Cullen L. Cavallaro, Dora M. Schnur, and Andrew J. Tebben | 175 |
| 7.1 | Introduction | 175 |
| 7.2 | Large Libraries and Collections | 176 |
| 7.2.1 | Methods and Examples for Large Library Diversity Calculations | 177 |
| 7.3 | Medium-sized/Target-class Libraries and Collections | 181 |
| 7.3.1 | Computational Methods for Medium-and Target-class Libraries and Collections | 183 |
| 7.4 | Small Focused Libraries | 189 |
| 7.4.1 | Computational Methods for Small and Focused Libraries | 190 |
| 7.5 | Summary/Conclusion | 191 |
| | References | 192 |
| 8 | In Silico Lead Optimization Chris M.W. Ho | 199 |
| 8.1 | Introduction | 199 |
| 8.2 | The Rise of Computer-aided Drug Refinement | 200 |
| 8.3 | RACHEL Software Package | 201 |
| 8.4 | Extraction of Building Blocks from Corporate Databases | 201 |
| 8.5 | Intelligent Component Selection System | 203 |
| 8.6 | Development of a Component Specification Language | 205 |
| 8.7 | Filtration of Components Using Constraints | 207 |
| 8.8 | Template-driven Structure Generation | 208 |
| 8.9 | Scoring Functions – Methods to Estimate Ligand–Receptor Binding | 209 |
| 8.10 | Target Functions | 212 |
| 8.11 | Ligand Optimization Example | 214 |
| | References | 219 |
| Part III | Databases and Libraries | 221 |
| 9 | WOMBAT: World of Molecular Bioactivity Marius Olah, Maria Mracec, Liliana Ostopovici, Ramona Rad, Alina Bora, Nicoleta Hadaruga, Ionela Olah, Magdalena Banda, Zeno Simon, Mircea Mracec, and Tudor I. Oprea | 223 |
| 9.1 | Introduction – Brief History of the WOMBAT Project | 223 |
| 9.2 | WOMBAT 2004.1 Overview | 224 |
| 9.3 | WOMBAT Database Structure | 227 |
| 9.4 | WOMBAT Quality Control | 228 |
| 9.5 | Uncovering Errors from Literature | 231 |
| 9.6 | Data Mining with WOMBAT | 234 |
| 9.7 | Conclusions and Future Challenges | 235 |
| | References | 237 |
| 10 | Cabinet – Chemical and Biological Informatics Network Vera Povolna, Scott Dixon, and David Weininger | 241 |
| 10.1 | Introduction | 241 |
| 10.1.1 | Integration Efforts, WWW as Information Resource and Limitations | 241 |
| 10.1.2 | Goals | 243 |
| 10.2 | Merits of Federation Rather than Unification | 243 |
| 10.2.1 | The Merits of Unification | 244 |
| 10.2.2 | The Merits of Federation | 244 |
| 10.2.3 | Unifying Disparate Data Models is Difficult, Federating them is Easy | 245 |
| 10.2.4 | Language is a Natural Key | 246 |
| 10.3 | HTTP is Appropriate Communication Technology | 248 |
| 10.3.1 | HTTP is Specifically Designed for Collaborative Computing | 248 |
| 10.3.2 | HTTP is the Dominant Communication Protocol Today | 248 |
| 10.3.3 | HTML Provides a Universally Accessible GUI | 249 |
| 10.3.4 | MIME ‘‘ Text/Plain’’ and ‘‘Application/Octet-Stream’’ are Important Catch-alls | 249 |
| 10.3.5 | Other MIME Types are Useful | 250 |
| 10.3.6 | One Significant HTTP Work-around is Required | 250 |
| 10.4 | Implementation | 251 |
| 10.4.1 | Daylight HTTP Toolkit | 251 |
| 10.4.2 | Metaphorics’ Cabinet Library | 252 |
| 10.5 | Specific Examples of Federated Services | 252 |
| 10.5.1 | Empath – Metabolic Pathway Chart | 253 |
| 10.5.2 | Planet – Protein–ligand Association Network | 254 |
| 10.5.3 | EC Book – Enzyme Commission Codebook | 254 |
| 10.5.4 | WDI – World Drug Index | 254 |
| 10.5.5 | WOMBAT – World of Molecular Bioactivity | 255 |
| 10.5.6 | TCM (Traditional Chinese Medicines), DCM (Dictionary of Chinese Medicine), PARK (Photo ARKive) and zi4 | 255 |
| 10.5.7 | Cabinet ‘‘Download’’ Service | 256 |
| 10.5.8 | Cabinet Usage Example | 256 |
| 10.6 | Deployment and Refinement | 262 |
| 10.6.1 | Local Deployment | 264 |
| 10.6.2 | Intranet Deployment | 264 |
| 10.6.3 | Internet Deployment | 265 |
| 10.6.4 | Online Deployment | 266 |
| 10.7 | Conclusions | 266 |
| | | |
| | References | 268 |
| 11 | Structure Modification in Chemical Databases Peter W. Kenny and Jens Sadowski | 271 |
| 11.1 | Introduction | 271 |
| 11.2 | Permute | 274 |
| 11.2.1 | Protonation and Formal Charges | 274 |
| 11.2.2 | Tautomerism | 275 |
| 11.2.3 | Nitrogen Configurations | 276 |
| 11.2.4 | Duplicate Removal | 276 |
| 11.2.5 | Nested Loop | 276 |
| 11.2.6 | Application Statistics | 277 |
| 11.2.7 | Impact on Docking | 277 |
| 11.3 | Leatherface | 279 |
| 11.3.1 | Protonation and Formal Charges | 279 |
| 11.3.2 | Tautomerism | 280 |
| 11.3.3 | Ionization and Tautomer Model | 281 |
| 11.3.4 | Relationships between Structures | 282 |
| 11.3.5 | Substructural Searching and Analysis | 283 |
| 11.4 | Concluding Remarks | 283 |
| | References | 284 |
| 12 | Rational Design of GPCR-specific Combinational Libraries Based on the Concept of Privileged Substructures Nikolay P. Savchuk, Sergey E. Tkachenko, and Konstantin V. Balakin | 287 |
| 12.1 | Introduction – Combinatorial Chemistry and Rational Drug Design | 287 |
| 12.2 | Rational Selection of Building Blocks Based on Privileged Structural Motifs | 288 |
| 12.2.1 | Privileged Structures and Substructures in the Design of Pharmacologically Relevant Combinatorial Libraries | 288 |
| 12.2.2 | Analysis of Privileged Structural Motifs: Structure Dissection Rules | 291 |
| 12.2.3 | Knowledge Database | 293 |
| 12.2.4 | Target-specific Differences in Distribution of Molecular Fragments | 295 |
| 12.2.5 | Privileged versus Peripheral Retrosynthetic Fragments | 296 |
| 12.2.6 | Peripheral Retrosynthetic Fragments: How to Measure the Target-specific Differences? | 297 |
| 12.2.7 | Selection of Building Blocks | 300 |
| 12.2.8 | Product-based Approach: Limiting the Space of Virtual Libraries | 305 |
| 12.2.9 | Alternative Strategy: Property-based Approach | 306 |
| 12.2.10 | Kohonen Self-organizing Maps | 307 |
| 12.3 | Conclusions | 309 |
| | References | 311 |
| Part IV | Chemoinformatics Applications | 315 |
| 13 | A Practical Strategy for Directed Compound Acquisition Gerald M. Maggiora, Veerabahu Shanmugasundaram, Michael S. Lajiness, Tom N. Doman, and Martin W. Schultz | 317 |
| 13.1 | Introduction | 317 |
| 13.2 | A Historical Perspective | 319 |
| 13.3 | Practical Issues | 320 |
| 13.4 | Compound Acquisition Scheme | 322 |
| 13.4.1 | Preprocessing Compound Files | 322 |
| 13.4.2 | Initial Compound Selection and Diversity Assessment | 325 |
| 13.4.3 | Compound Reviews | 327 |
| 13.4.4 | Final Selection and Compound Purchase | 328 |
| 13.5 | Conclusions | 328 |
| 13.6 | Methodologies | 329 |
| 13.6.1 | Preprocessing Filters | 329 |
| 13.6.2 | Diverse Solutions (DVS) | 330 |
| 13.6.3 | Dfragall | 330 |
| 13.6.4 | Ring Analysis | 331 |
| | References | 331 |
| 14 | Efficient Strategies for Lead Optimization by Simultaneously Addressing Affinity, Selectivity and Pharmacokinetic Parameters Karl-Heinz Baringhaus and Hans Matter | 333 |
| 14.1 | Introduction | 333 |
| 14.2 | The Origin of Lead Structures | 336 |
| 14.3 | Optimization for Affinity and Selectivity | 338 |
| 14.3.1 | Lead Optimization as a Challenge in Drug Discovery | 338 |
| 14.3.2 | Use and Limitation of Structure-based Design Approaches | 339 |
| 14.3.3 | Integration of Ligand-and Structure-based Design Concepts | 340 |
| 14.3.4 | The Selectivity Challenge from the Ligands’ Perspective | 342 |
| 14.3.5 | Selectivity Approaches Considering Binding Site Topologies | 344 |
| 14.4 | Addressing Pharmacokinetic Problems | 347 |
| 14.4.1 | Prediction of Physicochemical Properties | 347 |
| 14.4.2 | Prediction of ADME Properties | 348 |
| 14.4.3 | Prediction of Toxicity | 349 |
| 14.4.4 | Physicochemical and ADMET Property-based Design | 350 |
| 14.5 | ADME/Antitarget Models for Lead Optimization | 350 |
| 14.5.1 | Global ADME Models for Intestinal Absorption and Protein Binding | 350 |
| 14.5.2 | Selected Examples to Address ADME/Toxicology Antitargets | 354 |
| 14.6 | Integrated Approach | 357 |
| 14.6.1 | Strategy and Risk Assessment | 357 |
| 14.6.2 | Integration | 359 |
| 14.6.3 | Literature and Aventis Examples on Aspects of Multidimensional Optimization | 360 |
| 14.7 | Conclusions | 366 |
| | References | 367 |
| 15 | Chemoinformatic Tools for Library Design and the Hit-to-Lead Process: A User’s Perspective Robert Alan Goodnow, Jr., Paul Gillespie, and Konrad Bleicher | 381 |
| 15.1 | The Need for Leads: The Sources of Leads and the Challenge to Find Them | 381 |
| 15.2 | Property Predictions | 383 |
| 15.3 | Prediction of Solubility | 384 |
| 15.4 | Druglikeness | 390 |
| 15.4.1 | Are There Differences between Drugs and Nondrugs? | 390 |
| 15.4.2 | Is the Problem TractablewithinaSingle Program? | 391 |
| 15.4.3 | Do We Have a Training Set that Will Allow Us to Address the Issue? | 392 |
| 15.4.4 | Approaches to the Prediction of Druglikeness | 392 |
| 15.5 | Frequent Hitters | 394 |
| 15.6 | Identification of a Lead Series | 395 |
| 15.7 | The Hit-to-lead Process | 397 |
| 15.7.1 | Prioritization of Hits | 397 |
| 15.7.2 | Identification of Analogs | 402 |
| 15.7.3 | Additional Assays | 403 |
| 15.8 | Leads from Libraries: General Principles, Practical Considerations | 404 |
| 15.9 | Druglikeness in Small-molecule Libraries | 406 |
| 15.10 | Data Reduction and Viewing for Virtual Library Design | 407 |
| 15.11 | Druglikeness | 408 |
| 15.12 | Complexity and Andrews’ Binding Energy | 408 |
| 15.13 | Solubility | 411 |
| 15.14 | Polar Surface Area | 411 |
| 15.15 | Number of Rotatable Bonds | 412 |
| 15.16 | hERG Channel Binding | 413 |
| 15.17 | Chemoinformatic Analysis of the Predicted Hansch Substituent Constants of the Diversity Reagents for Design of Vector Exploration Libraries | 415 |
| 15.18 | Targeting Libraries by Virtual Screening | 416 |
| 15.19 | Combinatorial Design Based on Biostructural Information | 418 |
| 15.20 | Ligand-based Combinatorial Design: The RADDAR Approach | 419 |
| 15.21 | Virtual Screening of Small-molecule Library with Peptide-derived Pharmacophores | 421 |
| 15.22 | Chemoinformatic Tools and Strategies to Visualize Active Libraries | 423 |
| 15.23 | Visualization of Library Designs during Hit-to-lead Efforts | 423 |
| 15.24 | Summary and Outlook for Chemoinformatically Driven Lead Generation | 425 |
| | References | 426 |
| 16 | Application of Predictive QSAR Models to Database Mining Alexander Tropsha | 437 |
| 16.1 | Introduction | 437 |
| 16.2 | Building Predictive QSAR Models: The Importance of Validation | 438 |
| 16.3 | Defining Model Applicability Domain | 441 |
| 16.4 | Validated QSAR Modeling as an Empirical Data-modeling Approach: Combinatorial QSAR | 443 |
| 16.5 | Validated QSAR Models as Virtual Screening Tools | 445 |
| 16.6 | Conclusions and Outlook | 452 |
| | References | 453 |
| 17 | Drug Discovery in Academia – A Case Study Donald J. Abraham | 457 |
| 17.1 | Introduction | 457 |
| 17.2 | Linking the University with Business and Drug Discovery | 457 |
| 17.2.1 | Start-up Companies | 457 |
| 17.2.2 | Licensing | 458 |
| 17.3 | Research Parks | 459 |
| 17.4 | Conflict of Interest Issues for Academicians | 459 |
| 17.5 | Drug Discovery in Academia | 461 |
| 17.5.1 | Clinical Trials in Academia | 461 |
| 17.6 | Case Study: The Discovery and Development of Allosteric Effectors of Hemoglobin | 462 |
| 17.6.1 | Geduld (Patience) | 463 |
| 17.6.2 | Glück (Luck) | 463 |
| 17.6.3 | Geschick (Skill) | 464 |
| 17.6.4 | Geld (Money) | 471 |
| | References | 481 |
| | Subject Index | 485 |