John Wiley & Sons Strategies in Biomedical Data Science Cover An essential guide to healthcare data problems, sources, and solutions Strategies in Biomedical Dat.. Product #: 978-1-119-23219-3 Regular price: $57.85 $57.85 Auf Lager

Strategies in Biomedical Data Science

Driving Force for Innovation

Etchings, Jay A.

SAS Institute Inc

Cover

1. Auflage März 2017
464 Seiten, Hardcover
Wiley & Sons Ltd

ISBN: 978-1-119-23219-3
John Wiley & Sons

Jetzt kaufen

Preis: 61,90 €

Preis inkl. MwSt, zzgl. Versand

Weitere Versionen

epubpdf

An essential guide to healthcare data problems, sources, and solutions

Strategies in Biomedical Data Science provides medical professionals with much-needed guidance toward managing the increasing deluge of healthcare data. Beginning with a look at our current top-down methodologies, this book demonstrates the ways in which both technological development and more effective use of current resources can better serve both patient and payer. The discussion explores the aggregation of disparate data sources, current analytics and toolsets, the growing necessity of smart bioinformatics, and more as data science and biomedical science grow increasingly intertwined. You'll dig into the unknown challenges that come along with every advance, and explore the ways in which healthcare data management and technology will inform medicine, politics, and research in the not-so-distant future. Real-world use cases and clear examples are featured throughout, and coverage of data sources, problems, and potential mitigations provides necessary insight for forward-looking healthcare professionals.

Big Data has been a topic of discussion for some time, with much attention focused on problems and management issues surrounding truly staggering amounts of data. This book offers a lifeline through the tsunami of healthcare data, to help the medical community turn their data management problem into a solution.

* Consider the data challenges personalized medicine entails

* Explore the available advanced analytic resources and tools

* Learn how bioinformatics as a service is quickly becoming reality

* Examine the future of IOT and the deluge of personal device data

The sheer amount of healthcare data being generated will only increase as both biomedical research and clinical practice trend toward individualized, patient-specific care. Strategies in Biomedical Data Science provides expert insight into the kind of robust data management that is becoming increasingly critical as healthcare evolves.

Foreword xi

Acknowledgments xv

Introduction 1

Who Should Read This Book? 3

What's in This Book? 4

How to Contact Us 6

Chapter 1 Healthcare, History, and Heartbreak 7

Top Issues in Healthcare 9

Data Management 16

Biosimilars, Drug Pricing, and Pharmaceutical Compounding 18

Promising Areas of Innovation 19

Conclusion 25

Notes 25

Chapter 2 Genome Sequencing: Know Thyself, One Base Pair at a Time 27

Content contributed by Sheetal Shetty and Jacob Brill

Challenges of Genomic Analysis 29

The Language of Life 30

A Brief History of DNA Sequencing 31

DNA Sequencing and the Human Genome Project 35

Select Tools for Genomic Analysis 38

Conclusion 47

Notes 48

Chapter 3 Data Management 53

Content contributed by Joe Arnold

Bits about Data 54

Data Types 56

Data Security and Compliance 59

Data Storage 66

SwiftStack 70

OpenStack Swift Architecture 78

Conclusion 94

Notes 94

Chapter 4 Designing a Data-Ready Network Infrastructure 105

Research Networks: A Primer 108

ESnet at 30: Evolving toward Exascale and Raising Expectations 109

Internet2 Innovation Platform 111

Advances in Networking 113

InfiniBand and Microsecond Latency 114

The Future of High-Performance Fabrics 117

Network Function Virtualization 119

Software-Defined Networking 121

OpenDaylight 122

Conclusion 157

Notes 157

Chapter 5 Data-Intensive Compute Infrastructures 163

Content contributed by Dijiang Huang, Yuli Deng, Jay Etchings, Zhiyuan Ma, and Guangchun Luo

Big Data Applications in Health Informatics 166

Sources of Big Data in Health Informatics 168

Infrastructure for Big Data Analytics 171

Fundamental System Properties 186

GPU-Accelerated Computing and Biomedical Informatics 187

Conclusion 190

Notes 191

Chapter 6 Cloud Computing and Emerging Architectures 211

Cloud Basics 213

Challenges Facing Cloud Computing Applications in Biomedicine 215

Hybrid Campus Clouds 216

Research as a Service 217

Federated Access Web Portals 219

Cluster Homogeneity 220

Emerging Architectures (Zeta Architecture) 221

Conclusion 229

Notes 229

Chapter 7 Data Science 235

NoSQL Approaches to Biomedical Data Science 237

Using Splunk for Data Analytics 244

Statistical Analysis of Genomic Data with Hadoop 250

Extracting and Transforming Genomic Data 253

Processing eQTL Data 256

Generating Master SNP Files for Cases and Controls 259

Generating Gene Expression Files for Cases and Controls 260

Cleaning Raw Data Using MapReduce 261

Transpose Data Using Python 263

Statistical Analysis Using Spark 264

Hive Tables with Partitions 268

Conclusion 270

Notes 270

Appendix: A Brief Statistics Primer 290

Content Contributed by Daniel Peñaherrera

Chapter 8 Next-Generation Cyberinfrastructures 307

Next-Generation Cyber Capability 308

NGCC Design and Infrastructure 310

Conclusion 327

Note 330

Conclusion 335

Appendix A The Research Data Management Survey: From Concepts to Practice 337

Brandon Mikkelsen and Jay Etchings

Appendix B Central IT and Research Support 353

Gregory D. Palmer

Appendix C HPC Working Example: Using Parallelization Programs Such as GNU Parallel and OpenMP with Serial

Tools 377

Appendix D HPC and Hadoop: Bridging HPC to Hadoop 385

Appendix E Bioinformatics + Docker: Simplifying Bioinformatics Tools Delivery with Docker Containers 391

Glossary 399

About the Author 419

About the Contributors 421

Index 427
JAY A. ETCHINGS is the director of operations at Arizona State University's Research Computing program, where he is responsible for developing innovative architectures to progress fluid technical environments supporting highly computational workloads, peta-scale data analysis, next-generation cyber capabilities, and emerging network innovations.