John Wiley & Sons Mathematical Statistics with Resampling and R Cover This thoroughly updated second edition combines the latest software applications with the benefits o.. Product #: 978-1-119-41654-8 Regular price: $126.17 $126.17 Auf Lager

Mathematical Statistics with Resampling and R

Chihara, Laura M. / Hesterberg, Tim C.

Cover

2. Auflage November 2018
560 Seiten, Hardcover
Wiley & Sons Ltd

ISBN: 978-1-119-41654-8
John Wiley & Sons

Jetzt kaufen

Preis: 135,00 €

ca.-Preis

Preis inkl. MwSt, zzgl. Versand

Weitere Versionen

epubmobipdf

This thoroughly updated second edition combines the latest software applications with the benefits of modern resampling techniques

Resampling helps students understand the meaning of sampling distributions, sampling variability, P-values, hypothesis tests, and confidence intervals. The second edition of Mathematical Statistics with Resampling and R combines modern resampling techniques and mathematical statistics. This book has been classroom-tested to ensure an accessible presentation, uses the powerful and flexible computer language R for data analysis and explores the benefits of modern resampling techniques.

This book offers an introduction to permutation tests and bootstrap methods that can serve to motivate classical inference methods. The book strikes a balance between theory, computing, and applications, and the new edition explores additional topics including consulting, paired t test, ANOVA and Google Interview Questions. Throughout the book, new and updated case studies are included representing a diverse range of subjects such as flight delays, birth weights of babies, and telephone company repair times. These illustrate the relevance of the real-world applications of the material. This new edition:

* Puts the focus on statistical consulting that emphasizes giving a client an understanding of data and goes beyond typical expectations

* Presents new material on topics such as the paired t test, Fisher's Exact Test and the EM algorithm

* Offers a new section on "Google Interview Questions" that illustrates statistical thinking

* Provides a new chapter on ANOVA

* Contains more exercises and updated case studies, data sets, and R code

Written for undergraduate students in a mathematical statistics course as well as practitioners and researchers, the second edition of Mathematical Statistics with Resampling and R presents a revised and updated guide for applying the most current resampling techniques to mathematical statistics.

Preface xiii

1 Data and Case Studies 1

1.1 Case Study: Flight Delays 1

1.2 Case Study: BirthWeights of Babies 2

1.3 Case Study: Verizon Repair Times 3

1.4 Case Study: Iowa Recidivism 4

1.5 Sampling 5

1.6 Parameters and Statistics 6

1.7 Case Study: General Social Survey 7

1.8 Sample Surveys 8

1.9 Case Study: Beer and HotWings 9

1.10 Case Study: Black Spruce Seedlings 10

1.11 Studies 10

1.12 Google Interview Question: Mobile Ads Optimization 12

Exercises 16

2 Exploratory Data Analysis 21

2.1 Basic Plots 21

2.2 Numeric Summaries 25

2.2.1 Center 25

2.2.2 Spread 26

2.2.3 Shape 27

2.3 Boxplots 28

2.4 Quantiles and Normal Quantile Plots 29

2.5 Empirical Cumulative Distribution Functions 35

2.6 Scatter Plots 38

2.7 Skewness and Kurtosis 40

Exercises 42

3 Introduction to Hypothesis Testing: Permutation Tests 47

3.1 Introduction to Hypothesis Testing 47

3.2 Hypotheses 48

3.3 Permutation Tests 50

3.3.1 Implementation Issues 55

3.3.2 One-sided and Two-sided Tests 61

3.3.3 Other Statistics 62

3.3.4 Assumptions 64

3.3.5 Remark on Terminology 68

3.4 Matched Pairs 68

Exercises 70

4 Sampling Distributions 75

4.1 Sampling Distributions 75

4.2 Calculating Sampling Distributions 80

4.3 The Central LimitTheorem 84

4.3.1 CLT for Binomial Data 86

4.3.2 Continuity Correction for Discrete Random Variables 89

4.3.3 Accuracy of the Central Limit Theorem* 91

4.3.4 CLT for SamplingWithout Replacement 92

Exercises 93

5 Introduction to Confidence Intervals: The Bootstrap 103

5.1 Introduction to the Bootstrap 103

5.2 The Plug-in Principle 110

5.2.1 Estimating the Population Distribution 112

5.2.2 How Useful Is the Bootstrap Distribution? 113

5.3 Bootstrap Percentile Intervals 118

5.4 Two-Sample Bootstrap 119

5.4.1 Matched Pairs 124

5.5 Other Statistics 128

5.6 Bias 131

5.7 Monte Carlo Sampling: The "Second Bootstrap Principle" 134

5.8 Accuracy of Bootstrap Distributions 135

5.8.1 Sample Mean: Large Sample Size 135

5.8.2 Sample Mean: Small Sample Size 137

5.8.3 Sample Median 138

5.8.4 Mean-Variance Relationship 138

5.9 HowMany Bootstrap Samples Are Needed? 140

Exercises 141

6 Estimation 149

6.1 Maximum Likelihood Estimation 149

6.1.1 Maximum Likelihood for Discrete Distributions 150

6.1.2 Maximum Likelihood for Continuous Distributions 153

6.1.3 Maximum Likelihood for Multiple Parameters 157

6.2 Method of Moments 161

6.3 Properties of Estimators 163

6.3.1 Unbiasedness 164

6.3.2 Efficiency 167

6.3.3 Mean Square Error 171

6.3.4 Consistency 173

6.3.5 Transformation Invariance* 175

6.3.6 Asymptotic Normality of MLE* 177

6.4 Statistical Practice 178

6.4.1 Are You Asking the Right Question? 179

6.4.2 Weights 179

Exercises 180

7 More Confidence Intervals 187

7.1 Confidence Intervals for Means 187

7.1.1 Confidence Intervals for a Mean, Variance Known 187

7.1.2 Confidence Intervals for a Mean, Variance Unknown 192

7.1.3 Confidence Intervals for a Difference in Means 198

7.1.4 Matched Pairs, Revisited 204

7.2 Confidence Intervals in General 204

7.2.1 Location and Scale Parameters* 208

7.3 One-sided Confidence Intervals 212

7.4 Confidence Intervals for Proportions 214

7.4.1 Agresti-Coull Intervals for a Proportion 217

7.4.2 Confidence Intervals for a Difference of Proportions 218

7.5 Bootstrap Confidence Intervals 219

7.5.1 t Confidence Intervals Using Bootstrap Standard Errors 219

7.5.2 Bootstrap t Confidence Intervals 220

7.5.3 Comparing Bootstrap t and Formula t Confidence Intervals 224

7.6 Confidence Interval Properties 226

7.6.1 Confidence Interval Accuracy 226

7.6.2 Confidence Interval Length 227

7.6.3 Transformation Invariance 227

7.6.4 Ease of Use and Interpretation 227

7.6.5 Research Needed 228

Exercises 228

8 More Hypothesis Testing 241

8.1 Hypothesis Tests for Means and Proportions: One Population 241

8.1.1 A Single Mean 241

8.1.2 One Proportion 244

8.2 Bootstrap t-Tests 246

8.3 Hypothesis Tests for Means and Proportions: Two Populations 248

8.3.1 Comparing Two Means 248

8.3.2 Comparing Two Proportions 251

8.3.3 Matched Pairs for Proportions 254

8.4 Type I and Type II Errors 255

8.4.1 Type I Errors 257

8.4.2 Type II Errors and Power 261

8.4.3 P-Values Versus Critical Regions 266

8.5 Interpreting Test Results 267

8.5.1 P-Values 267

8.5.2 On Significance 268

8.5.3 Adjustments for Multiple Testing 269

8.6 Likelihood Ratio Tests 271

8.6.1 Simple Hypotheses and the Neyman-Pearson Lemma 271

8.6.2 Likelihood Ratio Tests for Composite Hypotheses 275

8.7 Statistical Practice 279

8.7.1 More Campaigns with No Clicks and No Conversions 284

Exercises 285

9 Regression 297

9.1 Covariance 297

9.2 Correlation 301

9.3 Least-Squares Regression 304

9.3.1 Regression toward the Mean 308

9.3.2 Variation 310

9.3.3 Diagnostics 311

9.3.4 Multiple Regression 317

9.4 The Simple LinearModel 317

9.4.1 Inference for alpha and beta 322

9.4.2 Inference for the Response 326

9.4.3 Comments about Assumptions for the Linear Model 330

9.5 Resampling Correlation and Regression 332

9.5.1 Permutation Tests 335

9.5.2 Bootstrap Case Study: Bushmeat 336

9.6 Logistic Regression 340

9.6.1 Inference for Logistic Regression 346

Exercises 350

10 Categorical Data 359

10.1 Independence in Contingency Tables 359

10.2 Permutation Test of Independence 361

10.3 Chi-square Test of Independence 365

10.3.1 Model for Chi-square Test of Independence 366

10.3.2 2 × 2 Tables 368

10.3.3 Fisher's Exact Test 370

10.3.4 Conditioning 371

10.4 Chi-square Test of Homogeneity 372

10.5 Goodness-of-fit Tests 374

10.5.1 All Parameters Known 374

10.5.2 Some Parameters Estimated 377

10.6 Chi-square and the Likelihood Ratio* 379

Exercises 380

11 Bayesian Methods 391

11.1 Bayes Theorem 392

11.2 Binomial Data: Discrete Prior Distributions 392

11.3 Binomial Data: Continuous Prior Distributions 400

11.4 Continuous Data 406

11.5 Sequential Data 409

Exercises 414

12 One-way ANOVA 419

12.1 Comparing Three or More Populations 419

12.1.1 The ANOVA F-test 419

12.1.2 A Permutation Test Approach 428

Exercises 429

13 Additional Topics 433

13.1 Smoothed Bootstrap 433

13.1.1 Kernel Density Estimate 435

13.2 Parametric Bootstrap 437

13.3 The Delta Method 441

13.4 Stratified Sampling 445

13.5 Computational Issues in Bayesian Analysis 446

13.6 Monte Carlo Integration 448

13.7 Importance Sampling 452

13.7.1 Ratio Estimate for Importance Sampling 458

13.7.2 Importance Sampling in Bayesian Applications 461

13.8 The EM Algorithm 467

13.8.1 General Background 469

Exercises 472

Appendix A Review of Probability 477

A.1 Basic Probability 477

A.2 Mean and Variance 478

A.3 The Normal Distribution 480

A.4 The Mean of a Sample of RandomVariables 481

A.5 Sums of Normal Random Variables 482

A.6 The Law of Averages 483

A.7 Higher Moments and the Moment-generating Function 484

Appendix B Probability Distributions 487

B.1 The Bernoulli and Binomial Distributions 487

B.2 The Multinomial Distribution 488

B.3 The Geometric Distribution 490

B.4 The Negative Binomial Distribution 491

B.5 The Hypergeometric Distribution 492

B.6 The Poisson Distribution 493

B.7 The Uniform Distribution 495

B.8 The Exponential Distribution 495

B.9 The Gamma Distribution 497

B.10 The Chi-square Distribution 499

B.11 The Student's t Distribution 502

B.12 The Beta Distribution 504

B.13 The F Distribution 505

Exercises 507

Appendix C Distributions Quick Reference 509

Solutions to Selected Exercises 513

References 525

Index 531
LAURA M. CHIHARA, PHD, is Professor of Mathematics and Statistics at Carleton College. She has extensive experience teaching mathematics and statistics and has worked as Educational Services Supervisor at Insightful Corporation.

TIM C. HESTERBERG, PHD, is Senior Data Scientist at Google. He was a senior research scientist for Insightful Corporation and led the development of S+Resample and other S+ and R software.

L. M. Chihara, Carleton College, USA; T. C. Hesterberg, Google