# Statistics at Square Two

3. Edition March 2023

208 Pages, Softcover*Practical Approach Book*

**978-1-119-40136-0**

STATISTICS AT SQUARE TWO

An easy-to-follow exploration of intermediate statistical techniques used in medical research

In the newly revised third edition of Statistics at Square Two: Understanding Modern Statistical Applications in Medicine, a team of distinguished statisticians delivers an accessible and intuitive discussion of advanced statistical methods for readers and users of scientific medical literature. This will allow readers to engage critically with modern research as the authors explain the correct interpretation of results in the medical literature.

The book includes two brand new chapters covering meta-analysis and time-series analysis as well as new references to the many checklists that have appeared in recent years to enable better reporting of contemporary research. Most examples have been updated as well, and each chapter contains practice exercises and answers. Readers will also find sample code (in R) for many of the analyses, in addition to:

* A thorough introduction to models and data, including the different types of data, statistical models, and computer-intensive methods

* Comprehensive explorations of multiple linear regression, including the interpretation of computer output, diagnostic statistics such as influential points, and many uses of multiple regression

* Practical discussions of multiple logistic regression, survival analysis, Poisson regression and random effects models including their uses, examples in the medical literature, and strategies for interpreting computer output

Perfect for anyone hoping to better understand the statistics presented in contemporary medical research, Statistics at Square Two: Understanding Modern Statistical Applications in Medicine will also benefit postgraduate students studying statistics and medicine.

1 Models, Tests and Data 1

1.1 Types of Data 1

1.2 Confounding, Mediation and Effect Modification 2

1.3 Causal Inference 3

1.4 Statistical Models 5

1.5 Results of Fitting Models 6

1.6 Significance Tests 7

1.7 Confidence Intervals 8

1.8 Statistical Tests Using Models 8

1.9 Many Variables 9

1.10 Model Fitting and Analysis: Exploratory and Confirmatory Analyses 10

1.11 Computer-intensive Methods 11

1.12 Missing Values 11

1.13 Bayesian Methods 12

1.14 Causal Modelling 12

1.15 Reporting Statistical Results in the Medical Literature 14

1.16 Reading Statistics in the Medical Literature 14

2 Multiple Linear Regression 17

2.1 The Model 17

2.2 Uses of Multiple Regression 18

2.3 Two Independent Variables 18

2.3.1 One Continuous and One Binary Independent Variable 19

2.3.2 Two Continuous Independent Variables 22

2.3.3 Categorical Independent Variables 22

2.4 Interpreting a Computer Output 23

2.4.1 One Continuous Variable 24

2.4.2 One Continuous Variable and One Binary Independent Variable 25

2.4.3 One Continuous Variable and One Binary Independent Variable with Their Interaction 26

2.4.4 Two Independent Variables: Both Continuous 27

2.4.5 Categorical Independent Variables 29

2.5 Examples in the Medical Literature 31

2.5.1 Analysis of Covariance: One Binary and One Continuous Independent Variable 31

2.5.2 Two Continuous Independent Variables 32

2.6 Assumptions Underlying the Models 32

2.7 Model Sensitivity 33

2.7.1 Residuals, Leverage and Influence 33

2.7.2 Computer Analysis: Model Checking and Sensitivity 34

2.8 Stepwise Regression 35

2.9 Reporting the Results of a Multiple Regression 36

2.10 Reading about the Results of a Multiple Regression 36

2.11 Frequently Asked Questions 37

2.12 Exercises: Reading the Literature 38

3 Multiple Logistic Regression 41

3.1 Quick Revision 41

3.2 The Model 42

3.2.1 Categorical Covariates 44

3.3 Model Checking 44

3.3.1 Lack of Fit 45

3.3.2 "Extra-binomial" Variation or "Over Dispersion" 45

3.3.3 The Logistic Transform is Inappropriate 46

3.4 Uses of Logistic Regression 46

3.5 Interpreting a Computer Output 47

3.5.1 One Binary Independent Variable 47

3.5.2 Two Binary Independent Variables 51

3.5.3 Two Continuous Independent Variables 53

3.6 Examples in the Medical Literature 54

3.6.1 Comment 55

3.7 Case-control Studies 56

3.8 Interpreting Computer Output: Unmatched Case-control Study 56

3.9 Matched Case-control Studies 58

3.10 Interpreting Computer Output: Matched Case-control Study 58

3.11 Example of Conditional Logistic Regression in the Medical Literature 60

3.11.1 Comment 60

3.12 Alternatives to Logistic Regression 61

3.13 Reporting the Results of Logistic Regression 61

3.14 Reading about the Results of Logistic Regression 61

3.15 Frequently Asked Questions 62

3.16 Exercise 62

4 Survival Analysis 65

4.1 Introduction 65

4.2 The Model 66

4.3 Uses of Cox Regression 68

4.4 Interpreting a Computer Output 68

4.5 Interpretation of the Model 70

4.6 Generalisations of the Model 70

4.6.1 Stratified Models 70

4.6.2 Time Dependent Covariates 71

4.6.3 Parametric Survival Models 71

4.6.4 Competing Risks 71

4.7 Model Checking 72

4.8 Reporting the Results of a Survival Analysis 73

4.9 Reading about the Results of a Survival Analysis 74

4.10 Example in the Medical Literature 74

4.10.1 Comment 75

4.11 Frequently Asked Questions 76

4.12 Exercises 77

5 Random Effects Models 79

5.1 Introduction 79

5.2 Models for Random Effects 80

5.3 Random vs Fixed Effects 81

5.4 Use of Random Effects Models 81

5.4.1 Cluster Randomised Trials 81

5.4.2 Repeated Measures 82

5.4.3 Sample Surveys 83

5.4.4 Multi-centre Trials 83

5.5 Ordinary Least Squares at the Group Level 84

5.6 Interpreting a Computer Output 85

5.6.1 Different Methods of Analysis 85

5.6.2 Likelihood and gee 85

5.6.3 Interpreting Computer Output 86

5.7 Model Checking 89

5.8 Reporting the Results of Random Effects Analysis 89

5.9 Reading about the Results of Random Effects Analysis 90

5.10 Examples of Random Effects Models in the Medical Literature 90

5.10.1 Cluster Trials 90

5.10.2 Repeated Measures 91

5.10.3 Comment 91

5.10.4 Clustering in a Cohort Study 91

5.10.5 Comment 91

5.11 Frequently Asked Questions 91

5.12 Exercises 92

6 Poisson and Ordinal Regression 95

6.1 Poisson Regression 95

6.2 The Poisson Model 95

6.3 Interpreting a Computer Output: Poisson Regression 96

6.4 Model Checking for Poisson Regression 97

6.5 Extensions to Poisson Regression 99

6.6 Poisson Regression Used to Estimate Relative Risks from a 2 × 2 Table 99

6.7 Poisson Regression in the Medical Literature 100

6.8 Ordinal Regression 100

6.9 Interpreting a Computer Output: Ordinal Regression 101

6.10 Model Checking for Ordinal Regression 103

6.11 Ordinal Regression in the Medical Literature 104

6.12 Reporting the Results of Poisson or Ordinal Regression 104

6.13 Reading about the Results of Poisson or Ordinal Regression 104

6.14 Frequently Asked Question 105

6.15 Exercises 105

7 Meta-analysis 107

7.1 Introduction 107

7.2 Models for Meta-analysis 108

7.3 Missing Values 111

7.4 Displaying the Results of a Meta-analysis 111

7.5 Interpreting a Computer Output 113

7.6 Examples from the Medical Literature 114

7.6.1 Example of a Meta-analysis of Clinical Trials 114

7.6.2 Example of a Meta-analysis of Case-control Studies 115

7.7 Reporting the Results of a Meta-analysis 115

7.8 Reading about the Results of a Meta-analysis 116

7.9 Frequently Asked Questions 116

7.10 Exercise 118

8 Time Series Regression 121

8.1 Introduction 121

8.2 The Model 122

8.3 Estimation Using Correlated Residuals 122

8.4 Interpreting a Computer Output: Time Series Regression 123

8.5 Example of Time Series Regression in the Medical Literature 124

8.6 Reporting the Results of Time Series Regression 125

8.7 Reading about the Results of Time Series Regression 125

8.8 Frequently Asked Questions 125

8.9 Exercise 126

Appendix 1 Exponentials and Logarithms 129

Appendix 2 Maximum Likelihood and Significance Tests 133

A2. 1 Binomial Models and Likelihood 133

A. 2 The Poisson Model 135

A2. 3 The Normal Model 135

A2. 4 Hypothesis Testing: the Likelihood Ratio Test 137

A2. 5 The Wald Test 138

A2. 6 The Score Test 138

A2. 7 Which Method to Choose? 139

A2. 8 Confidence Intervals 139

A2. 9 Deviance Residuals for Binary Data 140

A2. 10 Example: Derivation of the Deviances and Deviance Residuals Given in Table 3.3 140

A2.10.1 Grouped Data 140

A2.10.2 Ungrouped Data 140

Appendix 3 Bootstrapping and Variance Robust Standard Errors 143

A3.1 The Bootstrap 143

A3.2 Example of the Bootstrap 144

A3.3 Interpreting a Computer Output: The Bootstrap 145

A3.3.1 Two-sample T-test with Unequal Variances 145

A3.4 The Bootstrap in the Medical Literature 145

A3.5 Robust or Sandwich Estimate SEs 146

A3.6 Interpreting a Computer Output: Robust SEs for Unequal Variances 147

A3.7 Other Uses of Robust Regression 149

A3.8 Reporting the Bootstrap and Robust SEs in the Literature 149

A3.9 Frequently Asked Question 150

Appendix 4 Bayesian Methods 151

A4.1 Bayes' Theorem 151

A4.2 Uses of Bayesian Methods 152

A4.3 Computing in Bayes 153

A4.4 Reading and Reporting Bayesian Methods in the Literature 154

A4.5 Reading about the Results of Bayesian Methods in the Medical Literature 154

Appendix 5 R codes 157

A5. 1 R Code for Chapter 2 157

A5. 3 R Code for Chapter 3 163

A5. 4 R Code for Chapter 4 166

A. 5 R Code for Chapter 5 168

A5. 6 R Code for Chapter 6 170

A5. 7 R Code for Chapter 7 171

A5. 8 R Code for Chapter 8 173

A5. 9 R Code for Appendix 1 173

A5. 10 R Code for Appendix 2 174

A5. 11 R Code for Appendix 3 175

Answers to Exercises 179

Glossary 185

Index 191

Richard M. Jacques is a Senior Lecturer in Medical Statistics at the University of Sheffield in the United Kingdom.