Methods for Reliability Improvement and Risk Reduction

Todinov, Michael

1. Edition November 2018
288 Pages, Hardcover
Wiley & Sons Ltd

ISBN: 978-1-119-47758-7

John Wiley & Sons

Wiley Online Library Sample Chapter

Buy now

Ordering and shipping via our shop or authorized distribution partners.

Order now

Further versions

ePub MOBI PDF

Reliability is one of the most important attributes for the products and processes of any company or organization. This important work provides a powerful framework of domain-independent reliability improvement and risk reducing methods which can greatly lower risk in any area of human activity. It reviews existing methods for risk reduction that can be classified as domain-independent and introduces the following new domain-independent reliability improvement and risk reduction methods:
* Separation
* Stochastic separation
* Introducing deliberate weaknesses
* Segmentation
* Self-reinforcement
* Inversion
* Reducing the rate of accumulation of damage
* Permutation
* Substitution
* Limiting the space and time exposure
* Comparative reliability models

The domain-independent methods for reliability improvement and risk reduction do not depend on the availability of past failure data, domain-specific expertise or knowledge of the failure mechanisms underlying the failure modes. Through numerous examples and case studies, this invaluable guide shows that many of the new domain-independent methods improve reliability at no extra cost or at a low cost.

Using the proven methods in this book, any company and organisation can greatly enhance the reliability of its products and operations.

Preface xv

1 Domain-Independent Methods for Reliability Improvement and Risk Reduction 1

1.1 The Domain-Specific Methods for Risk Reduction 1

1.2 The Statistical, Data-Driven Approach 3

1.3 The Physics-of-Failure Approach 4

1.4 Reliability Improvement and TRIZ 6

1.5 The Domain-Independent Methods for Reliability Improvement and Risk Reduction 6

2 Basic Concepts 9

2.1 Likelihood of Failure, Consequences from Failure, Potential Loss, and Risk of Failure 9

2.2 Drawbacks of the Expected Loss as a Measure of the Potential Loss from Failure 14

2.3 Potential Loss, Conditional Loss, and Risk of Failure 15

2.4 Improving Reliability and Reducing Risk 19

2.5 Resilience 21

3 Overview of Methods and Principles for Improving Reliability and Reducing Risk That Can Be Classified as Domain-Independent 23

3.1 Improving Reliability and Reducing Risk by Preventing Failure Modes 23

3.1.1 Techniques for Identifying and Assessing Failure Modes 23

3.1.2 Effective Risk Reduction Procedure Related to Preventing Failure Modes from Occurring 27

3.1.3 Reliability Improvement and Risk Reduction by Root Cause Analysis 28

3.1.3.1 Case Study: Improving the Reliability of Automotive Suspension Springs by Root Cause Analysis 28

3.1.4 Preventing Failure Modes by Removing Latent Faults 29

3.2 Improving Reliability and Reducing Risk by a Fault-Tolerant System Design and Fail-Safe Design 31

3.2.1 Building in Redundancy 31

3.2.1.1 Case Study: Improving Reliability by k-out-of-n redundancy 34

3.2.2 Fault-Tolerant Design 34

3.2.3 Fail-Safe Principle and Fail-Safe Design 35

3.2.4 Reducing Risk by Eliminating Vulnerabilities 36

3.2.4.1 Eliminating Design Vulnerabilities 36

3.2.4.2 Reducing the Negative Impact of Weak Links 37

3.2.4.3 Reducing the Likelihood of Unfavourable Combinations of Risk-Critical Random Factors 38

3.2.4.4 Reducing the Vulnerability of Computational Models 39

3.3 Improving Reliability and Reducing Risk by Protecting Against Common Cause 40

3.4 Improving Reliability and Reducing Risk by Simplifying at a System and Component Level 42

3.5 Improving Reliability and Reducing Risk by Reducing the Variability of Risk-Critical Parameters 44

3.5.1 Case Study: Interaction Between the Upper Tail of the Load Distribution and the Lower Tail of the Strength Distribution 46

3.6 Improving Reliability and Reducing Risk by Making the Design Robust 48

3.6.1 Case Study: Increasing the Robustness of a Spring Assembly with Constant Clamping Force 50

3.7 Improving Reliability and Reducing Risk by Built-in Reinforcement 51

3.7.1 Built-In Prevention Reinforcement 51

3.7.2 Built-In Protection Reinforcement 51

3.8 Improving Reliability and Reducing Risk by Condition Monitoring 52

3.9 Reducing the Risk of Failure by Improving Maintainability 56

3.10 Reducing Risk by Eliminating Factors Promoting Human Errors 57

3.11 Reducing Risk by Reducing the Hazard Potential 58

3.12 Reducing Risk by using Protective Barriers 59

3.13 Reducing Risk by Efficient Troubleshooting Procedures and Systems 60

3.14 Risk Planning and Training 60

4 Improving Reliability and Reducing Risk by Separation 61

4.1 The Method of Separation 61

4.2 Separation of Risk-Critical Factors 62

4.2.1 Time Separation by Scheduling 62

4.2.1.1 Case Study: Full Time Separation with Random Starts of the Events 62

4.2.2 Time and Space Separation by Using Interlocks 63

4.2.2.1 Case Study: A Time Separation by Using an Interlock 63

4.2.3 Time Separation in Distributed Systems by Using Logical Clocks 64

4.2.4 Space Separation of Information 65

4.2.5 Separation of Duties to Reduce the Risk of Compromised Safety, Errors, and Fraud 65

4.2.6 Logical Separation by Using a Shared Unique Key 66

4.2.6.1 Case Study: Logical Separation of X-ray Equipment by a Shared Unique Key 66

4.2.7 Separation by Providing Conditions for Independent Operation 67

4.3 Separation of Functions, Properties, or Behaviour 68

4.3.1 Separation of Functions 68

4.3.1.1 Separation of Functions to Optimise for Maximum Reliability 68

4.3.1.2 Separation of Functions to Reduce Load Magnitudes 70

4.3.1.3 Separation of a Single Function into Multiple Components to Reduce Vulnerability to a Single Failure 71

4.3.1.4 Separation of Functions to Compensate Deficiencies 71

4.3.1.5 Separation of Functions to Prevent Unwanted Interactions 71

4.3.1.6 Separation of Methods to Reduce the Risk Associated with Incorrect Mathematical Models 72

4.4 Separation of Properties to Counter Poor Performance Caused by Inhomogeneity 72

4.4.1 Separation of Strength Across Components and Zones According to the Intensity of the Stresses from Loading 72

4.4.2 Separation of Properties to Satisfy Conflicting Requirements 74

4.4.3 Separation in Geometry 75

4.4.3.1 Case Study: Separation in Geometry for a Cantilever Beam 75

4.5 Separation on a Parameter, Conditions, or Scale 76

4.5.1 Separation at Distinct Values of a Risk-Critical Parameter Through Deliberate Weaknesses and Stress Limiters 76

4.5.2 Separation by Using Phase Changes 77

4.5.3 Separation of Reliability Across Components and Assemblies According to Their Cost of Failure 77

4.5.3.1 Case Study: Separation of the Reliability of Components Based on the Cost of Failure 78

5 Reducing Risk by Deliberate Weaknesses 81

5.1 Reducing the Consequences from Failure Through Deliberate Weaknesses 81

5.2 Separation from Excessive Levels of Stress 82

5.2.1 Deliberate Weaknesses Disconnecting Excessive Load 82

5.2.2 Energy-Absorbing Deliberate Weaknesses 85

5.2.2.1 Case Study: Reducing the Maximum Stress from Dynamic Loading by Energy-Absorbing Elastic Components 85

5.2.3 Designing Frangible Objects or Weakly Fixed Objects 86

5.3 Separation from Excessive Levels of Damage 87

5.3.1 Deliberate Weaknesses Decoupling Damaged Regions and Limiting the Spread of Damage 87

5.3.2 Deliberate Weaknesses Providing Stress and Strain Relaxation 88

5.3.3 Deliberate Weaknesses Separating from Excessive Levels of Damage Accumulation 90

5.4 Deliberate Weaknesses Deflecting the Failure Location or Damage Propagation 91

5.4.1 Deflecting the Failure Location from Places Where the Cost of Failure is High 91

5.4.2 Deflecting the Failure Location from Places Where the Cost of Intervention for Repair is High 92

5.4.3 Deliberate Weaknesses Deflecting the Propagation of Damage 92

5.5 Deliberate Weaknesses Designed to Provide Warning 92

5.6 Deliberate Weaknesses Designed to Provide Quick Access or Activate Protection 94

5.7 Deliberate Weaknesses and Stress Limiters 94

6 Improving Reliability and Reducing Risk by Stochastic Separation 97

6.1 Stochastic Separation of Risk-Critical Factors 97

6.1.1 Real-Life Applications that Require Stochastic Separation 97

6.1.2 Stochastic Separation of a Fixed Number of Random Events with Different Duration Times 99

6.1.2.1 Case Study: Stochastic Separation of Consumers by Proportionally Reducing Their Demand Times 102

6.1.3 Stochastic Separation of Random Events Following a Homogeneous Poisson Process 105

6.1.3.1 Case Study: Stochastic Separation of Random Demands Following a Homogeneous Poisson Process 106

6.1.4 Stochastic Separation Based on the Probability of Overlapping of Random Events for More than a Single Source Servicing the Random Demands 106

6.1.5 Computer Simulation Algorithm Determining the Probability of Overlapping for More than a Single Source Servicing the Demands 108

6.2 Expected Time Fraction of Simultaneous Presence of Critical Events 110

6.2.1 Case Study: Expected Fraction of Unsatisfied Demand at a Constant Sum of the Time Fractions of User Demands 112

6.2.2 Case Study: Servicing Random Demands from Ten Different Users, Each Characterised by a Distinct Demand Time Fraction 114

6.3 Analytical Method for Determining the Expected Fraction of Unsatisfied Demand for Repair 114

6.3.1 Case Study: Servicing Random Repairs from a System Including Components of Three Different Types, Each Characterised by a Distinct Repair Time 115

6.4 Expected Time Fraction of Simultaneous Presence of Critical Events that have been Initiated with Specified Probabilities 116

6.4.1 Case Study: Servicing Random Demands from Patients in a Hospital 117

6.4.2 Case Study: Servicing Random Demands from Four Different Types of Users, Each Issuing a Demand with Certain Probability 118

6.5 Stochastic Separation Based on the Expected Fraction of Unsatisfied Demand 119

6.5.1 Fixed Number of Random Demands on a Time Interval 119

6.5.2 Random Demands Following a Poisson Process on a Time Interval 120

6.5.2.1 Case Study: Servicing Random Failures from Circular Knitting Machines by an Optimal Number of Repairmen 122

7 Improving Reliability and Reducing Risk by Segmentation 125

7.1 Segmentation as a Problem-Solving Strategy 125

7.2 Creating a Modular System by Segmentation 127

7.3 Preventing Damage Accumulation and Limiting Damage Propagation by Segmentation 129

7.3.1 Creating Barriers Containing Damage 129

7.3.2 Creating Weak Interfaces Dissipating or Deflecting Damage 131

7.3.3 Reducing Deformations and Stresses by Segmentation 131

7.3.4 Reducing Hazard Potential by Segmentation 131

7.3.5 Reducing the Likelihood of Errors by Segmenting Operations 132

7.3.6 Limiting the Presence of Flaws by Segmentation 132

7.4 Improving Fault Tolerance and Reducing Vulnerability to a Single Failure by Segmentation 133

7.4.1 Case Study: Improving Fault Tolerance of a Column Loaded in Compression by Segmentation 133

7.4.2 Reducing the Vulnerability to a Single Failure by Segmentation 136

7.5 Reducing Loading Stresses by Segmentation 138

7.5.1 Improving Load Distribution by Segmentation 138

7.5.2 Improving Heat Dissipation by Segmentation 139

7.5.3 Case Study: Reducing Stress by Increasing the Perimeter to Cross-Sectional Area Ratio Through Segmentation 140

7.6 Reducing the Probability of a Loss/Error by Segmentation 142

7.6.1 Reducing the Likelihood of a Loss by Segmenting Opportunity Bets 142

7.6.1.1 Case Study: Reducing the Risk of a Loss from a Risky Prospect Involving a Single Opportunity Bet 143

7.6.2 Reducing the Likelihood of a Loss by Segmenting an Investment Portfolio 144

7.6.3 Reducing the Likelihood of Erroneous Conclusion from Imperfect Tests by Segmentation 145

7.7 Decreasing the Variation of Properties by Segmentation 146

7.8 Improved Control and Condition Monitoring by Time Segmentation 148

8 Improving Reliability and Reducing Risk by Inversion 149

8.1 The Method of Inversion 149

8.2 Improving Reliability by Inverting Functions, Relative Position, and Motion 150

8.2.1 Case Study: Eliminating Failure Modes of an Alarm Circuit by Inversion of Functions 151

8.2.2 Improving Reliability by Inverting the Relative Position of Objects 152

8.2.2.1 Case Study: Inverting the Position of an Object with Respect to its Support to Improve Reliability 153

8.3 Improving Reliability by Inverting Properties and Geometry 155

8.3.1 Case Study: Improving Reliability by Inverting Mechanical Properties Whilst Maintaining an Invariant 155

8.3.2 Case Study: Improving Reliability by Inverting Geometry Whilst Maintaining an Invariant 156

8.4 Improving Reliability and Reducing Risk by Introducing Inverse States 158

8.4.1 Inverse States Cancelling Anticipated Undesirable Effects 158

8.4.2 Inverse States Buffering Anticipated Undesirable Effects 159

8.4.3 Inverse States Reducing the Likelihood of an Erroneous Action 160

8.5 Improving Reliability and Reducing Risk by Inverse Thinking 161

8.5.1 Inverting the Problem Related to Reliability Improvement and Risk Reduction 161

8.5.1.1 Case Study: Reducing the Risk of High Employee Turnover 162

8.5.2 Improving Reliability and Reducing Risk by Inverting the Focus 163

8.5.2.1 Shifting the Focus from the Components to the System 163

8.5.2.2 Starting from the Desired Ideal End Result 163

8.5.2.3 Focusing on Events that are Missing 164

8.5.3 Improving Reliability and Reducing Risk by Moving Backwards to Contributing Factors 164

8.5.3.1 Case Study: Identifying Failure Modes of a Lubrication System by Moving Backwards to Contributing Factors 165

8.5.4 Inverse Thinking in Mathematical Models Evaluating or Reducing Risk 166

8.5.4.1 Case Study: Using the Method of Inversion for Fast Evaluation of the Production Availability of a Complex System 167

8.5.4.2 Case Study: Repeated Inversion for Evaluating the Risk of Collision of Ships 170

9 Reliability Improvement and Risk Reduction Through Self-Reinforcement 177

9.1 Self-Reinforcement Mechanisms 177

9.2 Self-Reinforcement Relying on a Proportional Compensating Factor 179

9.2.1 Transforming Forces and Pressure into a Self-Reinforcing Response 179

9.2.1.1 Capturing a Self-Reinforcing Proportional Response from Friction Forces 179

9.2.1.2 Case Study: Transforming Friction Forces into a Proportional Response in the Design of a Friction Grip 180

9.2.1.3 Transforming Pressure into a Self-Reinforcing Response 182

9.2.1.4 Transforming Weight into a Self-Reinforcing Response 182

9.2.1.5 Transforming Moments into a Self-Reinforcing Response 182

9.2.1.6 Self-Reinforcement by Self-Balancing 183

9.2.1.7 Self-Reinforcement by Self-Anchoring 184

9.2.2 Transforming Motion into a Self-Reinforcing Response 186

9.2.3 Self-Reinforcement by Self-Alignment 186

9.2.3.1 Case Study: Self-Reinforcement by Self-Alignment of a Rectangular Panel Under Wind Pressure 187

9.2.4 Self-Reinforcement Through Modified Geometry and Strains 188

9.3 Self-Reinforcement by Feedback Loops 188

9.3.1 Self-Reinforcement by Creating Negative Feedback Loops 188

9.3.2 Positive Feedback Loops 189

9.3.3 Reducing Risk by Eliminating or Inhibiting Positive Feedback Loops with Negative Impact 190

9.3.3.1 Case Study: Growth of Damage Sustained by a Positive Feedback Loop with Negative Impact 192

9.3.4 Self-Reinforcement by Creating Positive Feedback Loops with Positive Impact 194

9.3.4.1 Case Study: Positive Feedback Loop Providing Self-Reinforcement by Self-Energising 195

10 Improving Reliability and Reducing Risk by Minimising the Rate of Damage Accumulation and by a Substitution 197

10.1 Improving Reliability and Reducing Risk by Minimising the Rate of Damage Accumulation 197

10.1.1 Classification of Failures Caused by Accumulation of Damage 197

10.1.2 Minimising the Rate of Damage Accumulation by Optimal Replacement 198

10.1.3 Minimising the Rate of Damage Accumulation by Selecting the Optimal Variation of the Damage-Inducing Factors 203

10.1.3.1 A Case Related to a Single Damage-Inducing Factor 203

10.1.3.2 A Case Related to Multiple Damage-Inducing Factors 206

10.1.3.3 Reducing the Rate of Damage Accumulation by Derating 209

10.1.4 Reducing the Rate of Damage Accumulation by Deliberate Weaknesses 210

10.1.5 Reducing the Rate of Damage Accumulation by Reducing Exposure to Acceleration Stresses 211

10.1.5.1 Reducing Exposure to Acceleration Stresses by Reducing the Magnitude of the Acceleration Stresses 211

10.1.5.2 Reducing Exposure to Acceleration Stresses by Modifying or Replacing the Working Environment 211

10.1.6 Reducing the Rate of Damage Accumulation by Appropriate Materials Selection, Design, and Manufacturing 212

10.2 Improving Reliability and Reducing Risk by Substitution with Assemblies Working on Different Physical Principles 213

10.2.1 Increasing Reliability by a Substitution with Magnetic Assemblies 215

10.2.2 Increasing Reliability by a Substitution with Electrical Systems 215

10.2.3 Increasing Reliability by a Substitution with Optical Assemblies 216

10.2.4 Increasing Reliability and Reducing Risk by a Substitution with Software 217

11 Improving Reliability by Comparative Models, Permutations, and by Reducing the Time/Space Exposure 219

11.1 A Comparative Method for Improving System Reliability 219

11.1.1 Comparative Method for Improving System Reliability Based on Proving an Inequality 220

11.1.2 The Method of Biased Coins for Proving System Reliability Inequalities 221

11.1.2.1 Case Study: Comparative Method for Improving System Reliability by the Method of Biased Coins 223

11.1.3 A Comparative Method Based on Computer Simulation for Production Networks 225

11.2 Improving Reliability and Reducing Risk by Permutations of Interchangeable Components and Processes 226

11.3 Improving Reliability and Availability by Appropriate Placement of the Condition Monitoring Equipment 229

11.4 Improving Reliability and Reducing Risk by Reducing Time/Space Exposure 231

11.4.1 Reducing the Time of Exposure 231

11.4.2 Reducing the Space of Exposure 232

11.4.2.1 Case Study: Reducing the Risk of Failure of Wires by Simultaneously Reducing the Cost 232

11.4.2.2 Case Study: Evaluating the Risk of Failure of Components with Complex Shape 233

12 Reducing Risk by Determining the Exact Upper Bound of Uncertainty 235

12.1 Uncertainty Associated with Properties from Multiple Sources 235

12.2 Quantifying Uncertainty in the Case of Known Mixing Proportions 237

12.2.1 Variance of a Property from Multiple Sources in the Case Where the Mixing Proportions are Known 239

12.2.1.1 Case Study: Estimating the Uncertainty in Setting Positioning Distance 239

12.3 A Tight Upper Bound for the Uncertainty in the Case of Unknown Mixing Proportions 242

12.3.1 Variance Upper Bound Theorem 242

12.3.2 An Algorithm for Determining the Exact Upper Bound of the Variance of Properties from Multiple Sources 243

12.3.3 Determining the Source Whose Removal Results in the Largest Decrease of the Exact Variance Upper Bound 244

12.4 Applications of the Variance Upper Bound 245

12.4.1 Using the Variance Upper Bound for Increasing the Robustness of Products and Processes 245

12.4.2 Using the Variance Upper Bound for Increasing the Robustness of Electronic Devices 246

12.4.2.1 Case Study: Calculating the Worst-Case Variation by the Variance Upper Bound Theorem 246

12.4.3 Using the Variance Upper Bound Theorem for Delivering Conservative Designs 247

12.4.3.1 Case Study: Identifying the Distributions Associated with the Worst-Case Variation During Virtual Testing 247

12.5 Using Standard Inequalities to Obtain a Tight Upper Bound for the Uncertainty in Mechanical Properties 248

References 251

Index 261

MICHAEL TODINOV has a background in mechanical engineering, applied mathematics and computer science. He received his PhD and his higher doctorate (DEng) from the University of Birmingham and is currently a professor in mechanical engineering in Oxford Brookes University, UK. Professor Todinov is an internationally acclaimed expert in reliability and risk. In 2017, he received the prestige IMechE award in the area of risk reduction in mechanical engineering. He has published four research monographs and a large number of research papers in the area of reliability and risk. His name is associated with developing theoretical and computational frameworks for analysis and optimisation of repairable flow networks and for reliability analysis based on the cost of failure.

M. Todinov, Cranfield University, UK