# Methods for Reliability Improvement and Risk Reduction

Reliability is one of the most important attributes for the products and processes of any company or organization. This important work provides a powerful framework of domain-independent reliability improvement and risk reducing methods which can greatly lower risk in any area of human activity. It reviews existing methods for risk reduction that can be classified as domain-independent and introduces the following new domain-independent reliability improvement and risk reduction methods:

* Separation

* Stochastic separation

* Introducing deliberate weaknesses

* Segmentation

* Self-reinforcement

* Inversion

* Reducing the rate of accumulation of damage

* Permutation

* Substitution

* Limiting the space and time exposure

* Comparative reliability models

The domain-independent methods for reliability improvement and risk reduction do not depend on the availability of past failure data, domain-specific expertise or knowledge of the failure mechanisms underlying the failure modes. Through numerous examples and case studies, this invaluable guide shows that many of the new domain-independent methods improve reliability at no extra cost or at a low cost.

Using the proven methods in this book, any company and organisation can greatly enhance the reliability of its products and operations.

1.1. The domain-specific methods for risk reduction

1.2. The statistical, data-driven approach

1.3. The physics-of-failure approach

1.4. Reliability improvement and TRIZ

1.5. The domain-independent methods for reliability improvement and risk reduction

2. Basic Concepts

2.1. Likelihood of failure, consequences from failure, potential loss and. risk of failure

2.2. Drawbacks of the expected loss as a measure of the potential loss. from failure

2.3. Potential loss, conditional loss and risk of failure

2.4. Improving reliability and reducing risk

2.5. Resilience

3. Overview of methods and principles for improving reliability and reducing risk that can be classified as domain-independent

3.1. Improving reliability and reducing risk by preventing failure modes from occurring

3.1.1. Techniques for identifying and assessing failure modes

3.1.2. Effective risk reduction procedure related to preventing failure modes from occurring

3.1.3. Reliability improvement and risk reduction by root-cause analysis

3.1.3.1. Case study. Improving the reliability of automotive suspension springs by root-cause analysis

3.1.4. Preventing failure modes by removing latent faults

3.2. Improving reliability and reducing risk by a fault-tolerant system design and 'fail-safe' design

3.2.1. Building in redundancy

3.2.1.1. Case study: Improving reliability by active k-out-of-n redundancy

3.2.2. Fault-tolerant design

3.2.3. Fail-safe principle and fail-safe design

3.2.4. Reducing risk by eliminating vulnerabilities

3.2.4.1. Eliminating design vulnerabilities

3.2.4.2. Reducing the negative impact of weak links

3.2.4.3. Reducing the likelihood of unfavourable combinations of risk-critical. random variables

3.2.4.4. Reducing the vulnerability of computational models

3.2.4.4.1. Case study: Revealing a vulnerability in the Newton-Raphson method for solving non-linear equations

3.3. Improving reliability and reducing risk by protecting against common cause

3.4. Improving reliability and reducing risk by simplifying at a system and component level

3.5. Improving reliability and reducing risk by reducing the variability of risk-critical parameters

3.5.1. Case study: Interaction between the upper tail of the load distribution and the lower tail of the strength distribution

3.6. Improving reliability and reducing risk by making the design robust

3.6.1. Case study: Increasing the robustness of a spring assembly with constant clamping force

3.7. Improving reliability and reducing risk by built-in reinforcement

3.7.1. Built-in prevention reinforcement

3.7.2. Built-in protection reinforcement

3.8. Improving reliability and reducing risk by condition monitoring

3.9. Reducing the Risk of Failure by Improving Maintainability

3.10. Reducing risk by eliminating factors promoting human errors

3.11. Reducing risk by reducing the hazard potential

3.12. Using barriers to prevent damage escalation and reduce the rate of deterioration

3.13. Reducing risk by efficient troubleshooting procedures and systems

3.14. Risk planning and training

4. Improving reliability and reducing risk by separation

4.1. The method of separation

4.2. Separation of risk-critical factors

4.2.1. Time separation by scheduling

4.2.1.1. Case study: Full time separation with random starts of the events

4.2.2. Time and space separation by using interlocks

4.2.2.1 Case study: A time separation by using an interlock

4.2.3. Time separation in distributed systems by using logical clocks

4.2.4. Space separation of information

4.2.5. Separation of duties to reduce the risk of compromised safety, errors and fraud

4.2.6. Logical separation by using a shared unique key

4.2.4.1. Case study: Logical separation of X-ray equipment by a shared unique key

4.2.7. Separation by providing conditions for independent operation

4.3. Separation of functions, properties or behaviour for distinct components or parts

4.3.1. Separation of functions

4.3.1.1. Separation of functions to optimise for maximum reliability

4.3.1.2. Separation of functions to reduce load magnitudes

4.3.1.3. Separation of a single function into multiple components to reduce vulnerability to a single failure

4.3.1.4. Separation of functions to compensate deficiencies

4.3.1.5. Separation of functions to prevent unwanted interactions

4.3.1.6. Separation. of methods to reduce the risk associated with errors in mathematical models

4.4. Separation of properties to counter poor performance caused by homogeneity

4.4.1. Separation of strength across components and zones according to the intensity of the stresses from loading

4.4.2. Separation of properties to satisfy conflicting requirements

4.4.3. Separation in geometry

4.4.3.1. Case study: Separation in geometry for a cantilever beam

4.5. Separation on a parameter, conditions or scale

4.5.1. Separation at distinct values of a risk-critical parameter through deliberate weaknesses and stress limiters

4.5.2. Separation by using phase changes

4.5.3. Separation of reliability across components and assemblies according to their cost of failure

4.5.3.1. Case study: Separation of the reliability of components based on the cost of failure

5. Reducing Risk by Deliberate Weaknesses

5.1. Reducing the consequences from failure through deliberate weaknesses

5.2. Separation from excessive levels of stress

5.2.1. Deliberate weaknesses disconnecting excessive load

5.2.2. Energy absorbing deliberate weaknesses

5.2.2.1. A case study: Reducing the maximum stress from dynamic loading by energy absorbing elastic components

5.2.3. Designing frangible objects or weakly fixed objects

5.3. Separation from excessive levels of damage

5.3.1. Deliberate weaknesses decoupling damaged regions and limiting the spread of damage

5.3.2. Deliberate weaknesses providing stress and strain relaxation

5.3.3. Deliberate weaknesses separating from excessive levels of damage accumulation

5.4. Deliberate weaknesses deflecting the failure location or damage propagation

5.4.1. Deflecting the failure location from places where the cost of failure is high

5.4.2. Deflecting the failure location from places where the cost of intervention for repair is high

5.4.3. Deliberately weaknesses deflecting the propagation of damage

5.5. Deliberate weaknesses designed to provide warning

5.6. Deliberate weaknesses designed to provide quick access or activate protection

5.7. Deliberate weaknesses and stress limiters

6. Improving reliability and reducing risk by stochastic separation

6.1. Stochastic separation of risk-critical factors/events

6.1.1. Real-life applications that require stochastic separation

6.1.2. Stochastic separation of a fixed number of random events with different duration times

6.1.2.1. Case study: Stochastic separation of consumers by proportionally reducing their demand times

6.1.3. Stochastic separation of random events following a homogeneous Poisson process

6.1.3.1. Case study: Stochastic separation of random demands following a Homogeneous Poisson process

6.1.4. Stochastic separation based on the probability of overlapping of random events, for more than a single source servicing the random demands

6.1.5. Computer simulation algorithm for determining the probability of overlapping of a given order, for more than a single source servicing the demands

6.2. Expected time fraction of simultaneous presence of critical events

6.2.1. Case study: Expected fraction of unsatisfied demand at a constant sum of the time fractions of the user demands

6.2.2. Case study: Servicing random demands from 10 different users each characterised by a distinct demand time fraction

6.3. Analytical method for determining the expected fraction of unsatisfied demand for repair

6.3.1. Case study: Servicing random repairs from a system including 42 components of 3 different types, each characterised by a distinct repair time

6.4. Expected time fraction of simultaneous presence of critical events that. have been initiated with specified probabilities

6.4.1. Case study: Servicing random demands from patients in a hospital

6.4.2. Case study: Servicing random demands from 4 different types of users each issuing a demand with certain probability

6.5. Stochastic separation based on the expected fraction of unsatisfied demand

6.5.1. Fixed number of random demands on a time interval

6.5.2. Random demands following a homogeneous Poisson process on a time interval

6.5.2.1. Case study: Servicing random failures from circular knitting machines by the optimal number of repairmen

7. Improving reliability and reducing risk by segmentation

7.1. Segmentation as a problem-solving strategy

7.2. Creating a modular system by segmentation

7.3. Preventing damage accumulation and limiting damage propagation by segmentation

7.3.1. Creating barriers containing damage

7.3.2. Creating weak interfaces dissipating or deflecting damage

7.3.3. Reducing deformations and stresses by segmentation

7.3.4. Reducing hazard potential by segmentation

7.3.5. Reducing the likelihood of errors by segmenting operations

7.3.6. Limiting the presence of flaws by segmentation

7.4. Improving fault tolerance and reducing vulnerability to a single failure by segmentation

7.4.1. Case study: Improving fault tolerance of a column loaded in compression by segmentation

7.4.2. Reducing the vulnerability to a single failure by segmentation

7.5. Reducing loading stresses by segmentation

7.5.1. Improving the load distribution by segmentation

7.5.2. Improving heat dissipation by segmentation

7.5.3. Case study: Reducing stress by increasing the perimeter to cross-section area ratio through segmentation

7.6. Reducing the probability of a loss/error by segmentation

7.6.1. Reducing the likelihood of a loss by segmenting opportunity bets

7.6.1.1. Case study: Reducing the risk of a loss from a risky prospect involving a single opportunity bet

7.6.2. Reducing the likelihood of a loss by segmenting an investment portfolio

7.6.3. Reducing the likelihood of erroneous conclusion from imperfect tests, by segmentation

7.7. Decreasing the variation of properties by segmentation

7.8. Improved control and condition monitoring by time segmentation

8. Improving reliability and reducing risk by inversion

8.1. The method of inversion

8.2. Improving reliability by inverting functions, relative position and motion

8.2.1. Case study: Eliminating failure modes of an alarm circuit by inversion of functions

8.2.2. Improving reliability by inverting the relative position of objects

8.2.2.1. Case study: Inverting the position of an object with respect to its support to achieve a self-reinforcing response

8.3. Improving reliability by inverting properties and geometry

8.3.1. Case study: Improving reliability by inverting mechanical properties whilst maintaining an invariant

8.3.2. Case study: Improving reliability by inverting geometry whilst maintaining an invariant

8.4. Improving reliability and reducing risk by introducing inverse states

8.4.1. Inverse states cancelling anticipated undesirable effects

8.4.2. Inverse states buffering anticipated undesirable effects

8.4.3. Inverse states limiting the likelihood of an erroneous action

8.5. Improving reliability and reducing risk by inverse thinking

8.5.1. Inverting the problem related to reliability improvement and risk reduction

8.5.1.1. Case study: Reducing the risk of high employee turnover due to bad management

8.5.2. Improving reliability and reducing risk by inverting the focus

8.5.2.1. Shifting the focus from the components to the system

8.5.2.2. Starting from the desired ideal end result

8.5.2.3. Focusing on events that are missing

8.5.3. Improving reliability and reducing risk by moving backwards to contributing factors

8.5.3.1. Case study: Identifying failure modes of a lubrication system by moving backwards to contributing factors

8.5.4. Inverse thinking in mathematical models evaluating or reducing risk

8.5.4.1. Case study: Using the method of inversion for fast evaluation of the production availability of a complex system

8.5.4.2. Case study: Repeated inversion for evaluating the risk of collision of ships

9. Reliability improvement and risk reduction through self-reinforcement

9.1. Self-reinforcement mechanisms

9.2. Self-reinforcement relying on a proportional compensating factor

9.2.1. Transforming forces and pressure into a self-reinforcing response

9.2.1.1. Capturing a self-reinforcing proportional response from friction forces

9.2.1.2. Case study: Transforming friction forces into a proportional response in the design of a friction grip

9.2.1.3. Transforming pressure into a self-reinforcing response

9.2.1.4. Transforming weight into a self-reinforcing response

9.2.1.5. Transforming moments into a self-reinforcing response

9.2.1.6. Self-reinforcement by self-balancing

9.2.1.7. Self-reinforcement by self-anchoring

9.2.2. Transforming motion into a self-reinforcing response

9.2.3. Self-reinforcement by self-alignment

9.2.3.1. Case study: Self-reinforcement by self-alignment of a rectangular panel under wind pressure

9.2.4. Self-reinforcement through modified geometry and strains

9.3. Self-reinforcement by feedback loops

9.3.1. Self-reinforcement by creating negative feedback loops

9.3.2. Positive feedback loops

9.3.3. Reducing risk by eliminating or inhibiting self-reinforcing positive feedback loops with negative impact

9.3.3.1. Case study: Growth of damage sustained by a positive feedback loop with negative impact

9.3.4. Self-reinforcement by creating positive feedback loops with positive impact

9.3.4.1. Case study: Positive feedback loop providing self-reinforcement by self-energizing

10. Improving reliability and reducing risk by minimizing the rate of damage accumulation and by a substitution

10.1. Improving reliability and reducing risk by minimizing the rate of damage accumulation

10.1.1. Classification of failures caused by accumulation of damage

10.1.2. Minimizing the rate of damage accumulation by optimal replacement

10.1.3. Minimizing the rate of damage accumulation by selecting the optimal variation of the damage-inducing factors

10.1.3.1. A case related to a single damage-inducing factor

10.1.3.2. A case related to multiple damage inducing factors

10.1.3.3. Reducing the rate of damage accumulation by derating

10.1.4. Reducing the rate of damage accumulation by deliberate weaknesses

10.1.5. Reducing the rate of damage accumulation by reducing exposure to acceleration stresses

10.1.5.1. Reducing exposure to acceleration stresses by reducing the magnitude of the acceleration stresses

10.1.5.2. Reducing exposure to acceleration stresses by modifying or replacing working environment

10.1.6. Reducing the rate of damage accumulation by appropriate materials selection, design and manufacturing

10.2. Improving reliability and reducing risk by substitution with assemblies working on a different physical principle

10.2.1. Increasing reliability by a substitution with magnetic assemblies/devices

10.2.2. Increasing reliability by a substitution with electrical systems

10.2.3. Increasing reliability by a substitution with optical assemblies

10.2.4. Increasing reliability and reducing risk by a substitution with software

11. Improving Reliability by Comparative Models, Permutations and by Reducing the Time/Space Exposure

11.1. A comparative method for improving system reliability

11.1.1. Comparative method for improving system reliability based on proving an inequality

11.1.2. The method of biased coins for proving system reliability inequalities

11.1.2.1. Case study: Comparative method for improving system reliability by the method of biased coins

11.1.3. A comparative method based on computer simulation for production networks

11.2. Improving reliability and reducing risk by permutations of interchangeable components and processes

11.3. Improving availability by appropriate placement of the monitoring equipment

11.4. Improving reliability and reducing risk by reducing time/space exposure

11.4.1. Reducing the time of exposure

11.4.2. Reducing the space of exposure

11.4.2.1. Case study: Reducing the risk of failure of wires by simultaneously reducing. the cost.

11.4.2.2. Case study: Evaluating the risk of failure of components with complex shape.

12. Reducing risk by determining the exact upper bound of uncertainty

12.1. Uncertainty associated with properties from multiple sources

12.2. Quantifying uncertainty in the case of known mixing proportions.

12.2.1. Variance of a property from multiple sources in the case where the mixing proportions are known

12.2.1.1. Case study: Bounding uncertainty in setting positioning distance

12.3. A tight upper bound for the uncertainty in the case of unknown mixing proportions.

12.3.1. Variance upper bound theorem

12.3.2. An algorithm for determining the exact upper bound of the variance of properties from multiple sources

12.3.3. Determining the source whose removal results in the largest decrease of the exact variance upper bound

12.4. Applications of the variance upper bound

12.4.1. Using the variance upper bound for increasing the robustness of products and processes

12.4.2. Using the variance upper bound for increasing the robustness of electronic devices

12.4.2.1. Case study: Calculating the worst-case variation by the variance upper bound theorem

12.4.3. Using the variance upper bound theorem for delivering conservative designs

12.4.3.1. Case study: Identifying the distributions associated with the worst-case variation during virtual testing by using the variance upper bound theorem

12.5. Using standard inequalities to obtain a tight upper bound for the uncertainty in mechanical properties

12.6. Using standard inequalities to obtain a tight upper bound for the probability of unfavorable combination of design parameters