Introduction
Welcome to the tutorial on Reliability Block Diagrams (RBDs) and System Reliability! In the RAM module, you learned to calculate reliability metrics for individual components. This tutorial extends those concepts to systems, collections of components whose arrangement determines whether the system succeeds or fails.
RBDs are a graphical tool for modeling how component reliabilities combine to produce system-level reliability. They are widely used in reliability engineering to analyze designs, identify vulnerabilities, and evaluate the benefit of redundancy.
Learning Objectives
By the end of this module, learners will be able to:
- Define reliability block diagrams and relate them to physical systems.
- Compute system reliability for series, parallel, and mixed configurations.
- Interpret k-out-of-n (voting) systems and calculate their reliability.
- Describe the relationship between RBDs and Fault Tree Analysis.
- Apply RBD calculations to a realistic multi-component system.
What is a Reliability Block Diagram?
A Reliability Block Diagram represents each component in a system as a block with a known reliability value. The blocks are connected by lines that show how the components relate to each other functionally:
- A path from left (input) to right (output) through functioning blocks represents a working system.
- The RBD is not a wiring diagram. It shows logical dependencies, not physical connections.
For example, a pump system that requires a motor, a pump, and a valve all to function would be drawn as three blocks in series. A backup generator that can substitute for a primary generator would appear as two blocks in parallel.
RBDs directly connect to the RAM metrics you already know:
- Each block has a reliability \(R_i(t) = e^{-\lambda_i t}\) (exponential model) or a Weibull \(R_i(t)\) (see the Life Data Analysis module).
- The system reliability is derived from the block reliabilities using the formulas in this tutorial.
Series Systems
In a series system, all components must function for the system to function. This is the most common configuration, think of a chain where every link must hold.
\[R_{sys} = R_1 \times R_2 \times \cdots \times R_n = \prod_{i=1}^{n} R_i\]
Key insight: A series system is always less reliable than its weakest component. Adding more components in series can only reduce system reliability.
Example
A water pumping system requires three components to all be operational: a motor (R = 0.95), a pump (R = 0.90), and a control valve (R = 0.98).
R_motor <- 0.95
R_pump <- 0.90
R_valve <- 0.98
R_series <- R_motor * R_pump * R_valve
R_series
## [1] 0.8379
The system reliability is approximately 83.8%, lower than any individual component.
Calculate series reliability yourself.
# A conveyor belt system has 5 components in series.
# Component reliabilities: 0.98, 0.96, 0.99, 0.94, 0.97
# Calculate the system reliability.
# R_sys <- prod(c(0.98, 0.96, 0.99, 0.94, 0.97))
R_components <- c(0.98, 0.96, 0.99, 0.94, 0.97)
R_series <- prod(R_components)
R_series # ~0.845
Parallel Systems
In a parallel system, only one component needs to function for the system to succeed. The system fails only if all components fail simultaneously. This is called active redundancy.
\[R_{sys} = 1 - \prod_{i=1}^{n}(1 - R_i)\]
Key insight: Adding parallel components always increases system reliability. Redundancy is a powerful tool for critical systems.
Example
A backup power system has a primary generator (R = 0.90) and a standby generator (R = 0.85). Either one alone keeps the system running.
R_primary <- 0.90
R_standby <- 0.85
R_parallel <- 1 - (1 - R_primary) * (1 - R_standby)
R_parallel
## [1] 0.985
The parallel system reliability is 98.5%, much higher than either generator alone.
Use the slider to explore how the number of redundant components affects system reliability.
Calculate parallel reliability yourself.
# A critical pump station has 3 pumps in parallel.
# Each pump has reliability 0.88.
# What is the system reliability?
# R_sys <- 1 - prod(1 - c(0.88, 0.88, 0.88))
R_components <- c(0.88, 0.88, 0.88)
R_parallel <- 1 - prod(1 - R_components)
R_parallel # ~0.9983
Mixed Systems
Most real systems combine series and parallel blocks. To analyze a mixed system, decompose it into subsystems and apply series and parallel rules step by step, working from the innermost blocks outward.
Use the selector below to compare how the topology changes across the three core configurations.
Example
A safety system has:
- Subsystem A: a sensor (R = 0.95) and a transmitter (R = 0.97) in series.
- Subsystem B: two redundant actuators (each R = 0.90) in parallel.
- Subsystems A and B are in series (both must work).
# Subsystem A (series)
R_A <- 0.95 * 0.97
R_A
## [1] 0.9215
# Subsystem B (parallel)
R_B <- 1 - (1 - 0.90) * (1 - 0.90)
R_B
## [1] 0.99
# Overall system (A and B in series)
R_system <- R_A * R_B
R_system
## [1] 0.912285
k-out-of-n Systems
A k-out-of-n system succeeds if at least k of its n identical components function. This generalizes series (k = n) and parallel (k = 1):
- k = n: all must work → series.
- k = 1: at least one works → parallel.
- 1 < k < n: voting or load-sharing systems.
The reliability of a k-out-of-n system with identical components (each with reliability p) follows the binomial distribution:
\[R_{k/n} = \sum_{i=k}^{n} \binom{n}{i} p^i (1-p)^{n-i} = 1 - \text{pbinom}(k-1, n, 1-p)\]
Example
A flight control system uses 3 redundant computers. At least 2 of the 3 must agree for the system to function safely (a 2-out-of-3 voter). Each computer has reliability R = 0.99.
n <- 3 # total components
k <- 2 # minimum required
p <- 0.99 # individual reliability
R_voting <- 1 - pbinom(k - 1, n, 1 - p)
R_voting
## [1] 0.000298
The 2-out-of-3 system has reliability 0.9997.
Calculate k-out-of-n reliability in R.
# A 3-out-of-5 redundant sensor array.
# Each sensor has reliability p = 0.95.
# Calculate the system reliability.
n <- 5
k <- 3
p <- 0.95
# R_sys <- 1 - pbinom(k - 1, n, 1 - p)
n <- 5
k <- 3
p <- 0.95
R_sys <- 1 - pbinom(k - 1, n, 1 - p)
R_sys # ~0.9988
System MTTF
For a series system of components with constant failure rates (exponential distribution), the system failure rate is the sum of component failure rates:
\[\lambda_{sys} = \sum_{i=1}^{n} \lambda_i \qquad MTTF_{sys} = \frac{1}{\lambda_{sys}}\]
For \(n\) identical parallel components each with failure rate \(\lambda\):
\[MTTF_{parallel} = \frac{1}{\lambda}\left(1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n}\right)\]
Example
Two pumps in parallel, each with \(\lambda = 0.01\) failures/hour:
lambda <- 0.01
MTTF_single <- 1 / lambda
MTTF_parallel <- (1 / lambda) * (1 + 1/2)
MTTF_single # 100 hours
## [1] 100
MTTF_parallel # 150 hours — 50% longer with one spare
## [1] 150
Introduction to Fault Tree Analysis
Fault Tree Analysis (FTA) is a top-down approach to reliability analysis that starts with an undesirable system event (the top event) and works backward to identify the combinations of component failures that could cause it.
While an RBD asks “What must work for the system to succeed?”, a fault tree asks “What can cause the system to fail?”
Gates
A fault tree uses logic gates to combine failure events:
- AND gate (\(\land\)): the top event occurs only if all input events occur simultaneously. This corresponds to parallel components in an RBD, all must fail to cause system failure.
- OR gate (\(\lor\)): the top event occurs if any input event occurs. This corresponds to series components in an RBD, any single failure causes system failure.
RBD — Fault Tree Duality
Every RBD has a corresponding fault tree and vice versa:
| RBD configuration | Fault tree gate for system failure |
|---|---|
| Series (all must work) | OR gate (any failure causes system failure) |
| Parallel (any can work) | AND gate (all must fail for system failure) |
This duality means the two tools provide complementary views of the same system. RBDs are better for computing reliability; fault trees are better for tracing failure causes and identifying critical combinations.
The fault tree below illustrates a simple two-component system: the top event (system failure) occurs if either component fails — an OR gate, which corresponds to a series RBD.
For complex fault trees, the R package FaultTree on CRAN
provides tools for building and analyzing fault tree models
programmatically.
Case Study: Industrial Cooling System
Let’s apply everything to a realistic example. An industrial cooling system has the following architecture:
- Water supply subsystem: two pumps in parallel (each R = 0.92), followed by a filter in series (R = 0.99).
- Control subsystem: a primary controller (R = 0.97) and a backup controller in parallel (R = 0.95).
- The water supply and control subsystems must both work (series at system level).
# Step 1 — Water supply subsystem
R_pump_parallel <- 1 - (1 - 0.92)^2
R_water <- R_pump_parallel * 0.99
R_water
## [1] 0.983664
# Step 2 — Control subsystem
R_control <- 1 - (1 - 0.97) * (1 - 0.95)
R_control
## [1] 0.9985
# Step 3 — Overall system (both subsystems in series)
R_system <- R_water * R_control
R_system
## [1] 0.9821885
Now explore what happens if the filter is upgraded.
# Modify the case study: the filter is upgraded to R = 0.999.
# Recalculate the system reliability with the improved filter.
R_pump1 <- 0.92
R_pump2 <- 0.92
R_filter <- 0.999 # upgraded from 0.99
R_ctrl1 <- 0.97
R_ctrl2 <- 0.95
# R_pump_parallel <- 1 - (1 - R_pump1) * (1 - R_pump2)
# R_water <- R_pump_parallel * R_filter
# R_control <- 1 - (1 - R_ctrl1) * (1 - R_ctrl2)
# R_system <- R_water * R_control
R_pump1 <- 0.92
R_pump2 <- 0.92
R_filter <- 0.999
R_ctrl1 <- 0.97
R_ctrl2 <- 0.95
R_pump_parallel <- 1 - (1 - R_pump1) * (1 - R_pump2)
R_water <- R_pump_parallel * R_filter
R_control <- 1 - (1 - R_ctrl1) * (1 - R_ctrl2)
R_system <- R_water * R_control
R_system # ~0.985
Summary
Congratulations on completing the Reliability Block Diagrams and System Reliability tutorial!
Key takeaways:
- Series: \(R_{sys} = \prod R_i\) — every component must work; reliability decreases with more components.
- Parallel: \(R_{sys} = 1 - \prod(1-R_i)\) — redundancy; only one component needs to work.
- Mixed: decompose into subsystems, apply formulas step by step.
- k-out-of-n:
1 - pbinom(k-1, n, 1-p)for voting / load-sharing systems. - Series MTTF: \(1/\sum\lambda_i\); Parallel MTTF of \(n\) identical units: \((1/\lambda)\sum_{i=1}^{n}(1/i)\).
- FTA duality: series RBD ↔︎ OR gate; parallel RBD ↔︎ AND gate.
References
Abernethy, R.B. (2004) The New Weibull Handbook. Fifth Edition.
Aden-Buie G, Schloerke B, Allaire J, Rossell Hayes A (2023). learnr: Interactive Tutorials for R. https://rstudio.github.io/learnr/, https://github.com/rstudio/learnr.
Billinton R, Allan R (1992). Reliability Evaluation of Engineering Systems. Second Edition. Plenum Press, New York.
Vesely W, Goldberg F, Roberts N, Haasl D (1981). Fault Tree Handbook. US Nuclear Regulatory Commission, NUREG-0492.