ECE/BaS June 2014 - page 19

Electrotechnical Standardization (CENELEC)
has issued relevant documents. These include
EN 50126 (Reliability, Availability, Maintain-
ability and Safety (RAMS) in rail systems),
EN 50128 (Software for railway control and
protection systems) and EN 50129 (Safety re-
lated electronic systems for signaling). EN
50129 also contains the exact definition of SIL
levels specially for railway applications. It de-
rives the matching SIL from the Tolerable Haz-
ard Rate (THR) per hour and per function.
SIL 4 allows a range of 10-9 to 10-8, down to
SIL 1 with 10-6 to 10-5.
A safety-critical application demands a safe
system with a determined error behavior. The
most logical starting point for incorporating
system-level safety is in the initial planning
phase. It is during this phase that the all-im-
portant step of fault recognition has the highest
chance for success. This practice helps to
ensure that fault recognition can be incorpo-
rated into the basic structure. If a fault is not
discovered during the initial phases, the system
cannot be prepared to prevent that fault reac-
tion. Through the use of an inductive analysis
technique known as Failure Modes and Effects
Analysis (FMEA) conducted at the initial plan-
ning phase, determination can be made if the
system will satisfy the required Mean Time
Between Failure (MTBF), the predicted elapsed
time between inherent failures of a system
during operation.
It is also important to consider which type of
behavior a sub-system must have. A fail-oper-
ational system is capable of completing an op-
eration or at least phases of the operation, fol-
lowing the failure of one or more components.
Since the operation may not be completed as
designed, these systems can result in unsafe
conditions. In turn, a fail-safe system is de-
signed to mitigate unsafe consequences in the
event of failure. When the system fails, the ap-
plication yields a state that is no more unsafe
than when it is operating correctly, at the min-
imum.
An important strategy used to lessen risk in
safety-critical systems is the use of redundant
subsystems. Redundancy can reduce risk by
increasing the MTBF value, which can be
achieved by incorporating multiple executions
of the same component in a system. Of the
many possible approaches to design redundant
features into a system, each carries its own ca-
pabilities and limitations. A doubled computer
increases availability. If one CPU fails, the sys-
tem can still be online using the remaining
CPU. To increase safety, too, both systems
must be operating in order for the application
to run. If full system function depends on two
systems working correctly, this actually reduces
availability. The most common configuration
for increasing both safety and availability is a
triple-redundant system. Three CPUs or sys-
tems are running with a voter that constantly
B
OARDS
& M
ODULES
Figure 2. Triple redundancy on the A602
VMEbus computer from MEN
1...,9,10,11,12,13,14,15,16,17,18 20,21,22,23,24,25,26,27,28,29,...50
Powered by FlippingBook