Root Cause Failure Analysis: A Case Study

Tutorial

Finding the causes of failures in complex systems is much more challenging than simply examining failed components and determining the reasons why they failed. In a systems failure analysis, the failure analysis team must consider complex system interactions, identify all possible failure causes, and then systematically evaluate each to rule hypothesized failure causes in or out. The cause may be due to a nonconforming component, engineering errors, changes in the process, or other factors. The failure analysis process can be greatly complicated if the failure occurs intermittently, or if the failed hardware is not available for examination, or if sub-tier suppliers can induce hypothesized failure causes. This article describes a cluster bomb failure and outlines steps required to find and correct the failure’s root cause.

figure1.png

Figure 1. CBU-87/B Cluster Bomb Operational Sequence. The aircraft releases the dispenser and it spins up in the airstream. At a pre-selected dispenser rotational velocity, linear charges fire to open the dispenser, releasing the submunitions into the airstream. Selecting aircraft altitude, the time the dispenser starts to rotate, and the dispenser rotational velocity controls the submunition pattern on the ground.

The CBU-87/B is a cluster bomb that has been used with great effect by the U.S. Air Force in Afghanistan and Iraq. Developed in the 1980s, the system is comprised of a dispenser, a dispenser fuse, and 202 submunitions. Combat aircraft deliver and release the cluster bomb at altitudes and speeds designed to control the submunition pattern on the ground. After release from the aircraft, dispenser fins deploy immediately to stabilize the device. The cluster bomb free falls to a specified height, at which point an explosive bolt fires in the dispenser tail section. When this occurs, the dispenser fins tilt and the dispenser spins up to a pre-selected rotational speed. As the dispenser attains the preset rotational velocity, an inertial sensor sends a signal to fire linear charges. The linear charges open the dispenser by peeling away its skin, releasing the submunitions into the airstream. Figure 1 shows the operational sequence.

Firing of the linear charges is controlled by a dispenser fuse, with an explosive train, as Figure 2 shows.

figure2.png

Figure 2. Detonator Fuse Explosive Train. The electric detonator ignites an explosive transfer mechanism, which fires Detonator A. This fires across a 0.30-inch gap to ignite Detonator B, which then ignites the linear charges.

The signal to open the dispenser discharges an electric detonator, which fires into an explosive transfer mechanism. The explosive transfer mechanism ignites Detonator A, which fires across a 0.30-inch gap to ignite Detonator B. Detonator B then ignites the secondary explosive (the linear charges that open the dispenser). When the dispenser fuse is not armed, a .030-inch steel barrier is inserted between Detonator A and Detonator B. If the electric detonator, or the explosive transfer mechanism, or Detonator A fire inadvertently, the explosive event will stop at the steel barrier and the dispenser will not open. When the device is armed, the steel barrier is removed to allow the explosive event to continue across the gap.

The company that manufactured the dispenser fuse built it to a U.S. government design. It dropped two dispensers each month from an aircraft for lot acceptance testing. After starting initial production, the manufacturer had not experienced any failures during the first several months of lot acceptance testing. Then, during one of the monthly flight lot acceptance tests, one of the dispensers failed to open.

After preparing a fault tree analysis for the dispenser not opening, the failure analysis team examined components from the recovered (and unexploded) cluster bomb dispenser. The returned components had been damaged too severely to determine the failure cause, so the failure analysis team tested several additional dispenser fuses and linear charges at the cluster bomb manufacturing facility. In the test lab, the explosive event usually (but not always) transferred across the 0.30-inch gap. The failure analysis team isolated the point of failure to the interface between Detonator A and Detonator B. The failure analysis team found that Detonator A would always fire, but the explosive event did not always transfer across the 0.30-inch gap to ignite Detonator B. The failure analysis team tested several dispenser fuses, and observed that this failure occurred on an intermittent basis.

The failure analysis team checked all dispenser fuse components against drawing requirements, but they found no nonconformances. The failure analysis team checked the pedigree of the detonators and found that everything was in order. The individual detonators had been acceptance tested by selecting 32 detonators from each lot and firing them into a steel witness block, with the detonator held directly against the witness block. The detonators had always worked, leaving a significant dent in the witness block.

The failure analysis team looked for differences related to the failure causes hypothesized by the fault tree analysis and listed in the FMA&A. The same explosive component supplier had been providing the detonators for years, and in fact, the supplier had designed both Detonator A and Detonator B. The only difference the failure analysis team could find was that the recent detonators were from a new production lot. The failure analysis team could find nothing about the new lot, however, that was different from prior lots.

The failure analysis team’s purchasing member called the detonator supplier. The detonator supplier suggested that the team visit the detonator manufacturing facility. The cluster bomb manufacturer sent two engineers. These engineers met with the supplier’s technical staff and toured the factory, where they learned that the detonator design had an internal steel sleeve, an explosive mix, a thin-walled aluminum cup, and a cap. The thin walled aluminum container was crimped over the cap, as Figure 3 shows.

figure3.png

Figure 3. Detonator A Design Details. The detonator consisted of four components. A steel sleeve formed a barrel inside the aluminum cup. An aluminum cover crimped inside the cup sealed the explosive mix inside the steel barrel.

The engineers from both companies reviewed all documentation associated with the design, the inspection data, and the process. No anomalies or changes appeared. During the discussion, however, the detonator supplier asked the cluster bomb engineers about the explosive transfer mechanism they were relying upon for the explosive transfer to jump the 0.30-inch gap. Neither cluster bomb engineer knew the answer to this question. Both assumed heat and explosive shock completed the transfer.

The detonator supplier explained that this particular design relied upon a “hurled plate” explosive transfer mechanism. When Detonator A fired, it was supposed to create a flat plate at its output end, which would then fly through space and slap into the adjacent detonator. The supplier explained that the kinetic energy associated with this impact was what continued the explosive transfer, as Figure 4 shows.

figure4.png

Figure 4. Hurled Plate Formation. When the detonator fires, it shears the aluminum cup at its output end, creating a plate that is hurled across a gap. When this plate strikes the next explosive train component, kinetic energy ignites it.

The cluster bomb engineers found this interesting, but they explained to the detonator supplier that it didn’t explain why failures occurred. The detonator supplier recommended talking to the production operators who actually manufactured the detonator. The visiting engineers agreed, and all three people visited the production area.

In discussing the problem with the production technician, the engineers learned that the technician crimped the aluminum cup with the detonator output end resting against a rubber block. The engineers asked if the technician had made any changes to the process, and the technician explained that approximately two months earlier he had increased the crimping pressure to get a more uniform crimp.

The detonator engineer immediately recognized the significance of this change, and he examined the ends of recently-crimped detonators. The output ends displayed varying degrees of concavity induced by the rubber stop against which the detonators were crimped. The detonator engineer noted that a concave surface would not form a flat plate but would instead deform into a molten aluminum jet during detonation. In effect, the detonator output end became a small shaped charge instead of a flat plate.

figure5.png

Figure 5. Detonator Crimping Operation. The aluminum cover at the top of the detonator was crimped in place with the output end of the detonator held against a rubber stop.

The three engineers examined the detonator drawing, and found that it had no flatness requirement (in other words, the drawing did not prohibit the concave condition). While still at the detonator supplier, the engineers tested detonator outputs, but instead of placing the detonator directly against a witness block when it fired (as had been the practice during detonator lot acceptance testing at the cluster bomb manufacturer), the engineers left a 0.30-inch gap to duplicate the dispenser fuse design. The engineers found that detonators with less than 0.05-inch concavity formed a clean plate, which left a circular impression on the witness block. Detonators with more than 0.05-inch concavity left jagged holes in the witness block, clearly indicating that the detonators had not formed the requisite hurled plate.

Based on the above, the failure analysis team implemented several corrective actions. The team modified the detonator drawing to specify maximum allowed concavity at the output end. The detonator supplier replaced the rubber stop used during the crimping operation with a steel stop. The failure analysis team modified the detonator acceptance test approach. Instead of placing the detonator against the witness block, the acceptance test was modified to fire the detonator across a 0.30-inch gap. The acceptance criteria included a clear, circular impact on the witness block. The failure analysis team inspected all detonators in stock, keeping only those with acceptably low concavity. Finally, the failure analysis team inspected delivered dispensers by visiting munitions storage depots, and replaced any detonators having unacceptable concavity.

The CBU-87/B cluster bomb dispenser failure analysis offers several teaching points:

  • The failure was induced by an undocumented change to the process. Note that there had been no requirement to document process changes in this area, as the detonator supplier had not identified this to be a critical performance characteristic in its work instructions.

  • The process change was extremely subtle, and would not have been discovered had the failure analysis team not worked closely with the detonator supplier.

  • The detonators that induced the failures conformed to their drawing requirements. The drawing was inadequate, as it did not control a feature critical to the device’s successful function.

  • The cluster bomb manufacturer’s detonator acceptance test was inadequate. It did not consider the hurled plate mechanism upon which the design relied for successful function.

  • The failure analysis team did not initially understand how the system worked (the cluster bomb engineers did not know about the hurled plate transfer mechanism). It was not until this knowledge became available that the team recognized this as another potential failure cause.

  • The failure analysis team would not have discovered the failure cause without visiting the supplier, and without seeking input from the production technicians manufacturing the component.

Sometimes failures are induced by changes to the process, the design, the environment, supplier actions, or other factors. Sometimes failures are caused by process deficiencies. Sometimes failures are caused by not understanding and controlling critical systems characteristics. Sometimes failures are caused by acceptance test approaches that fail to duplicate actual operating conditions. In this example, nearly all of the above occurred. The failure analysis team resolved the failure by systematically identifying all potential failure causes, involving the supplier, and working to evaluate each. In the process, the failure analysis team uncovered previously unknown design characteristics and identified new failure causes. After the failure analysis team implemented the corrective actions described above, there were no recurrences of this failure.

Editor’s Note: This article is excerpted from Systems Failure Analysis, a book that addresses finding and correcting the causes of failures in complex systems. The book is written by Joe Berk, a member of the Eogogics Systems Engineering faculty, and will be published by Elsevier in 2008. You will see the Eogogics courses on RCFA described in our curriculum on Root Cause Failure Analysis.