Global TMW:
Login  |  Register          Free Newsletter Subscription
Subscribe
Email
Print
Reprint
Learn RSS

What Causes Semiconductor Devices to Fail?

Failure analysis provides insight into the mechanisms and causes of failure, and leads to improvements in the design and test of components.

V. Lakshminarayanan, Centre for Development of Telematics, Bangalore, India -- Test & Measurement World, 11/1/1999

Over the years, users have demanded more reliable electronic equipment. At the same time, electronic equipment has grown ever more complex. The combination of these two factors has put great emphasis on the need to ensure trouble-free operation over a long period. Failure analysis can provide valuable insight into the mechanisms and causes of failure, which, in turn, lead to improvements in the design of components and products and thus help improve the reliability of electronic systems. 

See also:
-Models Predict Failure Rates
-Using Models to Predict Semiconductor Failures
-The Effect of Temperature on Failure Rate

Engineers and scientists have studied failures so often that they now have models, or equations, that we can use to predict when failures will occur. These models don’t predict when a specific device will fail, but they can predict with reasonable certainty the rate of failure under specific conditions.

A device usually fails because it experiences conditions that stress it beyond its maximum ratings. The way a device fails is called a failure mechanism. Typically, electricity, heat, chemicals, radiation, mechanical stresses, and other factors cause the failures. It’s important to draw a distinction between mechanisms and causes. For example, a device might fail due to an electrical mechanism caused by mechanical stress.

The common failure mechanisms for semiconductors can be broken down into several main categories. Knowing about how these mechanisms operate will help you pinpoint failures in devices given to you for analysis:

1. Encapsulation failure. These failures occur when the encapsulation used to package a device develops a fault—usually a crack. Mechanical or thermal stress and differences in the coefficient of thermal expansion between the encapsulation material and metal used for leads can cause the cracks to form. These openings let moisture enter the package when humidity is high or when the device gets exposed to flux, cleaning agents, and so on. Chemical action can degrade the device and cause it to fail.

2. Die-attach failure. Improper contact between the die and substrate decreases thermal conductivity between the two. As a result, the die can overheat, which leads to stressing and cracking, and thus device failure. 

3. Wire-bond failure. Thermal overstress due to high current flow, mechanical stress in the bond wire due to improper bonding, cracks at the interface between bond wire and die, electromigration of silicon, and excessive bonding pressure can each cause wire-bond failures. When a bond fails, so does the device, because one of its conductors no longer exists.

4. Bulk-silicon defects. Sometimes, faults caused by crystal defects or the presence of impurities and contaminants in the silicon bulk material will cause a device to fail. Process defects caused by diffusion problems during device-fabrication can also cause devices to fail. 

5. Oxide-layer faults. Electrostatic discharge and high-voltage transients propagating through device leads can cause thin oxide layers—insulators—to break down and cause a device to malfunction. Cracks or scratches in the oxide layer or the presence of impurities in the oxide can also lead to failures. 

6. Aluminum-metal faults. These faults arise from:

  • electromigration of aluminum in the direction of current flow due to high electric fields,
  • breaking aluminum conductors due to electrical overstress caused by high currents,
  • corrosion of aluminum,
  • wear out of metal caused by soldering,
  • improper metal deposition at contact windows, and
  • formation of hillocks and cracks. 

Usually, it takes a specific event or set of conditions to precipitate a failure. The rest of this article describes the most common events or conditions that can cause failures. By understanding these causes, you can perform thorough failure analyses and help designers and test engineers produce more reliable products. Keep in mind, though, that design faults—both in the device and in the PCB or final products—can produce conditions that also cause devices to fail. But even those design problems generally lead to one or more of the conditions described below.

Thermal Overstress

Thermal overstress—excess heat—can cause semiconductors to fail. Excess heat melts materials, chars plastics, warps and breaks semiconductor dies, and causes other types of damage. In general, devices should not operate with a junction temperature above 125–150°C. 

Military applications aim to limit junction temperature to 110°C. By applying the Arrhenius equation (see, “Models Predict Failure Rates”), you can show that reducing a device’s junction temperature from 160°C to 135°C cuts the failure rate in half. (For a further explanation of the equation and how to solve it, see “The Effect of Temperature on Failure Rate.”)

If high temperatures have caused a failure, notify the product’s designers. They must take into account product packaging and operating specs to ensure that fans, heat sinks, and other cooling devices keep temperatures within spec. Although high-power devices require heat sinks and fans, low-power devices can simply distribute heat into the surrounding air. Also, test engineers must ensure that heat sinks and fans remain in place during testing or that other heat-removal devices are available to sufficiently cool a device undergoing testing. 

Test engineers may have to monitor the temperature of power semiconductors and power assemblies to ensure they do not operate the devices at unsafe temperatures—and thus reduce their useful life—during testing.

Electrical Overstress

Semiconductor devices should operate within the range of voltage, current, and power limits established by the devices’ manufacturers. These limits exist for both power and I/O connections to a device. When a device operates outside this “safe operating area” (SOA), electrical overstress (EOS) can cause internal voltage breakdown that can, in turn, cause internal damage that ruins the device. If the EOS produces a higher current flow, the device can also overheat, adding thermal overstress to the causes of failure. The added thermal stress leads to a secondary-mode failure, named because the thermal stress arose from the primary EOS.

Figure 1. This bipolar transistor failed because of EOS caused by a high-voltage transient. This microscopic view shows breakdown at the base-emitter junction. The two pincer-like structures at one of the wire-bond contacts are the breakdown regions.
The internal view of a bipolar transistor (Fig. 1) shows how a high-voltage transient—an EOS—applied to the base of the transistor damaged the device by creating a base-emitter short circuit. The device failed.

Circuit designers can minimize EOS failures by properly specifying components and derating their specs. They can also include in their designs necessary protection devices such as zener diodes, ferrite beads, filters, varistors, and so on, which can prevent electrical stresses from reaching critical devices. Chip designers can incorporate protective devices on their ICs. Also, test engineers must ensure that testers do not electrically overstress devices as products go through test steps or through burn in. Test circuits should provide the same types of protection as do the product’s internal circuits.

Electrostatic discharge (ESD) forms a subset of the EOS category. ESD can affect the functioning of an electronic device at any stage—during device fabrication, testing, handling, assembly, production, or field use.1,2,3 ESD occurs when charge accumulates on a surface for any reason. A person walking on a carpet can generate a static voltage as high as 20 kV due to triboelectric charging. In addition, machines that use plastic parts can generate electrostatic charge. Obviously, dissipating this charge into a semiconductor can destroy the device. 

ESD need not always lead to immediate component failure; it can cause a latent defect in the component that will go undetected during routine testing. Such “weakened” components are more likely to fail in the field when systems operate under realistic—non-lab—conditions.

Typically, ESD damage manifests itself in the following ways:

  • A discharge or electrical overstress damages a device. The damage leads to higher-than-normal current flow, which leads to thermal overstress. The thermal overstress melts metal interconnections and damages junctions.
  • An intense electric field causes breakdown of junctions and thin oxide layers.
  • ESD-induced fields can couple to PCB traces and produce a high current that can melt semiconductor junctions.
  • A discharge causes “latch-up” in a CMOS device due to triggering of silicon-controlled rectifiers (SCRs). 

Figure 2. The splattered look in the photomicrograph of a damaged RS-232C transceiver shows how electrical overstress can lead to charring of the device.
Figure 3. The crack in this RS-232C transceiver arose from an electrostatic discharge at one of the device’s pins.
Figure 4. Closer examination of the damage to the RS-232C transceiver shown in Figure 3 reveals a microscopic hole punched through an insulating oxide layer by the electrostatic discharge.
You can see the effects of ESD damage in photomicrographs of CMOS RS-232C transceiver ICs (Figs. 2, 3, and 4). Analysis revealed latch-up caused by an ESD transient as the cause of device failure for the IC in Figure 2. CMOS integrated circuits are particularly prone to ESD and latch-up problems due to parasitic pnpn structures in the device. 

These structures form SCRs, which when triggered by a discharge at any pin, will continue to conduct like a short circuit between the device’s VDD and VSS pins. Inevitably, latch-up leads to overheating and device failure. In severe cases, the device will get charred (Fig. 2) due to the high heat produced by the device before it destroyed itself. In another failed device, energy from an ESD burned a microscopic hole through an oxide layer (Figs. 3 and 4), which led to subsequent failure. 

When you detect failures caused by ESD, remind your company’s engineers, technicians, and other staff people to take a few simple precautions during storage, handling, assembly, and test to minimize the effects of damage due to ESD. Standard precautions include:

  • store electronic components in tubes and bins that dissipate static;
  • wear wrist bands during assembly;
  • when soldering by hand, use grounded-tip soldering equipment;
  • use antistatic work tables and floors; and
  • package assemblies in bags that dissipate static.

MIL-STD-883 and MIL-STD-1686 provide component standards, and IEC 1000-4-2 and EIA 1361 provide equipment standards. The ESD Association (Rome, NY, www.eosesd.org) sets standards for handling devices.

Component failures fall into two subgroups: Problems in a batch of components that slip through initial testing, and problems that arise from component designs.

Occasionally, more than a normal number of failures occur in a batch of components. These components pass initial production testing, and then mishandling, poor packaging techniques, and problems during assembly, test, and shipment “weaken” the components by causing latent defects that surface only after final assembly. Based on your analyses, you may have to reevaluate your incoming testing and your handling processes. You may need to work with suppliers to “tighten up” product specifications.

A small, but nonfatal failure in one component may cascade and lead to destructive failures in other components. Consider the case of a transformer in a switching power supply. Parasitic effects such as leakage inductance can induce voltage spikes that then cause breakdown of switching transistors. You may have to isolate the failures and backtrack to find the main cause. Then you need to work with manufacturers and QA people to ensure you get products made to the tight tolerances you require. T&MW

FOOTNOTES

  1. Bipolar Power Transistor Databook, AN-1040, Motorola Semiconductor, Phoenix, AZ, 1995, pp. 6-6 to 6-25. 
  2. CMOS Logic Databook, AN-248, National Semiconductor, Santa Clara, CA, 1988, pp. 2-48 to 2-49.
  3. Duffy, Carl, “Test CMOS Devices for ESD Immunity,” Test & Measurement World, Newton, MA, August 1997, pp. 59–70.

FOR FURTHER READING

Boxleitner, Warren, Electrostatic Discharge and Electronic Equipment: A Practical Guide for Designing to Prevent ESD Problems, IEEE Press, Piscataway, NJ, 1989. 

Doyle, Edgar A., Jr., “How Parts Fail,” IEEE Spectrum, IEEE, New York, NY, October 1981, pp. 36–43. 

“Engineer’s Factfile,” Electronic Packaging and Production, Des Plaines, IL, December 1997, p. 31. 

Jensen, Finn, Electronic Component Reliability: Fundamentals, Modelling, Evaluation, and Assurance, John Wiley & Sons, New York, NY, 1996. 

Mardiguian, Michel, Understand, Simulate and Fix ESD Problems, Interference Control Technologies, Gainesville, VA, 1986.

Pollino, Emiliano, Microelectronic Reliability: Integrity Assessment and Assurance, Artech House, Norwood, MA, 1989. 

Ramakumar, R., Engineering Reliability: Fundamentals and Applications, Prentice Hall, Englewood Cliffs, NJ, 1992. 

ACKNOWLEDGEMENT

The author thanks Y. K. Pandey, Director (Systems), and A. K. Manoj Kumar, Sr. Program Manager, Centre for Development of Telematics, for their encouragement to carry out the studies that led to this article.


V. Lakshminarayanan graduated from the Indian Institute of Science (Bangalore, India) in 1983 with an M.E. degree in Electrical Communication Engineering. He has designed and developed electronic systems and now coordinates failure analysis and reliability activities.

Email
Print
Reprint
Learn RSS

Related Content

Related Content

 

By This Author

Sponsored Links



 
Advertisement
SPONSORED LINKS

More Content

  • Blogs
  • Podcasts

Blogs


Sorry, no blogs are active for this topic.

» VIEW ALL BLOGS RSS

Podcasts

Advertisements





NEWSLETTERS

Click on a title below to learn more.

Test Industry News (3 Times Per Month)
Machine-Vision & Inspection (Monthly)
Communications Test (Monthly)
Design, Test & Yield (Monthly)
Automotive, Aerospace & Defense (Monthly)
Instrumentation (Monthly)
Resource Center E-Alert (Monthly)
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites