Global TMW:
Login  |  Register          Free Newsletter Subscription
Subscribe
Email
Print
Reprint
Learn RSS

The challenge of multisite test

Translating the economic benefits of parallel memory test to non-memory devices.

Greg Smith, Consultant -- Test & Measurement World, 2/1/2006

TABLES:
Memory vs. non-memory devices

Factors affecting test cost

READ OTHER FEBRUARY ARTICLES: 

Contents, February 2006

It seems simple. If you want to cut the cost of test for an IC, you should double or even quadruple the number of devices you test in parallel. After all, memory manufacturers have proven the value of this technique beyond a shadow of doubt—it's becoming standard practice to test 128 DRAMs in parallel (Ref. 1). Multisite testing has reduced the capital cost of a DRAM test site from $400k in 1997 to about $27k in 2004, allowing test costs for memories to stay constant or even decrease, even though densities have increased from 64 Mbits to 1 Gbit over the same period.

Why shouldn't the same math apply to all devices? To some extent it does, but most test engineers know that things are never as simple as they seem. Memories and SOCs are very different, and analyzing those differences can help you see why increasing the number of test sites may not necessarily result in cost savings on some testers.

In previous generations of non-memory testers, the combination of pin count, BIST/DFT features, and mixed-signal cores in non-memory devices conspired the keep most production test solutions to a maximum of dual site. Most testers did not provide the specialized features that allow the tester to independently synchronize to multiple devices in parallel, robbing efficiency from multisite solutions. Only recently have some ATE manufacturers delivered testers that provide sufficiently high-density digital and mixed-signal instruments and architectures that are capable of supporting massive multisite for non-memory devices.

The two key differences between memory and non-memory devices are test times and total production volumes. Big production runs and long test times make memories ideally suited for massively parallel testing. A memory tester equipped with 128 sites testing a 1-Gbit memory with a 128-s test time will have an output of about 3600 units per hour (UPH). A non-memory tester with four sites testing a device with a 4-s test time has the same throughput. An attempt to reduce test cost for this device by going to a 16-site test would theoretically produce 14,400 UPH, but other facets of production are likely to limit the payback from creating a massive multisite test solution.

Multisite efficiency

The relative efficiency of testing memories and SOCs can be very different (Ref. 2). The efficiency of memory testers is largely irrelevant due to the algorithmic nature of memory test and the ability to generate test stimulus and process test results in per-site hardware. The test list for memories consists of a small number of tests that have long execution times, so relatively little time is spent setting up the tester compared with the actual test time. Non-memory devices, conversely, have test lists that can be thousands of tests long, and each test may require only a few milliseconds to perform.

Bottlenecks in the tester architecture become more and more noticeable as the number of sites increases. Every element of the tester design must be optimized for multisite efficiency. The efficiency of DC tests depends upon the ability to quickly sequence these tests under pattern control, eliminating any serial programming of tester hardware. The efficiency of mixed-signal tests depend upon the ability to move and analyze captured data quickly while testing continues in the foreground. The efficiency of many mixed-signal and digital tests depend upon the ability of the tester to independently synchronize (or match) on each site in parallel, otherwise these tests must be done serially. In other words, the tester must be designed from the ground up with a fully parallel architecture.

Figure 1.  High efficiency is crucial to low cost of test. A tester must be more than 75% efficient to provide any real benefit beyond quad site. To be cost-effective beyond eight sites, a tester must be more than 90% efficient.

Figure 1
, adapted from Ref. 2, shows that a tester must be more than 75% efficient to provide any real benefit beyond quad site. To be cost-effective beyond eight sites, a tester must be more than 90% efficient. Only a parallel architecture tester will be able to achieve this level of efficiency in production.

How much can a handler handle?

A crucial element of the test cell is the device handler. Pick-and-place handlers (P&P handlers) that can handle a large number of package types and a wide range of device pin counts are often used for non-memory devices. These handlers can be easily changed from one package style to another, and many support testing at ambient, cold, and hot temperatures. Inside the handler, the device goes through four basic stages:

  • waiting to be tested in an input tray;
  • being loaded into a carrier in the handler and brought to the correct temperature for testing;
  • placed into the test socket, tested, and then placed back into the carrier; and
  • sorted, where good devices and bad devices are placed into separate output trays.

P&P handlers are able to perform all four of these processes in parallel. The handler takes into account anticipated thermal soak times and presumed test times to determine how many devices to queue in the soak chamber and how many devices to sort in parallel. Like the tester, the handler represents a reasonable tradeoff between throughput and expense.

Two main factors determine the throughput of a handler:

  • Index time, the time required to remove tested devices from the test sockets and install fresh, untested devices. For P&P handlers, index times range from 0.4 to 0.8 s. Index time must be added to the test time of the device when calculating throughput. On some handlers, index times increase with the number of parallel sites.
  • Maximum throughput, the maximum number of devices that the P&P handler can process in a given time period, if the actual test time is zero. The maximum throughput gives an indication of how many devices can be accommodated in the thermal soak chamber and how quickly devices can be sorted after testing is completed. Current handlers offer throughput of 5000 to 8000 UPH.

Unfortunately, the index time and the maximum throughput for most handlers differ depending on factors such as the number of parallel test sites, the size and type of the device package, and the dimensions of the device trays. Test temperature also can have a marked effect on the throughput.

Handler manufacturers will generally provide throughput curves specific to the handler model and change kit for each device type to help customers calculate expected performance. These curves represent the peak performance of the handler under ideal conditions. On a real production floor, variations in package dimensions and misalignments in carrier trays result in handler jams. Usually, an operator quickly clears these jams with a few keystrokes on the handler control panel, but while the jam condition exists, no material moves through the handler. Also, if the jam occurs in the thermal chamber or in the mechanism that presents the devices to the test sockets, the operator may need to open the handler to clear the jammed devices. Handler jams are specified using two parameters:

  • Jam rate, the average number of devices processed between handler jams. For P&P handlers, jam rates will range from 1 in 10,000 to nearly 1 in 5000. How tightly device package dimensions are maintained can affect the jam rate, as can the weight of the device, because heavier devices are less likely to be mishandled. Test temperature also has an effect, with cold temperature testing tending to have the highest jam rates.
  • Mean time to assist (MTTA), the amount of time required to clear a jam. Most of the time, a quick key press takes care of the jam in less than a minute, but a few jams require the operator to open the handler or break the setup and can bring down a test cell for as long as an hour. Also, the MTTA assumes that an operator is immediately available to service the test cell, instead of doing other work. For an operator working a few test cells, a reasonable MTTA is 2 to 5 min.

For multisite testing, it is crucial to remember that the jam rate is related purely to the number of devices being handled. Therefore, the selection of a handler with the lowest possible jam rate is critical to maximizing throughput. Also, the constraints will be very different for wafer probe test where throughput can be much higher and jams are not an issue.

Testing device lots

Testing ICs is a batch process. A batch, or "lot," of devices is loaded into a handler, tested, and unloaded. Then, a fresh lot is loaded, and so on. While the loading and unloading takes place, the test cell is idle.

When a lot completes, an operator summarizes the test results, unloads and labels the trays of good and bad devices, and loads fresh material into the handler. The amount of time needed for this end-of-lot (EOL) processing is almost independent of lot size but will vary depending on the level of automation. It also depends on the number of test cells each operator covers. If the operator is doing something else when a lot completes, a test cell will stand idle until loaded with fresh material. Informal manufacturer surveys indicate that a reasonable estimate for EOL processing is 5 to 10 min, mainly depending upon the number of test cells an operator manages.

The impact of this idle time on test cost is a function of the size of the lot and the amount of time required for EOL processing. The larger the lot, the longer it will take to run through the test cell, meaning that the test cell is idled less frequently and therefore more efficient. If efficiency were the only thing driving lot size, then large lots would be the best choice. Unfortunately, lot size depends on factors that often push manufacturers to make lots smaller, not larger. Customers want to keep work-in-progress inventories low and are unwilling to accept large lots, and semiconductor manufacturers are reticent to build large quantities and hold them in finished-goods inventory. In general, lot sizes are usually between 1000 and 10,000 devices. At the beginning of a production run, lots tend to be smaller, increasing in size as yields improve.

Consider a case where the throughput of a quad-site solution is 8000 devices per hour. A 2000-device lot could be tested in 15 min. If EOL processing takes an additional 10 min, during which the test cell is idle, then the test cell is idle 40% of the time, driving up the real cost of test dramatically. In contrast, if a single-site solution is implemented, testing the same 2000-unit lot may take 120 min. In this case, with the same 10-min EOL processing time, the test cell is idle only 8% of the time.

Figure 2.  Device 1, a wireless base-band device with embedded memory, shows steadily decreasing cost of test up to an octal site implementation, but device 2, a similar device but without embedded memory and produced in lower volumes, is 30% more expensive to test at octal site than at dual site.
The faster the tester, the more important it will be to ensure that idle time during lot processing is minimized.

Your test cost may vary

The challenge is to understand all of these effects and determine what type of setup will be the most the most cost-effective. Two specific cases help tie the effects together:

  • Device 1 is a wireless base-band device with a big embedded memory. It has 80 active pins, DACs, ADCs, and multiple processor cores. Because of the embedded memory, the test time is extremely long at 15 s. Demand for this device is high, and the expected production rate is approximately 1 million devices per month for the next year. Lot sizes are 5000 devices.
  • Device 2 is a wireless-networking base-band device with the same pin count. It also has DACs, ADCs, and processor cores, but no embedded memory. Without the memory, and because of some effective DFT, test time is a blistering 5 s. Production is ramping up, and 10,000 devices will be shipped per month for the next year. Volumes are moderate—lot size is 1000 devices.

Our fictional test engineer has a tester that can be configured to test any number of sites from 1 to 16, and the multisite efficiency is a respectable 95%. She has selected a P&P handler with an index time of 0.5 s and a maximum throughput of 7000 devices/hr for quad, octal, and hex configurations. For dual site, this handler has a throughput of 3500 devices/hr. For single site, throughput is 1750. The handler jam rate is 1 in 5000, and MTTA is 2 min.

A model that includes all of these factors provides estimates of the cost of test for these two devices (Figure 2). Even though the devices are similar in many respects, the test time and production rates have a critical effect on the cost of test. While device 1 shows a steadily decreasing cost of test up to an octal site implementation, device 2 is 30% more expensive to test at octal site than at dual site.

Figure 3.  Lot size can have an effect on cost of test. Here, increasing lot size from 1000 to 10,000 cuts test cost 20%. 
The same model can be used to understand the potential impact of making changes to the production testing. For example, the test engineer could examine the effect of making the lots larger for device 2 (Figure 3). If she can increase the lot size to 10,000, she can cut the cost of test almost 20%—a much better result than she would obtain by adding sites.

Minimizing the cost of test for complex non-memory devices requires more thought than just doubling the number of devices tested in parallel. Even though memory devices have shown that massive multisite is a valuable strategy to minimize cost of test, non-memory devices present a different challenge to test cell throughput. Table 2 includes some of the major factors that can influence the economics of multisite solutions.

The one constant in semiconductor test is change. Tester and handler manufacturers are constantly refining technology and working with device manufacturers to explore new technologies to break these barriers. Other types of handlers, including strip-test handlers and matrix handlers, have been developed to handle devices in groups rather than individually. Also, P&P handlers are constantly improving to offer higher throughputs, lower jam rates, and advanced features to minimize MTTA. As these technologies come on line, the economics of multisite testing will evolve, and reductions in cost of test will continue.

Table 1. Memory vs. non-memory devices
1-Gbit DRAM Non-memory device
Number of active pins 45 <16 to >300
Incorporates BIST/DFT No Yes
Mixed signal No Yes
Single device test time ~120 s 1 to 10 s
Total production volumes 10 million to 100 million 100,000 to 10 million


Author Information
Greg Smith was a consultant specializing in semiconductor ATE and handling systems when he wrote this article. He has now joined Teradyne in a technical marketing role. He previously held leadership roles in product development, marketing, and applications at LTX.


REFERENCES
  1. Rajsuman, Rochit, "Open Architecture Supports Parallel Test," Test & Measurement World, December 2003, www.tmworld.com/archives.
  2. LaBonte, Harold, "It's all about the Architecture," The Final Test Report, Vol. 16, No. 2, February 2005, www.finaltestreport.com.
  •  

    Table 2. Factors affecting test cost

    Massive multisite may reduce cost of test if...

    • Tester has a parallel architecture with multisite efficiency >90%
    • Total production volume exceeds 1 million
    • Test time is greater than 10 s (usually infers device includes embedded memory)
    • UPH of handler does not limit throughput
    • Jam rate is low (>1:10,000)
    • Lot size is large (>5000)
    • Device pin count is small, or reduced pin-count testing is used

    Massive multisite may increase cost of test if...

    • Tester has bottlenecks that limit multisite efficiency to <90%
    • Total production volumes are less than 1 million
    • Test time is less than 10 s
    • UPH of handler limits the throughput
    • Handler assist rate is <1:5000
    • Lot size is small (<2000)
    • Device pin count is large

    Email
    Print
    Reprint
    Learn RSS

    Talkback

    We would love your feedback!

    Post a comment

    » VIEW ALL TALKBACK THREADS

    Related Content

    Related Content

     

    By This Author

    There are no other articles written by this author.

    Sponsored Links



     
    Advertisement
    SPONSORED LINKS

    More Content

    • Blogs
    • Podcasts

    Blogs

    • Martin Rowe
      Rowe's and Columns

      July 22, 2008
      Disposable test equipment
      While visiting a company for an upcoming T&MW print article, I heard an engineer talk about high...
      More
    • Martin Rowe
      Rowe's and Columns

      July 16, 2008
      Oscilloscope frustrations
      The other day, a reader e-mailed me about his oscilloscope frustrations. "I use my oscilloscope...
      More
    • » VIEW ALL BLOGS RSS

    Podcasts

    Advertisements





    NEWSLETTERS

    Click on a title below to learn more.

    Test Industry News (3 Times Per Month)
    Machine-Vision & Inspection (Monthly)
    Communications Test (Monthly)
    Design, Test & Yield (Monthly)
    Automotive, Aerospace & Defense (Monthly)
    Instrumentation (Monthly)
    Resource Center E-Alert (Monthly)
    ©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
    Use of this Web site is subject to its Terms of Use | Privacy Policy
    Please visit these other Reed Business sites