Global TMW:
Login  |  Register          Free Newsletter Subscription
Subscribe
Email
Print
Reprint
Learn RSS

Tests ride the PCI Express

Physical-layer and protocol-layer tests can minimize interoperability problems.

Eugene Sushansky and John Gudmundson, PLX Technology, and Chuck Trefts, Catalyst Enterprises -- Test & Measurement World, 9/1/2005

SIDEBARS:
Checklists and workshops
PHY encoding

READ
OTHER SEPT. ARTICLES:  Table of contents, Sept. 2005

SEPTEMBER FEATURES:
Vote for the 2005 Test Engineer
of the Year

Measuring space
Tests ride the PCI Express
Boundary scan goes underground

Make sense of lens specs
Guide the light

Riders on city buses must follow certain rules: move to the rear, signal the driver to stop, and give up your seat to someone who needs it more than you do. Likewise, devices that "ride" the PCI Express—bridges, endpoints, motherboards, add-in cards, root complexes, and switches—must follow rules as well. You can ensure your PCIe devices follow these rules—and work with other PCIe devices—by performing a series of compliance tests.

Because it's a serial bus, PCIe requires testing at the physical layer (PHY) and at the protocol layers above it (Figure 1). Each layer must comply with many pages of specifications published by the PCI Special Interest Group (PCI SIG, www.pci-sig.com). The PCI SIG also publishes a series of compliance checklists and organizes workshops where you can run interoperability tests. Before attending a workshop, you can perform tests in your lab using the checklists as a guide. (See "Checklists and workshops")


Figure 1.  PCI Express uses two protocol layers on top of a physical layer.

Start at the PHY

Successful PCIe interoperability starts at the PHY, which consists of logical and electrical sub-blocks that manage data encoding and decoding as well as lane configurations and variations. They also cover a myriad of electrical signal specifications. PCIe products must link at the PHY layer before higher layer connections—and data transfer—can occur.

PHY testing begins with the differential signal lanes on the PCIe bus. Transmitted (TX) bits come in two versions: transition bits (Figure 2) and nontransition bits (Figure 3). For each, you need to analyze data streams by using eye diagrams to measure their amplitude and jitter.

You can test the signals on TX lanes for electrical compliance by using an oscilloscope with a differential probe. You can generate a compliance-test pattern from any PCIe device through a device's state machine. To invoke the test pattern, terminate the TX lane with a 50-V resistor or 50-V oscilloscope probe. If the device's receiving (RX) lane has no activity on power up, the state machine will transmit a compliance test pattern.

Figure 2.  Transition bits, which occur when a bit changes state, form a complete eye opening.
 
Figure 3.  Nontransition bits must comply with eye-diagram specifications such as eye opening and amplitude. 

To make eye-diagram measurements, you need to generate 3500 unit intervals (UIs). Use an oscilloscope to make the measurements on the 250 UIs that occur in the center of the 3500-UI pattern. A compliant eye opening will show transition bits with a minimum differential amplitude of 800 mVp-p. Nontransition bits should have a minimum differential amplitude of 505 mVp-p. But higher amplitude values aren't necessarily better. Compare the signal's eye opening and amplitude to specified limits.

A perfect eye, if one existed, would have an opening of 1.0 UI. PCIe specifications allow no more then a 30% loss from 1.0 UI (0.7 UI or greater) for TX, and they require 40% (0.4 UI) or larger UIs at a receiver's input. The 30% loss is a jitter budget allocated for board layout and other possible loss contributors between a transmitter and a receiver.

PCI SIG specifications require that you test a receiver's ability to extract a clock that's embedded in transmitted data. Like several other serial buses, PCIe uses 8b/10b encoding to minimize long runs of the same bit, thus making transition bits frequent enough for a receiver to extract the clock. "PHY encoding," explains why 8b/10b character encoding improves a receiver's ability to recover a clock from a data stream.

PHY problems can appear as errors at the protocol layers. For example, problems in the electrical portion of the PHY will often result in invalid characters or disparity errors. You can use a protocol analyzer to track such errors over a specified time interval.

A protocol analyzer can simulate a PCIe device by sending two transaction layer packets (TLPs) to a device under test (DUT). The packets should generate several responses from the DUT across all three layers, and the protocol analyzer can check these responses for errors. You should perform this test before diving into other protocol tests to make sure that the link's physical integrity is intact.

Data-link layer

Once you complete the electrical and logical PHY tests, you can move to protocols. Testing progresses up the protocol stack, through the data-link and transaction layers, where you must verify the operability of devices working under control of their drivers and applications.

The PCIe data-link layer ensures the integrity of communications between the two directly communicating devices. The data-link layer serves many functions, including:

  • flow control initialization,
  • flow control updates,
  • ACK/NAK protocol,
  • power management control,
  • cyclic redundancy checks (CRCs),
  • transaction sequencing, and
  • replay of failed transactions.

Communications between PCIe devices occur in data-link-layer packets (DLLPs). In addition to providing origin and terminus to various DLLP communications, the data-link layer adds or removes information and checks for proper transaction sequencing and data integrity. When a PCIe transaction passes from the transaction layer to the data-link layer, the data-link layer appends it with a sequence number and a 32-bit link CRC (LCRC).

A PCIe receiver uses the sequence number to verify correct transaction ordering, and it checks the LCRC to verify error-free data. If a receiver detects sequencing or packet-integrity errors, it will instructthe transmitter to resend the failed transaction. If a device receives DLLPs without error and in the proper order, the receiver strips away the sequence number and LCRC fields before it passes the packets to the transaction layer.

As part of a PCI SIG compliance test, you need to test a device's ability to react to malformed packets, including whether the device properly reports errors to system registers through the root complex. See Figure 4 for the compliance test setup. A PCIe bus exerciser acting as a bus endpoint, and under control of an automated compliance test script or used manually, emulates a transmitting device. With the exerciser, you can force noncompliant data-link-layer responses, transaction-layer responses, or no response.

Figure 4.  A typical test setup for a PCI Express device requires setup, stimulus, and response packets. An oscilloscope lets you measure jitter and eye openings.

The exerciser can also simulate a transmitter to test another device's error-handling characteristics, its device driver, or its application software. Operating as an endpoint, the exerciser can send a pre-queued DLLP, such as an ACK or NAK packet with a corrupt 16-bit DLLP CRC, in response to a received transaction. Or it can send an incorrect sequence number embedded within a complete transaction. Logging and evaluation of such responses can help you identify interoperability problems and their root causes.

Transaction layer

The PCIe transaction layer encapsulates data-link-layer information and control characters into TLPs that move information from point to point and throughout the PCIe fabric. These transactions echo PCI and PCI-X transactions, with a message transaction added for PCIe communications. Message transactions, used for power management, interrupt signaling, hot-plug support, error reporting, and vendor-specific purposes, eliminate the need for sideband signaling.

The transaction layer performs various error checks, including:

  • TLP packet format,
  • timeout errors on completion packets,
  • flow control operation,
  • notification of unsupported requests,
  • data corruption (through data poisoning),
  • end-to-end CRCs (ECRCs), and
  • unexpected completions.

Transactions at this layer come in two variations: posted or nonposted. Memory writes and messages are posted transactions, which don't require transaction-layer completion packets. Memory reads, I/O reads and writes, and configuration reads and writes are nonposted, and thus require a completion packet. Nonposted transactions are modeled similarly to the PCI-X split-transaction protocol (request/completion).

Compliance testing at the transaction layer requires that you simulate a device and verify that the DUT responds properly to errors. A protocol analyzer, for example, must intentionally transmit a packet with a header bit set to indicate data poisoning, with the analyzer programmed to provide a pass/fail indication based on the receiving device's response.

The error-poisoned (EP) bit in a TLP's header indicates corruption of payload data. Devices may send a corrupt payload in various ways. A requester can fetch data from memory that may include a parity error or internal data corruption. Device responses to a poisoned packet might include a returned completion TLP with a status field set to unsupported request (UR), appropriate updates to error registers, or the disposal of the received payload data.

A formatting error associated with the packet will produce a malformed TLP. Examples include mismatches between the payload size and the header's payload length field, byte-enable violations, incorrect type field values, and improper transaction routing. Each possible malformation requires an individual test case to confirm compliance.

The transaction layer also handles proper flow control, which ensures that a device won't send data across the link until a receiver is ready to receive. The transaction layer must ensure that a receiver has enough buffer memory to handle the next transaction. Protocol errors can cause a device to refuse to send transactions or to overrun buffers on the receiving device. You need to test flow control for proper passing of data as part of your compliance tests.

Flow control is important because different types of packets need different priorities. A time-critical application such as video needs a higher priority than data retrieved from a hard drive. PCIe devices use the virtual channel/traffic class (VC/TC) concept to prioritize data, which further complicates compliance testing. Application programs and device drivers make the TC assignments and allocate memory to VC buffers (hardware on switches, endpoints, and root complexes). PCIe switches route traffic based on the VC/TC mapping.

You can test prioritization once a PCIe bus exerciser or application software classifies data to a given TC. Use a protocol analyzer to verify the packet header has the proper 3-bit TC tag. Then, once a PCIe device maps data to the VC configured for that TC, use the analyzer to view the data.

Compare the TC characteristics of this egress data to the source data and verify the expected output. The expected TC packets should match each type of known ingress TC packets and the relative weighting of TC to VC.

A variety of physical and logical factors can produce data-transmission problems. Examples include marginal quality of the transmitted data, corrupt physical media, marginality at the receiving device, or a noncompliant optional reference clock. But with thorough testing of all layers of the PCIe protocol stack and their interactions with drivers and application layer software, you can minimize interoperability problems.


Author Information
Eugene Sushansky is senior engineering project manager for PLX Technology, Sunnyvale, CA. He was responsible for the emulation of PLX's first PCI Express products, and he actively participates at interoperability workshops for PCI Express products. He holds a BSEE from San Francisco State University. E-mail: eugene@plxtech.com.
John Gudmundson is senior product marketing manager for PLX Technology. Prior to PLX, he held marketing positions at Sunrise Telecom, Nortel Networks, Integrated Telecom, and Ascend Communications. He is an active member of the IEEE and American Marketing Association, and he holds positions on several industry standards-bodies. Gudmundson holds a BSEE from the University of California, Los Angeles, a BSBA from the University of California, Berkeley, and an MBA from the University of Southern California. E-mail: jgudmundson@plxtech.com.
Chuck Trefts is the technical marketing manager for Catalyst Enterprises, San Jose, CA. Prior to joining Catalyst, he held various positions in engineering and technical marketing within the aerospace industry, and he is a member of various trade associations and standards bodies. He holds a BSBA from the University of LaVerne (CA).

 

Checklists and workshops

The PCI Special Interest Group (PCI SIG) workshops divide testing into two distinct areas: interoperability testing among devices and systems from the various attending companies and Gold Suitetesting performed by the SIG. Your product must successfully pass both types of testing for it to comply with the PCIe specification. To help you prepare for a compliance workshop, the SIG has developed several checklists that provide detailed requirements for PCIe bridges, endpoints, motherboards, BIOS, add-in cards, root complexes, and switches.

Interoperability testing at the workshops lets you confirm proper and reliable operation between PCIe systems and devices. You may uncover interoperability issues between systems and devices that you may not find from either the SIG Gold Suites testing or the checklists.

PCI SIG tests range from low-level physical testing to protocol-oriented tests. But they aren't exhaustive, because the PCIe specification contains thousands of rules across all layers, and testing time is limited at workshops.

If your product meets compliance specifications, you can get it on the PCI SIG's integrator's list. To get on the list, a company must submit self-completed checklists for its products after they pass the tests at a workshop. Each checklist contains a common cross-section of test areas, including electrical, link and transaction layer, system architecture, power management, and configuration testing. Plug-in cards and motherboards also need an electromechanical section. The motherboard checklist also includes a test section on BIOS operation.

Eugene Sushansky, John Gudmundson, and Chuck Trefts

PHY encoding

The PCIe bus uses the popular 8b/10b encoding scheme, which formats every eight bits into 10-bit characters. The 8b/10b format provides sufficient bit-level transitions for a receiver to recover an embedded clock, which eliminates the need for a separate clock. The 8b/10b scheme also produces a balanced DC link because it eliminates long streams of 0's or 1's. Most of these 8b/10b characters have both negative and positive variants, and some provide a neutral disparity. Some examples include:

Positive disparity (more 1's than 0's): 0 0 1 1 1 1 0 1 0 1

Negative disparity (more 0's than 1's): 1 1 0 0 0 0 1 0 1 0

Neutral disparity (equal number of 0's and 1's): 0 1 0 1 0 1 1 1 0 0

To maintain the DC balance, a running-disparity function in the transmitting device makes on-the-fly decisions on the polarity of each transmitted character. That is, it determines whether the number of 1's transmitted approximates the number of 0's transmitted.This encoding mechanism eases link synchronization, simplifies receiver and transmitter designs, and improves error detection based on the characterized data at the 2.5-Gbps data transfer rate.

PCIe also scrambles the data just prior to the 10b encoding in the transmit device's PHY logic. Scrambling removes repetitive patterns in the bit stream, which otherwise could concentrate energy at discrete frequencies. Scrambling also spreads energy over a frequency range, resulting in a more EMI-friendly architecture. Not all information moving across the link is scrambled; control characters and physical layer ordered sets are not scrambled.—Eugene Sushansky, John Gudmundson, and Chuck Trefts

Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

There are no other articles written by this author.

Sponsored Links



 
Advertisement
SPONSORED LINKS

More Content

  • Blogs
  • Podcasts

Blogs

  • Martin Rowe
    Rowe's and Columns

    July 22, 2008
    Disposable test equipment
    While visiting a company for an upcoming T&MW print article, I heard an engineer talk about high...
    More
  • Martin Rowe
    Rowe's and Columns

    July 16, 2008
    Oscilloscope frustrations
    The other day, a reader e-mailed me about his oscilloscope frustrations. "I use my oscilloscope...
    More
  • » VIEW ALL BLOGS RSS

Podcasts

Advertisements





NEWSLETTERS

Click on a title below to learn more.

Test Industry News (3 Times Per Month)
Machine-Vision & Inspection (Monthly)
Communications Test (Monthly)
Design, Test & Yield (Monthly)
Automotive, Aerospace & Defense (Monthly)
Instrumentation (Monthly)
Resource Center E-Alert (Monthly)
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites