Global TMW:
Login  |  Register          Free Newsletter Subscription
Subscribe
Email
Print
Reprint
Learn RSS

Protocol Analyzers Test Fibre Channel Systems

A protocol analyzer can help you isolate and analyze problems in Fibre Channel networks.

Paul Levin and Greg Beutler, Xyratex International, Irvine, CA -- Test & Measurement World, 8/1/1999

Key Fibre Channel Features

Ø Transfer rates of up to 100 Mbytes/s transfer rate. By comparison, Ultra SCSI has a transfer rating of 40 Mbytes/s.

Ø As many as 125 drives (devices) in a single loop.

Ø Installation that requires no option settings, no jumpers, and no bus-terminator networks.

Ø Operation with device separations of up to 30 m for copper-wire connections, up to 10 km using single-mode long-wavelength lasers, and 2 km for normal fiber-optic connections.

Ø Fibre Channel communications use an 8B/10B encoding scheme to detect most transmission errors. Data encoding employs a 32-bit CRC value at the end of every transmission frame so a receiver can verify the accuracy of each frame’s contents.

T&MW

Fibre Channel systems are used in computer networks that transfer large quantities of information from place to place. Such networks might transfer video information from workstation to workstation, or to and from workstations and arrays of disks.

Fibre Channel provides plenty of bandwidth, it operates over long distances, and it offers all the “hooks” necessary to permit users to mix packets of video, audio, graphics, and control information using a variety of protocols, such as Internet protocol (IP) or SCSI. (See “Key Fibre Channel Features,”) Thus, it provides an excellent medium for many types of computer networks. But someone must ensure that a Fibre Channel network operates properly.

The day will come when a Fibre Channel network doesn’t work as well as expected. Response times may start to climb or occasional video or audio packets won’t arrive at a destination in time, or they will fail to arrive at all. You must be ready to find out why a network isn’t working properly and how to get it back on track.

The most basic tool you can use to test a Fibre Channel system is a protocol analyzer, which may include performance-monitoring software. When you insert a protocol analyzer in a Fibre Channel loop or network, it acts only as an observer. It neither initiates nor terminates any “traffic,” nor does it alter the network’s traffic in any way.

A Fibre Channel network employs unidirectional links that go in a daisy chain—often called a loop—from device to device. To use a protocol analyzer to monitor the performance or behavior of a device in a network, you monitor the information that goes into the device and the information that comes out of it. Consequently, all Fibre Channel protocol analyzers and performance monitors are two-port devices (Fig. 1). In the example shown in the figure, the protocol analyzer uses Port A to monitor the traffic flowing to the storage system and uses Port B to monitor the traffic the storage system transmits back to the workstation.

TMW9908F3FIG1.gif (16757 bytes)

Figure 1. A protocol analyzer requires two connections to a Fibre Channel network, one on each side of a section of the network undergoing testing. The analyzer does not modify network traffic, it only monitors it.

Stamp Your Traffic
The protocol analyzer records all of, or a subset of, the Fibre Channel traffic going past it, and the records include time stamps that identify when the traffic was present. Time stamps let you know the exact order in which the analyzer acquired the traffic and how much time
separated the captured events. The analyzer stores captured data in high-speed RAM or, when time allows during capture of small portions of a network’s traffic, the analyzer can save information to an internal hard disk. After the analyzer captures the data, it can display it or process it to produce detailed analyses.

A performance monitor—usually software that runs on a protocol analyzer—indicates traffic levels, traffic statistics, and basic error conditions for the information going past the analyzer (Fig. 2). Specifically, Fibre Channel performance measurements include

Ø data rates in bytes/s and frames/s;

Ø link utilization;

Ø traffic characteristics;

Ø error conditions;

Ø code violations (CV) or “illegal” 10-bit codes;

Ø cyclic redundancy check (CRC) failures; and

Ø loop-initialization-procedure events.

08f3fig2.gif (26070 bytes)

Figure 2. The display for performance-analysis software shows bus-use history (upper left) and frame-size characteristics (upper right). The bottom display shows data rates and link use for the two input channels.

Because Fibre Channel data rates approach 100 Mbytes/s, a protocol analyzer cannot save every piece of information it reads—at least not for long. So, instrument suppliers provide selective triggering that lets users choose the specific pieces of traffic to save. At its simplest, this type of trigger is analogous to a trigger on an oscilloscope. Unlike a scope, though, the protocol analyzer’s trigger lets it start and stop acquiring traffic many times so it can collect separate but similar sections of traffic for later analysis.

Usually, acquisition of traffic information starts on a match of trigger conditions with either particular fields within frames or with protocol signals that indicate specific events. If your protocol analyzer provides performance-monitor functions (Fig. 2), you can have the analyzer trigger acquisitions based on the occurrence of specific throughput values or error conditions.

Filters Select Traffic
If you need to capture traffic based on an event—or events—that occur infrequently, you usually can save such information directly to a disk that can accumulate information for many hours or even for several days. Clearly, you must set up the triggers on your instrument so it captures only the information you need and doesn’t overload the disk with extraneous data.

To aid in capturing the proper traffic, instruments provide data filters. A filter might limit the instrument to capturing traffic only from a particular source. The filter also can detect particular types of commands or responses, or it can filter just the first n bytes in each frame.

When a Fibre Channel system includes several sets of computers, workstations, and arrays of disks that operate over a fabric, you’ll need several sets of test gear that must operate in concert. (Fabric is a Fibre Channel term that simply means a system of routers and switches.) In many cases, you won’t have access to the internal fabric connections, so you will have to synchronize your test instruments—usually with a direct cable connection—so you can identify and correlate what happens on one part of a network with what happens on another part (Fig. 3).

TMW9908F3FIG3.gif (25093 bytes)

Figure 3. The link between two networks takes place through a “fabric” of routers and switches. To make measurements across such a fabric requires synchronizing two protocol analyzers by using a direct connection.

Now that you know more about the tools available to help solve Fibre Channel problems, how can you apply them? First, you need to check the integrity of the Fibre Channel network itself. Error logs maintained by workstations and smart hubs in the fabric can indicate problems.

If your Fibre Channel system already includes a protocol analyzer, the analyzer will provide log information, too. (Some users leave an analyzer constantly connected to a network.) The logged information helps you assess whether or not the network’s electrical or optical links are operating properly. Fibre Channel’s stated objective is to operate with a bit error rate of less than 1 in 1012, or roughly 3 errors/hr. Most Fibre Channel users report error rates considerably lower than that.

If the error log reports unexpected loop-initialization procedures or more than one or two code-violation or CRC errors in an hour, you’ll need to examine the integrity of the entire network. Loop-initialization procedures (LIPs)—required steps that ensure a network is restored and functioning properly—generally don’t occur unless the network contains a defective device or the network has a break. LIPs also occur when someone adds or removes a device from a Fibre Channel network.

Check Network Integrity First
When errors appear in a network, testing poses a challenge because a Fibre Channel network provides no easy way to determine where errors arise between the source of information and the device that monitors link activity. The monitoring device might be a performance monitor, or it could be a workstation or hub.

The Accredited Standards Committee ANSI-X3T11 is now working to resolve the problem of identifying sources of errors in a network. The original standard provided for error reporting using a Link Error Status function, but due to ambiguities in the standard, no manufacturer implemented error-reporting in a Fibre Channel product.

Until manufacturers provide error-reporting functions, though, you have no easy way to poll the network to determine which port is the first in the network to detect errors. So, as a first step in testing a network, measure the signal power at the receiving end of the network to determine if it’s below the expected value. If it is, the network probably suffers from a link-integrity problem.

If network integrity is fine, yet the network still doesn’t function properly, a protocol analyzer and performance monitor will help locate the source of errors. Start at the network’s origin and use the protocol analyzer to bridge each device on the network, one at a time, until the protocol analyzer captures errors worth analyzing (more than the few-per-hour rate). You’ll see a sharp increase in the error rate when you bridge the offending part of the network.

Look for an Overload
Mechanical and electronic problems such as a broken connector or a bad driver IC can cause obvious problems. But errors also can occur when a network suffers from an overload condition, which you can measure using a performance monitor. The monitor’s peak bus-utilization indicator, or the plot of bus use vs. time, may show a sharp increase in use from well within Fibre Channel’s sustained capacity of 60–70 Mbyte/s to nearly 100% capacity—between 80 and 93 Mbytes/s (SCSI protocol). Full use of the Fibre Channel’s capacity may result in delayed (missed) transfers or slow responses.

To help analyze the cause of such a situation, you should set a performance monitor’s threshold to trigger the protocol analyzer at, say, 90% network capacity. Analysis of events just before and through a period of peak bus use may indicate why so much traffic was trying to get onto the network at once.

Even if the Fibre Channel network doesn’t reach full capacity overall, an individual device on the network may overload. Overloading can occur when one device gets burdened with so many I/O requests that it cannot properly process them. At the same time, other arrays of disks may remain idle.

A protocol analyzer can collect long sequences of only frame headers that you can then analyze
offline with software. Analysis will show whether or not the loading is balanced in the short and long term. Balancing refers to the sharing of activity between devices so that one device is not busy most of the time while others see little use.

If one device seems to be particularly busy, isolate its traffic and study it in more detail. The study may show that the network administrator needs to replace a disk storage system with a faster unit, or that a data structure should be spread between several storage systems to apportion network traffic more evenly among them. As a preventive measure, you should monitor the capacity of the network for all the devices at least weekly, at least to start. As you gain confidence in your measurements, you can stretch monitoring periods. Data from monitoring a network’s activity can reveal increased overall Fibre Channel use or increased use of a particular device in the network.

If you notice an increasing response time of storage units—the time it takes to respond to a request for data—or if you notice a missing data effect—discarded frames during periods of congested traffic—you should capture detailed information continuously. Protocol analyzers provide a wrap mode that captures data continuously by writing the newest information over the oldest. The analyzer stops capturing traffic when a preset condition triggers it. In this way, you can analyze the traffic on the network up to and including the trigger condition. Remember, that you can set up a protocol analyzer or performance monitor to trigger on many different conditions or faults to help you determine the cause of problems.

Home in on Errors
Always check the time stamps on captured information to be sure the analyzer is capturing enough information to adequately cover times when problems occur. Determining the proper coverage may take some “cut and try” work. If necessary, readjust trigger and filter characteristics, such as the amount of data captured per frame, to lengthen the capture period.

If you need still more detailed information, run additional tests, but save only frames corresponding to a particular device ID, thereby capturing more frames over a longer period for the device that you suspect of causing problems. Then, when you find the frames that seem to cause problems, adjust triggering to capture more traffic in the vicinity of those frames. T&MW

FOR FURTHER READING
Dedlk, Jan, with Gary Stephens, What Is Fibre Channel? 4th ed. Ancot, Menlo Park, CA (650-322-5322.) ISBN: 0963743996. 1997.
Kembel, Robert, The Fibrechannel Consultant: Arbitrated Loop, Connectivity Solutions, Tucson, AZ (520-881-0877). ISBN: 0-931-836-82-4. 1997.

More information about Fibre Channel is available from the Fibre Channel Loop Community (Saratoga, CA), an organization that supports Fibre Channel education and standards, www.fibrechannel.com.

Paul Levin is Senior Principal Engineer and Greg Beutler is Product Planning Manager at Xyratex International (Irvine, CA), a company that manufactures Fibre Channel test instruments. 949-476-1016. paul_levin@us.xyratex.com ; gregory_beutler@us.xyratex.com.   

Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

There are no other articles written by this author.

Sponsored Links



 
Advertisement
SPONSORED LINKS

More Content

  • Blogs
  • Podcasts

Blogs

  • Rick Nelson
    Taking the Measure

    July 1, 2008
    S-parameters are so yesterday
    Textbook amplifiers operate in linear mode and are easy to analyze. Unfortunately, it’s often ...
    More
  • Rick Nelson
    Taking the Measure

    June 30, 2008
    Cell phones helping cell phones
    Now, I’m leery of the phrase “paradigm shift,” which is often applied to increment...
    More
  • » VIEW ALL BLOGS RSS

Podcasts

Advertisements





NEWSLETTERS
Click on a title below to learn more.

Test Industry News (3 Times Per Month)
Machine-Vision & Inspection (Monthly)
Communications Test (Monthly)
Design, Test & Yield (Monthly)
Automotive, Aerospace & Defense (Monthly)
Instrumentation (Monthly)
Resource Center E-Alert (Monthly)
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites