Get a grip on performance limits
Measure your embedded system's timing to learn how it responds to interrupts.
Andrew Girson, InHand Electronics, Rockville, MD -- Test & Measurement World, 1/1/2001
Embedded systems rely on interrupts to tell a processor when something needs service. Often, the system's response time to an interrupt is critical to proper operations. To learn how well your embedded system responds to interrupts, you'll have to make your own timing measurements.
You may need to find the upper boundary of how fast your software can keep servicing a series of interrupts and then decide if that upper boundary is acceptable for your application. You have several options for making these measurements, with each requiring just one piece of test equipment - an oscilloscope, a PC with a counter-timer card, or a function generator.
Operating systems (OSs) typically contain interrupt service routines (ISRs) that notify the OS that a specific interrupt occurred. If the interrupt has a high priority, the OS will often store information about the currently executing thread's status and switch to a software thread called an interrupt service thread (IST). An IST services the device causing the interrupt. If the interrupt has a low priority, then the OS will wait until it completes its current thread before running the IST. The measurement I'll describe applies to high-priority interrupts only because you generally won't care how quickly an OS services low-priority interrupts.
An embedded system's interrupt response time typically consists of three components. The sum of the three components is the overall response time of your embedded system from hardware interrupt to IST completion (assuming it has only one high-priority interrupt and thread). The components are:
1. The time between when a physical interrupt (such as a rising or falling edge on an interrupt pin) occurs and when the ISR begins to run. This period is generally known as the interrupt latency or hardware-interrupt latency.
2. The time between when the ISR begins to run and when the operating system switches tasks to the IST that services the interrupt. This period includes the time that the CPU spends inside the ISR plus the OS's task-switch latency, also known as scheduling latency or IST latency.
3. The time required for the high-priority IST to perform its tasks. This period is the one that is most under your control.
You have several options for measuring response time. You can use software in your embedded system to simulate interrupts, or you can generate hardware interrupts from an external source. Then, you can use hardware or software to measure the time between an interrupt's generation and the IST's completion. You can also count the number of completed interrupts and compare the count to the number of generated interrupts.
Start with one
You can start by measuring a system's response time with a single interrupt instance. You need a DSO with cursors, and you need to write ISR and IST code for your embedded-system UUT. The ISR and IST code must process the interrupt and generate pulses that indicate the start and end of the interrupt service.
If you're simulating interrupts with your embedded-system software, you'll also need to make the pulse available to a DSO. Write code that generates the simulated interrupt pulse and sends that pulse to an unused processor I/O pin. Next, use a wire to loop the pulse back into an interrupt input pin. Connect a scope probe to that wire to view the pulse.
To complete the measurement, you need to generate a signal indicating that the IST has completed its work. Add code to your IST that generates a pulse on another of the processor's I/O pins after the IST has serviced the interrupt. Connect that I/O pin to the oscilloscope's second channel.
Now, you'll have two scope traces, one indicating the start of the interrupt and the other indicating the completion of the IST. Measure the time between these two events with the scope's cursors; you'll have the system's response time.
|
| Figure 1 Time T2, about 40 µs, is the overall response time from a single interrupt in an embedded system containing an Intel StrongArm processor running Microsoft Windows CE. |
Figure 1 shows a measurement I took on an embedded system containing an Intel StrongArm processor running Microsoft Windows CE. The overall response (T2) is the time between the falling-edge interrupt in the top waveform and the rising-edge (IST completion) in the bottom waveform. At 200 µs per division, T2 measures about 40 µs. T1 indicates the time between one interrupt and the next, approximately 500 µs.
The method I just described lets you take just one measurement. You get more accurate statistical information, though, when you take thousands of response-time measurements.
Some bench DSOs let you take multiple measurements, store them, and perform statistical analysis right in the box. These instruments can generate histograms of the repeated measurements. By analyzing the histograms, you can get average and upper-bound overall latency measurements that let you better determine how well your embedded system services interrupts.
Add a card
If you don’t have such a sophisticated DSO, you can still take thousands of measurements. A counter/timer card in a PC can measure time-periods, and the PC can analyze the results.
To get the data, you must connect your microprocessor’s interrupt input line and the I/O pin with the “IST completion” pulse to the counter/timer card’s I/O connector. A counter/timer card can measure the time between the two events (T2 in Figure 1). You can even use the card to generate the interrupt pulses instead of using software in your embedded system. You can then store the measurements in a PC and perform the statistical analysis later.
Although less expensive than the DSO approach, the counter/timer card approach requires a fair amount of PC programming. You must configure the card to generate pulses, start counting, stop counting, and transfer its results to the PC. Then, you need a software package capable of statistical analysis, or you have to write your own analysis code.
Use a function generator
The scope and counter/timer methods produce measurements of the overall latency from which you can calculate how quickly your embedded system services interrupts. You can, though, set up a test to directly measure how fast external sources can generate interrupts that your embedded system can reliably process. In this case, you can use a function generator to create the interrupt pulses. This method gives you control over how fast the interrupts repeat. With a function generator, you can increase the interrupt frequency until the UUT can no longer reliably service the interrupts.
|
| Figure 2 A function generator can initiate interrupts in an embedded system so you can measure the system’s overall response time. |
Figure 2 shows how to set up this test. Connect the processor’s interrupt line to the function generator’s output only. (Otherwise, you might place an extra load in the line that will interfere with the test.) Set the function generator to generate a square wave at voltage levels compatible with the processor’s interrupt input pin. Set the square wave’s frequency to the rate of interrupts you expect to get when the UUT is in a real application.
Instead of generating a pulse after each completed IST, you should add code to your UUT that increments a software counter each time the IST ends. You need just one line of source code to increment the counter. You’ll also need to add code to your UUT to send the count data to a PC after the test runs. (If your UUT has its own display, you can just print the test results to the display.)
Set the square wave’s frequency to mimic the interrupts of a real-world application. When I ran these tests, I set the square wave to simulate an ADC’s sample and conversion time because my embedded system takes measurements. Set the function generator to its burst mode, which typically lets you push a trigger button on the front panel and have the function generator generate a specific number of cycles of the waveform.
Now that you have a mechanism for generating interrupts and counting the completion of ISTs, you’re ready to take measurements. Start your UUT’s software and then push the function generator’s trigger button. After the burst stops, retrieve the value of the software counter. If the UUT services all of the interrupts, then the value in the counter will equal the number of pulses in the burst. If the test result doesn’t match the number of pulses, then reduce the function generator’s frequency until all interrupts complete their tasks successfully.
When your embedded system is able to service all interrupts, then increase the frequency of the square wave, thereby reducing the time between interrupts. You can repeatedly perform this test over long bursting intervals, increasing the frequency each time until the software counter no longer equals the number of cycles in the burst. At this point, you will have a good estimate of the upper boundary on the overall latency for your embedded system. “A Test of Windows CE” shows results of a test I performed in my lab.
You can also analyze the performance of multiple interrupts. In this case, you can set up a second channel on the function generator (if it has two channels), or feed the sync pulse from one function generator to another function generator to get a synchronized second channel.
A simpler approach, however, is to set up two input pins for two interrupts—one that starts on the square wave’s rising edge and the other on its falling edge. You can feed the square wave from the function generator into both interrupt inputs, and by varying the duty cycle of the square wave, you can analyze the time spacing between an interrupt on each pin that causes failure. By varying the frequency of the square wave, you can analyze both the time spacing effects and the overall latency effects simultaneously. T&MW
Andrew Girson is CEO of InHand Electronics, a developer of handheld instrumentation platforms for OEMs. He received an MSEE from the University of Virginia. E-mail: agirson@inhandelectronics.com .
| A test of Windows CE
An embedded operating system’s perceived and quoted performance and its often unknown actual performance usually differ. A perfect example of this confusion surrounds Microsoft’s Windows CE operating system. Microsoft trumpets Windows CE as bringing the power of Windows to embedded, real-time devices, yet competitors deride it as neither stable nor fast enough and as lacking in critical real-time features. I’ve had many conversations with people who assumed that such an OS—because of its legacy (Windows)—was simply not appropriate for certain embedded and portable applications. So I could better understand the limitations of the OS, I decided to quantify the performance of one of my company’s Elf handheld single-board computer platforms. This platform uses Intel’s StrongARM CPU and runs Windows CE 3.0. I created a simple software application with one high-priority interrupt and an associated IST that simply incremented a software counter. I connected my function generator and then tested the software under several scenarios, using the burst approach outlined in the main article.
The results in Table 1 demonstrate the application’s performance under two scenarios. In one case, the OS resides in non-burst-mode flash memory; in the other case, the OS resides in faster DRAM. For each scenario, I also varied the IST priority from 0 (highest priority) to 251 (a “normal” priority used by many generic OS threads). Overall upper-bound response-time measurements vary dramatically as a function of OS location and IST priority. Windows CE proved itself capable of handling my embedded applications. It can achieve overall response times much faster than 1 ms, a result that has proved surprising to some engineers. Still more interesting is that a simple thing like executing out of DRAM can have such an effect on performance of high-priority interrupts and ISTs. Software location is just one of many variables that can affect real-time performance.—Andrew Girson | ||||||||||||


















