How to debug embedded systems
Ilias Alexopoulos- December 11, 2012Embedded systems design is a challenging field. Each project is unique with diverse needs and constraints. This is a tutorial discussing, methods, tips and tricks for helping debugging embedded systems firmware using logic analyzers and digital oscilloscopes. There are many ways to debug embedded systems. Tools that help include simulators, in-circuit emulators, JTAG/BDM debuggers, custom hardware, LEDs and switches, as well as serial or other communication ports. Depending on budget and complexity, designers choose the best tools that fit their needs. Although they are primarily targeted at hardware design, you can use digital or analog oscilloscopes and logic analyzers for firmware development as well. The equivalent firmware tool for these is the trace buffer.
Embedded firmware has many differences in respect to traditional software development. Apart from limited resources in hardware (memory, speed, and tools), embedded systems always have the parameter of time. We often use the term real-time system to describe embedded systems. If response timing fails, then the system is not considered working even if function-wise it is perfect. For example, if you drive some excitation, like motors, it is not easy to stop your program and execute code step-by-step to determine the problem. It may be also an intermittent problem that occurs only when running full speed. One way to overcome these problems is to use a trace buffer [1, page 33.]
A trace buffer records actions in a memory buffer. After the event, you can stop the debugger or gather the data for analysis through a communication port (e.g. UART, SPI, Ethernet). This method has advantages. For instance, since it is a firmware solution, it offers flexibility. In addition, you can access to any internal variable. However, trace buffers need memory, some management code, and they may have difficulties correlating to external hardware events. Although newer processors offer more memory, this is often used by the application.
Another method is the action codes [1, pg 43]. Instead of capturing data in a system’s RAM, you could send the codes to an external register or pins and capture the outputs with the help of a logic analyzer. This overcomes the memory problem, and the required code is very small and fast. In this case, the limitations are the output pins and the instrument’s memory capacity. Fortunately, external event synchronization is performed easily, and you have a very strong triggering capability if you use a relatively modern logic analyzer. This article includes techniques and examples for this approach.
Example Case 1
On a project I was involved in the mid ‘90s, we had a system with two devices: a microcontroller for doing the I/O work of the board, and a microprocessor running the application code. We were a team of two hardware engineers and three software engineers developing the product. The two microprocessors were communicating through an 8-bit latched register port. The master processor (application) polled the slave (I/O processor) for events. The communication was based on interrupts (request) and acknowledgement from the slave processor side. The hardware engineers (including me), were responsible for the hardware and to provide a hardware abstraction layer for the application. The I/O controller, for which I was writing the firmware, was standalone firmware communicating with the application processor.
The software team occasionally complained that they had missing keyboard events during their testing. As you might expect, we (the smart hardware engineers), never investigated the issue because we believed that probably there was an error on their side. After a few months and repeated complaints, we decided to investigate the issue. We used in-circuit emulators, and we implemented a trace buffer on the application processor. We could not track the problem from the software though. Each of us concluded that the software worked functionally as expected. The suspect was the inter-processor interface, so we arranged to have a logic analyzer.
The specific instrument had 4K bytes of storage memory. We attached the probes quickly and started looking at the signals. Soon we realized that the memory was not enough to investigate this issue. And even worse, there was no other analyzer with much more memory at the time. As the senior engineer was disappointed, I started to work with the trigger mechanism. I was motivated to accomplish the task with what we had. It was the second time I used a logic analyzer. I never imagined that I would use this kind of instrument at all, especially for low-end embedded systems. I was so wrong! I managed to do a multi level triggering of about 10 stages. And then it happened. The instrument triggered, and we saw the problem (albeit the small memory). The I/O controller responded too fast for the application processor. The problem was resolved with a few ‘no operations’ (NOPs).
Lessons learned: Firmware usually has many states (memory) and actions that are correlated. You need to have as much memory as you can to debug it. Also a complex triggering mechanism is essential for more complex problems.
Page 1 of 7Next >