Introduction

Introduction to Interrupts

Introduction to Interrupts

Normal execution of a given software application is contained within the bounds of one program, or instruction stream. Such execution is provable, as well as traceable. However, system designers and implementers also have to understand how breaks in program flow occur, and how they may affect the running program. Flow breaks fall into two general classes:

- Exceptions and traps are predictable, synchronous breaks in program flow. They are synchronous because they are caused by the execution of certain instructions (divide by zero; illegal memory access; software interrupt). Exceptions and traps trigger execution of instructions that are not part of the program, but perform work on behalf of it.
- Interrupts are asynchronous breaks in program flow that occur as a result of events outside the running program. They are usually hardware related, stemming from events such as a button press, timer expiration, or completion of a data transfer. We can see from these examples that interrupt conditions are independent of particular instructions; they can happen at any time. Interrupts trigger execution of instructions that perform work on behalf of the system, but not necessarily the current program.

This article explains how an interrupt is handled by the processor and software.

Why interrupt?

As a system's functional requirements and the size of the software grow, it becomes more difficult to ensure that time-critical items (such as capturing incoming data before it is overwritten by the hardware) are performed properly. We can approach this dilemma with a faster processor (more cost, more heat, and more radiated noise), or we can separate the time-critical functions from the others and execute them in a prioritized manner. Interrupts form the basis for this separation. The non-time-critical functions continue to execute as quickly as they can (within the main loop), but time-critical functions are executed on demand-in response to interrupts from the hardware.

Hardware

When a device asserts its interrupt request signal, it must be processed in an orderly fashion. All CPUs, and many devices, have some mechanism for enabling/disabling interrupt recognition and processing:

- At the device level, there is usually an interrupt control register with bits to enable or disable the interrupts that device can generate.
- At the CPU level, a global mechanism functions to inhibit/enable (often called the global interrupt enable) recognition of interrupts.
- Systems with multiple interrupt inputs provide the ability to mask (inhibit) interrupt requests individually and/or on a priority basis. This capability may be built into the CPU or provided by an external interrupt controller. Typically, there are one or more interrupt mask registers, with individual bits allowing or inhibiting individual interrupt sources.
- There is often also one non-maskable interrupt input to the CPU that is used to signal important conditions such as pending power fail, reset button pressed, or watchdog timer expiration.

Figure 1 shows an interrupt controller, two devices capable of producing interrupts, a processor, and the interrupt-related paths among them. The interrupt controller multiplexes multiple input requests into one output. It shows which inputs are active and allows individual inputs to be masked. Alternatively, it prioritizes the inputs, shows the highest active input, and provides a mask for inputs below a given level. The processor status register has a global interrupt enable flag bit. In addition, a watchdog timer is connected to the non-maskable interrupt input.

The interrupt software associated with a specific device is known as its interrupt service routine (ISR), or handler.

Software

Some older CPUs routed all interrupts to a single ISR. Upon recognizing an interrupt, the CPU saved some state information and started execution at a fixed location. The ISR at that location had to poll the devices in priority order to determine which one required service. However, the basic process of interrupt handling is the same as in the more complex case.

Most modern CPUs use the same general mechanism for processing exceptions, traps, and interrupts: an interrupt vector table. Some CPU vector tables contain only the address of the code to be executed. In most cases, a specific ISR is responsible for servicing each interrupting device and acknowledging, clearing, and rearming its interrupt; in some cases, servicing the device (for example, reading data from a serial port) automatically clears and rearms the interrupt.

Interrupts may occur at any time, but the CPU does not instantly recognize and process them immediately. First, the CPU will not recognize a new interrupt while interrupts are disabled. Second, the CPU must, upon recognition, stop fetching new instructions and complete those still in progress. Because the interrupt is totally unrelated to the running program it interrupts, the CPU and ISR work together to save and restore the full state of the interrupted program (stack, flags, registers, and so on). The running program is not affected by the interruption, although it takes longer to execute. The hardware and software flow for a timer interrupt is shown in Figure 2.

Many interrupt controllers provide a means of prioritizing interrupt sources, so that, in the event of multiple interrupts occurring at (approximately) the same time, the more time-critical ones are processed first. These same systems usually also provide for prioritized interrupt handling, a means by which a higher-priority interrupt can interrupt the processing of a lower-priority interrupt. This is called interrupt nesting. In general, the ISR should only take care of the time-critical portion of the processing, then, depending on the complexity of the system, it may set a flag for the main loop, or use an operating system call to awaken a task to perform the non-time-critical portion.

Latency

The interrupt latency is the interval of time measured from the instant an interrupt is asserted until the corresponding ISR begins to execute. The worst-case latency for any given interrupt is a sum of many things, from longest to shortest:

- The longest period global interrupt recognition is inhibited
- The time it would take to execute all higher priority interrupts if they occurred simultaneously
- The time it takes the specific ISR to service all of its interrupt requests (if multiple are possible)
- The time it takes to finish the program instructions in progress and save the current program state and begin the ISR

We can see how higher-priority interrupts can have much lower latencies. In simple cases, latency can be calculated from instruction times, but many modern systems with 32-bit CPUs, caches, and multiple interrupt sources, are far too complex for exact latency calculations. Interrupt latency must be considered at design time, whenever responsiveness matters.

Introduction to Interrupt Debugging

Interrupt-related problems are among the hardest to debug. Here's a primer on some common pitfalls to avoid.

Interrupts are, in many cases, the key to real-time embedded systems. There's often no other way to make sure that a particular piece of code executes in a timely manner. Unfortunately, interrupts can increase a system's complexity and make overall operation less predictable.

An interrupt signals an event to the microprocessor. It could indicate that a particular amount of time has elapsed, that a user has pressed a button, or that a motor has moved a certain distance.

When an interrupt occurs, the microprocessor hardware saves the return address on the stack and transfers control to the interrupt service routine (ISR).[1]

The ISR saves the CPU context (unless the hardware does so automatically) and any registers that it will use. The context includes the contents of special registers, such as CPU status registers, and any other information needed to return the CPU to the state it was in just before the interrupt occurred.

After saving the context, the ISR does whatever the interrupt has prompted it to do, restores the CPU context, and returns to normal processing. The microprocessor resumes executing where it left off before the interrupt.

Because software cannot predict when interrupts will occur and ISRs briefly pause execution of the mainline software, we must always remember that an interrupt can occur at any time, between any pair of instructions. In my experience, most of the really difficult interrupt problems occur when that reality interacts with the rest of the software. In this article, we will take a look at two of the more common ones. We'll also discuss ways to avoid them.

Race conditions

A race condition is probably the most common interrupt-related problem. Take a look at Figure 1 along with the two pseudo-code fragments below:

Figure 1: A race condition

Mainline code:

1. Read variable X into register

2. Decrement register contents

3. Store result back at variable X

ISR code:

A. Read variable X into register

B. Increment register contents

C. Store result back at variable X

Let's say that the shared variable X is tracking the number of bytes in a buffer. The ISR puts a byte into the buffer and increments X. The mainline code reads a byte from the buffer and decrements X. Say that X starts out with a value of 4. The ISR puts a byte into the buffer and increments it to 5. The mainline code then reads a byte and decrements the count back to 4.

But if an interrupt occurs between lines 1 and 3 in the mainline code, the value of X will be corrupted. First, the mainline code reads X, which is 4, into a register. Then the ISR occurs, also reads 4, and increments X to 5. After the ISR completes, the mainline code finishes, storing the improper value 3 in X. This sequence is illustrated in Figure 1.

Any shared resource can be involved in a race condition. The issue also arises with shared hardware reigsters. It even applies to shared subroutines and functions, unless they are reentrant.[2]

This problem has several solutions. Some processors have atomic read-modify-write instructions that can read the memory location, modify the value, and write the new value back to memory without interruption. If you are using a high-level language such as C, it may be difficult to force the compiler to generate these special instructions. Some assembly may be required.

A second way to prevent race conditions is to disable interrupts around the read/decrement/write sequence in the mainline code, as illustrated below:

Protected mainline code:

0. Save interrupt state and disable

1. Read variable X into register

2. Decrement register contents

3. Store result back at variable X

4. Restore prior interrupt state

By far, the best solution is to avoid sharing variables and hardware registers between ISR and mainline code. In the example of the counter, this could be accomplished by using two counters. One counter is incremented by the mainline code, and the other counter ISR code. The number of bytes in the buffer is the difference between the two counts. Ideally, variables that are only written by the ISR code are only read by the mainline code, and vice versa.

Hardware complications

Some peripherals have more internal registers than externally addressable locations. Registers in such devices are manipulated by first writing a value to an address register, and then reading or writing data at a different address to access the selected register's contents.

The sequence to access a register in these devices is something like this:

- Write identifier for desired internal register to "address register."
- Read or write the "data register" to access the selected internal register.

A problem occurs if an interrupt fires between the two operations and the ISR also must manipulate that peripheral device's registers. The mainline code will write the address register to select whatever data register it needs to access. Then the ISR gets control, writes a different value to the address register, and accesses some other register. When the ISR returns and the mainline code completes its access, the address register has changed so the mainline code reads (or writes) the wrong register.

Devices that have this characteristic are often high-integration parts with a number of functions, and there may be no way to avoid having both ISR and mainline code access the device.

One way to prevent such a race condition is to read the contents of the peripheral address register and save it as part of the context in the ISR. If the address register is write-only, the only viable solution is to bracket the mainline access with an interrupt disable/enable pair.

Stack overflow

Another potential problem with interrupts is stack overflow. Since the return address and any additional context information is always added to the stack, an interrupt uses up stack space. If you add more information to the stack than it can hold, you get stack overflow.

When the stack overflows, one of three things may happen: the new (overflow) data may overwrite another memory area; the stack pointer may wrap around causing another part of the stack to be overwritten; or, in a system with hardware memory management, an exception may occur.

Stack overflow is more likely on microcontrollers and other systems with limited memory. A larger stack is the best way to prevent an overflow. Some microcontrollers have a fixed, hardware stack that requires careful attention from the programmer to prevent overflow. In some designs, you may even have to save some information in fixed memory locations, rather than on the stack, to prevent overflow.

The addition of interrupts to a system opens the window to dangers such as race conditions and stack overflow. Careful attention to design in these areas can save enormous amounts of debug time when you perform integration.

Stuart Ball is an electrical engineer with twenty years of experience in the area of embedded systems. He is the author of three books on the subject, all published by Butterworth-Heinemann. He holds a BSEE degree from the University of Missouri-Columbia. E-mail him at stuart@stuartball.com.

1. Massey, Russel. "Understanding Interrupts," Embedded Systems Programming, June 2001, p. 95.

Back

2. Ganssle, Jack. "Reentrancy," Embedded Systems Programming, April 2001, p.183.