Featured Categories
All Categories
Recent Posts
This is the story of an interesting project issue – including compiler optimizations, inline assembly, bootloaders, and all other manner of mayhem. Learning about optimizations and their impact on code, especially low-level code, is critical for improving your embedded firmware game.
Consider the following code from the Nordic bootloader:
__STATIC_INLINE void jump_to_addr(uint32_t new_msp, uint32_t new_lr, uint32_t addr)
{
__ASM volatile ("MSR MSP, %[arg]" : : [arg] "r" (new_msp));
__ASM volatile ("MOV LR, %[arg]" : : [arg] "r" (new_lr) : "lr");
__ASM volatile ("BX %[arg]" : : [arg] "r" (addr));
}
This code implements the very last steps of the bootloader before it jumps to the application.
Step 1: Setup the MSP register (“Main Stack Pointer”) with the beginning of the stack.
Step 2: Setup the LR register (“Link Register”) with the return address (in this case, a dummy value since we never expect the main app to return to the bootloader).
Step 3: JUMP! Branch to the address of our application code and be on our way!
On your way to bootloader/application jump happiness (image source)
We had compiled this code into a new project and were experiencing random reboots and flaky behavior.
Check out the assembly code that is generated by this code when it is operating correctly:
B500 push {lr}
F3808808 msr msp, r0
468E mov lr, r1
4710 bx r2
F85DFB04 pop.w {pc}
According to the ARM ABI (see section 6.1.1), the parameters are provided in r0, r1, and r2. The operation proceeds in a fairly straightforward fashion:
Step 1: Move r0 into msp.
Step 2: Move r1 into lr.
Step 3: JUMP! Branch to r2.
Everything looks great, no issues. This code works fine.
Now – the surprising part. Let’s say you want to debug the bootloader. You turn the compiler optimizations OFF and add debug symbols so you can follow the code. Now take a look at the assembly:
B500 push {lr}
B085 sub sp, sp, #20
9003 str r0, [sp, #12]
9102 str r1, [sp, #8]
9201 str r2, [sp, #4]
9B03 ldr r3, [sp, #12]
F3838808 msr msp, r3
9B02 ldr r3, [sp, #8]
469E mov lr, r3
9B01 ldr r3, [sp, #4]
4718 bx r3
BF00 nop
B005 add sp, sp, #20
F85DFB04 pop.w {pc}
See the problem yet? When we turned off optimizations, the compiler no longer uses registers directly, instead opting to put the parameters on the stack.
These parameters are placed on the stack in the first chunk of str instructions. Then the following happens:
Step 1: Load the new stack pointer into MSP.
Step 2: Load the new link register into LR.
Step 3: Load the jump address and execute the jump with r3.
Except that’s not really what happens…
Here’s what really happens:
Step 1: Load the new stack pointer into MSP.
Step 2: Load garbage into the LR.
Step 3: Load garbage and execute a jump to a garbage address.
In step 1, the stack pointer – where all the remaining parameters are stored – is blown away. Ugh…
Yucky invalid stack pointers make for a rotten day (image source)
There are likely some directives you could give gcc to help it avoid this issue. Our approach for now is to simply always build the Nordic bootloader with optimizations turned on.
And if you have questions about an embedded project you’re working on, Dojo Five can help you with all aspects of your EmbedOps journey! We are always happy to hear about cool projects or interesting problems to solve, so don’t hesitate to reach out and chat with us on LinkedIn or through email!
Check out our services | Check out our Emedded CI Platform
Or contact us at [email protected]