G++ generating redundant code
(This is using the 2016q2 release of the toolchain, GCC 5.4.1).
Use first arm-none-eabi-gcc and then arm-none-eabi-g++ to compile this (without any optimisation i.e. -O0):
void doStuff(void);
int getNum(void);
void compare1(int n)
{
if(n > 33)
{
doStuff();
}
if(getNum() > 44)
{
doStuff();
}
}
GCC generates pretty much what I'd expect:
27 0010 08301BE5 ldr r3, [fp, #-8]
28 0014 210053E3 cmp r3, #33
29 0018 000000DA ble .L2
30 001c FEFFFFEB bl doStuff
31 .L2:
32 0020 FEFFFFEB bl getNum
33 0024 0030A0E1 mov r3, r0
34 0028 2C0053E3 cmp r3, #44
35 002c 000000DA ble .L4
36 0030 FEFFFFEB bl doStuff
37 .L4:
But from G++:
32 0010 08301BE5 ldr r3, [fp, #-8]
33 0014 210053E3 cmp r3, #33
34 0018 000000DA ble .L2
35 001c FEFFFFEB bl _Z8doStuffv
36 .L2:
37 0020 FEFFFFEB bl _Z6getNumv
38 0024 0030A0E1 mov r3, r0
39 0028 2C0053E3 cmp r3, #44
40 002c 0130A0C3 movgt r3, #1 <<< why is
41 0030 0030A0D3 movle r3, #0 <<< all this
42 0034 FF3003E2 and r3, r3, #255 <<< extra code
43 0038 000053E3 cmp r3, #0 <<< generated?
44 003c 0000000A beq .L4
45 0040 FEFFFFEB bl _Z8doStuffv
46 .L4:
It gets worse if G++ is given the -mthumb switch:
40 0012 FFF7FEFF bl _Z6getNumv
41 0016 0200 movs r2, r0
42 0018 0123 movs r3, #1 <<< again
43 001a 2C2A cmp r2, #44
44 001c 00DC bgt .L3
45 001e 0023 movs r3, #0 <<< that
46 .L3:
47 0020 1B06 lsls r3, r3, #24 <<< redundant
48 0022 1B0E lsrs r3, r3, #24 <<< code
49 0024 01D0 beq .L5 <<< and even an extra branch instruction!
50 0026 FFF7FEFF bl _Z8doStuffv
51 .L5:
And with the -mcpu=cortex-m3 switch:
40 0016 0346 mov r3, r0
41 0018 2C2B cmp r3, #44
42 001a CCBF ite gt <<< good
43 001c 0123 movgt r3, #1 <<< use
44 001e 0023 movle r3, #0 <<< of
45 0020 DBB2 uxtb r3, r3 <<< Thumb-2
46 0022 002B cmp r3, #0 <<< conditional-
47 0024 01D0 beq .L4
48 0026 FFF7FEFF bl _Z8doStuffv
49 .L4:
It seems to be using a uint8_t as a flag, initially setting it true, changing it to false if the condition is not met, then zero-extending the uint8_t to 32 bits before jumping based on its value - 7 instructions instead of 2 (and using an extra register). So two questions - why is it using a flag at all, and why is the flag handled as a byte rather than a word?
Admittedly, the redundant instructions disappear if I enable optimisation O1, but that's not ideal for debugging.
(Why am I bothered? Because I'm looking at whether the Keil product justifies its cost, and one of the differences I've found is the efficiency of code it generates without optimisations enabled. It would really be better for us devs if we could keep on using Atmel Studio and feel confident that arm-g++ wasn't eating our precious code-space and CPU cycles with stuff like this! Friday rant over...)
Thanks,
Mike H
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- No assignee Edit question
- Solved by:
- Michael Haben
- Solved:
- Last query:
- Last reply: