LTO 4.8 a lot (and often) issues, why?

Asked by emblocks

Hi,

I'm playing around with the 4.8 toolchain and I also are checking how good LTO is with this version. I experience that LTO is often not working (most of the time) and only works for rather small programs (where little gain is achieved with LTO).

Is there something I'm doing wrong?

E.g. errors of a ST demo with LTO:

||=== usr_prj, usr_prj ===|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2182|Error: cannot honor width suffix -- `mov r4,#0'|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2184|Error: cannot honor width suffix -- `mov r1,#41'|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2201|Error: cannot honor width suffix -- `mov r1,#40'|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2278|Error: lo register required -- `add r0,r0,#8'|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2288|Error: cannot honor width suffix -- `mov r0,#0'|
.....
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2744|Error: cannot honor width suffix -- `and r5,r0'|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2750|Error: lo register required -- `sub r0,r5,#1'|
S:\Users\EmBlocks\AppData\Local\Temp\ccmHsIwf.s|2751|Error: cannot honor width suffix -- `sbc r5,r5,r0'|
||More errors follow but not being shown.|
||Edit the max errors limit in compiler options...|
||=== Build finished: 50 errors, 0 warnings (0 minutes, 4 seconds) ===|

I also notice, coming from the 4.7.3 trunk (not embedded) and are now using the 4.8 embedded branch, I still don't see much different in image size between both. On a 100K+ project its less than 800 bytes.

Question information

Language:
English Edit question
Status:
Answered
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Terry Guo (terry.guo) said :
#1

Thanks for reporting. About the error message, would you please provide a reduced case and associated compiler options? A test case that can be compiled to reproduce this error message is OK.

Revision history for this message
emblocks (gnugcc) said :
#2

Well that is not so difficult. It's harder to get a right example.

Take, for instance, an exported Mbed.org project from yours and try it with LTO. Or try to build the ST demo of the F429, the one with the graphical library.

 If it was just one failing project then I would make an example project for you but it is just too many.

Revision history for this message
Uwe Bonnes (bon) said :
#3

Do those examples come with a Makefile?

Revision history for this message
emblocks (gnugcc) said :
#4

No, I call them from emblocks (dev version 3.0 with LTO) and/or from coide (LTO enabled).

They are normal building projects with the -flto added in a second build for LTO optimization.
But I can't believe that I'm the only one because this LTO is quite sensitive.

Revision history for this message
emblocks (gnugcc) said :
#5

For instance:

`vTaskSwitchContext' referenced in section `.text.xPortPendSVHandler' of S:\Users\EmBlocks\AppData\Local\Temp\cc54eg5o.ltrans22.ltrans.o: defined in discarded section `.text' of .\obj\~#\~#\~#\Utilities\Third_Party\FreeRTOS\Source\tasks.o (symbol from plugin)
collect2.exe: error: ld returned 1 exit status

Also a bit annoying is that these errors are difficult to catch in an IDE because they have a totally different structure as the normal warning and error messages.

Revision history for this message
emblocks (gnugcc) said :
#6

Ok, the one above is solved by declaring:

void vTaskSwitchContext( void ) PRIVILEGED_FUNCTION __attribute__((used));

Is it possible to change the returned error in a more convenient way as all the other "normal" errors?

Revision history for this message
emblocks (gnugcc) said :
#7

To answer my own question:

The source code is not one-on-one suitable for LTO. You need some additional attributes to make things happy. Because the first attempt gave errors like "Cannot honor width suffix -- `mov r4,#0' " I was assuming that all the other errors were low-level bugs.

I will see if I can make a small project with the first errors, the project is coming from Mbed.org and is a mixed c/c++ project.

I manged to make a RegEx which is catching the discarded section error. Example of IDE output:

.\obj\~#\~#\~#\Utilities\Third_Party\FreeRTOS\Source\tasks.o | vTaskSwitchContext | defined in discarded section
||=== Build finished: 1 errors, 4 warnings (0 minutes, 18 seconds) ===|

Last Question: is it normal that with LTO enabled, mixed mode view of disassembly is discarded?

Revision history for this message
Terry Guo (terry.guo) said :
#8

Unfortunately I can't reproduce issue "Cannot honor width suffix" with my mbed.org project and my stm32f429 project. All I need is not your full project, a preprocessored c file is good enough for me. You can use gcc option -save-temps to get a preprocessored c file.

Revision history for this message
emblocks (gnugcc) said :
#9

If you take the F051 project then you get those errors.

I don't know how to attache files here but if I build it with -save-temps I get in EmBlocks the usr_prj.elf.ltrans0.s opened with all the errors like:

usr_prj.elf.ltrans0.s|2182|Error: cannot honor width suffix -- `mov r4,#0'|
usr_prj.elf.ltrans0.s|2184|Error: cannot honor width suffix -- `mov r1,#41'|
usr_prj.elf.ltrans0.s|2201|Error: cannot honor width suffix -- `mov r1,#40'|
usr_prj.elf.ltrans0.s|2278|Error: lo register required -- `add r0,r0,#8'|
usr_prj.elf.ltrans0.s|2288|Error: cannot honor width suffix -- `mov r0,#0'|
usr_prj.elf.ltrans0.s|2291|Error: cannot honor width suffix -- `mov r0,#128'|
usr_prj.elf.ltrans0.s|2292|Error: cannot honor width suffix -- `lsl r0,r0,#4'|
usr_prj.elf.ltrans0.s|2305|Error: cannot honor width suffix -- `mov r0,#1'|
usr_prj.elf.ltrans0.s|2308|Error: cannot honor width suffix -- `mov r0,#72'|
usr_prj.elf.ltrans0.s|2340|Error: cannot honor width suffix -- `and r0,r6'|
usr_prj.elf.ltrans0.s|2359|Error: lo register required -- `add r5,r5,#3'|
usr_prj.elf.ltrans0.s|2391|Error: lo register required -- `add r6,r6,#4'|
usr_prj.elf.ltrans0.s|2406|Error: cannot honor width suffix -- `mov r6,#0'|
usr_prj.elf.ltrans0.s|2422|Error: cannot honor width suffix -- `mov r0,#56'|
usr_prj.elf.ltrans0.s|2429|Error: cannot honor width suffix -- `lsr r0,r6,#2'|
usr_prj.elf.ltrans0.s|2431|Error: cannot honor width suffix -- `lsl r0,r0,#2'|
usr_prj.elf.ltrans0.s|2454|Error: cannot honor width suffix -- `lsl r5,r0,#29'|
usr_prj.elf.ltrans0.s|2457|Error: lo register required -- `sub r0,r0,#4'|
usr_prj.elf.ltrans0.s|2461|Error: cannot honor width suffix -- `mov r6,#128'|
usr_prj.elf.ltrans0.s|2463|Error: lo register required -- `sub r5,r0,#4'|
usr_prj.elf.ltrans0.s|2464|Error: cannot honor width suffix -- `lsl r6,r6,#17'|
usr_prj.elf.ltrans0.s|2467|Error: cannot honor width suffix -- `mov r5,#8'|
usr_prj.elf.ltrans0.s|2468|Error: cannot honor width suffix -- `neg r5,r5'|
usr_prj.elf.ltrans0.s|2476|Error: cannot honor width suffix -- `mov r5,#64'|
usr_prj.elf.ltrans0.s|2477|Error: cannot honor width suffix -- `neg r5,r5'|
usr_prj.elf.ltrans0.s|2484|Error: cannot honor width suffix -- `mov r5,#0'|
usr_prj.elf.ltrans0.s|2494|Error: lo register required -- `sub r0,r0,#32'|
usr_prj.elf.ltrans0.s|2496|Error: cannot honor width suffix -- `mov r6,#0'|
usr_prj.elf.ltrans0.s|2518|Error: cannot honor width suffix -- `mov r0,#1'|
usr_prj.elf.ltrans0.s|2530|Error: lo register required -- `add r0,r0,#1'|
usr_prj.elf.ltrans0.s|2536|Error: lo register required -- `sub r5,r0,#1'|
usr_prj.elf.ltrans0.s|2537|Error: cannot honor width suffix -- `lsl r6,r0,#2'|
usr_prj.elf.ltrans0.s|2540|Error: cannot honor width suffix -- `mov r5,#4'|
usr_prj.elf.ltrans0.s|2541|Error: cannot honor width suffix -- `neg r5,r5'|
usr_prj.elf.ltrans0.s|2556|Error: cannot honor width suffix -- `lsl r5,r5,#2'|
usr_prj.elf.ltrans0.s|2585|Error: cannot honor width suffix -- `mov r0,#1'|
usr_prj.elf.ltrans0.s|2621|Error: cannot honor width suffix -- `mov r1,#0'|
usr_prj.elf.ltrans0.s|2650|Error: cannot honor width suffix -- `mov r5,#0'|
usr_prj.elf.ltrans0.s|2652|Error: lo register required -- `add r5,r5,#1'|
usr_prj.elf.ltrans0.s|2659|Error: cannot honor width suffix -- `and r5,r0'|
usr_prj.elf.ltrans0.s|2665|Error: lo register required -- `sub r0,r5,#1'|
usr_prj.elf.ltrans0.s|2666|Error: cannot honor width suffix -- `sbc r5,r5,r0'|
usr_prj.elf.ltrans0.s|2709|Error: cannot honor width suffix -- `mov r0,#250'|
usr_prj.elf.ltrans0.s|2710|Error: cannot honor width suffix -- `lsl r0,r0,#1'|
usr_prj.elf.ltrans0.s|2728|Error: cannot honor width suffix -- `mov r5,#0'|
usr_prj.elf.ltrans0.s|2731|Error: lo register required -- `add r5,r5,#1'|
usr_prj.elf.ltrans0.s|2737|Error: cannot honor width suffix -- `mov r2,#61'|
usr_prj.elf.ltrans0.s|2744|Error: cannot honor width suffix -- `and r5,r0'|
usr_prj.elf.ltrans0.s|2750|Error: lo register required -- `sub r0,r5,#1'|
usr_prj.elf.ltrans0.s|2751|Error: cannot honor width suffix -- `sbc r5,r5,r0'|
||More errors follow but not being shown.|
||Edit the max errors limit in compiler options...|
||=== Build finished: 50 errors, 0 warnings (0 minutes, 4 seconds) ===|

The number after the file is the line with the error. For instance the first fault gives the assembler snippet of:

2170 .cfi_startproc
2171 .LVL175:
2172 push {r4, lr}
2173 .cfi_def_cfa_offset 8
2174 .cfi_offset 4, -8
2175 .cfi_offset 14, -4
2176 .LBB374:
2177 .LBB375:
2178 .LBB376:
2179 .file 18 "mbed/DigitalOut.h"
2180 .loc 18 48 0
2181 ldr r0, .L181
2182 mov r4, #0 <---- Error
2183 .loc 18 49 0
2184 mov r1, #41
2185 .loc 18 48 0
2186 strb r4, [r0]
2187 str r4, [r0, #4]
2188 str r4, [r0, #8]
2189 str r4, [r0, #12]
2190 str r4, [r0, #16]
2191 .loc 18 49 0
2192 bl gpio_init_out
2193 .LVL176:
2194 .LBE376:
2195 .LBE375:

Revision history for this message
emblocks (gnugcc) said :
#10

Consideration: Mbed.org is exporting the project with object files of the Mbed library rather then with source files. These object files are not generated with -flto. I don't know if this matters because library linking is also done without -flto generated objects.

Revision history for this message
emblocks (gnugcc) said :
#11

Also strange, the intermediate assembler file contains the source file info like:

..
..
 .file 18 "mbed/DigitalOut.h"
 .loc 18 48 0
 ldr r0, .L181
 mov r4, #0
..
..

but the actual output file can't show mixed assembler view in GDB. Somehow this information is gone. As soons as I turn off the LTO I have the mixed view possibility back in my disassemble.

Revision history for this message
Terry Guo (terry.guo) said :
#12

From the ST website, I can get below two files:

stm32cubef0.zip stm32cubef4.zip

But the example projects in those packages are not makefile based project. And I don't have corresponding IDE at hand right now. So I have problems to build those projects.

Meanwhile, with -save-temps option, you should get some .i files which is preprocessored c code file. Would you please email me one of them along with command line options? My email is <email address hidden>. Thanks.

Revision history for this message
emblocks (gnugcc) said :
#13

An exported Mbed.org project with makefile:

http://www.emblocks.org/downloads/export_gcc_arm_DISCO_F051R8.zip

perhaps you can get the error with this. I used this project, but as an emblocks export, to generate the faults.
The -flto is not added yet!!

I don't have .i files because they are provided as objects (see project in zip file).

Revision history for this message
Terry Guo (terry.guo) said :
#14

Thanks very much.

This exported project works fine with option -flto, there is one of the command line:

arm-none-eabi-g++ -mcpu=cortex-m0 -mthumb -c -g -fno-common -fmessage-length=0 -Wall -fno-exceptions -ffunction-sections -fdata-sections -fomit-frame-pointer -flto -MMD -MP -DNDEBUG -Os -DTARGET_DISCO_F051R8 -DTARGET_M0 -DTARGET_STM -DTARGET_STM32F0 -DTARGET_STM32F051 -DTARGET_STM32F051R8 -DTOOLCHAIN_GCC_ARM -DTOOLCHAIN_GCC -D__CORTEX_M0 -DARM_MATH_CM0 -DMBED_BUILD_TIMESTAMP=1412064947.18 -D__MBED__=1 -std=gnu++98 -I. -I./mbed -I./mbed/TARGET_DISCO_F051R8 -I./mbed/TARGET_DISCO_F051R8/TARGET_STM -I./mbed/TARGET_DISCO_F051R8/TARGET_STM/TARGET_DISCO_F051R8 -I./mbed/TARGET_DISCO_F051R8/TOOLCHAIN_GCC_ARM -I./rtos -I./rtos/rtos -I./rtos/rtx -I./rtos/rtx/TARGET_M0 -I./rtos/rtx/TARGET_M0/TOOLCHAIN_GCC -o rtos/rtos/Mutex.o rtos/rtos/Mutex.cpp

Am I missing something?

Revision history for this message
emblocks (gnugcc) said :
#15

Well, I'm not familiar with the makefile exported projects of Mbed but it seems that you are not linking the whole project with the mbed provided objects. I think that instead of building the project you are only compiling one file, is it not?

The project fails building on both EmBlocks and CoIDE.

Revision history for this message
Uwe Bonnes (bon) said :
#16

I added -flto to CC_FLAGS and LD_FLAGS in the Makefile and when compiling, this error is thrown:

rtos/rtx/rt_CMSIS.c: In function 'osThreadCreate':
rtos/rtx/rt_CMSIS.c:689:1: error: r7 cannot be used in asm here
  }
  ^
I use
gcc version 4.8.4 20140725 (release) [ARM/embedded-4_8-branch revision 213147]
from launchpad. Compiling without -flto works fine.

Revision history for this message
Terry Guo (terry.guo) said :
#17

Essentially the makefile-based project works in the same way as your IDE based project. All of them have two stages to produce final image file (or elf file): compile stage and link stage. In the compile stage, we use compiler to build c source files one by one and will get object files. Then in the link stage, we link all object files together to get final image file.

The error "usr_prj.elf.ltrans0.s|2182|Error: cannot honor width suffix -- `mov r4,#0'|" is a compile stage error. It has nothing to do with link stage. So to reproduce this error, I just need a preprocessed c file and then compile it. That's why I keep asking for the c file and command line option. I don't need any other things for now.

Let us just focus on the compile stage. I can successfully build rtos/rtx/rt_CMSIS.c with below lto-enabled commands:

arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -c -g -fno-common -fmessage-length=0 -Wall -fno-exceptions -ffunction-sections -fdata-sections -fomit-frame-pointer -flto -MMD -MP -DNDEBUG -Os -DTARGET_DISCO_F051R8 -DTARGET_M0 -DTARGET_STM -DTARGET_STM32F0 -DTARGET_STM32F051 -DTARGET_STM32F051R8 -DTOOLCHAIN_GCC_ARM -DTOOLCHAIN_GCC -D__CORTEX_M0 -DARM_MATH_CM0 -DMBED_BUILD_TIMESTAMP=1412064947.18 -D__MBED__=1 -std=gnu99 -I. -I./mbed -I./mbed/TARGET_DISCO_F051R8 -I./mbed/TARGET_DISCO_F051R8/TARGET_STM -I./mbed/TARGET_DISCO_F051R8/TARGET_STM/TARGET_DISCO_F051R8 -I./mbed/TARGET_DISCO_F051R8/TOOLCHAIN_GCC_ARM -I./rtos -I./rtos/rtos -I./rtos/rtx -I./rtos/rtx/TARGET_M0 -I./rtos/rtx/TARGET_M0/TOOLCHAIN_GCC -o rtos/rtx/rt_CMSIS.o rtos/rtx/rt_CMSIS.c

I am using same tool chain as yours.

If you can reproduce such error in CoIDE, then please share me your CoIDE-based project, I now installed CoIDE on my machine.

Revision history for this message
Terry Guo (terry.guo) said :
#18

Hi Uwe,

I can reproduce the error you mentioned. I am looking into it.

Revision history for this message
Terry Guo (terry.guo) said :
#19

Hi emblocks,

I still can't reproduce the error:

usr_prj.elf.ltrans0.s|2182|Error: cannot honor width suffix -- `mov r4,#0'|

please help.

Revision history for this message
emblocks (gnugcc) said :
#20

Hi, I will PM you a link which will give you a complete environment with a EB 3.0 beta (portable without installer) with your 4.8 included and the project. Just launch EB and load the project EBP file. Hit the build.

Revision history for this message
emblocks (gnugcc) said :
#21

If you need the -save-temps switch then it is the easiest way to do this tool wide instead of project.

from the menu:

settings->tools->compiler settings[other options]
settings->tools->linker settings[other options]

Revision history for this message
Terry Guo (terry.guo) said :
#22

Hi emblocks,

I am using your exported project and able to reproduce the error after add extra option -fomit-frame-pointer to LD_FLAGS in project Makefile. I got a lot errors like below:

usr_prj.elf.ltrans0.s: Assembler messages:
usr_prj.elf.ltrans0.s:250: Error: lo register required -- `add r1,r1,#3'
usr_prj.elf.ltrans0.s:264: Error: cannot honor width suffix -- `mov r0,#128'
usr_prj.elf.ltrans0.s:266: Error: cannot honor width suffix -- `mov r3,#0'
usr_prj.elf.ltrans0.s:271: Error: lo register required -- `add r3,r3,#23'

So please don't bother my question on how to reproduce it.

Thanks again Uwe and emblocks for reporting. I will look into them. Since it looks to me that we have two issues here: one is about above error and one is about r7 register. I will create two Bug entries for them. I will use your project as test case, if you are not comfortable for this, please let me know.

Revision history for this message
emblocks (gnugcc) said :
#23

Well, actually it's your own (ARM) project. It is coming from the Mbed.org python export test suite.

Your welcome.

Revision history for this message
emblocks (gnugcc) said :
#24

Terry, what about the loss of mixed view assembly in GDB? is this also a bug?

I think that this should work but it isn't. As soon as -flto is turned on you are losing the possibility to watch the assembly in mixed mode view in GDB. If -flto is turned off, the mixed mode is working again.

Revision history for this message
Terry Guo (terry.guo) said :
#25

You are right. I just figured out how to reproduce this mixed view issue. I will count it as the third issue and open bug entry for it.

Revision history for this message
Terry Guo (terry.guo) said :
#26

All those issues are caused by below two inline assembly code which are not good enough:

#define SVC_Call(f) \
  __asm volatile \
  ( \
    "ldr r7,="#f"\n\t" \
    "mov r12,r7\n\t" \
    "svc 0" \
    : "=r" (__r0), "=r" (__r1), "=r" (__r2), "=r" (__r3) \
    : "r" (__r0), "r" (__r1), "r" (__r2), "r" (__r3) \
    : "r7", "r12", "lr", "cc" \
  );

Since r7 is frame pointer register for thumb target and will be used to access stack variable in a way like [r7, #4], so any change to r7 will trigger error "error: r7 cannot be used in asm here". But if the function using SVC_Call is very small and doesn't use any stack variable, there will be no such problem. I think we need to rewrite the above code to avoid use r7.

The problem "Error: cannot honor width suffix -- `mov r0,#128'" is caused by below inline assembly code:

__attribute__((naked)) void software_init_hook (void) {
  __asm (
    ".syntax unified\n" <<<<------this is offending code
    ".thumb\n"
    "movs r0,#0\n"
    "movs r1,#0\n"
    "mov r4,r0\n"

Remember that our target is cortex-m0 which now uses non-unified syntax assembly code. With LTO and this assembly code, we end up with case like:

    .syntax unified
    .thumb
    movs r0, #0 <----OK, this is unified.
   .thumb
 main:
    mov r0, #0 <---This is non-unified but will be wrongly regarded as unified.

All in all, we need to improve those cmsis rtos code.

Revision history for this message
emblocks (gnugcc) said :
#27

That make sense Terry.

That whole UAL is a bit obscure to me.
So we end up with one toolchain bug (the mixed view discard) and the other two are Mbed source code problems. That gives a bit more confidence in LTO although it is very sensitive and a lot of potential pitfalls for novice users.

Mixed mode:
The mixed view issue is for command line users not a big issue I guess because they don't notice it so quickly. But if you are working from an IDE where the mixed mode view is turned on by default then you notice this at once and gives it confusion what is going on. It looks like if the assembly jumped to non-existing code.

Thanks for the feedback.

Can you help with this problem?

Provide an answer of your own, or ask emblocks for more information if necessary.

To post a message you must log in.