[proposed 4.5 version of ARM fix now in 4.7/trunk, forward-ported to 4.6.0 ] Date: Wed, 23 Mar 2011 16:46:42 +0100 From: Bernd Schmidt Subject: Problem with ARM longcalls List-Archive: I've discovered a problem with -mlong-calls on ARM. The bug was first reported against a new target, but I'd copied the relevant code from the ARM backend. We use current_function_section in arm_is_long_call_p to decide whether we're calling something that goes into the same section. The problem with this is that current_function_section can only be used during final, since it relies on the global variable in_cold_section_p which is set up only in assemble_start_function. On ARM, this problem manifests as short-calls when a long-call would be required; in the other port it was an "insn doesn't satisfy its constraints" error. The following patch is against 4.5, since the problem appears hidden in mainline (the initialization of first_function_block_is_cold has changed). Ok for trunk and branches after arm-linux tests complete? Bernd * function.c (init_function_start): Call decide_function_section. * varasm.c (decide_function_section): New function. (assemble_start_function): When not using flag_reorder_blocks_and_partition, don't compute in_cold_section_p or first_function_block_is_cold. * rtl.h (decide_function_section): Declare. * gcc.target/arm/cold-lc.c: New test. --- gcc-4.6.0/gcc/function.c.~1~ 2011-03-09 21:49:00.000000000 +0100 +++ gcc-4.6.0/gcc/function.c 2011-06-21 22:32:05.000000000 +0200 @@ -4488,6 +4488,7 @@ init_function_start (tree subr) else allocate_struct_function (subr, false); prepare_function_start (); + decide_function_section (subr); /* Warn if this value is an aggregate type, regardless of which calling convention we are using for it. */ --- gcc-4.6.0/gcc/rtl.h.~1~ 2011-03-09 21:49:00.000000000 +0100 +++ gcc-4.6.0/gcc/rtl.h 2011-06-21 22:32:05.000000000 +0200 @@ -1664,6 +1664,7 @@ extern rtx get_pool_constant (rtx); extern rtx get_pool_constant_mark (rtx, bool *); extern enum machine_mode get_pool_mode (const_rtx); extern rtx simplify_subtraction (rtx); +extern void decide_function_section (tree); /* In function.c */ extern rtx assign_stack_local (enum machine_mode, HOST_WIDE_INT, int); --- gcc-4.6.0/gcc/testsuite/gcc.target/arm/cold-lc.c.~1~ 1970-01-01 01:00:00.000000000 +0100 +++ gcc-4.6.0/gcc/testsuite/gcc.target/arm/cold-lc.c 2011-06-21 22:32:05.000000000 +0200 @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mlong-calls" } */ +/* { dg-final { scan-assembler-not "bl\[^\n\]*dump_stack" } } */ + +extern void dump_stack (void) __attribute__ ((__cold__)) __attribute__ ((noinline)); +struct thread_info { + struct task_struct *task; +}; +extern struct thread_info *current_thread_info (void); + +void dump_stack (void) +{ + unsigned long stack; + show_stack ((current_thread_info ()->task), &stack); +} + +void die (char *str, void *fp, int nr) +{ + dump_stack (); + while (1); +} + --- gcc-4.6.0/gcc/varasm.c.~1~ 2011-02-28 16:36:37.000000000 +0100 +++ gcc-4.6.0/gcc/varasm.c 2011-06-21 22:33:56.000000000 +0200 @@ -1541,6 +1541,38 @@ notice_global_symbol (tree decl) } } +/* If not using flag_reorder_blocks_and_partition, decide early whether the + current function goes into the cold section, so that targets can use + current_function_section during RTL expansion. DECL describes the + function. */ + +void +decide_function_section (tree decl) +{ + first_function_block_is_cold = false; + + if (flag_reorder_blocks_and_partition) + /* We will decide in assemble_start_function. */ + return; + + if (DECL_SECTION_NAME (decl)) + { + /* Calls to function_section rely on first_function_block_is_cold + being accurate. The first block may be cold even if we aren't + doing partitioning, if the entire function was decided by + choose_function_section (predict.c) to be cold. */ + + initialize_cold_section_name (); + + if (crtl->subsections.unlikely_text_section_name + && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)), + crtl->subsections.unlikely_text_section_name) == 0) + first_function_block_is_cold = true; + } + + in_cold_section_p = first_function_block_is_cold; +} + /* Output assembler code for the constant pool of a function and associated with defining the name of the function. DECL describes the function. NAME is the function's name. For the constant pool, we use the current @@ -1553,7 +1585,6 @@ assemble_start_function (tree decl, cons char tmp_label[100]; bool hot_label_written = false; - first_function_block_is_cold = false; if (flag_reorder_blocks_and_partition) { ASM_GENERATE_INTERNAL_LABEL (tmp_label, "LHOTB", const_labelno); @@ -1588,6 +1619,8 @@ assemble_start_function (tree decl, cons if (flag_reorder_blocks_and_partition) { + first_function_block_is_cold = false; + switch_to_section (unlikely_text_section ()); assemble_align (DECL_ALIGN (decl)); ASM_OUTPUT_LABEL (asm_out_file, crtl->subsections.cold_section_label); @@ -1604,17 +1637,8 @@ assemble_start_function (tree decl, cons hot_label_written = true; first_function_block_is_cold = true; } + in_cold_section_p = first_function_block_is_cold; } - else if (DECL_SECTION_NAME (decl)) - { - /* Calls to function_section rely on first_function_block_is_cold - being accurate. */ - first_function_block_is_cold - = (cgraph_node (current_function_decl)->frequency - == NODE_FREQUENCY_UNLIKELY_EXECUTED); - } - - in_cold_section_p = first_function_block_is_cold; /* Switch to the correct text section for the start of the function. */