[backport from gcc-4.8/trunk r191066 ] Date: Fri, 07 Sep 2012 09:35:50 +0100 From: Richard Earnshaw Subject: Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate List-Archive: On 20/08/12 12:36, Richard Guenther wrote: > On Fri, Aug 17, 2012 at 7:05 PM, Richard Earnshaw wrot= e: >> On 17/08/12 16:20, Richard Earnshaw wrote: >>> Ok, in which case we have to give is_widening_mult_rhs_p enough smarts >>> to not strip >>> >>> (s32)u32 >>> >>> and return u32. >>> >>> I'll have another think about it. >> >> Take two. >> >> This version should address your concerns about handling >> >> (u32)u16 * (u32)u16 -> u64 >> >> We now look at each operand directly, but when doing so we check whether >> the operand is the same size as the result or not. When it is, we can >> strip any conversion; when it isn't the conversion must preserve >> signedness of the inner operand and mustn't be a narrowing conversion. >> >> * tree-ssa-math-opts.c (widening_mult_conversion_strippable_p): >> New function. >> (is_widening_mult_rhs_p): Use it. >> >> Testing underway (again) >> >> OK? > > Ok. > > Thanks, > Richard. > >> R. >> > It turns out that this patch caused a few tests in gcc.target/arm to start failing. Investigating has shown that there are a number of reasons for this. One test (gcc.target/arm/pr50318-1.c) is just bogus. It's scanning for an unsigned widening multiply when the correct operation is a signed widening multiply. Fixing the compiler has just exposed the broken test. Two tests (gcc.target/arm/smlaltb-1.c and gcc.target/arm/smlaltt-1.c) fail because the compiler now generates a subreg in the middle of the widening multiply expression. Our patterns don't recognize this form and I'm really not keen on the compiler doing this sort of thing anyway, subreg operations of this nature are endian dependent and dealing with this is messy. For now I'm going to xfail these two tests. The final two failures (gcc.target/arm/wmul-5.c and gcc.target/arm/wmul-6.c) really should pass. They don't because my first iteration of the patch failed to spot that (sign_extend:DI (zero_extend:SI (reg:HI))) can be simplified legitimately to (zero_extend:DI (reg:HI)). The patch below to widening_mult_conversion_strippable_p fixes this. So this is a clean-up patch to fix these issues. gcc/ 2012-09-07 Richard Earnshaw PR tree-ssa/54295 * tree-ssa-math-opts.c (widening_mult_conversion_strippable_p): Sign-extension of a zero-extended value can be simplified to just zero-extension. gcc/testsuite/ 2012-09-07 Richard Earnshaw * gcc.target/arm/pr50318-1.c: Scan for smlal. * gcc.target/arm/smlaltb-1.c: XFAIL test. * gcc.target/arm/smlaltt-1.c: Likewise. --- gcc-4.7.1/gcc/testsuite/gcc.target/arm/pr50318-1.c.~1~ 2011-09-08 21:45:37.000000000 +0200 +++ gcc-4.7.1/gcc/testsuite/gcc.target/arm/pr50318-1.c 2012-09-09 13:14:33.000000000 +0200 @@ -8,4 +8,4 @@ long long test (unsigned int sec, unsign long)nsecs; } -/* { dg-final { scan-assembler "umlal" } } */ +/* { dg-final { scan-assembler "smlal" } } */ --- gcc-4.7.1/gcc/testsuite/gcc.target/arm/smlaltb-1.c.~1~ 2011-07-07 19:44:14.000000000 +0200 +++ gcc-4.7.1/gcc/testsuite/gcc.target/arm/smlaltb-1.c 2012-09-09 13:14:33.000000000 +0200 @@ -11,4 +11,4 @@ foo (long long x, int in) return x + b * a; } -/* { dg-final { scan-assembler "smlaltb\\t" } } */ +/* { dg-final { scan-assembler "smlaltb\\t" { xfail *-*-* } } } */ --- gcc-4.7.1/gcc/testsuite/gcc.target/arm/smlaltt-1.c.~1~ 2011-07-07 19:44:14.000000000 +0200 +++ gcc-4.7.1/gcc/testsuite/gcc.target/arm/smlaltt-1.c 2012-09-09 13:14:33.000000000 +0200 @@ -11,4 +11,4 @@ foo (long long x, int in1, int in2) return x + b * a; } -/* { dg-final { scan-assembler "smlaltt\\t" } } */ +/* { dg-final { scan-assembler "smlaltt\\t" { xfail *-*-* } } } */ --- gcc-4.7.1/gcc/tree-ssa-math-opts.c.~1~ 2012-09-09 13:14:14.000000000 +0200 +++ gcc-4.7.1/gcc/tree-ssa-math-opts.c 2012-09-09 13:14:33.000000000 +0200 @@ -1995,7 +1995,11 @@ widening_mult_conversion_strippable_p (t the operation and doesn't narrow the range. */ inner_op_type = TREE_TYPE (gimple_assign_rhs1 (stmt)); - if (TYPE_UNSIGNED (op_type) == TYPE_UNSIGNED (inner_op_type) + /* If the inner-most type is unsigned, then we can strip any + intermediate widening operation. If it's signed, then the + intermediate widening operation must also be signed. */ + if ((TYPE_UNSIGNED (inner_op_type) + || TYPE_UNSIGNED (op_type) == TYPE_UNSIGNED (inner_op_type)) && TYPE_PRECISION (op_type) > TYPE_PRECISION (inner_op_type)) return true;