diff mbox series

Add IFN_COND_{MUL,DIV,MOD,RDIV}

Message ID 87zi0pct6j.fsf@linaro.org
State New
Headers show
Series Add IFN_COND_{MUL,DIV,MOD,RDIV} | expand

Commit Message

Richard Sandiford May 24, 2018, 9:13 a.m. UTC
This patch adds support for conditional multiplication and division.
It's mostly mechanical, but a few notes:

* The *_optab name and the .md names are the same as the unconditional
  forms, just with "cond_" added to the front.  This means we still
  have the awkward difference between sdiv and div, etc.

* It was easier to retain the difference between integer and FP
  division in the function names, given that they map to different
  tree codes (TRUNC_DIV_EXPR and RDIV_EXPR).

* SVE has no direct support for IFN_COND_MOD, but it seemed more
  consistent to add it anyway.

* Adding IFN_COND_MUL enables an extra fully-masked reduction
  in gcc.dg/vect/pr53773.c.

* In practice we don't actually use the integer division forms without
  if-conversion support (added by a later patch).

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu.  OK for the non-AArch64 bits?

Richard


2018-05-24  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* doc/sourcebuild.texi (vect_double_cond_arith): Include
	multiplication and division.
	* doc/md.texi (cond_mul@var{m}, cond_div@var{m}, cond_mod@var{m})
	(cond_udiv@var{m}, cond_umod@var{m}): Document.
	* optabs.def (cond_smul_optab, cond_sdiv_optab, cond_smod_optab)
	(cond_udiv_optab, cond_umod_optab): New optabs.
	* internal-fn.def (IFN_COND_MUL, IFN_COND_DIV, IFN_COND_MOD)
	(IFN_COND_RDIV): New internal functions.
	* internal-fn.c (get_conditional_internal_fn): Handle TRUNC_DIV_EXPR,
	TRUNC_MOD_EXPR and RDIV_EXPR.
	* genmatch.c (commutative_op): Handle CFN_COND_MUL.
	* match.pd (UNCOND_BINARY, COND_BINARY): Handle them.
	* config/aarch64/iterators.md (UNSPEC_COND_MUL, UNSPEC_COND_DIV):
	New unspecs.
	(SVE_INT_BINARY): Include mult.
	(SVE_COND_FP_BINARY): Include UNSPEC_MUL and UNSPEC_DIV.
	(optab, sve_int_op): Handle mult.
	(optab, sve_fp_op, commutative): Handle UNSPEC_COND_MUL and
	UNSPEC_COND_DIV.
	* config/aarch64/aarch64-sve.md (cond_<optab><mode>): New pattern
	for SVE_INT_BINARY_SD.

gcc/testsuite/
	* lib/target-supports.exp
	(check_effective_target_vect_double_cond_arith): Include
	multiplication and division.
	* gcc.dg/vect/pr53773.c: Do not expect a scalar tail when using
	fully-masked loops with a fixed vector length.
	* gcc.dg/vect/vect-cond-arith-1.c: Add multiplication and division
	tests.
	* gcc.target/aarch64/sve/vcond_8.c: Likewise.
	* gcc.target/aarch64/sve/vcond_9.c: Likewise.
	* gcc.target/aarch64/sve/vcond_12.c: Add multiplication tests.

Comments

Richard Biener May 24, 2018, 10:18 a.m. UTC | #1
On Thu, May 24, 2018 at 11:34 AM Richard Sandiford <
richard.sandiford@linaro.org> wrote:

> This patch adds support for conditional multiplication and division.

> It's mostly mechanical, but a few notes:


> * The *_optab name and the .md names are the same as the unconditional

>    forms, just with "cond_" added to the front.  This means we still

>    have the awkward difference between sdiv and div, etc.


> * It was easier to retain the difference between integer and FP

>    division in the function names, given that they map to different

>    tree codes (TRUNC_DIV_EXPR and RDIV_EXPR).


> * SVE has no direct support for IFN_COND_MOD, but it seemed more

>    consistent to add it anyway.


> * Adding IFN_COND_MUL enables an extra fully-masked reduction

>    in gcc.dg/vect/pr53773.c.


> * In practice we don't actually use the integer division forms without

>    if-conversion support (added by a later patch).


> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf

> and x86_64-linux-gnu.  OK for the non-AArch64 bits?


OK.

Richard.

> Richard



> 2018-05-24  Richard Sandiford  <richard.sandiford@linaro.org>


> gcc/

>          * doc/sourcebuild.texi (vect_double_cond_arith): Include

>          multiplication and division.

>          * doc/md.texi (cond_mul@var{m}, cond_div@var{m}, cond_mod@var{m})

>          (cond_udiv@var{m}, cond_umod@var{m}): Document.

>          * optabs.def (cond_smul_optab, cond_sdiv_optab, cond_smod_optab)

>          (cond_udiv_optab, cond_umod_optab): New optabs.

>          * internal-fn.def (IFN_COND_MUL, IFN_COND_DIV, IFN_COND_MOD)

>          (IFN_COND_RDIV): New internal functions.

>          * internal-fn.c (get_conditional_internal_fn): Handle

TRUNC_DIV_EXPR,
>          TRUNC_MOD_EXPR and RDIV_EXPR.

>          * genmatch.c (commutative_op): Handle CFN_COND_MUL.

>          * match.pd (UNCOND_BINARY, COND_BINARY): Handle them.

>          * config/aarch64/iterators.md (UNSPEC_COND_MUL, UNSPEC_COND_DIV):

>          New unspecs.

>          (SVE_INT_BINARY): Include mult.

>          (SVE_COND_FP_BINARY): Include UNSPEC_MUL and UNSPEC_DIV.

>          (optab, sve_int_op): Handle mult.

>          (optab, sve_fp_op, commutative): Handle UNSPEC_COND_MUL and

>          UNSPEC_COND_DIV.

>          * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New pattern

>          for SVE_INT_BINARY_SD.


> gcc/testsuite/

>          * lib/target-supports.exp

>          (check_effective_target_vect_double_cond_arith): Include

>          multiplication and division.

>          * gcc.dg/vect/pr53773.c: Do not expect a scalar tail when using

>          fully-masked loops with a fixed vector length.

>          * gcc.dg/vect/vect-cond-arith-1.c: Add multiplication and division

>          tests.

>          * gcc.target/aarch64/sve/vcond_8.c: Likewise.

>          * gcc.target/aarch64/sve/vcond_9.c: Likewise.

>          * gcc.target/aarch64/sve/vcond_12.c: Add multiplication tests.


> Index: gcc/doc/sourcebuild.texi

> ===================================================================

> --- gcc/doc/sourcebuild.texi    2018-05-24 09:54:37.508451387 +0100

> +++ gcc/doc/sourcebuild.texi    2018-05-24 10:12:10.145352193 +0100

> @@ -1426,8 +1426,9 @@ have different type from the value opera

>   Target supports hardware vectors of @code{double}.


>   @item vect_double_cond_arith

> -Target supports conditional addition, subtraction, minimum and maximum

> -on vectors of @code{double}, via the @code{cond_} optabs.

> +Target supports conditional addition, subtraction, multiplication,

> +division, minimum and maximum on vectors of @code{double}, via the

> +@code{cond_} optabs.


>   @item vect_element_align_preferred

>   The target's preferred vector alignment is the same as the element

> Index: gcc/doc/md.texi

> ===================================================================

> --- gcc/doc/md.texi     2018-05-24 09:32:10.522816506 +0100

> +++ gcc/doc/md.texi     2018-05-24 10:12:10.142352315 +0100

> @@ -6333,6 +6333,11 @@ operand 0, otherwise (operand 2 + operan


>   @cindex @code{cond_add@var{mode}} instruction pattern

>   @cindex @code{cond_sub@var{mode}} instruction pattern

> +@cindex @code{cond_mul@var{mode}} instruction pattern

> +@cindex @code{cond_div@var{mode}} instruction pattern

> +@cindex @code{cond_udiv@var{mode}} instruction pattern

> +@cindex @code{cond_mod@var{mode}} instruction pattern

> +@cindex @code{cond_umod@var{mode}} instruction pattern

>   @cindex @code{cond_and@var{mode}} instruction pattern

>   @cindex @code{cond_ior@var{mode}} instruction pattern

>   @cindex @code{cond_xor@var{mode}} instruction pattern

> @@ -6342,6 +6347,11 @@ operand 0, otherwise (operand 2 + operan

>   @cindex @code{cond_umax@var{mode}} instruction pattern

>   @item @samp{cond_add@var{mode}}

>   @itemx @samp{cond_sub@var{mode}}

> +@itemx @samp{cond_mul@var{mode}}

> +@itemx @samp{cond_div@var{mode}}

> +@itemx @samp{cond_udiv@var{mode}}

> +@itemx @samp{cond_mod@var{mode}}

> +@itemx @samp{cond_umod@var{mode}}

>   @itemx @samp{cond_and@var{mode}}

>   @itemx @samp{cond_ior@var{mode}}

>   @itemx @samp{cond_xor@var{mode}}

> Index: gcc/optabs.def

> ===================================================================

> --- gcc/optabs.def      2018-05-16 12:48:59.194282896 +0100

> +++ gcc/optabs.def      2018-05-24 10:12:10.146352152 +0100

> @@ -222,6 +222,11 @@ OPTAB_D (notcc_optab, "not$acc")

>   OPTAB_D (movcc_optab, "mov$acc")

>   OPTAB_D (cond_add_optab, "cond_add$a")

>   OPTAB_D (cond_sub_optab, "cond_sub$a")

> +OPTAB_D (cond_smul_optab, "cond_mul$a")

> +OPTAB_D (cond_sdiv_optab, "cond_div$a")

> +OPTAB_D (cond_smod_optab, "cond_mod$a")

> +OPTAB_D (cond_udiv_optab, "cond_udiv$a")

> +OPTAB_D (cond_umod_optab, "cond_umod$a")

>   OPTAB_D (cond_and_optab, "cond_and$a")

>   OPTAB_D (cond_ior_optab, "cond_ior$a")

>   OPTAB_D (cond_xor_optab, "cond_xor$a")

> Index: gcc/internal-fn.def

> ===================================================================

> --- gcc/internal-fn.def 2018-05-24 09:32:10.522816506 +0100

> +++ gcc/internal-fn.def 2018-05-24 10:12:10.146352152 +0100

> @@ -145,6 +145,12 @@ DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST,


>   DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary)

>   DEF_INTERNAL_OPTAB_FN (COND_SUB, ECF_CONST, cond_sub, cond_binary)

> +DEF_INTERNAL_OPTAB_FN (COND_MUL, ECF_CONST, cond_smul, cond_binary)

> +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_DIV, ECF_CONST, first,

> +                             cond_sdiv, cond_udiv, cond_binary)

> +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MOD, ECF_CONST, first,

> +                             cond_smod, cond_umod, cond_binary)

> +DEF_INTERNAL_OPTAB_FN (COND_RDIV, ECF_CONST, cond_sdiv, cond_binary)

>   DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MIN, ECF_CONST, first,

>                                cond_smin, cond_umin, cond_binary)

>   DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MAX, ECF_CONST, first,

> Index: gcc/internal-fn.c

> ===================================================================

> --- gcc/internal-fn.c   2018-05-24 09:32:10.522816506 +0100

> +++ gcc/internal-fn.c   2018-05-24 10:12:10.146352152 +0100

> @@ -3246,6 +3246,12 @@ get_conditional_internal_fn (tree_code c

>         return IFN_COND_MIN;

>       case MAX_EXPR:

>         return IFN_COND_MAX;

> +    case TRUNC_DIV_EXPR:

> +      return IFN_COND_DIV;

> +    case TRUNC_MOD_EXPR:

> +      return IFN_COND_MOD;

> +    case RDIV_EXPR:

> +      return IFN_COND_RDIV;

>       case BIT_AND_EXPR:

>         return IFN_COND_AND;

>       case BIT_IOR_EXPR:

> Index: gcc/genmatch.c

> ===================================================================

> --- gcc/genmatch.c      2018-05-24 09:54:37.508451387 +0100

> +++ gcc/genmatch.c      2018-05-24 10:12:10.145352193 +0100

> @@ -487,6 +487,7 @@ commutative_op (id_base *id)


>         case CFN_COND_ADD:

>         case CFN_COND_SUB:

> +      case CFN_COND_MUL:

>         case CFN_COND_MAX:

>         case CFN_COND_MIN:

>         case CFN_COND_AND:

> Index: gcc/match.pd

> ===================================================================

> --- gcc/match.pd        2018-05-24 09:54:37.509451356 +0100

> +++ gcc/match.pd        2018-05-24 10:12:10.146352152 +0100

> @@ -78,10 +78,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

>   /* Binary operations and their associated IFN_COND_* function.  */

>   (define_operator_list UNCOND_BINARY

>     plus minus

> +  mult trunc_div trunc_mod rdiv

>     min max

>     bit_and bit_ior bit_xor)

>   (define_operator_list COND_BINARY

>     IFN_COND_ADD IFN_COND_SUB

> +  IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV

>     IFN_COND_MIN IFN_COND_MAX

>     IFN_COND_AND IFN_COND_IOR IFN_COND_XOR)


> Index: gcc/config/aarch64/iterators.md

> ===================================================================

> --- gcc/config/aarch64/iterators.md     2018-05-24 09:54:37.508451387

+0100
> +++ gcc/config/aarch64/iterators.md     2018-05-24 10:12:10.142352315

+0100
> @@ -464,6 +464,8 @@ (define_c_enum "unspec"

>       UNSPEC_UMUL_HIGHPART ; Used in aarch64-sve.md.

>       UNSPEC_COND_ADD    ; Used in aarch64-sve.md.

>       UNSPEC_COND_SUB    ; Used in aarch64-sve.md.

> +    UNSPEC_COND_MUL    ; Used in aarch64-sve.md.

> +    UNSPEC_COND_DIV    ; Used in aarch64-sve.md.

>       UNSPEC_COND_MAX    ; Used in aarch64-sve.md.

>       UNSPEC_COND_MIN    ; Used in aarch64-sve.md.

>       UNSPEC_COND_LT     ; Used in aarch64-sve.md.

> @@ -1202,7 +1204,7 @@ (define_code_iterator SVE_INT_UNARY [neg

>   ;; SVE floating-point unary operations.

>   (define_code_iterator SVE_FP_UNARY [neg abs sqrt])


> -(define_code_iterator SVE_INT_BINARY [plus minus smax umax smin umin

> +(define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin

>                                        and ior xor])


>   (define_code_iterator SVE_INT_BINARY_REV [minus])

> @@ -1239,6 +1241,7 @@ (define_code_attr optab [(ashift "ashl")

>                           (neg "neg")

>                           (plus "add")

>                           (minus "sub")

> +                        (mult "mul")

>                           (div "div")

>                           (udiv "udiv")

>                           (ss_plus "qadd")

> @@ -1382,6 +1385,7 @@ (define_mode_attr lconst_atomic [(QI "K"

>   ;; The integer SVE instruction that implements an rtx code.

>   (define_code_attr sve_int_op [(plus "add")

>                                (minus "sub")

> +                             (mult "mul")

>                                (div "sdiv")

>                                (udiv "udiv")

>                                (neg "neg")

> @@ -1540,9 +1544,10 @@ (define_int_iterator UNPACK_UNSIGNED [UN

>   (define_int_iterator MUL_HIGHPART [UNSPEC_SMUL_HIGHPART

UNSPEC_UMUL_HIGHPART])

>   (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_ADD UNSPEC_COND_SUB

> +                                        UNSPEC_COND_MUL UNSPEC_COND_DIV

>                                           UNSPEC_COND_MAX UNSPEC_COND_MIN])


> -(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB])

> +(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB

UNSPEC_COND_DIV])

>   (define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_LT UNSPEC_COND_LE

>                                        UNSPEC_COND_EQ UNSPEC_COND_NE

> @@ -1573,6 +1578,8 @@ (define_int_attr optab [(UNSPEC_ANDF "an

>                          (UNSPEC_XORV "xor")

>                          (UNSPEC_COND_ADD "add")

>                          (UNSPEC_COND_SUB "sub")

> +                       (UNSPEC_COND_MUL "mul")

> +                       (UNSPEC_COND_DIV "div")

>                          (UNSPEC_COND_MAX "smax")

>                          (UNSPEC_COND_MIN "smin")])


> @@ -1787,10 +1794,14 @@ (define_int_attr cmp_op [(UNSPEC_COND_LT


>   (define_int_attr sve_fp_op [(UNSPEC_COND_ADD "fadd")

>                              (UNSPEC_COND_SUB "fsub")

> +                           (UNSPEC_COND_MUL "fmul")

> +                           (UNSPEC_COND_DIV "fdiv")

>                              (UNSPEC_COND_MAX "fmaxnm")

>                              (UNSPEC_COND_MIN "fminnm")])


>   (define_int_attr commutative [(UNSPEC_COND_ADD "true")

>                                (UNSPEC_COND_SUB "false")

> +                             (UNSPEC_COND_MUL "true")

> +                             (UNSPEC_COND_DIV "false")

>                                (UNSPEC_COND_MIN "true")

>                                (UNSPEC_COND_MAX "true")])

> Index: gcc/config/aarch64/aarch64-sve.md

> ===================================================================

> --- gcc/config/aarch64/aarch64-sve.md   2018-05-24 09:54:37.506451449

+0100
> +++ gcc/config/aarch64/aarch64-sve.md   2018-05-24 10:12:10.141352356

+0100
> @@ -1803,6 +1803,21 @@ (define_expand "cond_<optab><mode>"

>     aarch64_sve_prepare_conditional_op (operands, 5, commutative_p);

>   })


> +(define_expand "cond_<optab><mode>"

> +  [(set (match_operand:SVE_SDI 0 "register_operand")

> +       (unspec:SVE_SDI

> +         [(match_operand:<VPRED> 1 "register_operand")

> +          (SVE_INT_BINARY_SD:SVE_SDI

> +            (match_operand:SVE_SDI 2 "register_operand")

> +            (match_operand:SVE_SDI 3 "register_operand"))

> +          (match_operand:SVE_SDI 4 "register_operand")]

> +         UNSPEC_SEL))]

> +  "TARGET_SVE"

> +{

> +  bool commutative_p = (GET_RTX_CLASS (<CODE>) == RTX_COMM_ARITH);

> +  aarch64_sve_prepare_conditional_op (operands, 5, commutative_p);

> +})

> +

>   ;; Predicated integer operations.

>   (define_insn "*cond_<optab><mode>"

>     [(set (match_operand:SVE_I 0 "register_operand" "=w")

> @@ -1817,6 +1832,19 @@ (define_insn "*cond_<optab><mode>"

>     "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"

>   )


> +(define_insn "*cond_<optab><mode>"

> +  [(set (match_operand:SVE_SDI 0 "register_operand" "=w")

> +       (unspec:SVE_SDI

> +         [(match_operand:<VPRED> 1 "register_operand" "Upl")

> +          (SVE_INT_BINARY_SD:SVE_SDI

> +            (match_operand:SVE_SDI 2 "register_operand" "0")

> +            (match_operand:SVE_SDI 3 "register_operand" "w"))

> +          (match_dup 2)]

> +         UNSPEC_SEL))]

> +  "TARGET_SVE"

> +  "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"

> +)

> +

>   ;; Predicated integer operations with the operands reversed.

>   (define_insn "*cond_<optab><mode>"

>     [(set (match_operand:SVE_I 0 "register_operand" "=w")

> @@ -1828,6 +1856,19 @@ (define_insn "*cond_<optab><mode>"

>             (match_dup 3)]

>            UNSPEC_SEL))]

>     "TARGET_SVE"

> +  "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"

> +)

> +

> +(define_insn "*cond_<optab><mode>"

> +  [(set (match_operand:SVE_SDI 0 "register_operand" "=w")

> +       (unspec:SVE_SDI

> +         [(match_operand:<VPRED> 1 "register_operand" "Upl")

> +          (SVE_INT_BINARY_SD:SVE_SDI

> +            (match_operand:SVE_SDI 2 "register_operand" "w")

> +            (match_operand:SVE_SDI 3 "register_operand" "0"))

> +          (match_dup 3)]

> +         UNSPEC_SEL))]

> +  "TARGET_SVE"

>     "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"

>   )


> Index: gcc/testsuite/lib/target-supports.exp

> ===================================================================

> --- gcc/testsuite/lib/target-supports.exp       2018-05-24

09:54:37.511451293 +0100
> +++ gcc/testsuite/lib/target-supports.exp       2018-05-24

10:12:10.148352070 +0100
> @@ -5590,8 +5590,9 @@ proc check_effective_target_vect_double

>       return $et_vect_double_saved($et_index)

>   }


> -# Return 1 if the target supports conditional addition, subtraction,

minimum
> -# and maximum on vectors of double, via the cond_ optabs.  Return 0

otherwise.
> +# Return 1 if the target supports conditional addition, subtraction,

> +# multiplication, division, minimum and maximum on vectors of double,

> +# via the cond_ optabs.  Return 0 otherwise.


>   proc check_effective_target_vect_double_cond_arith { } {

>       return [check_effective_target_aarch64_sve]

> Index: gcc/testsuite/gcc.dg/vect/pr53773.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-16 12:48:59.115202362

+0100
> +++ gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-24 10:12:10.147352111

+0100
> @@ -14,5 +14,8 @@ foo (int integral, int decimal, int powe

>     return integral+decimal;

>   }


> -/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" } } */

> +/* We can avoid a scalar tail when using fully-masked loops with a fixed

> +   vector length.  */

> +/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" { target { {

! vect_fully_masked } || vect_variable_length } } } } */
> +/* { dg-final { scan-tree-dump-times "\\* 10" 0 "optimized" { target {

vect_fully_masked && { ! vect_variable_length } } } } } */

> Index: gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c       2018-05-24

09:54:37.509451356 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c       2018-05-24

10:12:10.147352111 +0100
> @@ -6,6 +6,8 @@ #define N (VECTOR_BITS * 11 / 64 + 3)


>   #define add(A, B) ((A) + (B))

>   #define sub(A, B) ((A) - (B))

> +#define mul(A, B) ((A) * (B))

> +#define div(A, B) ((A) / (B))


>   #define DEF(OP)                                                        \

>     void __attribute__ ((noipa))                                 \

> @@ -34,6 +36,8 @@ #define TEST(OP)                                      \

>   #define FOR_EACH_OP(T)                         \

>     T (add)                                      \

>     T (sub)                                      \

> +  T (mul)                                      \

> +  T (div)                                      \

>     T (__builtin_fmax)                           \

>     T (__builtin_fmin)


> @@ -54,5 +58,7 @@ main (void)


>   /* { dg-final { scan-tree-dump { = \.COND_ADD} "optimized" { target

vect_double_cond_arith } } } */
>   /* { dg-final { scan-tree-dump { = \.COND_SUB} "optimized" { target

vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_MUL} "optimized" { target

vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_RDIV} "optimized" { target

vect_double_cond_arith } } } */
>   /* { dg-final { scan-tree-dump { = \.COND_MAX} "optimized" { target

vect_double_cond_arith } } } */
>   /* { dg-final { scan-tree-dump { = \.COND_MIN} "optimized" { target

vect_double_cond_arith } } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c      2018-05-24

09:54:37.510451324 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c      2018-05-24

10:12:10.147352111 +0100
> @@ -5,6 +5,8 @@


>   #define add(A, B) ((A) + (B))

>   #define sub(A, B) ((A) - (B))

> +#define mul(A, B) ((A) * (B))

> +#define div(A, B) ((A) / (B))

>   #define max(A, B) ((A) > (B) ? (A) : (B))

>   #define min(A, B) ((A) < (B) ? (A) : (B))

>   #define and(A, B) ((A) & (B))

> @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP)

       \
>   #define FOR_EACH_INT_TYPE(T, TYPE) \

>     T (TYPE, TYPE, add) \

>     T (TYPE, TYPE, sub) \

> +  T (TYPE, TYPE, mul) \

>     T (TYPE, TYPE, max) \

>     T (TYPE, TYPE, min) \

>     T (TYPE, TYPE, and) \

> @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \

>   #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \

>     T (TYPE, CMPTYPE, add) \

>     T (TYPE, CMPTYPE, sub) \

> +  T (TYPE, CMPTYPE, mul) \

> +  T (TYPE, CMPTYPE, div) \

>     T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \

>     T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)


> @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP)

>   /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 }

} */
>   /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 }

} */

> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 }

} */
> +

>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP)

>   /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 }

} */

> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }

} */
> +

> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 1 }

} */
> +

>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1

} } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c      2018-05-24

09:54:37.510451324 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c      2018-05-24

10:12:10.148352070 +0100
> @@ -5,6 +5,8 @@


>   #define add(A, B) ((A) + (B))

>   #define sub(A, B) ((A) - (B))

> +#define mul(A, B) ((A) * (B))

> +#define div(A, B) ((A) / (B))

>   #define max(A, B) ((A) > (B) ? (A) : (B))

>   #define min(A, B) ((A) < (B) ? (A) : (B))

>   #define and(A, B) ((A) & (B))

> @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP)

       \
>   #define FOR_EACH_INT_TYPE(T, TYPE) \

>     T (TYPE, TYPE, add) \

>     T (TYPE, TYPE, sub) \

> +  T (TYPE, TYPE, mul) \

>     T (TYPE, TYPE, max) \

>     T (TYPE, TYPE, min) \

>     T (TYPE, TYPE, and) \

> @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \

>   #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \

>     T (TYPE, CMPTYPE, add) \

>     T (TYPE, CMPTYPE, sub) \

> +  T (TYPE, CMPTYPE, mul) \

> +  T (TYPE, CMPTYPE, div) \

>     T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \

>     T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)


> @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP)

>   /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, p[0-7]/m,} 2 }

} */
>   /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.d, p[0-7]/m,} 2 }

} */

> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 }

} */
> +

>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP)

>   /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m,} 1

} } */

> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }

} */
> +

> +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.h, p[0-7]/m,} 1

} } */
> +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.s, p[0-7]/m,} 1

} } */
> +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.d, p[0-7]/m,} 1

} } */
> +

>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1

} } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c

> ===================================================================

> --- gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c     2018-05-24

09:54:37.510451324 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c     2018-05-24

10:12:10.147352111 +0100
> @@ -5,6 +5,8 @@


>   #define add(A, B) ((A) + (B))

>   #define sub(A, B) ((A) - (B))

> +#define mul(A, B) ((A) * (B))

> +#define div(A, B) ((A) / (B))

>   #define max(A, B) ((A) > (B) ? (A) : (B))

>   #define min(A, B) ((A) < (B) ? (A) : (B))

>   #define and(A, B) ((A) & (B))

> @@ -29,6 +31,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP)

       \
>   #define FOR_EACH_INT_TYPE(T, TYPE) \

>     T (TYPE, TYPE, add) \

>     T (TYPE, TYPE, sub) \

> +  T (TYPE, TYPE, mul) \

>     T (TYPE, TYPE, max) \

>     T (TYPE, TYPE, min) \

>     T (TYPE, TYPE, and) \

> @@ -38,6 +41,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \

>   #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \

>     T (TYPE, CMPTYPE, add) \

>     T (TYPE, CMPTYPE, sub) \

> +  T (TYPE, CMPTYPE, mul) \

> +  /* No div because that gets converted into a mul anyway.  */ \

>     T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \

>     T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)


> @@ -58,10 +63,10 @@ FOR_EACH_LOOP (DEF_LOOP)


>   /* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.., z[0-9]+} } } */


> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 14 } } */

> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 18 } } */

> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 18 } } */

> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 18 } } */

> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 16 } } */

> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 21 } } */

> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 21 } } */

> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 21 } } */


>   /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m,} 2 }

} */
>   /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m,} 2 }

} */
> @@ -73,6 +78,11 @@ FOR_EACH_LOOP (DEF_LOOP)

>   /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 }

} */
>   /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 }

} */

> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 }

} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 }

} */
> +

>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> @@ -116,6 +126,10 @@ FOR_EACH_LOOP (DEF_LOOP)

>   /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
>   /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 }

} */

> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }

} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }

} */
> +

>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1

} } */
>   /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1

} } */
diff mbox series

Patch

Index: gcc/doc/sourcebuild.texi
===================================================================
--- gcc/doc/sourcebuild.texi	2018-05-24 09:54:37.508451387 +0100
+++ gcc/doc/sourcebuild.texi	2018-05-24 10:12:10.145352193 +0100
@@ -1426,8 +1426,9 @@  have different type from the value opera
 Target supports hardware vectors of @code{double}.
 
 @item vect_double_cond_arith
-Target supports conditional addition, subtraction, minimum and maximum
-on vectors of @code{double}, via the @code{cond_} optabs.
+Target supports conditional addition, subtraction, multiplication,
+division, minimum and maximum on vectors of @code{double}, via the
+@code{cond_} optabs.
 
 @item vect_element_align_preferred
 The target's preferred vector alignment is the same as the element
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	2018-05-24 09:32:10.522816506 +0100
+++ gcc/doc/md.texi	2018-05-24 10:12:10.142352315 +0100
@@ -6333,6 +6333,11 @@  operand 0, otherwise (operand 2 + operan
 
 @cindex @code{cond_add@var{mode}} instruction pattern
 @cindex @code{cond_sub@var{mode}} instruction pattern
+@cindex @code{cond_mul@var{mode}} instruction pattern
+@cindex @code{cond_div@var{mode}} instruction pattern
+@cindex @code{cond_udiv@var{mode}} instruction pattern
+@cindex @code{cond_mod@var{mode}} instruction pattern
+@cindex @code{cond_umod@var{mode}} instruction pattern
 @cindex @code{cond_and@var{mode}} instruction pattern
 @cindex @code{cond_ior@var{mode}} instruction pattern
 @cindex @code{cond_xor@var{mode}} instruction pattern
@@ -6342,6 +6347,11 @@  operand 0, otherwise (operand 2 + operan
 @cindex @code{cond_umax@var{mode}} instruction pattern
 @item @samp{cond_add@var{mode}}
 @itemx @samp{cond_sub@var{mode}}
+@itemx @samp{cond_mul@var{mode}}
+@itemx @samp{cond_div@var{mode}}
+@itemx @samp{cond_udiv@var{mode}}
+@itemx @samp{cond_mod@var{mode}}
+@itemx @samp{cond_umod@var{mode}}
 @itemx @samp{cond_and@var{mode}}
 @itemx @samp{cond_ior@var{mode}}
 @itemx @samp{cond_xor@var{mode}}
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def	2018-05-16 12:48:59.194282896 +0100
+++ gcc/optabs.def	2018-05-24 10:12:10.146352152 +0100
@@ -222,6 +222,11 @@  OPTAB_D (notcc_optab, "not$acc")
 OPTAB_D (movcc_optab, "mov$acc")
 OPTAB_D (cond_add_optab, "cond_add$a")
 OPTAB_D (cond_sub_optab, "cond_sub$a")
+OPTAB_D (cond_smul_optab, "cond_mul$a")
+OPTAB_D (cond_sdiv_optab, "cond_div$a")
+OPTAB_D (cond_smod_optab, "cond_mod$a")
+OPTAB_D (cond_udiv_optab, "cond_udiv$a")
+OPTAB_D (cond_umod_optab, "cond_umod$a")
 OPTAB_D (cond_and_optab, "cond_and$a")
 OPTAB_D (cond_ior_optab, "cond_ior$a")
 OPTAB_D (cond_xor_optab, "cond_xor$a")
Index: gcc/internal-fn.def
===================================================================
--- gcc/internal-fn.def	2018-05-24 09:32:10.522816506 +0100
+++ gcc/internal-fn.def	2018-05-24 10:12:10.146352152 +0100
@@ -145,6 +145,12 @@  DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST,
 
 DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary)
 DEF_INTERNAL_OPTAB_FN (COND_SUB, ECF_CONST, cond_sub, cond_binary)
+DEF_INTERNAL_OPTAB_FN (COND_MUL, ECF_CONST, cond_smul, cond_binary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (COND_DIV, ECF_CONST, first,
+			      cond_sdiv, cond_udiv, cond_binary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MOD, ECF_CONST, first,
+			      cond_smod, cond_umod, cond_binary)
+DEF_INTERNAL_OPTAB_FN (COND_RDIV, ECF_CONST, cond_sdiv, cond_binary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MIN, ECF_CONST, first,
 			      cond_smin, cond_umin, cond_binary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MAX, ECF_CONST, first,
Index: gcc/internal-fn.c
===================================================================
--- gcc/internal-fn.c	2018-05-24 09:32:10.522816506 +0100
+++ gcc/internal-fn.c	2018-05-24 10:12:10.146352152 +0100
@@ -3246,6 +3246,12 @@  get_conditional_internal_fn (tree_code c
       return IFN_COND_MIN;
     case MAX_EXPR:
       return IFN_COND_MAX;
+    case TRUNC_DIV_EXPR:
+      return IFN_COND_DIV;
+    case TRUNC_MOD_EXPR:
+      return IFN_COND_MOD;
+    case RDIV_EXPR:
+      return IFN_COND_RDIV;
     case BIT_AND_EXPR:
       return IFN_COND_AND;
     case BIT_IOR_EXPR:
Index: gcc/genmatch.c
===================================================================
--- gcc/genmatch.c	2018-05-24 09:54:37.508451387 +0100
+++ gcc/genmatch.c	2018-05-24 10:12:10.145352193 +0100
@@ -487,6 +487,7 @@  commutative_op (id_base *id)
 
       case CFN_COND_ADD:
       case CFN_COND_SUB:
+      case CFN_COND_MUL:
       case CFN_COND_MAX:
       case CFN_COND_MIN:
       case CFN_COND_AND:
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	2018-05-24 09:54:37.509451356 +0100
+++ gcc/match.pd	2018-05-24 10:12:10.146352152 +0100
@@ -78,10 +78,12 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* Binary operations and their associated IFN_COND_* function.  */
 (define_operator_list UNCOND_BINARY
   plus minus
+  mult trunc_div trunc_mod rdiv
   min max
   bit_and bit_ior bit_xor)
 (define_operator_list COND_BINARY
   IFN_COND_ADD IFN_COND_SUB
+  IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV
   IFN_COND_MIN IFN_COND_MAX
   IFN_COND_AND IFN_COND_IOR IFN_COND_XOR)
     
Index: gcc/config/aarch64/iterators.md
===================================================================
--- gcc/config/aarch64/iterators.md	2018-05-24 09:54:37.508451387 +0100
+++ gcc/config/aarch64/iterators.md	2018-05-24 10:12:10.142352315 +0100
@@ -464,6 +464,8 @@  (define_c_enum "unspec"
     UNSPEC_UMUL_HIGHPART ; Used in aarch64-sve.md.
     UNSPEC_COND_ADD	; Used in aarch64-sve.md.
     UNSPEC_COND_SUB	; Used in aarch64-sve.md.
+    UNSPEC_COND_MUL	; Used in aarch64-sve.md.
+    UNSPEC_COND_DIV	; Used in aarch64-sve.md.
     UNSPEC_COND_MAX	; Used in aarch64-sve.md.
     UNSPEC_COND_MIN	; Used in aarch64-sve.md.
     UNSPEC_COND_LT	; Used in aarch64-sve.md.
@@ -1202,7 +1204,7 @@  (define_code_iterator SVE_INT_UNARY [neg
 ;; SVE floating-point unary operations.
 (define_code_iterator SVE_FP_UNARY [neg abs sqrt])
 
-(define_code_iterator SVE_INT_BINARY [plus minus smax umax smin umin
+(define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin
 				      and ior xor])
 
 (define_code_iterator SVE_INT_BINARY_REV [minus])
@@ -1239,6 +1241,7 @@  (define_code_attr optab [(ashift "ashl")
 			 (neg "neg")
 			 (plus "add")
 			 (minus "sub")
+			 (mult "mul")
 			 (div "div")
 			 (udiv "udiv")
 			 (ss_plus "qadd")
@@ -1382,6 +1385,7 @@  (define_mode_attr lconst_atomic [(QI "K"
 ;; The integer SVE instruction that implements an rtx code.
 (define_code_attr sve_int_op [(plus "add")
 			      (minus "sub")
+			      (mult "mul")
 			      (div "sdiv")
 			      (udiv "udiv")
 			      (neg "neg")
@@ -1540,9 +1544,10 @@  (define_int_iterator UNPACK_UNSIGNED [UN
 (define_int_iterator MUL_HIGHPART [UNSPEC_SMUL_HIGHPART UNSPEC_UMUL_HIGHPART])
 
 (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_ADD UNSPEC_COND_SUB
+					 UNSPEC_COND_MUL UNSPEC_COND_DIV
 					 UNSPEC_COND_MAX UNSPEC_COND_MIN])
 
-(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB])
+(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB UNSPEC_COND_DIV])
 
 (define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_LT UNSPEC_COND_LE
 				      UNSPEC_COND_EQ UNSPEC_COND_NE
@@ -1573,6 +1578,8 @@  (define_int_attr optab [(UNSPEC_ANDF "an
 			(UNSPEC_XORV "xor")
 			(UNSPEC_COND_ADD "add")
 			(UNSPEC_COND_SUB "sub")
+			(UNSPEC_COND_MUL "mul")
+			(UNSPEC_COND_DIV "div")
 			(UNSPEC_COND_MAX "smax")
 			(UNSPEC_COND_MIN "smin")])
 
@@ -1787,10 +1794,14 @@  (define_int_attr cmp_op [(UNSPEC_COND_LT
 
 (define_int_attr sve_fp_op [(UNSPEC_COND_ADD "fadd")
 			    (UNSPEC_COND_SUB "fsub")
+			    (UNSPEC_COND_MUL "fmul")
+			    (UNSPEC_COND_DIV "fdiv")
 			    (UNSPEC_COND_MAX "fmaxnm")
 			    (UNSPEC_COND_MIN "fminnm")])
 
 (define_int_attr commutative [(UNSPEC_COND_ADD "true")
 			      (UNSPEC_COND_SUB "false")
+			      (UNSPEC_COND_MUL "true")
+			      (UNSPEC_COND_DIV "false")
 			      (UNSPEC_COND_MIN "true")
 			      (UNSPEC_COND_MAX "true")])
Index: gcc/config/aarch64/aarch64-sve.md
===================================================================
--- gcc/config/aarch64/aarch64-sve.md	2018-05-24 09:54:37.506451449 +0100
+++ gcc/config/aarch64/aarch64-sve.md	2018-05-24 10:12:10.141352356 +0100
@@ -1803,6 +1803,21 @@  (define_expand "cond_<optab><mode>"
   aarch64_sve_prepare_conditional_op (operands, 5, commutative_p);
 })
 
+(define_expand "cond_<optab><mode>"
+  [(set (match_operand:SVE_SDI 0 "register_operand")
+	(unspec:SVE_SDI
+	  [(match_operand:<VPRED> 1 "register_operand")
+	   (SVE_INT_BINARY_SD:SVE_SDI
+	     (match_operand:SVE_SDI 2 "register_operand")
+	     (match_operand:SVE_SDI 3 "register_operand"))
+	   (match_operand:SVE_SDI 4 "register_operand")]
+	  UNSPEC_SEL))]
+  "TARGET_SVE"
+{
+  bool commutative_p = (GET_RTX_CLASS (<CODE>) == RTX_COMM_ARITH);
+  aarch64_sve_prepare_conditional_op (operands, 5, commutative_p);
+})
+
 ;; Predicated integer operations.
 (define_insn "*cond_<optab><mode>"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
@@ -1817,6 +1832,19 @@  (define_insn "*cond_<optab><mode>"
   "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
 )
 
+(define_insn "*cond_<optab><mode>"
+  [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
+	(unspec:SVE_SDI
+	  [(match_operand:<VPRED> 1 "register_operand" "Upl")
+	   (SVE_INT_BINARY_SD:SVE_SDI
+	     (match_operand:SVE_SDI 2 "register_operand" "0")
+	     (match_operand:SVE_SDI 3 "register_operand" "w"))
+	   (match_dup 2)]
+	  UNSPEC_SEL))]
+  "TARGET_SVE"
+  "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
+)
+
 ;; Predicated integer operations with the operands reversed.
 (define_insn "*cond_<optab><mode>"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
@@ -1828,6 +1856,19 @@  (define_insn "*cond_<optab><mode>"
 	   (match_dup 3)]
 	  UNSPEC_SEL))]
   "TARGET_SVE"
+  "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"
+)
+
+(define_insn "*cond_<optab><mode>"
+  [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
+	(unspec:SVE_SDI
+	  [(match_operand:<VPRED> 1 "register_operand" "Upl")
+	   (SVE_INT_BINARY_SD:SVE_SDI
+	     (match_operand:SVE_SDI 2 "register_operand" "w")
+	     (match_operand:SVE_SDI 3 "register_operand" "0"))
+	   (match_dup 3)]
+	  UNSPEC_SEL))]
+  "TARGET_SVE"
   "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"
 )
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	2018-05-24 09:54:37.511451293 +0100
+++ gcc/testsuite/lib/target-supports.exp	2018-05-24 10:12:10.148352070 +0100
@@ -5590,8 +5590,9 @@  proc check_effective_target_vect_double
     return $et_vect_double_saved($et_index)
 }
 
-# Return 1 if the target supports conditional addition, subtraction, minimum
-# and maximum on vectors of double, via the cond_ optabs.  Return 0 otherwise.
+# Return 1 if the target supports conditional addition, subtraction,
+# multiplication, division, minimum and maximum on vectors of double,
+# via the cond_ optabs.  Return 0 otherwise.
 
 proc check_effective_target_vect_double_cond_arith { } {
     return [check_effective_target_aarch64_sve]
Index: gcc/testsuite/gcc.dg/vect/pr53773.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr53773.c	2018-05-16 12:48:59.115202362 +0100
+++ gcc/testsuite/gcc.dg/vect/pr53773.c	2018-05-24 10:12:10.147352111 +0100
@@ -14,5 +14,8 @@  foo (int integral, int decimal, int powe
   return integral+decimal;
 }
 
-/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" } } */
+/* We can avoid a scalar tail when using fully-masked loops with a fixed
+   vector length.  */
+/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" { target { { ! vect_fully_masked } || vect_variable_length } } } } */
+/* { dg-final { scan-tree-dump-times "\\* 10" 0 "optimized" { target { vect_fully_masked && { ! vect_variable_length } } } } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c	2018-05-24 09:54:37.509451356 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c	2018-05-24 10:12:10.147352111 +0100
@@ -6,6 +6,8 @@  #define N (VECTOR_BITS * 11 / 64 + 3)
 
 #define add(A, B) ((A) + (B))
 #define sub(A, B) ((A) - (B))
+#define mul(A, B) ((A) * (B))
+#define div(A, B) ((A) / (B))
 
 #define DEF(OP)							\
   void __attribute__ ((noipa))					\
@@ -34,6 +36,8 @@  #define TEST(OP)					\
 #define FOR_EACH_OP(T)				\
   T (add)					\
   T (sub)					\
+  T (mul)					\
+  T (div)					\
   T (__builtin_fmax)				\
   T (__builtin_fmin)
 
@@ -54,5 +58,7 @@  main (void)
 
 /* { dg-final { scan-tree-dump { = \.COND_ADD} "optimized" { target vect_double_cond_arith } } } */
 /* { dg-final { scan-tree-dump { = \.COND_SUB} "optimized" { target vect_double_cond_arith } } } */
+/* { dg-final { scan-tree-dump { = \.COND_MUL} "optimized" { target vect_double_cond_arith } } } */
+/* { dg-final { scan-tree-dump { = \.COND_RDIV} "optimized" { target vect_double_cond_arith } } } */
 /* { dg-final { scan-tree-dump { = \.COND_MAX} "optimized" { target vect_double_cond_arith } } } */
 /* { dg-final { scan-tree-dump { = \.COND_MIN} "optimized" { target vect_double_cond_arith } } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c	2018-05-24 09:54:37.510451324 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c	2018-05-24 10:12:10.147352111 +0100
@@ -5,6 +5,8 @@ 
 
 #define add(A, B) ((A) + (B))
 #define sub(A, B) ((A) - (B))
+#define mul(A, B) ((A) * (B))
+#define div(A, B) ((A) / (B))
 #define max(A, B) ((A) > (B) ? (A) : (B))
 #define min(A, B) ((A) < (B) ? (A) : (B))
 #define and(A, B) ((A) & (B))
@@ -27,6 +29,7 @@  #define DEF_LOOP(TYPE, CMPTYPE, OP)				\
 #define FOR_EACH_INT_TYPE(T, TYPE) \
   T (TYPE, TYPE, add) \
   T (TYPE, TYPE, sub) \
+  T (TYPE, TYPE, mul) \
   T (TYPE, TYPE, max) \
   T (TYPE, TYPE, min) \
   T (TYPE, TYPE, and) \
@@ -36,6 +39,8 @@  #define FOR_EACH_INT_TYPE(T, TYPE) \
 #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \
   T (TYPE, CMPTYPE, add) \
   T (TYPE, CMPTYPE, sub) \
+  T (TYPE, CMPTYPE, mul) \
+  T (TYPE, CMPTYPE, div) \
   T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \
   T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)
 
@@ -67,6 +72,11 @@  FOR_EACH_LOOP (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 } } */
 
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */
+
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
@@ -110,6 +120,14 @@  FOR_EACH_LOOP (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c	2018-05-24 09:54:37.510451324 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c	2018-05-24 10:12:10.148352070 +0100
@@ -5,6 +5,8 @@ 
 
 #define add(A, B) ((A) + (B))
 #define sub(A, B) ((A) - (B))
+#define mul(A, B) ((A) * (B))
+#define div(A, B) ((A) / (B))
 #define max(A, B) ((A) > (B) ? (A) : (B))
 #define min(A, B) ((A) < (B) ? (A) : (B))
 #define and(A, B) ((A) & (B))
@@ -27,6 +29,7 @@  #define DEF_LOOP(TYPE, CMPTYPE, OP)				\
 #define FOR_EACH_INT_TYPE(T, TYPE) \
   T (TYPE, TYPE, add) \
   T (TYPE, TYPE, sub) \
+  T (TYPE, TYPE, mul) \
   T (TYPE, TYPE, max) \
   T (TYPE, TYPE, min) \
   T (TYPE, TYPE, and) \
@@ -36,6 +39,8 @@  #define FOR_EACH_INT_TYPE(T, TYPE) \
 #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \
   T (TYPE, CMPTYPE, add) \
   T (TYPE, CMPTYPE, sub) \
+  T (TYPE, CMPTYPE, mul) \
+  T (TYPE, CMPTYPE, div) \
   T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \
   T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)
 
@@ -67,6 +72,11 @@  FOR_EACH_LOOP (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
 /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.d, p[0-7]/m,} 2 } } */
 
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */
+
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
@@ -110,6 +120,14 @@  FOR_EACH_LOOP (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c	2018-05-24 09:54:37.510451324 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c	2018-05-24 10:12:10.147352111 +0100
@@ -5,6 +5,8 @@ 
 
 #define add(A, B) ((A) + (B))
 #define sub(A, B) ((A) - (B))
+#define mul(A, B) ((A) * (B))
+#define div(A, B) ((A) / (B))
 #define max(A, B) ((A) > (B) ? (A) : (B))
 #define min(A, B) ((A) < (B) ? (A) : (B))
 #define and(A, B) ((A) & (B))
@@ -29,6 +31,7 @@  #define DEF_LOOP(TYPE, CMPTYPE, OP)				\
 #define FOR_EACH_INT_TYPE(T, TYPE) \
   T (TYPE, TYPE, add) \
   T (TYPE, TYPE, sub) \
+  T (TYPE, TYPE, mul) \
   T (TYPE, TYPE, max) \
   T (TYPE, TYPE, min) \
   T (TYPE, TYPE, and) \
@@ -38,6 +41,8 @@  #define FOR_EACH_INT_TYPE(T, TYPE) \
 #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \
   T (TYPE, CMPTYPE, add) \
   T (TYPE, CMPTYPE, sub) \
+  T (TYPE, CMPTYPE, mul) \
+  /* No div because that gets converted into a mul anyway.  */ \
   T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \
   T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)
 
@@ -58,10 +63,10 @@  FOR_EACH_LOOP (DEF_LOOP)
 
 /* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.., z[0-9]+} } } */
 
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 14 } } */
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 18 } } */
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 18 } } */
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 18 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 16 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 21 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 21 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 21 } } */
 
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m,} 2 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m,} 2 } } */
@@ -73,6 +78,11 @@  FOR_EACH_LOOP (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 } } */
 
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */
+
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
@@ -116,6 +126,10 @@  FOR_EACH_LOOP (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */