Browse Source
Hi, This patch is part of a series that adds support for Armv8.6-A (Matrix Multiply and BFloat16 extensions) to binutils. This patch introduces the Matrix Multiply (Int8, F32, F64) extensions to the arm backend. The following Matrix Multiply instructions are added: vummla, vsmmla, vusmmla, vusdot, vsudot[1]. [1]https://developer.arm.com/docs/ddi0597/latest/simd-and-floating-point-instructions-alphabetic-order Committed on behalf of Mihail Ionescu. gas/ChangeLog: 2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com> * config/tc-arm.c (arm_ext_i8mm): New feature set. (do_vusdot): New. (do_vsudot): New. (do_vsmmla): New. (do_vummla): New. (insns): Add vsmmla, vummla, vusmmla, vusdot, vsudot mnemonics. (armv86a_ext_table): Add i8mm extension. (arm_extensions): Move bf16 extension to context sensitive table. (armv82a_ext_table, armv84a_ext_table, armv85a_ext_table): Move bf16 extension to context sensitive table. (armv86a_ext_table): Add i8mm extension. * doc/c-arm.texi: Document i8mm extension. * testsuite/gas/arm/i8mm.s: New test. * testsuite/gas/arm/i8mm.d: New test. * testsuite/gas/arm/bfloat17-cmdline-bad-3.d: Update test. include/ChangeLog: 2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com> * opcode/arm.h (ARM_EXT2_I8MM): New feature macro. opcodes/ChangeLog: 2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com> * arm-dis.c (neon_opcodes): Add i8mm SIMD instructions. Regression tested on arm-none-eabi. Is this ok for trunk? Regards, Mihailgdb-9-branch
10 changed files with 195 additions and 5 deletions
@ -1,4 +1,4 @@ |
|||
#name: Bfloat 16 bad extension |
|||
#source: bfloat16-non-neon.s |
|||
#as: -mno-warn-deprecated -march=armv8.1-a+bf16 |
|||
#error: .*Error: extension does not apply to the base architecture.* |
|||
#error: .*Error: unknown architectural extension `bf16'* |
|||
|
|||
@ -0,0 +1,36 @@ |
|||
#name: Int8 Matrix Multiply extension |
|||
#source: i8mm.s |
|||
#as: -mno-warn-deprecated -march=armv8.6-a+i8mm+simd -I$srcdir/$subdir |
|||
#objdump: -dr --show-raw-insn |
|||
|
|||
.*: +file format .*arm.* |
|||
|
|||
Disassembly of section \.text: |
|||
|
|||
00000000 <\.text>: |
|||
*[0-9a-f]+: fcea4c40 vusmmla\.s8 q10, q5, q0 |
|||
*[0-9a-f]+: fc6a4c50 vummla\.u8 q10, q5, q0 |
|||
*[0-9a-f]+: fc6a4c40 vsmmla\.s8 q10, q5, q0 |
|||
*[0-9a-f]+: fcea4d40 vusdot\.s8 q10, q5, q0 |
|||
*[0-9a-f]+: feca4d50 vsudot\.u8 q10, q5, d0\[0\] |
|||
*[0-9a-f]+: feca4d70 vsudot\.u8 q10, q5, d0\[1\] |
|||
*[0-9a-f]+: feca4d40 vusdot\.s8 q10, q5, d0\[0\] |
|||
*[0-9a-f]+: feca4d60 vusdot\.s8 q10, q5, d0\[1\] |
|||
*[0-9a-f]+: fca5ad00 vusdot\.s8 d10, d5, d0 |
|||
*[0-9a-f]+: fe85ad00 vusdot\.s8 d10, d5, d0\[0\] |
|||
*[0-9a-f]+: fe85ad20 vusdot\.s8 d10, d5, d0\[1\] |
|||
*[0-9a-f]+: fe85ad10 vsudot\.u8 d10, d5, d0\[0\] |
|||
*[0-9a-f]+: fe85ad30 vsudot\.u8 d10, d5, d0\[1\] |
|||
*[0-9a-f]+: fcea4c40 vusmmla\.s8 q10, q5, q0 |
|||
*[0-9a-f]+: fc6a4c50 vummla\.u8 q10, q5, q0 |
|||
*[0-9a-f]+: fc6a4c40 vsmmla\.s8 q10, q5, q0 |
|||
*[0-9a-f]+: fcea4d40 vusdot\.s8 q10, q5, q0 |
|||
*[0-9a-f]+: feca4d50 vsudot\.u8 q10, q5, d0\[0\] |
|||
*[0-9a-f]+: feca4d70 vsudot\.u8 q10, q5, d0\[1\] |
|||
*[0-9a-f]+: feca4d40 vusdot\.s8 q10, q5, d0\[0\] |
|||
*[0-9a-f]+: feca4d60 vusdot\.s8 q10, q5, d0\[1\] |
|||
*[0-9a-f]+: fca5ad00 vusdot\.s8 d10, d5, d0 |
|||
*[0-9a-f]+: fe85ad00 vusdot\.s8 d10, d5, d0\[0\] |
|||
*[0-9a-f]+: fe85ad20 vusdot\.s8 d10, d5, d0\[1\] |
|||
*[0-9a-f]+: fe85ad10 vsudot\.u8 d10, d5, d0\[0\] |
|||
*[0-9a-f]+: fe85ad30 vsudot\.u8 d10, d5, d0\[1\] |
|||
@ -0,0 +1,32 @@ |
|||
vusmmla.s8 q10, q5, q0 |
|||
vummla.u8 q10, q5, q0 |
|||
vsmmla.s8 q10, q5, q0 |
|||
|
|||
vusdot.s8 q10, q5, q0 |
|||
vsudot.u8 q10, q5, d0[0] |
|||
vsudot.u8 q10, q5, d0[1] |
|||
vusdot.s8 q10, q5, d0[0] |
|||
vusdot.s8 q10, q5, d0[1] |
|||
|
|||
vusdot.s8 d10, d5, d0 |
|||
vusdot.s8 d10, d5, d0[0] |
|||
vusdot.s8 d10, d5, d0[1] |
|||
vsudot.u8 d10, d5, d0[0] |
|||
vsudot.u8 d10, d5, d0[1] |
|||
|
|||
|
|||
vusmmla q10.s8, q5.s8, q0.s8 |
|||
vummla q10.u8, q5.u8, q0.u8 |
|||
vsmmla q10.s8, q5.s8, q0.s8 |
|||
|
|||
vusdot q10.s8, q5.s8, q0.s8 |
|||
vsudot q10.u8, q5.u8, d0.u8[0] |
|||
vsudot q10.u8, q5.u8, d0.u8[1] |
|||
vusdot q10.s8, q5.s8, d0.s8[0] |
|||
vusdot q10.s8, q5.s8, d0.s8[1] |
|||
|
|||
vusdot d10.s8, d5.s8, d0.s8 |
|||
vusdot d10.s8, d5.s8, d0.s8[0] |
|||
vusdot d10.s8, d5.s8, d0.s8[1] |
|||
vsudot d10.u8, d5.u8, d0.u8[0] |
|||
vsudot d10.u8, d5.u8, d0.u8[1] |
|||
Loading…
Reference in new issue