Neon/AArch32: Mark inline asm output as read/write

'buffer' is both passed into the inline assembly code and modified by
it.  See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, 6.47.2.3.

With GCC 4, this commit does not change the generated assembly code at
all.

With GCC 8, this commit fixes an assembly error:

  /tmp/{foo}.s: Assembler messages:
  /tmp/{foo}.s:775: Error: registers may not be the same --
                    `str r9,[r9],#4'

I'm not sure why that error went unnoticed, since I definitely
benchmarked the previous commit with GCC 8.  Anyhow, this commit changes
the generated assembly code slightly but does not alter performance.

With Clang 10, this commit changes the generated assembly code slightly
but does not alter performance.

Refer to #529
This commit is contained in:
DRC 2021-07-12 13:52:38 -05:00
parent 2a2970af67
commit a1bfc05854

View File

@ -85,7 +85,7 @@ typedef struct {
#else
#define SPLAT() { \
put_buffer = __builtin_bswap32(put_buffer); \
__asm__("str %1, [%0], #4" : "=r" (buffer) : "r" (put_buffer)); \
__asm__("str %1, [%0], #4" : "+r" (buffer) : "r" (put_buffer)); \
}
#endif