"xor ax, ax" is still in use today. The main advantage is that it is shorter, ju...

sparkie · 2025-07-03T04:45:49 1751517949

In long mode, compilers will typically emit `xor eax, eax`, as it only needs 2 bytes: The opcode and modrm byte. `xor ax, ax` takes 3 bytes due to the operand size override prefix (0x66), and `xor rax, rax` takes 3 bytes due to the REX.W prefix. `xor eax, eax` will still clear the full 64-bit register.

Shorter basically means you can fit more in instruction cache, which should in theory improve performance marginally.

Someone · 2025-07-03T06:41:42 1751524902

Size isn’t everything. You should start by reading the manual for your CPU to see what it advises. The micro-architecture may treat only one of the sequences specially. For modern x64, I think that indeed is the shorter xor sequence, where, internally, the CPU just renames the register to a register that always contains zero, making the instruction independent of any earlier instructions using eax.

IIRC, Intel said a mov was the way to go for some now ancient x86 CPUs, though.

tyfighter · 2025-07-03T02:54:48 1751511288

Modern x86 implementations don't even do the XOR. It just renames the register to "zero".

burnt-resistor · 2025-07-03T01:59:14 1751507954

Barely. x86 is fading. Arm doesn't do this in GCC or Clang.

> Shorter usually means faster

It depends, so spouting generalities doesn't mean anything. Instruction cache line filling vs. cycle reduction vs. reservation station ordering is typically a compiler constraints optimization problem(s).

userbinator · 2025-07-03T02:55:46 1751511346

Arm doesn't do this in GCC or Clang.

Because Arm64 has a zero register, and Arm32 has small immediates, and all instructions are uniformly long.