What’s the Difference between AMD64 and Intel EM64T?
I just had a Misunderstood word to clear up — what in the heck is the difference between the AMD x64 implementation (AMD64) and the Intel implementation (EM64T). A friend just asked me, I tried to sound important and tell him, but in the end I realized I didn’t know jack either. So there you go. Serves me right for not applying my LRH Study Tech.
Here’s the answer from Answers:
During much of AMD’s history, they have produced processors patterned after Intel’s, but, in an ironic twist of computing history, AMD64 has been adopted (under the name EM64T or IA-32e) by Intel—the original creators of the x86 processor line—in newer versions of its Pentium 4, Pentium D, Pentium Extreme Edition, Celeron D, Xeon, and Core 2 processors.
The EM64T project began with the codename Yamhill, named after the Yamhill River in Oregon‘s Willamette Valley. After several years of denying that this project existed, Intel eventually admitted it existed in early 2004, and gave it the codename CT (Clackamas Technology), also named after an Oregon river, the Clackamas. Then within the space of weeks of the CT announcement, Intel gave it several new names. After the spring 2004 IDF, Intel named it IA-32E (IA-32 Extensions) and a few weeks later devised the name EM64T. Intel’s chairman at the time, Craig Barrett, admitted that this was one of their worst kept secrets. A recent white paper
discussing SSE4 and future extensions refers to the instruction set as “Intel64″.
Summary from the Intel website
Intel EM64T improves performance by allowing the system to address more than 4 GiB of both virtual and physical memory. Intel EM64T provides support for: 
- 64-bit flat virtual address space
- 64-bit pointers
- 64-bit wide general purpose registers
- 64-bit integer support
- Up to 1 tebibyte (TiB) of platform address space
EM64T was originally implemented on the E revision (Prescott) of Pentium 4 line of microprocessors, which were supported by i915P (Grantsdale) and i925X (Alderwood) chipsets in June 2004. EM64T’s implementation was largely due to the competitive pressure of AMD‘s AMD64 technology implemented on Opteron and Athlon64 lines of microprocessing units, otherwise known as the K8 core, one year earlier in 2003; and the technology was largely built compatible to AMD64, and the then announced Windows XP Professional x64 Edition supporting AMD64 technology. Intel’s first processor to activate the EM64T technology was the multi-socket processor Xeon
codenamed Nocona. Since the Nocona Xeon itself is directly based on Intel’s desktop processor, the Pentium 4, the Pentium 4 also has EM64T technology built in, although as with Hyper-Threading, this feature was not initially enabled on the then-new Prescott
design, likely because enabling EM64T did not coincide with Intel’s stance on x86-64 extensions at that particular time. Intel has since begun selling EM64T enabled Pentium 4s using the E0 revision of the Prescott core, being sold on the market as the Pentium 4, model F. However, the revision F core was targeted at workstations. Intel’s official launch of EM64T to desktop was the N0 Stepping Prescott-2M. The E0 revision also adds eXecute Disable(XD) support to EM64T, Intel’s name for the NX bit, and has been included in the current Xeon codenamed Irwindale. All 9xx/8xx/6xx/5×6/5×1/3×6/3×1 series CPUs have EM64T enabled, as do the Core 2 CPUs, and as will all future Intel CPUs. EM64T is also present in the last members of the Celeron D line.
The first Intel mobile processor supporting EM64T is the Merom version of the Core 2 processor, which was released on
27 July 2006. None of Intel’s earlier notebook CPUs (Core Duo, Pentium M, Celeron M, Mobile Pentium 4) support EM64T.
Differences between AMD64 and EM64T
There are a small number of differences between each instruction set. Compilers generally produce binaries that target both AMD64 and EM64T, making the differences mainly of interest to compiler developers and operating system developers.
- EM64T’s BSF and BSR instructions act differently when the source is 0 and the operand size is 32 bits. The processor sets the
zero flag and leaves the upper 32 bits of the destination undefined.
- AMD64 supports 3DNow! instructions. This includes prefetch with the opcode 0x0F 0x0D and PREFETCHW, which are useful for hiding memory latency.
- EM64T lacks the ability to save and restore a reduced (and thus faster) version of the floating-point state (involving the FXSAVE and FXRSTOR instructions).
- EM64T lacks some model-specific registers that are considered architectural to AMD64. These include SYSCFG, TOP_MEM, and TOP_MEM2.
- EM64T supports microcode update as in 32-bit mode, whereas AMD64 processors use a different microcode update format and control MSRs.
- EM64T’s CPUID instruction is very vendor-specific, as is normal for x86-style processors.
- EM64T supports the MONITOR and MWAIT instructions, used by operating systems to better deal with Hyper-threading.
- AMD64 systems allow the use of the AGP aperture as an IO-MMU. Operating systems can take advantage of this to let normal PCI devices DMA to memory above 4 GiB. EM64T systems require the use of bounce buffers, which are slower.
- SYSCALL and SYSRET are also only supported in IA-32e mode (not in compatibility mode) on EM64T. SYSENTER and SYSEXIT are supported in both modes.
- Near branches with the 0×66 (operand size) prefix behave differently. One type of CPU clears only the top 32 bits,
while the other type clears the top 48 bits.
- Early AMD64 processors lacked the CMPXCHG16B instruction, which is an extension of the CMPXCHG8B instruction present on most post-486 processors. Similar to CMPXCHG8B, CMPXCHG16B allows for atomic operations
on 128-bit double quadword (or oword) data types. This is useful for high resolution counters that could be updated by multiple processors (or cores). Without CMPXCHG16B the only way to perform such an operation is
by using a critical section.
- Early Intel CPUs with EM64T lacked LAHF and SAHF instructions supported by AMD64 until introduction of Pentium 4 G1
step in December 2005. LAHF and SAHF are load and store instructions, respectively, for certain status flags. These instructions are used for virtualization and floating-point condition handling.
- Early Intel CPUs with EM64T also lack the NX bit (No Execute bit) of the AMD64 architecture. The NX bit marks memory pages as non-executable, allowing protection against many types of malicious code.
- Originally EM64T hardware allowed access only to 236 bytes of memory, while AMD64 systems can handle up to 240 bytes (with planned expansion to 256 bytes). However, as of recent publications, EM64T now provides 240 bytes of memory access.
Blogged with Flock