64-Bit Micro-Architecture_ IA-64 or x86-64_
W
Description
IA means "Intel Architecture." IA-64 is "64-bit Intel architecture." Both IA-32 or IA-64, is known as Intel's processor architecture, and 32,64 and other figures represent 32-bit and 64-bit processors.
Document Sample


64-Bit Micro-Architecture: IA-64 or x86-64?
Paul Spiteri
BSc (Hons) Software Engineering B.Sc.
Supervisor: Antony Nicol
Second Reader: Rosemary Brown
December 2003
Abstract
More and more modern applications are suffering from the limitations of 32-bit microprocessor
architectures, as they begin to be reached.
This paper examines whether there is a need to move to a 64-bit platform, and a comparison of the two
architectural choices that are currently available to fulfil 64-bit needs, with the merits and weaknesses
of each exposed.
An explanation and comparison of RISC & CISC architecture designs examines the advantages of each
and concludes RISC is now the most effective philosophy when coupled with a modern efficient program
compiler that can harness the concurrency potential of super-scalar microprocessors.
The architectures examined and compared are the new Intel IA-64 RISC architecture and AMD’s x86-
64 extension to the existing x86 CISC architecture.
Potential problems with existing software backwards-compatibility and compiler dependency are
considered.
The paper concludes that IA-64 is a sensible design which overcomes most issues raised with x86 over
its lifespan and is a worthy successor. However, the x86-64 extension by AMD also addresses some of
the original x86 flaws and modernises the design.
This allows the x86-64 to be currently superior due to its full backwards-compatibility with existing
compiled software, and easy migration support.
1
1. Introduction
The current 32-bit generation of microprocessors have been in production in the mainstream since the
launch of the Intel 80386 CPU in 1985. There is a strong argument that the next step up to 64-bit CPUs
is necessary to continue enhancing performance.
Considering 32-bit has been in use for such a long period, and today’s architectures have little
resemblance to those original 80386 designs, this sounds a sensible suggestion. Bear in mind the 80386
is already an extension to the original 8086 16-bit architecture.
There are many factors which require considering before the route to a 64-bit platform can be decided.
• Is there an actual need for 64-bit registers?
• Is the current 32-bit addressable memory limitation of 4 GB becoming problematic?
• Should an all new architecture be designed or an existing architecture extended to 64-bit?
The disadvantages of moving to a 64-bit platform are equally important.
This paper will try to answer these questions as well as comparing the various approaches to 64-bit
computing from rival CPU manufacturers, specifically the Intel IA-64 architecture and the AMD x86-64
architecture.
The diagram to the left highlights a
variety of differences, including the
doubling of size of the internal data
paths.
Figure 1: Basic Comparison of 32bit & 64bit
The views on 64-bit computing from Intel and AMD differ largely. While both believe that there is a
purpose to extending CPU design to 64-bits, they consider the demand for such is in very different
places, and should be achieved in different ways.
The Intel approach is aimed solely towards high end enterprise servers, requiring fast processing power
and large quantities of memory for demanding applications such as large scale databases.
(Rattner) claims "It could be the end of the decade" before mainstream desktops need more than 4GB of
memory.
AMD however, believe there is a demand for the advantages 64-bit brings for workstation and even
desktop computers right now. Their approach is to design a CPU that runs existing software at high
speed, with expansion capabilities to 64-bits. This will allow the 4GB memory barrier to be broken as
well as potentially unlocking extra processing speed in certain circumstances, as described in (Dorian
2002).
According to Tim Sweeney (Lead Developer of Unreal 3-D Engine, Epic Games), there is a genuine use
for 64-bit workstation CPUs today.
2
re
“On a daily basis we' running into the Windows 2GB barrier with our next-generation content
development and pre-processing tools.
d
If cost-effective, backwards-compatible 64-bit CPUs were available today, we' buy them today. We
need them today.
t
And our next-generation 3d engine won' just support 64-bit, but will basically REQUIRE it on the
content-authoring side.”
Sweeney believes a 64-bit CPU should be brought to the home sooner rather than later, as his work
requires it. Due to Intel disagreeing, Sweeney is siding with AMD at this time.
We tell Intel this all the time, pleading for a cost-effective 64-bit desktop solution. Intel should be
listening to customers and taking the leadership role on the 64-bit desktop transition and not making
ridiculous "end of the decade" statements.
While 64-bit resolves memory constraint issues, the other advantage brought from 64-bit registers is the
ability to handle a larger dynamic range of numbers. Currently if a 64-bit value is required, the compiler
pairs two registers together to form a 64-bit value. This has obvious performance drawbacks.
However it should be clarified, most applications require only 32-bit registers to avoid overflow or
underflow.
(Stokes 2003) Mostly only the realm of scientific computing requires 64-bit for simulations etc. One
other use for larger registers is cryptography. As this often involves multiplication on huge integers, a
processor that can handle 64-bits at a time would offer large performance gains.
Intel’s reluctance to bring 64-bit CPU’s to the home user stems from their belief that x86 should not be
given another lease of life by further extension. Their 64-bit architecture is all new, which has positioned
it solely at high-end server use, where new software and operating systems will be less of an
inconvenience to deploy than to the desktop.
AMD on the other hand, have taken the x86 extension approach (Cleveland 2001). Their chip has been
designed to be fully compatible with existing 32-bit code to avoid the impact of having to immediately
migrate to a new operating system and software – instead it is as an optional extra.
Both of these design concepts will be further examined in detail, later in the paper.
To clarify, this 64-bit computer architecture paper is not focused on performance testing. Rather, it is
focused on architecture design, and discusses long-term potentials.
Section 2 identifies critical areas of architecture design that have influenced the two competing designs
examined in detail within section 3. Section 4 considers deployment issues and choices leading towards
conclusions drawn in Section 5.
3
2. Essential Knowledge
There are two basic requirements for a CPU to be 64-bit. It must be able to address a memory capacity
significantly larger than 4GB (e.g. 1 TB) and have general purpose registers with a 64-bit dynamic
range.
The importance of RISC & CISC designs is critical to this paper as both rival architectures examined
use one of the two.
2.1: RISC vs. CISC
CISC (Complex Instruction Set Computers)
RISC (Reduced Instruction Set Computers)
The simplest way to compare the two architectures is examine how a relatively simple task is
performed on each, and study the advantages and disadvantages of each.
For this example, the simple problem set is to calculate the cubed value of a given number, 20.
In a high level language, such as C++ the statements of code to cube the value of 20 stored in
variable ‘A’ would be:
1. int A = 20;
2. A = Cube(A);
2.2: The RISC Approach
RISC processors use only simple instructions that generally can be executed within one clock
cycle. Thus the cube operation would be performed by using the multiply operation twice,
resulting in 20 * 20 * 20 being calculated. The ‘MOVE’ instruction is also used to move data into
the registers.
In order to perform the exact series of steps described in the C++ code, a compiler would need to
code five lines of assembly:
1. MOV A, 20
2. MOV B, A
3. MUL B, A
4. MUL B, A
5. MOV A, B
2.3: The CISC Approach
The main goal of a CISC microprocessor is to complete a given task with the smallest number of
actual assembly language instructions. This is achieved by building processor hardware that is
capable of understanding and executing a large number of operations. For this particular example,
the CISC processor comes with an instruction that can calculate the cube of a value.
This allows the compiler to generate code that would in all probability, look like this:
1. MOV A, 20
2. CUBE A
‘A’ represents a main memory location, as CISC instructions can usually refer to memory and not
only registers. As you can see, the assembly language of the CISC processor compares very
closely to the original C++ code.
4
2.4: RISC / CISC Comparison
At first, the RISC method seems like a much less efficient way of completing the operation. As
there are more instructions to execute, the program file size will be larger to store each opcode.
Also, the compiler has a much more complex task to break the high-level language statement into
the multiple basic operations, rather than simply converting straight to the CISC CUBE instruction.
Debugging the larger RISC code may also be more difficult (Mann 1997) due to the instruction
code being longer and less like the original code.
As each RISC instruction is quick to execute, the total time is likely to be similar to the CISC time
taken even though there are more instructions.
(Tang 1996) points out that the RISC instructions also require less transistors of hardware space
than the complex instructions, leaving more room for general purpose registers etc. As the chips
can be made smaller, this can reduce the per-chip cost dramatically.
However RISC has an advantage as it can pipeline the instructions to execute simultaneously,
which can greatly improve performance but is heavily reliant on appropriate code. This greatly
increases the requirement of an efficient compiler.
The essence of RISC architecture is that it allows the execution of more operations in parallel and
at a higher rate than possible with a CISC architecture employing similar implementation
complexity. It can not only improve parallelism by pipelining, but also make superscalar and out-
of-order execution. (Zhongli Ding)
3. Architectural Comparison
3.1: The Competitors
There are two 64-bit designs currently in contention, one from each major PC CPU manufacturer
(INTEL & AMD).
The Intel architecture named IA-64 is a clean slate design, conceptualised in the early 90’s.
(Crawford 2000) describes the architecture as a RISC design that can execute multiple instructions
simultaneously. It makes use of VLIW (very long instruction words) for added flexibility, to group
instructions that can be executed in parallel (a characteristic referred to as superscalar).
On the other hand, (Cleveland 2001) describes how AMD is taking a less disruptive approach to
the challenge of 64-bit computing with x86-64. Their design is an extension on the existing x86-32
CISC architecture to overcome maximum addressable memory restrictions, and allow 64-bit
calculations to be performed natively.
Whether one approach is better or not is difficult to judge, as both have obvious advantages and
disadvantages.
Each aspect of the designs will now be examined, to help determine which is superior in differing
scenarios.
5
3.2: GPR (General Purpose Register) Comparison
As the IA-64 is RISC based architecture, the number of general purpose registers is large.
Compared with traditional x86 designs, the 128 integer registers and 128 floating point are
positively massive. Obviously, the general purpose registers are 64-bit wide, and the floating point
are 82-bits wide. (Turley 2002) / (Huck, Morris 2000).
The diagram to the left
shows the general purpose
integer and floating point
registers, and their widths.
Note the floating point is
slightly wider (82-bit), and
the Program Counter (PC) is
64-bit, like the general
integer registers.
Figure 2: A simplified diagram of IA-64
internal registers
This architecture gives the programmer or compiler an incredibly rich supply of registers, totalling
328.
AMD’s choice of register design was easily decided once they chose to extend the existing x86-32
architecture.
Comparable to when the original 8086 register set of 8 16-bit registers were extended to 32-bits for
the launch of the 80386 CPU, AMD is implementing their extension to 64-bit in the same way.
The 32-bit full registers were accessed by using different assembly language mnemonics to address
each general purpose register, e.g. AX became EAX. For AMD’s 64-bit extension the new
mnemonic prefix is R, e.g. AX register becomes RAX for the full 64-bits.
(Leibson 2000) details a further augmentation that AMD has implemented to the existing x86
design. While operating in 64-bit mode, the CPU can also make use of an extra eight general
purpose registers, which should help overcome some of the limitations of this ageing architecture.
The new registers double the total GPRs, as the diagram below shows.
This diagram shows
the general purpose
registers have been
doubled in width, and
also doubled in
quantity. The program
counter also has been
doubled in width.
Figure 3: A diagram detailing the new/extended
register set for x86-64
6
Overall the first impression when comparing the two register sets, is the IA-64 appears vastly
superior. While the x86-64 has reduced the restrictive quantity of registers over its predecessors,
the amount of IA-64 registers is still many times larger which appears very useful and should lead
to substantial performance benefits.
While this is extremely useful during functions execution, Gwennap (Oct 1999) points out:
With IA-64’s 128 integer registers, saving and restoring the entire register file takes more than
four times as long as on a standard RISC processor.
Traditional function calls would involve pushing the 128 general purpose registers on to the stack,
and popping on return. This is far from ideal as it is very time consuming.
(Turley 2002) describes how the IA-64 architecture brings with it a method to counter this
problem. The solution is called ‘register frames’.
The registers are split into groups which are individually visible to separate tasks. Each group, or
‘frame’ maps logical register numbers onto different physical registers. Each task’s frame logically
starts at GR32 although the actual register is unlikely to be.
Parameters can be efficiently passed between functions using this method, by allowing separate
frames to overlap which results in two register names pointing to the same physical register
location.
The diagram to the left
shows an example of register
framing, where two tasks
have allocated themselves 11
registers each, both starting
at different locations yet
named GR32 onwards. They
also overlap to allow data to
be shared.
Figure 4: An example of IA-64 Register
Framing
Regardless of the large number of registers, eventually the processor will run out. When this
occurs, the traditional pushing registers onto the stack is used. However, this differs from other
architectures as the procedure is fully automated by the processor and does not require handling by
the programmer or compiler while creating the instruction code.
Taken as a whole, one would have to say the IA-64 approach to general purpose registers is
superior. The vast quantity of available registers and elegant handling of them is simply superior to
x86-64’s implementation. However, there is no doubt x86-64 is still a marked improvement over
x86-32.
7
3.3: Floating Point Comparison
The IA-64 floating point unit is of course a new design, and while the power is not in question
there does appear to be some interesting quirks in the design.
(Gwennap May 1999) The floating-point instruction set is built around a fused multiply-add
(MAC) construct. Simple addition and multiplication are synthesized, using the constants +0.0 and
+1.0 stored in FR0 and FR1, respectively.
This means there is actually is no instruction to multiply or add two numbers together.
This is due to the FPU being designed to perform MAC Ops (Multiply-Accumulate Operations)
which essentially means two numbers are multiplied together, with the answer added to a third.
The solution to allow ‘standard’ floating point multiplication to be performed is to reset the third
‘adder’ value to zero.
(Sharangpani 2000) also explains how similarly, a floating point add can be performed by setting
the multiplier value to 1, and the ‘adder’ to the 2nd floating-point value to combine.
On the other hand, the AMD designers have chosen to modernise the x86 design by choosing to
ignore the legacy x87 floating point unit while updating the architecture to 64-bits. Figure 3 shows
it has not been extended, as programmers are encouraged to use the SSE/SSE2 unit for floating
point operations. This allows much higher performance, and of higher precision (128-bits vs. 79-
bits).
Essentially, both methods employed here are quite similar. The typical disadvantage of the x86
does not apply here as the floating point unit is roughly as modern as IA-64.
AMD have made it quite clear the old method should be completely bypassed and the modern SSE
unit should be used.
I have come to the conclusion that neither of the architectures can claim their method is
significantly superior.
3.4.1: x86 Compatibility – IA-64
The ability to execute existing software written for the highly popular x86 platform is clearly
important, when designing a new CPU.
IA-64 does allow for x86 binaries to be executed in an x86 compatibility mode, which essentially
‘emulates’ an x86 CPU through hardware. All of the x86 instructions are supported, including
additional instruction sets such as MMX or SSE, and can be used to run an entire x86-32 operating
system, or just individual applications from within an IA-64 OS.
This compatibility mode actually maps the x86 general purpose registers onto its own IA-64
register set, as shown below.
This diagram shows the
standard general purpose
registers well known to x86-
32 programmers (EAX,
EBX, ECX etc.) map onto
the IA-64 general purpose
registers starting at GR8.
Only the lower 32-bits are
usable.
Figure 5: Diagram showing x86 on IA-64
8
However, preparing to execute x86 instructions is not a simple task. (Gwennap 99) warns of this:
The switching overhead comes in preparing for the transition. Because of the register overlap, any
shared registers with important data must be explicitly saved to memory before switching modes.
Before calling an x86 routine, IA-64 code must properly set up the x86 segment descriptors, PSR,
and EFLAG registers. This mode-switch overhead makes it impossible to mix x86 and IA-64 code
at the subroutine level.
3.4.2: x86 Compatibility – x86-64
(Zeichick 2003) points out that the full binary compatibility with existing x86 software that AMD
x86-64 has, is probably its greatest asset.
This is achieved by using various modes of operation.
The first is ‘legacy mode’ which instructs the CPU to function exactly as a standard x86-32 CPU
would, with full speed compatibility for 32-bit software, and the extra 64-bit registers disabled.
The other mode is ’64-bit long mode’. This mode is set by the operating system during start-up,
and thus requires a 64-bit operating system to utilise it.
This ‘long mode’ is further split into two sub-modes – ’64-bit mode’ and ‘compatibility mode’.
This allows individual processes running on the OS to be of 32-bit legacy nature or 64-bit code –
with no performance penalty.
The diagram to the left
shows how a 64-bit
operating system can in fact
run legacy 32-bit software in
compatibility mode, while
also running 64-bit software
– when the CPU has ’64-bit
long mode’ enabled.
Figure 6: Diagram showing how x86-64 allows
legacy and 64-bit code to be run
simultaneously.
Of course, only 64-bit long mode code can take advantage of the extra registers and larger
addressable memory capabilities.
The default pointer size is 64-bit for long mode programs, to ensure they can point to data
anywhere within the maximum addressable memory range. However, the default integer size
remains at 32-bit as the majority of uses do not require such large values. While there should be no
performance loss to use 64-bit integers by default, the waste of memory would be significant
within the internal cache.
All research for x86 code compatibility concludes that the advantage here clearly goes to AMD
x86-64. Not only can it execute 32-bit code at a high level of performance, but can run it
9
simultaneously alongside full 64-bit applications when under a 64-bit compatible operating
system.
The Intel IA-64 on the other hand struggles to execute x86 code with any serious level of
performance, barely rivalling a Pentium 75 MHz. While this may improve in future revisions of
the design, currently AMD have this area of 64-bit computing under their control.
4. Architectural Summary
4.1 Deployment Issues
The above details of each architecture shows how widely different both approaches are to 64-bit
computing, with few similarities chosen.
As (Hans de Vries 2003) details, the main weakness of the IA-64 architecture is the requirement
for entirely new software to run on it, due to the completely new instruction set etc. The AMD also
cannot take advantage of the extra features without recompilation of software – but is still arguably
the fastest x86-32 microprocessor architecture created to date.
However, recompiling existing x86 software is simpler thanks to California based PathScale
recently (Q4’03) announcing a new version of their compiler suite with native code generation for
x86-64, claiming the highest performance is achievable with this combination by some 40%.
Research is showing companies are unwilling to go through this demanding task to move software
over to IA-64. When Sun Microsystems CEO Scott McNealy was queried why they chose x86-64
over IA-64 he responded:
We maintained binary compatibility with the entire x86 software base with x86-64. We took the
Xeon Solaris binary and it immediately ran on Opteron (x86-64 Architecture CPU).
Itanium (IA-64 Architecture) would require an entire re-write and recompile and re-certification
of our operating system and then of every application that ran on top of it.
Another deployment issue is cost of production. Theories by (Hans de Vries 2003) & (Leibson
2000) point towards the 64-bit extension element of x86-64 actually being relatively inexpensive
in terms of CPU core size. This is backed up by Marty Seyer, vice president of AMD' s
Microprocessor Business Unit who said:
05 05,"
"I think it will be in the ' timetable. Late ' when queried when AMD would cease production
of 32-bit CPUs.
This implies there is little reason to not include 32 & 64-bit support throughout their entire product
line, which can only aid x86-64 take-up by the industry.
However, Froutan (2003) has a differing view on this dual support.
"By building 32-bit support into its 64-bit processors, AMD is actually giving developers less of a
reason to port their applications to 64 bits and, therefore, slowing down the adoption of 64-bit
systems."
I disagree with this argument by Froutan. For AMD to create a 64-bit CPU their most logical path
was to extend their existing 32-bit CPU, the Athlon which is an x86 chip. Judging from recent
market share and financial reports, they do not have the resources to develop or persuade the
industry to adopt an all new architecture platform. To remove the 32-bit compatibility mode would
be suicidal considering the slow uptake of the backwards-incompatible IA-64 during its early
years.
10
4.2 Compiler Optimisation Importance
For IA-64 it is now the job of the compiler to locate parallelism and take advantage of the super-
scalar architecture. The importance of compilers and parallelism for this architecture is highlighted
by the opening lines of J Bharadwaj, J Pierce, 2000.
In planning the new architecture, Intel designers wanted to exploit the high level of instruction-
level parallelism (ILP) found in application code. To accomplish this goal, they incorporated a
powerful set of features such as control and data speculation, register framing, and a large
register file. By using these features, the compiler plays a crucial role in achieving the overall
performance of an IA-64 platform.
(Stokes 2003) and (Hans de Vries 2003) both explain: x86-64 does not rely on compiler
optimisations as highly, mainly due to the inherent x86 handicap of being nonparallel in nature. As
it is a super-scalar design, it does try to find instruction level parallelism in hardware, as well as
reordering instructions. However it is limited by having no foreknowledge of the entire program,
making the scope of the optimisations limited to the few instructions already in the pipeline.
(Turley 2002) describes how essentially, the IA-64 and x86-64 are completely opposite in this
area. The IA-64 does not re-order instructions at all – this is purely the job of the compiler where
as the x86-64 has to constantly search the binary stream of Op-codes looking for possible ways of
reorganising instructions in such a way to make the most efficient use of its hardware resources.
All this must be achieved within nanoseconds without affecting the critical path or impacting the
clock frequency.
It is clear from the evidence presented the IA-64 approach to software optimisation is superior.
Allowing the compiler to create optimisations is obviously more efficient than trying to locate
shortcuts during run-time. This also allows the CPU core to be smaller as it has no re-ordering etc.
hardware which could potentially reduce production costs.
However, the notable disadvantage is that it simply shifts the complexity onto the compiler.
5. Conclusion
The original problems highlighted by this paper such as memory capacity restrictions, can easily
be solved by migrating to a 64-bit platform. However, there is no single solution.
I fully appreciate Intel’s approach to the 64-bit design task. If a 64-bit platform is to be created, it
does seem logical to design an all new architecture using all the latest concepts and developments
to create the fastest microprocessor.
The compiler developments in recent years have helped shift the advantages of CISC architectures
towards RISC, due to RISC being better able to take advantage of super-scalar architecture.
Therefore Intel’s approach does appear ideal in terms of bringing the best theoretical architecture
to the industry.
However, it appears Intel is being too conservative with their forecast of bringing 64-bit CPUs to
the desktop by the end of the decade. There are situations in some desktop applications and
certainly in workstation scenarios where 64-bit registers and memory capacity greater than 4GB is
advantageous or even required. This need can only increase within the near future meaning Intel
11
risks turning people to AMD’s 64-bit solution if they do not alter their public roadmap and bring
64-bit computing to the desktop ahead of schedule.
(Leibson 2001) describes AMD’s design decisions to extend the existing x86 platform to 64-bit’s
was a logical move for two reasons.
1.It has worked before. (16-bit 32-bit).
2.They do not have the market share to successfully pioneer an all-new architecture.
It also has pure advantages, as existing software that is currently hitting the 4 GB limit can be
recompiled and be immediately faster while not suffering this limitation. (Kaplowitz 2003) further
examines the ease with which software can be ported to x86-64.
Also, since it can run 32-bit and 64-bit code concurrently, the migration process is eased
dramatically.
However, the long term future should be considered. Turley believes:
It is inconceivable that Intel cannot extend and enhance IA-64 for another decade or more.
On the other hand, the x86-64 architecture is the result of more than a decade of stretching and
enhancing.
Nonetheless, (McGrath 2000) points out AMD have tried to modernise the x86 architecture in as
many ways as possible. For example, the floating point performance has been brought up to date
with the addition of the SSE/SSE2 FP unit, to replace the antiquated x87 unit. The other complaint
of shortage of registers in x86 has also been partially addressed by doubling the quantity.
Overall, at this time the x86-64 does appear to be an effective answer to the original problems
raised that demand 64-bit solutions. While the IA-64 architecture is fascinating and has clear
advantages over traditional x86, until higher take-up is achieved by lower prices and greater
availability of software, it is potentially destined for failure when compared to the inexpensive, fast
& compatible AMD x86-64.
Of course, if there is one manufacturer who could successfully roll-out a new architecture to the
industry, it is Intel. To discard them would be a considerable error of judgement.
12
References
Sean Cleveland (2001)
x86-64 Technology White Paper
Advanced Micro Devices Inc.
Daniel Mann
‘Why the x86 CISC beat RISC’
http://www.amd-embedded.com/Benchmarks/whyx86.htm
Linley Gwennap (Oct 1999)
Merced (IA-64) Shows Innovative Design
Microprocessor Report, Volume 13, Number 13
Linley Gwennap (May 1999)
IA-64: A Parallel Instruction Set
Microprocessor Report, Volume 13, Number 7
Peter Song (Jan 1998)
Demystifying EPIC and IA-64
Microdesign Resources
Martin Hopkins (Feb 2000)
A Critical Look at IA-64
Microprocessor Report
Kevin McGrath (Sept 2000)
x86-64: Extending the x86 architecture to 64-bits
Stanford University
Jay Bharadwaj, William Y. Chen, Weihaw Chuang, Gerolf Hoflehner, Kalyan Muthukumar ,Jim Pierce
The Intel IA-64 Compiler Code Generator (Sept 2000)
Intel
Harsh Sharangpani, Ken Arora (Oct 2000)
Itanium Processor Micro-Architecture
Intel
John Crawford (Sept 2000)
Introducing the Itanium Processors
Intel
Tim Sweeney - Quote (Feb 2003)
Founder and President, Epic Games
http://www.amd.com/us-en/Weblets/0,,7832_8366_7823_8718%5E8320,00.html
Jerry Huck, Dale Morris, Jonathan Ross, Allan Knies, Hans Mulder, Rumi Zahir (Oct 2000)
Introducing the IA-64 Architecture
Hewlett Packard / Intel
Peter Dorian (Feb 2002)
Making a Sound Choice between 32 and 64 Bits
Meta Group White Paper
Steve Leibson (April 2000)
13
AMD Drops 64-Bit Hammer On X86
Microprocessor Report
Jon Stokes (Jan 1999)
A Preview of Intel’s IA-64
Arstechnica
Jon Stokes (March 2003)
An Introduction to 64-bit Computing and x86-64
Arstechnica
Yi Gao, Shilang Tang, Zhongli Ding (1996)
Comparison between CISC and RISC
http://doms.uwimona.edu.jm:1104/coursefiles/CS52R/cs52r_lectures/p_29.pdf
Michael Kanellos / Justin Rattner (Intel) (Feb 2003)
Intel takes slow road to 64-bit PC chips
http://zdnet.com.com/2100-1103-985432.html
Alan Zeichick (June 2003)
Extending the x86 Architecture to 64-bits
AMD64 DevSource
Hans de Vries (Sept 2003)
s
Understanding the detailed Architecture of AMD' 64 bit Core
http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html
Jim Turley (Feb 2002)
64-Bit CPUs: What You Need to Know
http://www.extremetech.com/article2/0,3973,231,00.asp
Robert McMillan (Nov 2003)
Are the Days of 32-Bit Chips Numbered?
http://www.pcworld.com/news/article/0,aid,113516,00.asp
David Kaplowitz (June 2003)
IBM Universal Database Software on the AMD Opteron Processor
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/
AMD_IBM_DB2_on_AMD_Opteron_2003-06-17.pdf
Nelson H. F. Beebe
A Selected Bibliography of Publications about Microprocessors
Centre for Scientific Computing, University of Utah
Paul Froutan (June 2003)
Understanding the AMD64 Solution
Microprocessor Report
Scott McNealy - Interview (2003)
Sun Microsystems
http://www.computing.co.uk/News/1151480
14
Get documents about "