VIEWS: 36 PAGES: 14 CATEGORY: Computers & Internet POSTED ON: 8/13/2010
IA means "Intel Architecture." IA-64 is "64-bit Intel architecture." Both IA-32 or IA-64, is known as Intel's processor architecture, and 32,64 and other figures represent 32-bit and 64-bit processors.
64-Bit Micro-Architecture: IA-64 or x86-64? Paul Spiteri BSc (Hons) Software Engineering B.Sc. Supervisor: Antony Nicol Second Reader: Rosemary Brown December 2003 Abstract More and more modern applications are suffering from the limitations of 32-bit microprocessor architectures, as they begin to be reached. This paper examines whether there is a need to move to a 64-bit platform, and a comparison of the two architectural choices that are currently available to fulfil 64-bit needs, with the merits and weaknesses of each exposed. An explanation and comparison of RISC & CISC architecture designs examines the advantages of each and concludes RISC is now the most effective philosophy when coupled with a modern efficient program compiler that can harness the concurrency potential of super-scalar microprocessors. The architectures examined and compared are the new Intel IA-64 RISC architecture and AMD’s x86- 64 extension to the existing x86 CISC architecture. Potential problems with existing software backwards-compatibility and compiler dependency are considered. The paper concludes that IA-64 is a sensible design which overcomes most issues raised with x86 over its lifespan and is a worthy successor. However, the x86-64 extension by AMD also addresses some of the original x86 flaws and modernises the design. This allows the x86-64 to be currently superior due to its full backwards-compatibility with existing compiled software, and easy migration support. 1 1. Introduction The current 32-bit generation of microprocessors have been in production in the mainstream since the launch of the Intel 80386 CPU in 1985. There is a strong argument that the next step up to 64-bit CPUs is necessary to continue enhancing performance. Considering 32-bit has been in use for such a long period, and today’s architectures have little resemblance to those original 80386 designs, this sounds a sensible suggestion. Bear in mind the 80386 is already an extension to the original 8086 16-bit architecture. There are many factors which require considering before the route to a 64-bit platform can be decided. • Is there an actual need for 64-bit registers? • Is the current 32-bit addressable memory limitation of 4 GB becoming problematic? • Should an all new architecture be designed or an existing architecture extended to 64-bit? The disadvantages of moving to a 64-bit platform are equally important. This paper will try to answer these questions as well as comparing the various approaches to 64-bit computing from rival CPU manufacturers, specifically the Intel IA-64 architecture and the AMD x86-64 architecture. The diagram to the left highlights a variety of differences, including the doubling of size of the internal data paths. Figure 1: Basic Comparison of 32bit & 64bit The views on 64-bit computing from Intel and AMD differ largely. While both believe that there is a purpose to extending CPU design to 64-bits, they consider the demand for such is in very different places, and should be achieved in different ways. The Intel approach is aimed solely towards high end enterprise servers, requiring fast processing power and large quantities of memory for demanding applications such as large scale databases. (Rattner) claims "It could be the end of the decade" before mainstream desktops need more than 4GB of memory. AMD however, believe there is a demand for the advantages 64-bit brings for workstation and even desktop computers right now. Their approach is to design a CPU that runs existing software at high speed, with expansion capabilities to 64-bits. This will allow the 4GB memory barrier to be broken as well as potentially unlocking extra processing speed in certain circumstances, as described in (Dorian 2002). According to Tim Sweeney (Lead Developer of Unreal 3-D Engine, Epic Games), there is a genuine use for 64-bit workstation CPUs today. 2 re “On a daily basis we' running into the Windows 2GB barrier with our next-generation content development and pre-processing tools. d If cost-effective, backwards-compatible 64-bit CPUs were available today, we' buy them today. We need them today. t And our next-generation 3d engine won' just support 64-bit, but will basically REQUIRE it on the content-authoring side.” Sweeney believes a 64-bit CPU should be brought to the home sooner rather than later, as his work requires it. Due to Intel disagreeing, Sweeney is siding with AMD at this time. We tell Intel this all the time, pleading for a cost-effective 64-bit desktop solution. Intel should be listening to customers and taking the leadership role on the 64-bit desktop transition and not making ridiculous "end of the decade" statements. While 64-bit resolves memory constraint issues, the other advantage brought from 64-bit registers is the ability to handle a larger dynamic range of numbers. Currently if a 64-bit value is required, the compiler pairs two registers together to form a 64-bit value. This has obvious performance drawbacks. However it should be clarified, most applications require only 32-bit registers to avoid overflow or underflow. (Stokes 2003) Mostly only the realm of scientific computing requires 64-bit for simulations etc. One other use for larger registers is cryptography. As this often involves multiplication on huge integers, a processor that can handle 64-bits at a time would offer large performance gains. Intel’s reluctance to bring 64-bit CPU’s to the home user stems from their belief that x86 should not be given another lease of life by further extension. Their 64-bit architecture is all new, which has positioned it solely at high-end server use, where new software and operating systems will be less of an inconvenience to deploy than to the desktop. AMD on the other hand, have taken the x86 extension approach (Cleveland 2001). Their chip has been designed to be fully compatible with existing 32-bit code to avoid the impact of having to immediately migrate to a new operating system and software – instead it is as an optional extra. Both of these design concepts will be further examined in detail, later in the paper. To clarify, this 64-bit computer architecture paper is not focused on performance testing. Rather, it is focused on architecture design, and discusses long-term potentials. Section 2 identifies critical areas of architecture design that have influenced the two competing designs examined in detail within section 3. Section 4 considers deployment issues and choices leading towards conclusions drawn in Section 5. 3 2. Essential Knowledge There are two basic requirements for a CPU to be 64-bit. It must be able to address a memory capacity significantly larger than 4GB (e.g. 1 TB) and have general purpose registers with a 64-bit dynamic range. The importance of RISC & CISC designs is critical to this paper as both rival architectures examined use one of the two. 2.1: RISC vs. CISC CISC (Complex Instruction Set Computers) RISC (Reduced Instruction Set Computers) The simplest way to compare the two architectures is examine how a relatively simple task is performed on each, and study the advantages and disadvantages of each. For this example, the simple problem set is to calculate the cubed value of a given number, 20. In a high level language, such as C++ the statements of code to cube the value of 20 stored in variable ‘A’ would be: 1. int A = 20; 2. A = Cube(A); 2.2: The RISC Approach RISC processors use only simple instructions that generally can be executed within one clock cycle. Thus the cube operation would be performed by using the multiply operation twice, resulting in 20 * 20 * 20 being calculated. The ‘MOVE’ instruction is also used to move data into the registers. In order to perform the exact series of steps described in the C++ code, a compiler would need to code five lines of assembly: 1. MOV A, 20 2. MOV B, A 3. MUL B, A 4. MUL B, A 5. MOV A, B 2.3: The CISC Approach The main goal of a CISC microprocessor is to complete a given task with the smallest number of actual assembly language instructions. This is achieved by building processor hardware that is capable of understanding and executing a large number of operations. For this particular example, the CISC processor comes with an instruction that can calculate the cube of a value. This allows the compiler to generate code that would in all probability, look like this: 1. MOV A, 20 2. CUBE A ‘A’ represents a main memory location, as CISC instructions can usually refer to memory and not only registers. As you can see, the assembly language of the CISC processor compares very closely to the original C++ code. 4 2.4: RISC / CISC Comparison At first, the RISC method seems like a much less efficient way of completing the operation. As there are more instructions to execute, the program file size will be larger to store each opcode. Also, the compiler has a much more complex task to break the high-level language statement into the multiple basic operations, rather than simply converting straight to the CISC CUBE instruction. Debugging the larger RISC code may also be more difficult (Mann 1997) due to the instruction code being longer and less like the original code. As each RISC instruction is quick to execute, the total time is likely to be similar to the CISC time taken even though there are more instructions. (Tang 1996) points out that the RISC instructions also require less transistors of hardware space than the complex instructions, leaving more room for general purpose registers etc. As the chips can be made smaller, this can reduce the per-chip cost dramatically. However RISC has an advantage as it can pipeline the instructions to execute simultaneously, which can greatly improve performance but is heavily reliant on appropriate code. This greatly increases the requirement of an efficient compiler. The essence of RISC architecture is that it allows the execution of more operations in parallel and at a higher rate than possible with a CISC architecture employing similar implementation complexity. It can not only improve parallelism by pipelining, but also make superscalar and out- of-order execution. (Zhongli Ding) 3. Architectural Comparison 3.1: The Competitors There are two 64-bit designs currently in contention, one from each major PC CPU manufacturer (INTEL & AMD). The Intel architecture named IA-64 is a clean slate design, conceptualised in the early 90’s. (Crawford 2000) describes the architecture as a RISC design that can execute multiple instructions simultaneously. It makes use of VLIW (very long instruction words) for added flexibility, to group instructions that can be executed in parallel (a characteristic referred to as superscalar). On the other hand, (Cleveland 2001) describes how AMD is taking a less disruptive approach to the challenge of 64-bit computing with x86-64. Their design is an extension on the existing x86-32 CISC architecture to overcome maximum addressable memory restrictions, and allow 64-bit calculations to be performed natively. Whether one approach is better or not is difficult to judge, as both have obvious advantages and disadvantages. Each aspect of the designs will now be examined, to help determine which is superior in differing scenarios. 5 3.2: GPR (General Purpose Register) Comparison As the IA-64 is RISC based architecture, the number of general purpose registers is large. Compared with traditional x86 designs, the 128 integer registers and 128 floating point are positively massive. Obviously, the general purpose registers are 64-bit wide, and the floating point are 82-bits wide. (Turley 2002) / (Huck, Morris 2000). The diagram to the left shows the general purpose integer and floating point registers, and their widths. Note the floating point is slightly wider (82-bit), and the Program Counter (PC) is 64-bit, like the general integer registers. Figure 2: A simplified diagram of IA-64 internal registers This architecture gives the programmer or compiler an incredibly rich supply of registers, totalling 328. AMD’s choice of register design was easily decided once they chose to extend the existing x86-32 architecture. Comparable to when the original 8086 register set of 8 16-bit registers were extended to 32-bits for the launch of the 80386 CPU, AMD is implementing their extension to 64-bit in the same way. The 32-bit full registers were accessed by using different assembly language mnemonics to address each general purpose register, e.g. AX became EAX. For AMD’s 64-bit extension the new mnemonic prefix is R, e.g. AX register becomes RAX for the full 64-bits. (Leibson 2000) details a further augmentation that AMD has implemented to the existing x86 design. While operating in 64-bit mode, the CPU can also make use of an extra eight general purpose registers, which should help overcome some of the limitations of this ageing architecture. The new registers double the total GPRs, as the diagram below shows. This diagram shows the general purpose registers have been doubled in width, and also doubled in quantity. The program counter also has been doubled in width. Figure 3: A diagram detailing the new/extended register set for x86-64 6 Overall the first impression when comparing the two register sets, is the IA-64 appears vastly superior. While the x86-64 has reduced the restrictive quantity of registers over its predecessors, the amount of IA-64 registers is still many times larger which appears very useful and should lead to substantial performance benefits. While this is extremely useful during functions execution, Gwennap (Oct 1999) points out: With IA-64’s 128 integer registers, saving and restoring the entire register file takes more than four times as long as on a standard RISC processor. Traditional function calls would involve pushing the 128 general purpose registers on to the stack, and popping on return. This is far from ideal as it is very time consuming. (Turley 2002) describes how the IA-64 architecture brings with it a method to counter this problem. The solution is called ‘register frames’. The registers are split into groups which are individually visible to separate tasks. Each group, or ‘frame’ maps logical register numbers onto different physical registers. Each task’s frame logically starts at GR32 although the actual register is unlikely to be. Parameters can be efficiently passed between functions using this method, by allowing separate frames to overlap which results in two register names pointing to the same physical register location. The diagram to the left shows an example of register framing, where two tasks have allocated themselves 11 registers each, both starting at different locations yet named GR32 onwards. They also overlap to allow data to be shared. Figure 4: An example of IA-64 Register Framing Regardless of the large number of registers, eventually the processor will run out. When this occurs, the traditional pushing registers onto the stack is used. However, this differs from other architectures as the procedure is fully automated by the processor and does not require handling by the programmer or compiler while creating the instruction code. Taken as a whole, one would have to say the IA-64 approach to general purpose registers is superior. The vast quantity of available registers and elegant handling of them is simply superior to x86-64’s implementation. However, there is no doubt x86-64 is still a marked improvement over x86-32. 7 3.3: Floating Point Comparison The IA-64 floating point unit is of course a new design, and while the power is not in question there does appear to be some interesting quirks in the design. (Gwennap May 1999) The floating-point instruction set is built around a fused multiply-add (MAC) construct. Simple addition and multiplication are synthesized, using the constants +0.0 and +1.0 stored in FR0 and FR1, respectively. This means there is actually is no instruction to multiply or add two numbers together. This is due to the FPU being designed to perform MAC Ops (Multiply-Accumulate Operations) which essentially means two numbers are multiplied together, with the answer added to a third. The solution to allow ‘standard’ floating point multiplication to be performed is to reset the third ‘adder’ value to zero. (Sharangpani 2000) also explains how similarly, a floating point add can be performed by setting the multiplier value to 1, and the ‘adder’ to the 2nd floating-point value to combine. On the other hand, the AMD designers have chosen to modernise the x86 design by choosing to ignore the legacy x87 floating point unit while updating the architecture to 64-bits. Figure 3 shows it has not been extended, as programmers are encouraged to use the SSE/SSE2 unit for floating point operations. This allows much higher performance, and of higher precision (128-bits vs. 79- bits). Essentially, both methods employed here are quite similar. The typical disadvantage of the x86 does not apply here as the floating point unit is roughly as modern as IA-64. AMD have made it quite clear the old method should be completely bypassed and the modern SSE unit should be used. I have come to the conclusion that neither of the architectures can claim their method is significantly superior. 3.4.1: x86 Compatibility – IA-64 The ability to execute existing software written for the highly popular x86 platform is clearly important, when designing a new CPU. IA-64 does allow for x86 binaries to be executed in an x86 compatibility mode, which essentially ‘emulates’ an x86 CPU through hardware. All of the x86 instructions are supported, including additional instruction sets such as MMX or SSE, and can be used to run an entire x86-32 operating system, or just individual applications from within an IA-64 OS. This compatibility mode actually maps the x86 general purpose registers onto its own IA-64 register set, as shown below. This diagram shows the standard general purpose registers well known to x86- 32 programmers (EAX, EBX, ECX etc.) map onto the IA-64 general purpose registers starting at GR8. Only the lower 32-bits are usable. Figure 5: Diagram showing x86 on IA-64 8 However, preparing to execute x86 instructions is not a simple task. (Gwennap 99) warns of this: The switching overhead comes in preparing for the transition. Because of the register overlap, any shared registers with important data must be explicitly saved to memory before switching modes. Before calling an x86 routine, IA-64 code must properly set up the x86 segment descriptors, PSR, and EFLAG registers. This mode-switch overhead makes it impossible to mix x86 and IA-64 code at the subroutine level. 3.4.2: x86 Compatibility – x86-64 (Zeichick 2003) points out that the full binary compatibility with existing x86 software that AMD x86-64 has, is probably its greatest asset. This is achieved by using various modes of operation. The first is ‘legacy mode’ which instructs the CPU to function exactly as a standard x86-32 CPU would, with full speed compatibility for 32-bit software, and the extra 64-bit registers disabled. The other mode is ’64-bit long mode’. This mode is set by the operating system during start-up, and thus requires a 64-bit operating system to utilise it. This ‘long mode’ is further split into two sub-modes – ’64-bit mode’ and ‘compatibility mode’. This allows individual processes running on the OS to be of 32-bit legacy nature or 64-bit code – with no performance penalty. The diagram to the left shows how a 64-bit operating system can in fact run legacy 32-bit software in compatibility mode, while also running 64-bit software – when the CPU has ’64-bit long mode’ enabled. Figure 6: Diagram showing how x86-64 allows legacy and 64-bit code to be run simultaneously. Of course, only 64-bit long mode code can take advantage of the extra registers and larger addressable memory capabilities. The default pointer size is 64-bit for long mode programs, to ensure they can point to data anywhere within the maximum addressable memory range. However, the default integer size remains at 32-bit as the majority of uses do not require such large values. While there should be no performance loss to use 64-bit integers by default, the waste of memory would be significant within the internal cache. All research for x86 code compatibility concludes that the advantage here clearly goes to AMD x86-64. Not only can it execute 32-bit code at a high level of performance, but can run it 9 simultaneously alongside full 64-bit applications when under a 64-bit compatible operating system. The Intel IA-64 on the other hand struggles to execute x86 code with any serious level of performance, barely rivalling a Pentium 75 MHz. While this may improve in future revisions of the design, currently AMD have this area of 64-bit computing under their control. 4. Architectural Summary 4.1 Deployment Issues The above details of each architecture shows how widely different both approaches are to 64-bit computing, with few similarities chosen. As (Hans de Vries 2003) details, the main weakness of the IA-64 architecture is the requirement for entirely new software to run on it, due to the completely new instruction set etc. The AMD also cannot take advantage of the extra features without recompilation of software – but is still arguably the fastest x86-32 microprocessor architecture created to date. However, recompiling existing x86 software is simpler thanks to California based PathScale recently (Q4’03) announcing a new version of their compiler suite with native code generation for x86-64, claiming the highest performance is achievable with this combination by some 40%. Research is showing companies are unwilling to go through this demanding task to move software over to IA-64. When Sun Microsystems CEO Scott McNealy was queried why they chose x86-64 over IA-64 he responded: We maintained binary compatibility with the entire x86 software base with x86-64. We took the Xeon Solaris binary and it immediately ran on Opteron (x86-64 Architecture CPU). Itanium (IA-64 Architecture) would require an entire re-write and recompile and re-certification of our operating system and then of every application that ran on top of it. Another deployment issue is cost of production. Theories by (Hans de Vries 2003) & (Leibson 2000) point towards the 64-bit extension element of x86-64 actually being relatively inexpensive in terms of CPU core size. This is backed up by Marty Seyer, vice president of AMD' s Microprocessor Business Unit who said: 05 05," "I think it will be in the ' timetable. Late ' when queried when AMD would cease production of 32-bit CPUs. This implies there is little reason to not include 32 & 64-bit support throughout their entire product line, which can only aid x86-64 take-up by the industry. However, Froutan (2003) has a differing view on this dual support. "By building 32-bit support into its 64-bit processors, AMD is actually giving developers less of a reason to port their applications to 64 bits and, therefore, slowing down the adoption of 64-bit systems." I disagree with this argument by Froutan. For AMD to create a 64-bit CPU their most logical path was to extend their existing 32-bit CPU, the Athlon which is an x86 chip. Judging from recent market share and financial reports, they do not have the resources to develop or persuade the industry to adopt an all new architecture platform. To remove the 32-bit compatibility mode would be suicidal considering the slow uptake of the backwards-incompatible IA-64 during its early years. 10 4.2 Compiler Optimisation Importance For IA-64 it is now the job of the compiler to locate parallelism and take advantage of the super- scalar architecture. The importance of compilers and parallelism for this architecture is highlighted by the opening lines of J Bharadwaj, J Pierce, 2000. In planning the new architecture, Intel designers wanted to exploit the high level of instruction- level parallelism (ILP) found in application code. To accomplish this goal, they incorporated a powerful set of features such as control and data speculation, register framing, and a large register file. By using these features, the compiler plays a crucial role in achieving the overall performance of an IA-64 platform. (Stokes 2003) and (Hans de Vries 2003) both explain: x86-64 does not rely on compiler optimisations as highly, mainly due to the inherent x86 handicap of being nonparallel in nature. As it is a super-scalar design, it does try to find instruction level parallelism in hardware, as well as reordering instructions. However it is limited by having no foreknowledge of the entire program, making the scope of the optimisations limited to the few instructions already in the pipeline. (Turley 2002) describes how essentially, the IA-64 and x86-64 are completely opposite in this area. The IA-64 does not re-order instructions at all – this is purely the job of the compiler where as the x86-64 has to constantly search the binary stream of Op-codes looking for possible ways of reorganising instructions in such a way to make the most efficient use of its hardware resources. All this must be achieved within nanoseconds without affecting the critical path or impacting the clock frequency. It is clear from the evidence presented the IA-64 approach to software optimisation is superior. Allowing the compiler to create optimisations is obviously more efficient than trying to locate shortcuts during run-time. This also allows the CPU core to be smaller as it has no re-ordering etc. hardware which could potentially reduce production costs. However, the notable disadvantage is that it simply shifts the complexity onto the compiler. 5. Conclusion The original problems highlighted by this paper such as memory capacity restrictions, can easily be solved by migrating to a 64-bit platform. However, there is no single solution. I fully appreciate Intel’s approach to the 64-bit design task. If a 64-bit platform is to be created, it does seem logical to design an all new architecture using all the latest concepts and developments to create the fastest microprocessor. The compiler developments in recent years have helped shift the advantages of CISC architectures towards RISC, due to RISC being better able to take advantage of super-scalar architecture. Therefore Intel’s approach does appear ideal in terms of bringing the best theoretical architecture to the industry. However, it appears Intel is being too conservative with their forecast of bringing 64-bit CPUs to the desktop by the end of the decade. There are situations in some desktop applications and certainly in workstation scenarios where 64-bit registers and memory capacity greater than 4GB is advantageous or even required. This need can only increase within the near future meaning Intel 11 risks turning people to AMD’s 64-bit solution if they do not alter their public roadmap and bring 64-bit computing to the desktop ahead of schedule. (Leibson 2001) describes AMD’s design decisions to extend the existing x86 platform to 64-bit’s was a logical move for two reasons. 1.It has worked before. (16-bit 32-bit). 2.They do not have the market share to successfully pioneer an all-new architecture. It also has pure advantages, as existing software that is currently hitting the 4 GB limit can be recompiled and be immediately faster while not suffering this limitation. (Kaplowitz 2003) further examines the ease with which software can be ported to x86-64. Also, since it can run 32-bit and 64-bit code concurrently, the migration process is eased dramatically. However, the long term future should be considered. Turley believes: It is inconceivable that Intel cannot extend and enhance IA-64 for another decade or more. On the other hand, the x86-64 architecture is the result of more than a decade of stretching and enhancing. Nonetheless, (McGrath 2000) points out AMD have tried to modernise the x86 architecture in as many ways as possible. For example, the floating point performance has been brought up to date with the addition of the SSE/SSE2 FP unit, to replace the antiquated x87 unit. The other complaint of shortage of registers in x86 has also been partially addressed by doubling the quantity. Overall, at this time the x86-64 does appear to be an effective answer to the original problems raised that demand 64-bit solutions. While the IA-64 architecture is fascinating and has clear advantages over traditional x86, until higher take-up is achieved by lower prices and greater availability of software, it is potentially destined for failure when compared to the inexpensive, fast & compatible AMD x86-64. Of course, if there is one manufacturer who could successfully roll-out a new architecture to the industry, it is Intel. To discard them would be a considerable error of judgement. 12 References Sean Cleveland (2001) x86-64 Technology White Paper Advanced Micro Devices Inc. Daniel Mann ‘Why the x86 CISC beat RISC’ http://www.amd-embedded.com/Benchmarks/whyx86.htm Linley Gwennap (Oct 1999) Merced (IA-64) Shows Innovative Design Microprocessor Report, Volume 13, Number 13 Linley Gwennap (May 1999) IA-64: A Parallel Instruction Set Microprocessor Report, Volume 13, Number 7 Peter Song (Jan 1998) Demystifying EPIC and IA-64 Microdesign Resources Martin Hopkins (Feb 2000) A Critical Look at IA-64 Microprocessor Report Kevin McGrath (Sept 2000) x86-64: Extending the x86 architecture to 64-bits Stanford University Jay Bharadwaj, William Y. Chen, Weihaw Chuang, Gerolf Hoflehner, Kalyan Muthukumar ,Jim Pierce The Intel IA-64 Compiler Code Generator (Sept 2000) Intel Harsh Sharangpani, Ken Arora (Oct 2000) Itanium Processor Micro-Architecture Intel John Crawford (Sept 2000) Introducing the Itanium Processors Intel Tim Sweeney - Quote (Feb 2003) Founder and President, Epic Games http://www.amd.com/us-en/Weblets/0,,7832_8366_7823_8718%5E8320,00.html Jerry Huck, Dale Morris, Jonathan Ross, Allan Knies, Hans Mulder, Rumi Zahir (Oct 2000) Introducing the IA-64 Architecture Hewlett Packard / Intel Peter Dorian (Feb 2002) Making a Sound Choice between 32 and 64 Bits Meta Group White Paper Steve Leibson (April 2000) 13 AMD Drops 64-Bit Hammer On X86 Microprocessor Report Jon Stokes (Jan 1999) A Preview of Intel’s IA-64 Arstechnica Jon Stokes (March 2003) An Introduction to 64-bit Computing and x86-64 Arstechnica Yi Gao, Shilang Tang, Zhongli Ding (1996) Comparison between CISC and RISC http://doms.uwimona.edu.jm:1104/coursefiles/CS52R/cs52r_lectures/p_29.pdf Michael Kanellos / Justin Rattner (Intel) (Feb 2003) Intel takes slow road to 64-bit PC chips http://zdnet.com.com/2100-1103-985432.html Alan Zeichick (June 2003) Extending the x86 Architecture to 64-bits AMD64 DevSource Hans de Vries (Sept 2003) s Understanding the detailed Architecture of AMD' 64 bit Core http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html Jim Turley (Feb 2002) 64-Bit CPUs: What You Need to Know http://www.extremetech.com/article2/0,3973,231,00.asp Robert McMillan (Nov 2003) Are the Days of 32-Bit Chips Numbered? http://www.pcworld.com/news/article/0,aid,113516,00.asp David Kaplowitz (June 2003) IBM Universal Database Software on the AMD Opteron Processor http://www.amd.com/us-en/assets/content_type/DownloadableAssets/ AMD_IBM_DB2_on_AMD_Opteron_2003-06-17.pdf Nelson H. F. Beebe A Selected Bibliography of Publications about Microprocessors Centre for Scientific Computing, University of Utah Paul Froutan (June 2003) Understanding the AMD64 Solution Microprocessor Report Scott McNealy - Interview (2003) Sun Microsystems http://www.computing.co.uk/News/1151480 14
Pages to are hidden for
"64-Bit Micro-Architecture_ IA-64 or x86-64_"Please download to view full document