# hw1 by ashrafp

VIEWS: 16 PAGES: 3

• pg 1
```									           Summer                CDA5155 Homework 1
Due: May 28th, 2010, 11:59pm

You are not allowed to take or give help in completing this assignment. Submit the PDF
version of the submission in e-Learning website before the deadline. Please include the
sentence in bold on top of your submission: “I have neither given nor received any
unauthorized aid on this assignment”.

Total Points: 70 pts

1. [10 points] Using the following table, solve the following questions:
Chip                     Num of Cores     Memory               Processor
performance          performance

Athlon 64 X2 4800 +                   2                 3423                 20178

Pentium EE 840                        2                 3228                 18893

Pentium D 820                         2                 3000                 15220

Athlon 64 X2 3800 +                   2                 2941                 17129

Pentium 4                             1                 2731                  7621

Athlon 64 3000+                       1                 2953                  7628

Pentium \$ 570                         1                 3501                 11210

Processor X                           1                 7000                  5000

a. Create a table similar to the given table, except express the results as
normalized to the Pentium 4 for both memory performance and processor
performance.

b. Calculate the arithmetic mean of the performance of each processor using
both the original performance and your normalized performance in part a).

c. Given the answer from part b), are there any conflicting conclusions you can
make?
2. [15 points] Your company’s internal studies show that a single-core system is
sufficient for the demand on your processing power. You are exploring, however,
whether you could save power by using two cores.

a. Assume that your application is 90% parallelizable. By how much could you
decrease the frequency and get the same performance?

b. Assume that the voltage may be decreased linearly with the frequency. Using
the equation in Section 1.5, how much dynamic power would the dual-core
system require as compared to the single-core system?

c. Now assume that the voltage may not decrease below 30% of the original
voltage. This voltage is referred to as the “voltage floor,” and any voltage
lower than that will lose the state. Using the equation in Section 1.5, how
much dynamic power would the dual‐core system require from part (a)
compared to the single‐core system when taking into account the voltage floor?

3. [10 points] You are designing a 32-bit instruction-set architecture which needs to
support 100 opcodes, three source operands and two destination operands. All the
source and destination operands are registers. Moreover, all the operands should be
able to access all the registers. What is the maximum size of the register file that this
architecture can use (show your computations)?

4. [15 points] In the load-store architecture of MIPS, operands of arithmetic and logical
instruction must be from registers. For a typical integer program, the instruction
distribution and CPI of 4 groups are given in the following table.

Type                 Frequency                  CPI
ALU                    50%                      1
Store                   15%                      2
Branch                   10%                      4

a. Calculate the average CPI of the integer program.

b. Now, assume that a set of new memory-register type of arithmetic and logical
instructions are added into the ISA. Each memory-register ALU instruction
combines one Load and one original ALU instruction together. It takes 4
cycles to execution this new type of instruction. Assume 60% of the load
instructions can be combined for the program; calculate the new CPI of the
integer program.

c. Assume the modification makes the overall cycle time increased by 5%. Is
this modification really worthwhile?

5. [20 points] Assume that values A, B, C and D reside in memory. Also assume that
instruction operation codes are represented in 8 bits, memory addresses are 64 bits
and register addresses are 8 bits. Assume all the data are 32-bits, and the instruction
lengths are in the table.

a. Write the code sequence for D=A+B*(A+C) for the following instruction set
architectures: 1) Stack; 2) Accumulator; 3) Register (Register-memory); 4)
Register (Load-Store). (You can refer to class slides, or Figure B.1-B.2 on page
B-4 of the Appendix B )

b. Compute the total instruction number and code size for each sequence you get.

c. Compute how many bytes are transferred to or from the memory in executing
the code sequences, including fetching instructions, read data, write data.

ISA                        Instruction Length
(bits)
Stack                      8 or 72
Accumulator                72
Register-memory            32 or 80 or 88