Docstoc

Floating-point representation

Document Sample
Floating-point representation Powered By Docstoc
					Representation of real numbers
      Fixed-point representation

— numbers are converted to binary fraction
— a binary point is fixed between bits (say
 between bit 2 and bit 3)


       0        0   0    0    0   0   1   1

            3
           ----- = 0.011(2)
            8
                        Example
            1
      7 ----- = 111.01(2)
            4

        0       0   0    1     1     0     1    0


Using sign-and-magnitude representation,
largest number = 01111111 (2) = 15.875 (10)
smallest number = 11111111 (2) = -15.875 (10)
     Floating-point representation

— The fixed point representation is not
 sufficient for scientific calculations, hence,
 there is a need to easily accommodate both
 very large integers and very small fractions.
—In this case, the position of the binary point
 is variable and the binary point is said to
 float.
            Important step
• numbers must be converted (normalized) to
  standard form

Example
480000 = 0.48 x 1000000 = 0.48 x 106
0.0007 = 0.7 x 0.001    = 0.7 x 10-3


                              standard form
               Binary number
11010 = 0.11010 x 2 5
0.000101 = 0.101 x 2 -3

A real number is composed of mantissa and exponent




            0.101 x 2                 -3
                      Example
Assume 16-bit computer (8 bits for mantissa, 8 bits for exponent)




           mantissa                     exponent

   Using sign-and-magnitude, 0.11010 x 2 5 is stored as

   0 1    1 0 1 0 0 0 0 0              0 0 0 1 0 1
           Further examples
0.0001112 = .1112 x 2-3 = .1112 x 10-11

0 1    1 1 0 0 0 0 1 0      0 0 0 0 1 1



-1011002 = - .10112 x 26 = - .10112 x 10110

 1 1   0 1 1 0 0 0 0 0      0 0 0 1 1 0
       Another representation
Suppose 16-bit binary codes is used to
  represent floating point number. Let us use
  the leftmost bit to indicate the sign of the
  number. The next 8 bits to represent the
  mantissa and the rightmost 7 bits to code
  the exponent in 2’s complement.
                 Example
Represent -10.37510
-1010.0112 = -.10100112 x 24
Therefore, the floating representation of -
  10.375 is


1101001100000100
    Fixed-point representation
• Advantage
  – accuracy is high
  – calculation can be done faster (less complicated
    circuitry)
• Disadvantage
  – range is smaller
  Floating-point representation
• Advantage
  – range is wider
  – as number of bits in mantissa increased, the
    precision will be increased
• Disadvantage
  – calculation is slow (more complicated circuitry)
           Arithmetical Errors
•   Truncation error
•   rounding error
•   overflow error
•   underflow error
         Truncation error
In the truncation scheme, any significant
  bit that cannot be accommodated in the
  bits for mantissa is ignored. For
  example, 101100011 is represented
  by .10110001. 0.000000001 is ignored.
  This is called a truncation error.
              Rounding error
In the rounding scheme, if the bits that cannot be
   accommodated correspond to a value less than
   half of the place value of the last bit used to
   represent mantissa, they are ignored, Otherwise,
   the place value of the last bit used to represent
   mantissa is added to the mantissa.
For example, the former example 101100011 is
   represented by 10110010.
             Overflow error
• data is too large to be stored

Example
In 16-bit 2’s complement,
if any number > 32767
then overflow
            Underflow error
• data is too small to be stored

Example
In 16-bit 2’s complement,
if any number < -32768
then underflow
                  Parity Bit
• Reason
  – to check any errors during data communication
  (i.e. parity check)
• Method
  – one bit in a byte/word added
• 2 types
  – odd parity
  – even parity
                 Odd Parity
• Example
  – 1010001

  0 may be added at the end


  => 10100010
                         parity bit

       to add ‘0’ so the the number of ‘1’ is odd
                 Even Parity
• Example
  – 1010001

  1 may be added at the end


  => 10100011
                         parity bit

       to add ‘0’ so the the number of ‘1’ is even
                 Exercise
What is the bit added to the following codes
 to make even parity?

  i.       1000010
  ii.      1110000
        Parity checking奇偶檢驗
—Parity checking is a simple method of checking
 the correctness of received data.
—An ASCII code is 8 bits long, but only the
 rightmost 7 bits are used to represent a character
 leaving the most significant bit 0. The 8th bit can
 be used as a parity bit.
       Parity checking奇偶檢驗
—To maintain even parity 偶數奇偶檢驗, the parity
 bit is set to 1 if the code being sent has an odd
 number of 1s.
—On the other hand, to maintain odd parity奇數奇
 偶檢驗, the parity bit is set to 1 if the code being
 sent has an even number of 1s.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:2/14/2012
language:
pages:23