Docstoc

4c - PowerPoint

Document Sample
4c - PowerPoint Powered By Docstoc
					User-Level Interprocess Communication
  for Shared Memory Multiprocessors
 Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M.

                   Presented by Chris Eigner
           Review of LRPC
• RPC concept can be used within a single
  machine as IPC
• Caller/callee in RPC are on same
  machine…room for optimizations
  – Run client thread in context of server, avoid
    scheduler
  – Argument stacks allocated in shared memory,
    avoid message copying
  – Domain caching to reduce context-switch
    overhead
     Problems with RPC/LRPC
•   Kernel mediates every cross-address
    space call - 70% of total overhead
•   Poor performing cross-address space
    communication
    – Kernel-level communication + user-level
      thread management
•   Opportunity for more SMP optimizations
         SMP Optimizations
• No need to switch processor to another
  address space
  – Remove kernel from equation!
  – Address spaces share memory directly
  – Processor reallocation can be avoided
    • Preserves valuable cache/TLB contexts
    • Cost can be amortized over independent calls
  – Inexpensive thread management; orders of
    magnitude less than kernel-level.
      URPC Responsibilities
• URPC design isolates three components
  of IPC
  – Thread management
  – Data transfer
  – Processor reallocation
        Thread Management
• Context switch
  – Switching processor to another thread in
    same address space


• Processor reallocation
  – Reallocating processor to a thread in a
    different address space
  – via Processor.Donate
An Example
              Data Transfer
• Bi-directional shared memory queue
  – Test-and-set locks (non-spinning) on each
    end
• Client/server model
  – send, receive, start, stop
      Processor Reallocation
• URPC makes certain assumptions to reduce
  processor reallocation
  – Client has other threads to run or incoming messages
  – Server has or will have a processor to service
    message
• Allows inexpensive context switch during
  blocking phase of cross-address call
• Enables parallel execution of URPC while
  avoiding processor reallocation
                 Performance
• Firefly workstation
  – Four C-VAX processors
  – 32Mb RAM!!!
• Taos OS
  – Provided kernel level threads
• FastThreads
  – User-level thread library
• URPC
  – Channel management
  – Message primitives
Performance
Performance
              worse than LRCP
Performance
                Deficiencies
• Optimistic assumptions won’t always hold
  – Single-threaded applications
  – High-latency I/O
• Processor reallocation occurs after two
  optimization checks (approx. 100 μs)
  – Is there an idle processor?
  – Is there an underpowered address space to which it
    can be reallocated?
• Voluntary return of processors can’t be
  guaranteed
• Two processors for single computation, only one
  active at a time
                 Summary
SMP allows new freedoms in RPC design
• No need to switch processor to another
  address space
  – Preserves valuable cache/TLB contexts
  – 1-2 orders of magnitude improvement
• But, not ideal for all application types
  – Single-threaded applications
  – High-latency I/O

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:10
posted:9/5/2012
language:English
pages:15