Outline for Today WHY How to Write a Bad Program (with the help by ermalos


									         Outline for Today                                               WHY?
• Administrative                                       Industrial Partners voice concerns about lack of
  – Review session tomorrow – 3pm D106                   intuition about performance among new hires
  – Exam available in hardcopy at review session and     (in general, not specific to Duke Ugrads,
    in pdf online afterwards                             necessarily)
  – If you haven’t demo’ed assignment 5, combine       Goal: to spend one lecture making SURE it
    with assignment 6                                    doesn’t apply to Duke Ugrads once YOU are
  – Course evals today – stop me at 3:15                 on-the- job.
• Objective: Discussion about Performance!

How to Write a Bad Program                                 List To Be Developed in
(with the “help” of the OS)                                         Class
                                                       • You will build this list of ideas in
• Discussion of different ways the design                class…contribute!
  of a program and the OS can conspire
  to produce really bad performance (and
  conversely, what to avoid in order to
  produce really good performance).
• This is one way of tying together many
  topics you’ve learned this semester.

      List Developed in Class                                       • Ignore spatial locality – failing to take
                                                                      advantage of caching opportunity
• Busywaiting instead of blocking waits.                            • Layer layers upon layers. Built virtual
                                                                      machines, do interpretation of everything.
• Hold locks when you don’t need to.
                                                                      – Roll-your-own redundantly
  Big critical sections / coarse-grain                              • Use Nachos
  locking                                                           • Thrash –
• Create deadlock (on purpose?)                                       – Give address insufficient memory allocation for its
                                                                        working set.
   Algorithms – e.g. bubblesort
                                                                      – Never deallocate (who?)
   – In your app, choose an algorithm O(n2)                              • You app grows itself a huge memory footprint
     instead of O(n) or O( logn) or whatever.                            • Your exit syscall doesn’t clean up

   – Ignore the constant factor                                     • Do all communication by I/O (e.g. files)
                                                                    • Choose tiny quantum for scheduling

• File layout –                                                     • Don’t have any caching
   – Never clean up a log -structured F.S
   – Encourage fragmentation
                                                                    • Ignore your workload – optimize for the least
   – Separate all related data –metadata, related blocks, etc all     common case.
     spread out.                                                    • Create way too many threads
• Always make copies of everything                                  • Don’t delay writes or always keep temp files
• Engineer priority inversions or just ignore                         longer than they should live.
  “reasonable” priority assignments.
• Replicate wildly – without consistency                            • Put temp files where no sysadmin will find
• Use eviction scheme that fights the locality of your                them for “managed” cleanup.
  application                                                       • Very frequent daemon processes doing
• Write through to disk – many individual writes                      whatever “maintenance” activity.
• Create likely starvation scenario – give disk reads               • Circular data structures trying to use
  priority when you know they dominate file access                    reference counting to trigger dealloc .
  patterns – may lead to write starvation
• Do all I/O as totally synchronous / also eliminate                • “gaming” the system inversely
  overlap between I/O and CPU

• Ignore granularity (e.g. Know pagesize
  and make all data structures fit badly),
• Making the page size too big (for your
  pgm) or too small

                  My List                         • Wrong lock granularity
                                                     – Lock the whole computation with a single
                                                       monolithic lock −> very big critical sections
                                                     – Fine grain locks that cause lots of context
• Chose an exponential algorithm when a                switching
  logarithmic algorithm would do                  • Deadlocks or Starvation
• Ignore constant factor. Choosing the               – Build a user-level thread package with blocking
  O(c log n) algorithm over the O(n2) algorithm
  when c > n2 for the values of n that matter.    • Using busywaiting (spinlocks ) where it ties up
                                                    resources that are needed.
• Be clueless as to the significance of “n”
                                                     – Expected long waits, RR scheduling
                                                  • Ignore opportunities for I/O overlap.

Interactions with Scheduler                              Memory
• Priority Inversions                                    • Lousy locality in a virtual memory
• Load Imbalances
                                                           – pointer-based hither-and-yon data structures
   – Create n threads for n-1 processors                   – access patterns that don’t match layout in pages
• Creating too many processes to do a single                 (column major order layout / row-wise access)
  task                                                     – hashing (randomizes accesses)
   – synchronizing among them creates serious            • Overcommitted memory
     contention                                            – large footprint
   – context switching overhead                            – creating lots and lots of processes
   – overcommits resources (memory - see next slide)     • Page allocation at odds with cache
                                                           (conflict misses in cache)

Files                                                    IPC and Networks
• Use really long absolute pathnames.
                                                         • Use RPC when inappropriate
• Lousy locality of file accesses
                                                           – not client-server communication pattern
   – lousy spatial (block-grain transfers unjustified)
   – no reuse (lousy temporal locality - defeats cache
• Sync to disk early and often
• Structure data for an application as lots of
  individual files
   – each mail message being a separate file, for
• Share data within a highly parallel job through
  files in a DFS rather than, say, messages.
   – Work the cache consistency mechanism very hard


To top