Docstoc

File System Persistence Distributed Objects

Document Sample
File System Persistence Distributed Objects Powered By Docstoc
					                    File System
• A file system
     – Is responsible for the organization, storage,
       retrieval, naming, sharing and protection of
       files
     – Is designed to store and manage large number
       of files, with facilities for creating, naming and
       deleting files
     – Stores programs and data and makes them
       available as needed

4/16/01               DOS1 - Distributed File Systems       1




                     Persistence
• Probably one of the most important services
  provided by a file system is persistence
     – The files exist after the program, and even the
       computer, has terminated
• Files typically do not go away
     – They are persistent and exist between sessions
     – In conventional systems files are the only
       persistence objects
4/16/01               DOS1 - Distributed File Systems       2




              Distributed Objects
• Using the OO paradigm is it easy to build
  distributed systems
     – Place objects on different machines
• Systems have been developed that allow this
     – Java RMI
     – CORBA ORB
• Having a persistent object store would be useful
     – Java RMI activation daemon
     – Certain ORB implementations

4/16/01               DOS1 - Distributed File Systems       3




                                                                1
                              File Model
 • Files contain both data and attributes
      – Data is a sequence of bytes accessible by
        read/write operations
      – Attributes consist of a collection of information
        about the file




 4/16/01                        DOS1 - Distributed File Systems                          4




             Common File Attributes
                                          File length
                                     Creation timestamp
                                       Read timestamp
                                       Write timestamp
                                     Attribute timestamp
                                       Reference count
                                             Owner
                                         File type
                                     Access control list




 4/16/01                        DOS1 - Distributed File Systems                          5




           UNIX file system operations
filedes = open(name, mode)       Opens an existing file with the given name.
filedes = creat(name, mode)      Creates a new file with the given name.
                                 Both operations deliver a file descriptor referencing the open
                                 file. The mode is read, write or both.
status = close(filedes)          Closes the open file filedes.
count = read(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer.
count = write(filedes, buffer, n)Transfers n bytes to the file referenced by filedes from buffer.
                                 Both operations deliver the number of bytes actually transferred
                                 and advance the read -write pointer.
pos = lseek(filedes, offset,     Moves the read -write pointer to offset (relative or absolute,
                    whence)      depending on whence).
status = unlink(name)            Removes the file name from the directory structure. If the file
                                 has no other names, it is deleted.
                                                      n
status = link(name1, name2) Adds a new name ( ame2 ) for a file ( ame1 ).n
status = stat(name, buffer)      Gets the file attributes for file name into buffer.


 4/16/01                        DOS1 - Distributed File Systems                          6




                                                                                                    2
               File system modules
   Directory module:        relates file names to file IDs

   File module:             relates file IDs to particular files
   Access control module: checks permission for operation requested

   File access module:      reads or writes file data or attributes
   Block module:            accesses and allocates disk blocks

   Device module:           disk I/O and buffering




4/16/01                  DOS1 - Distributed File Systems              7




                         File Service
• A file service allows for the storage and access of
  files on a network
     –    Remote file access is identical to local file access
     –    Convenient for users who use different workstations
     –    Other services can be easily implemented
     –    Makes management and deployment easier and more
          economical
• File systems were the first distributed systems that
  were developed
• Defines the service not the implementation
4/16/01                  DOS1 - Distributed File Systems              8




                         File Server
• A process that runs on some machine and
  helps to implement the file service
     – A system may have more than one file server




4/16/01                  DOS1 - Distributed File Systems              9




                                                                          3
                       File Service Models

                                                          ReadMe.txt             Upload/
            ReadMe.txt                                                           download
                                                         ReadMe.txt              model

                 Client
                                                              Server
                                                                                Remote
                                                           ReadMe.txt           access
                                                                                model
                 Client                                        Server

 4/16/01                              DOS1 - Distributed File Systems                      10




       Properties of Storage Systems
                                  Sharing Persis- Distributed  Consistency Example
                                          tence cache/replicas maintenance

Main memory                                                             1   RAM
File system                                                             1   UNIX file system

Distributed file system                                                     Sun NFS
                                                                            Web server
Web

Distributed shared memory                                                   Ivy (Ch. 16)

Remote objects (RMI/ORB)                                                1   CORBA

Persistent object store                                                 1   CORBA Persistent
                                                                            Object Service
Persistent distributed object store                                         PerDiS, Khazana


 4/16/01                              DOS1 - Distributed File Systems                      11




                                Requirements
 • Transparency
        – Access
              • Programs are unaware of the fact that files are distributed
        – Location
              • Programs see a uniform file name space. They do not know, or
                care, where the files are physically located
        – Mobility
              • Programs do need to be informed when files move (provided
                the name of the file remains unchanged)
        – Performance
        – Scaling
 4/16/01                              DOS1 - Distributed File Systems                      12




                                                                                                4
          Transparency Revisited
• Location Transparency
     – Path name gives no hint where the file is
       physically located
     – \\redshirt\ptt \dos\filesystem.ppt
     – File is on redshirt but where is redshirt ?
• Naming transparency
     – \\mordor\ptt\dos \filesystem.ppt


4/16/01                 DOS1 - Distributed File Systems        13




                    Requirements
• Concurrent File Updates
     – Changes to a file by one program should not interfere
       with the operation of other clients simultaneously
       accessing the same file
• File Replication
     – A file may be represented by several copies of its
       contents at different locations
• Hardware and Software Heterogeneity
     – An important aspect of openess

4/16/01                 DOS1 - Distributed File Systems        14




                    Requirements
• Fault Tolerance
     – The service can continue to operate in the face of client
       and server failures.
• Consistency
     – UNIX one-copy update semantics
• Security
• Efficiency
     – Should offer facilities that are of at least the same
       power and as those found in conventional systems
4/16/01                 DOS1 - Distributed File Systems        15




                                                                    5
               File Sharing Semantics
    • When more than one user shares a file, it is
      necessary to define the semantics of reading
      and writing
    • For single processor systems
         – The system enforces an absolute time ordering
           on all operations and always returns the result
           of the last operation
         – Referred to as UNIX semantics
    4/16/01                    DOS1 - Distributed File Systems                       16




                      UNIX Semantics
                                                                 Original file
                                                A      B
                     PID 0
                                                A      B         C

                         Writes “c”


                                             Read gets “abc”         PID 1




    4/16/01                    DOS1 - Distributed File Systems                       17




                               Distributed
                          A     B                Read “ab”
     PID 0

Writes “c”
                          A     B      C
              Client 1                                                       A   B


                          A     B
     PID 1                                          Read “ab”

               Read gets “c”


              Client 2
    4/16/01                    DOS1 - Distributed File Systems                       18




                                                                                          6
                        Summary
   Method                        Comment


   UNIX semantics                Every operation on a file is instantly
                                 visible to all processes
   Session Semantics             No changes are visible to other
                                 processes until the file is closed
   Immutable Files               No updates are possible; simplifies
                                 sharing and replication
   Transactions                  All changes have the all-or-nothing
                                 property


4/16/01                 DOS1 - Distributed File Systems                   19




                           Caching
• Attempt to hold what is needed by the
  process in high speed storage
• Parameters
     – What unit does the cache manage?
          • Files, blocks, …
     – What do you do when the cache fills up?
          • Replacement policy


4/16/01                 DOS1 - Distributed File Systems                   20




                  Cache Consistency
• The real problem with caching and
  distributed file systems is cache consistency
     – If two processes are caching the same file, how
       to the local copies find out about changes made
       to the file?
     – When they close their files, who wins the race?
• Client caching needs to be thought out
  carefully
4/16/01                 DOS1 - Distributed File Systems                   21




                                                                               7
                     Cache Strategies
     Method                           Comment


     Write Through                    Changes to file sent to server; Works
                                      but does not reduce write traffic.

     Delayed Write                    Send changes to server periodically;
                                      better performance but possibly
                                      ambiguous semantics.
     Write on Close                   Write changes when file is closed;
                                      Matches session semantics.

     Centralized Control              File server keeps track of who has which
                                      file open and for what purpose; UNIX
                                      Semantics, but not robust and scales
                                      poorly
4/16/01                     DOS1 - Distributed File Systems                       22




                            Replication
• Multiple copies of files are maintained
     – Increase reliability by having several copies of
       a file
     – Allow file access to occur even if a server is
       down
     – Load-balancing across servers
• Replication transparency
     – To what extent is the user aware that some files
       are replicated?
4/16/01                     DOS1 - Distributed File Systems                       23




               Types of Replication
Explicit File Replication       Lazy File Replication             Group Replication


               S0                          S0                             S0
                                   Now              Later



    C          S1              C           S1                 C           S1



               S2                          S2                             S2



4/16/01                     DOS1 - Distributed File Systems                       24




                                                                                       8
                    Update Protocols
• Okay so now we have replicas, how do we update
  them?
• Primary Copy Replication
     – Change is sent to primary
     – Primary sends changes to secondary servers
• Voting
     – Primary is down you are dead
     – Client must receive permissions of multiple servers
       before making an update
4/16/01                      DOS1 - Distributed File Systems                         25




           File Service Architecture
          Client computer                                      Server computer


   Application Application                                     Directory service
    program     program




                                                                 Flat file service

           Client module




4/16/01                      DOS1 - Distributed File Systems                         26




                     System Modules
• Flat File Service
     – Implements operations on the contents of files
     – UFIDs are used to refer to files (think I-node)
• Directory Service
     – Provides a mapping between text names and UFIDs
     – Note that the name space is not necessarily flat
     – Might be a client of the flat file service if it requires
       persistent storage
• Client Module
     – Provides client access to the system
4/16/01                      DOS1 - Distributed File Systems                         27




                                                                                          9
                       System Modules
 • Flat File Service ß FM
      – Implements operations on the contents of files
      – UFIDs are used to refer to files (think I-node)
 • Directory Service ß ADS
      – Provides a mapping between text names and UFIDs
      – Note that the name space is not necessarily flat
      – Might be a client of the flat file service if it requires
        persistent storage
 • Client Module ß DFS
      – Provides client access to the system
 4/16/01                       DOS1 - Distributed File Systems                       28




           Flat File Service Operations

Read(FileId , i, n) -> Data    If 1 = i = Length(File): Reads a sequence of up to n items
— throwsBadPosition            from a file starting at item i and returns it in Data.
Write(FileId , i, Data)        If 1 = i = Length(File)+1: Writes a sequence of Data to a
— throwsBadPosition            file, starting at item i, extending the file if necessary.
Create() -> FileId             Creates a new file of length 0 and delivers a UFID for it.
Delete(FileId )                Removes the file from the file store.
GetAttributes(FileId ) -> AttrReturns the file attributes for the file.
SetAttributes(FileId , Attr)   Sets the file attributes (only those attributes that are not
                               shaded in ).




 4/16/01                       DOS1 - Distributed File Systems                       29




                Important Differences
 • Differs from what you would expect..
      – No open or close
      – Read/Write specify a starting point
 • Why?? – fault tolerance
      – Repeatable operations
            • Operations are idempotent , clients may repeat calls
              to which they receive no reply
            • The interface may be implemented by a stateless
              server

 4/16/01                       DOS1 - Distributed File Systems                       30




                                                                                              10
                                Idempotent
idem·po·tent
   Pronunciation: 'I -d&m-"pO-t&nt
   Function: adjective
   Etymology: Latin idem same + potent-, potens having power – more at POTENT
   Date: 1870
   : relating to or being a mathematical quantity which when applied to
   itself under a given binary operation (as multiplication) equals itself;
   also : relating to or being an operation under which a mathematical
   quantity is idempotent
   - idempotent noun




    4/16/01                      DOS1 - Distributed File Systems                      31




                          Access Control
    • In a distributed system access checks must
      be performed at the server
         – Otherwise the server becomes an unprotected
           point of access
    • Two approaches
         – Access check is done when obtaining a UFID
         – User identity is submitted with each request

    4/16/01                      DOS1 - Distributed File Systems                      32




          Directory Service Operations
  Lookup(Dir, Name) -> FileId       Locates the text name in the directory and returns the
  — throwsNotFound                  relevant UFID. If Name is not in the directory, throws an
                                    exception.
  AddName(Dir, Name, File)                                                 N
                                    If Name is not in the directory, adds ( ame, File) to the
  — throwsNameDuplicate             directory and updates the file’s attribute record.
                                    If Name is already in the directory: throws an exception.
  UnName(Dir, Name)                 If Name is in the directory: the entry containing Name is
  — throwsNotFound                  removed from the directory.
                                    If Name is not in the directory: throws an exception.
  GetNames(Dir, Pattern) -> NameSeq Returns all the text names in the directory that match the
                                    regular expression Pattern .




    4/16/01                      DOS1 - Distributed File Systems                      33




                                                                                                 11
                                                NFS
• NFS was originally designed and implemented by
  Sun Microsystems
• Three interesting aspects
       – Architecture
       – Protocol
       – Implementation
• Sun’s RPC system was developed for use in NFS
       – Can be configured to use UDP or TCP


4/16/01                                DOS1 - Distributed File Systems                                       34




                                        Overview
• Basic idea is to allow an arbitrary collection of
  clients and servers to share a common file system
• An NFS server exports one of its directories
• Clients access exported directories by mounting
  them
• To programs running on the client, there is almost
  no difference between local and remote files


4/16/01                                DOS1 - Distributed File Systems                                       35




             Local and Remote Access
      Server 1                                      Client                                    Server 2
             (root)                                 (root)                                           (root)




      export                          . . . vmunix        usr                                        nfs



                            Remote                                         Remote
          people                          students       x      staff                           users
                            mount                                           mount

 big jon bob          ...                                                               jim ann jane joe


Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1;
the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.


4/16/01                                DOS1 - Distributed File Systems                                       36




                                                                                                                               12
                         NFS architecture
                     Client computer                                        Server computer


               Application Application
                program     program
    UNIX
system calls
                                                            UNIX kernel
UNIX kernel        Virtual file system                                      Virtual file system
                    Local             Remote
                         file system




                 UNIX                                                                      UNIX
                                       NFS                                 NFS
                 file                                                                       file
                         Other




                                       client                             server
                system                                                                    system
                                                   NFS
                                                 protocol




    4/16/01                             DOS1 - Distributed File Systems                           37




                             Access Control
    • NFS servers are stateless
          – User’s identity must be verified for each
            request
          – The UNIX UID and GID of the user are used
            for authentication purposes
    • Does this scare you?
    • Kerberized NFS

    4/16/01                             DOS1 - Distributed File Systems                           38




                                 File Mounting
    • File mounting protocol
          – A client sends a path name to a server and can request
            permission to mount the directory
          – If the request is legal, the server returns a file handle to
            the client
          – The handle identifies the file system type, the disk, the
            I-node number of the directory, and security
          – Subsequent calls to read/write in that directory use the
            file handle

    4/16/01                             DOS1 - Distributed File Systems                           39




                                                                                                       13
                   Automounting
• Allows a number of remote directories to be
  associated with a local directory
• Nothing is mounted until a client tries to
  access a remote directory
• Advantages
     – Don’t need to do any work if the files are not
       accessed
     – Some fault tolerance is provided

4/16/01                DOS1 - Distributed File Systems        40




            Directory/File Access
• Clients manipulate files by sending messages to
  servers
     – Most UNIX system calls are supported except for open
       and close
     – Why – to make things stateless
• Problems
     – Makes it tough to open and lock a file
     – NFS needs a separate additional mechanism to handle
       locking
     – Ownership?

4/16/01                DOS1 - Distributed File Systems        41




                  Server Caching
• Caching on the server side is basically “business
  as usual”
• Special care needs to be taken with write and the
  possibility of a server crash
• Write has two options
     – Data in a write operation is stored in cache and written
       to disk (write-through caching)
     – Data in a write operation is stored in cache until a
       commit is received.

4/16/01                DOS1 - Distributed File Systems        42




                                                                   14
            The Andrew File System
           Workstations                                                Servers

           User Venus
           program
                                                                        Vice
           UNIX kernel

                                                                    UNIX kernel

           User Venus                      Network
           program
           UNIX kernel

                                                                        Vice

           User Venus
           program                                                  UNIX kernel
           UNIX kernel


4/16/01                        DOS1 - Distributed File Systems                    43




              The File Name of AFS
                   Local                                         Shared
                                / (root)




     tmp     bin    . . .   vmunix                               cmu




                                                                        bin



                   Symbolic
                   links


4/16/01                        DOS1 - Distributed File Systems                    44




            System Call Interception
                                    Workstation



       User                                                            Venus
     program
                          UNIX file             Non-local file
                        system calls             operations

                                   UNIX kernel
                                UNIX file system


                              Local
                               disk

4/16/01                        DOS1 - Distributed File Systems                    45




                                                                                       15

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:8/20/2012
language:English
pages:15