xrootd
Andrew Hanushevsky
Stanford Linear Accelerator Center 30-May-03
Goals
High Performance File-Based Access
Scalable, extensible, usable
Fault tolerance
Server failures handled in a natural way Servers may be dynamically added and removed
Flexible Security
Allowing use of almost any protocol
Rootd Compatibility
May 30, 2003 2: xrootd
Achieving High Performance
Scalable request/response protocol Multi-threaded multi-process architecture Architecture sensitive polling MRU scheduling Sticky sockets Adaptive reconfiguration Versatile sfs layer (based on proven oofs)
May 30, 2003 3: xrootd
Scalable Protocol I
Connection multiplexing
One connection per client/host
Multiple logically independent streams
Request redirection supported
Similar to http redirection
Supports dynamic load balancing and fail-over
Uses an intentional request header
Can better optimize request processing
4: xrootd
May 30, 2003
Scalable Protocol II
Asynchronous mode allowed
Multiple processing-order-independent requests Optional application-directed pre-read
I/O segmenting
Able to naturally deal with very large transfers
Better use of server resources
Request deferral
Client waits for resources without using server resources
5: xrootd
May 30, 2003
Scalable Protocol III
Unsolicited Reverse Request Mode
Allows server to manage client for recovery
Asynchronous redirect, deferral, and messages
Protocol may be compatibly extended
Mechanism to send opaque information
Accommodate things that were “forgotten”
Messaging interface Cache group Request priority And so on….
6: xrootd
May 30, 2003
MT/MP Architecture
Normally one multi-threaded server per host
Should be able to utilize available resources
Easy to administer
Optionally, multiple servers per host
Fully utilize large machines
May 30, 2003
7: xrootd
Architecture Sensitive Polling
All POSIX systems support poll()
Used by default
Not always an efficient I/O “interrupt” mechanism
Alternate polling mechanisms allowed
/dev/poll
Available on Solaris and patched RH Linux Essential to reduce latency
8: xrootd
Up to an order of magnitude reduction in CPU
May 30, 2003
MRU Scheduling
Connections processed in most recently used order
Gives priority to active connections Reduces polling overhead Essentially a fair scheduling algorithm
Starvation cannot occur Longer running tasks tend to get started first
Assuming all other things being equal
May 30, 2003
9: xrootd
Sticky Sockets
Connection temporarily binds to a thread
Avoids polling and scheduling overhead Significantly reduces latency
Connection automatically unbinds
Client is not sufficiently active Number of other requests approaches available threads
May 30, 2003
10: xrootd
Adaptive Reconfiguration
Server dynamically adjusts configuration
Number of threads
Kept proportionate to number of active requests Sizes track actual usage profile
Pre-allocated buffers
Recomputed periodically
Pre-allocated objects
Number tracks recent needs
High latency connections rescheduled
11: xrootd
May 30, 2003
Versatile sfs Layer I
Integrates multiple performance features
Dynamic load balancing
Client redirected to “best” server of the moment
File descriptor partitioning
Reduces socket polling overhead
Prevents open file proliferation and attendant overhead
File system interface reuse
Same file opened in same mode shared by multiple clients
File system interface timeout
Reduces overhead caused by idle opened files
12: xrootd
May 30, 2003
Dynamic Load Balancing
Dynamic Selection
May 30, 2003
13: xrootd
DLB Implementation
xrootd dlbd subscribe open again open Client wait try host:port
May 30, 2003
xrootd dlbd
xrootd dlbd (any number)
I do
who has the file? dlbd xrootd (any number)
14: xrootd
Versatile sfs Layer II
Dynamic disk cache integration
Allows unlimited file system size Provides superior internal load balancing
Mass Storage Integration
HPSS, Castor, Enstore, etc
RFIO Integration Scalable authorization
From file sub-trees to single files
15: xrootd
May 30, 2003
Cache File System
/databases/mydbfile
Index Area Optional data cache Default data area
symlink Naming convention allows for audit and index recovery
Multiple Independent Filesystems
/cache1/databases:mydbfile
/cache2
/cache3
May 30, 2003 16: xrootd
Data Area Any number Any Size Chosen based on free space in LRU order
Fault Tolerance I
Servers may come and go
Uses load balancing to effect recovery
New servers can be added at any time Servers may be brought down for maintenance Files can be moved around in real-time
Client simply adjust to the new configuration
XTNetFile object handles recovery protocol
May 30, 2003
17: xrootd
Fault Tolerance II
Whenever client looses r/o connection
Back to distinguished xrootd(s) for reselection Limited wait/retry loop on the same server
Whenever client looses r/w connection
We will be working to improve this next year!
All handled in the XTNetFile class
Disruptions merely delay the client
May 30, 2003
18: xrootd
Flexible Security
Negotiated Security Protocol
Allows client/server to agree on protocol
E.g., Kerberos, GSI, AFS Kerberos, etc.
Can be easily extended
Multi-protocol authentication support
May 30, 2003
19: xrootd
Security Architecture
login
Client-Specific Security Configuration authenticate
Protocol Selection
Multiple handshakes allowed during authentication phase
(required by some PKI protocols)
Self Configuration
libooseccl.so
libooseccl.so Security Token
May 30, 2003
20: xrootd
Heterogeneous Security Support
• Servers have one or more protocol objects • Server protocol objects created • • •
at server initialization time Client selects which protocol to use when security context created Protocol object created based on configuration returned by xrootd One security context object per physical xrootd connection Protocol objects may be shared by one or more contexts Each “pass” through a security context object may generate credentials to be passed to xrootd
• •
protocols
May 30, 2003 21: xrootd
Simple & Effective Interface
For each login that requires authentication
XrdSecCreateSecurityContext(ipaddr, config)
Returns security protocol object
XrdSecClientSecurity
Based on server ipaddr and server-supplied config
Returns credentials to be sent to the server
XrdSecClientSecurity::getCredentials()
Done via authenticate request and possible authmore response
Based on well tested and documented oofs security
May 30, 2003 22: xrootd
Optional Scalable Authorization
libooseccl.so
libooacc.so
Authentication Authorization
u abh rw /slac/rootfiles/usr/abh r /cern/rootfiles
May 30, 2003
23: xrootd
Security Summary
Multi-protocol Authentication
Supports distributed heterogeneous environments Open-ended capability based model To keep the security hard hats happy Trivially replaceable for a plug & play architecture
24: xrootd
Scalable Authorization
Integrated Auditing
Well defined, proven interfaces
May 30, 2003
rootd Compatibility
Bilateral compatibility
XTNetfile reverts to TNetFile for rootd servers XRootd reverts to rootd protocol for TNetFile
Allows for transparent introduction
Can run mixed mode Binary is multi-environment compatible
May 30, 2003
25: xrootd
Compatibility Modes
Client-Side Compatibility
Application
XTNetFile rootd TNetFile
Server-Side Compatibility
xrootd
Application rootd
TNetFile
xrootd
rootd compability
May 30, 2003
26: xrootd
xrootd Architecture
Protocol Manager Protocol Layer Filesystem Logical Layer
Filesystem Physical Layer
Filesystem Implementation
May 30, 2003 27: xrootd
xrootd Internals
Dynamically loaded
(can also be static)
May 30, 2003
28: xrootd
Conclusion
xrootd provides high performance file access
Improves over afs, ams, nfs, etc.
Unique performance, usability, scalability, security, compatibility, and recoverability characteristics
xrootd can provide a firm server foundation for native file system implementations
E.g. alienfs, gridfs, slashgrid, etc
For now, aim is to support BaBar
May 30, 2003 29: xrootd