Middleware 101
Ken Klingenstein,
Project Director, Internet2 Middleware Initiative
Chief Technologist, University of Colorado at Boulder
Syllabus
Process and acknowledgments
The larger picture
Core middleware: the basic technologies
• Identifiers
• Authentication
• Directories
• PKI
The major projects
• eduPerson, LDAP Recipe, the Directory of Directories, Shibboleth,
HEBCA, HEPKI
What lies ahead
What to do and where to watch
Other middleware sessions
Mware 101 - big picture, Early Academic
PKI 101 - apps, certs,
identifier basics, authN, Adopters Medical
profiles, policies, Middleware
directory concepts, trust models technology --- policy
PKI overview
Labs: International
Middleware 301 - Issues
metadirectories, registries, Eduperson
Mware 202 - identifiers+, in Middleware
authorization Shibboleth
directory deployments DoD, apps
HEPKI- PAG -
current policy activities Middleware and
the Grid
HEPKI- TAG -
current technical activities
LDAP Recipe Metadirectories
Multicampus BoF BoF
Acknowledgments
MACE and the working groups
NSF catalytic grant and meeting
Early Adopters
Higher Education partners - campuses, EDUCAUSE, CREN,
AACRAO, SURA, NACUA, etc.
Corporate partners - IBM, ATT, Sun, Accord, Metamerge, et al.
Government partners - including NSF and the fPKI TWG
MACE (Middleware Architecture
Committee for Education)
Purpose - to provide advice, create experiments, foster standards, etc. on
key technical issues for core middleware within higher education
Membership - Bob Morgan (UW) Chair, Scott Cantor (Ohio State), Steven
Carmody (Brown), Michael Gettes (Georgetown), Keith Hazelton
(Wisconsin), Paul Hill (MIT), Jim Jokl (Virginia), Mark Poepping (CMU),
Bruce Vincent (Stanford), David Wasley (California), Von Welch (Grid)
European members - Brian Gilmore (Edinburgh), Ton Verschuren
(Netherlands)
Creates working groups in major areas, including directories, interrealm
authentication, PKI, medical issues, etc.
Works via conference calls, emails, occasional serendipitous in-person
meetings...
National Science Foundation
Catalytic grant in Fall 99 started the organized efforts, with Early
Harvest and Early Adopters
NSF Middleware Initiative - three year cooperative agreement,
begun 9/1/01, with Internet2/EDUCAUSE/SURA and the GRIDs
Center, to develop and deploy a national middleware
infrastructure for science, research and higher education
Work products are community standards, best practices,
schema and object classes, reference implementations, open
source services, corporate relations
Work areas are identifiers, directories, authentication,
authorization, GRIDs, PKI, video
Early Harvest
NSF funded workshop in Fall 99 and subsequent activities
Defined the territory and established a work plan
Best practices in identifiers, authentication, and directories
(http://middleware.internet2.edu/internet2-mi-best-practices-00.html)
http://middleware.internet2.edu/earlyharvest/
Early Adopters:
The Campus Testbed Phase
A variety of roles and missions
Commitment to move implementation forward
Provided some training and facilitated support
Develop national models of deployment alternatives
Address policy standards
Profiles and plans are on Internet2 middleware site
Early Adopter Participants
Dartmouth Michigan Tech U.
U. of Hawaii U. of Pittsburgh
Johns Hopkins U. of Southern Cal
U. of Maryland, BC Tufts U.
U. of Memphis U. of Tennessee, Memphis
U. of Michigan
Partnerships
EDUCAUSE
CREN, CNI
Grids, JA-SIG, OKI
campuses
higher education professional associations - AACRAO, NACUA,
CUMREC, etc.
increasing international interactions
corporate - IBM, Sun, ATT, etc.
Remedial IT architecture
The proliferation of customizable applications requires a
centralization of “customizations”
The increase in power and complexity of the network requires
access to user profiles
Electronic personal security services are now an impediment to
the next-generation computing grids
Inter-institutional applications require interoperational
deployments of institutional directories and authentication
What is Middleware?
specialized networked services that are shared by applications
and users
a set of core software components that permit scaling of
applications and networks
tools that take the complexity out of application integration
a second layer of the IT infrastructure, sitting above the
network
a land where technology meets policy
the intersection of what networks designers and applications
developers each do not want to do
Specifically…
Digital libraries need scalable, interoperable authentication and
authorization.
The Grid is a new paradigm for a computational resource; Globus provides
middleware, including security, location and allocation of resources, and
scheduling. This relies on campus-based services and inter-institutional
standards.
Instructional Management Systems need authentication and directories.
Next-generation portals want common authentication and storage.
Academic collaboration requires restricted sharing of materials between
institutions.
Last time it was about communication; this time it’s about collaboration.
The Grid
a model for a distributed computing environment, addressing
diverse computational resources, distributed databases,
network bandwidth, object brokering, security, etc.
Globus (www.globus.org) is the software that implements most
of these components; Legion is another such software
environment
Needs to integrate with campus infrastructure
Gridforum (www.gridforum.org) umbrella activity of agencies
and academics
Look for grids to occur locally and nationally, in physics,
earthquake engineering, etc.
A Map of Middleware
Core Middleware
Identity - unique markers of who you (person, machine, service,
group) are
Authentication - how you prove or establish that you are that
identity
Directories - where an identity’s basic characteristics are kept
Authorization - what an identity is permitted to do
PKI - emerging tools for security services
Identity Services on One Slide
Interrealm Objectclass
Shibboleth
standards Content Future DoDHE Grids
exchange of
(e.g.eduperson, Portals PKI et al. et al.
attributes
gridperson)
Security Learning
Domain Management
Systems Personal
Web services Portals
and
servers
WebISO
Campus authentication Enterprise directory
Future PKI
Simple point-to-point model
Service discovery
service Policy
Policy
enforcement
Authentication point Policy
enforcement
Service client target enforcement
point
points
Enterprise Enterprise
LDAP Protocols Attribute LDAP
directory Policv
requestor directory
decision
Attribute point
authority
Grid Video
directory directory Video
directory
What is the nature of the campus work?
Technological
• Establish campus-wide services: name space, authentication
• Build an enterprise directory service
• Populate the directory from source systems
• Enable applications to use the directory
Policies and Politics
• Clarify relationships between individuals and institution
• Determine who manages, who can update and who can see
common data
• Structure information access and use rules between departments
and central administrative units
• Reconcile business rules and practices
What are the benefits to the
institution?
Economies for central IT - reduced account management, better
web site access controls, tighter network security...
Economies for distributed IT - reduced administration, access to
better information feeds, easier integration of departmental
applications into campus-wide use...
Improved services for students and faculty - access to scholarly
information, control of personal data, reduced legal exposures...
Participation in future research environments - Grids,
videoconferencing, etc.
Participation in new collaborative initiatives - DoDHE,
Shibboleth, etc.
What are the costs to the institution?
Modest increases in capital equipment and staffing
requirements for central IT
Considerable time and effort to conduct campus wide planning
and vetting processes
One-time costs to retrofit some applications to new central
infrastructure
One-time costs to build feeds from legacy source systems to
central directory services
The political wounds from the reduction of duchies in data and
policies
OIDs to reference identifiers
numeric coding to uniquely define many middleware elements,
such as directory attributes and certificate policies
Numbering is only for identification (are two OIDs equal? If so,
the associated objects are the same) - no ordering, search,
hierarchy, etc.
Distributed management; each campus typically obtains an
“arc”, e.g. 1.3.4.1.16.602.1, and then creates OIDs by extending
the arc, e.g 1.3.4.1.16.602.1.0, 1.3.4.1.16.602.1.1,
1.3.4.1.16.602.1.1.1
Getting an OID
apply at IANA at http://www.iana.org/cgi-bin/enterprise.pl
apply at ANSI at http://web.ansi.org/public/services/reg_org.html
more info at http://middleware.internet2.edu/a-brief-guide-to-
OIDs.doc
Cuttings: Identifiers
“Any problem in Computer Science can be solved with another
level of indirection”
• Butler Lampson
“Except the problem of indirection complexity”
• Bob Morgan
Major campus identifiers
UUID Net ID
Student and/or emplid Email address
Person registry ID Library/departmental ID
Account login ID Publicly visible ID (and pseudo-
Enterprise-LAN ID SSN)
Student ID card Pseudonymous ID
General Identifier Characteristics
Uniqueness (within a given context)
Dumb vs intelligent (i.e. whether subfields have meaning)
Readability (machine vs human vs device)
Affordance (centrally versus locally provided)
Resolver approach (how identifier is mapped to its associated object)
Metadata (both associated with the assignment and resolution of an identifier)
Persistence (permanence of relationship between identifier and specific object)
Granularity (degree to which an identifier denotes a collection or component)
Format (checkdigits)
Versions (can the defining characteristics of an identifier change over time)
Capacity (size limitations imposed on the domain or object range)
Extensibility (the capability to intelligently extend one identifier to be the basis
for another identifier).
Important Characteristics
Semantics and syntax- what it names and how does it name it
Domain - who issues and over what space is identifier unique
Revocation - can the subject ever be given a different value for
the identifier
Reassignment - can the identifier ever be given to another
subject
Opacity - is the real world subject easily deduced from the
identifier - privacy and use issues
Identifier Mapping Process
Map campus identifiers against a canonical set of functional
needs
For each identifier, establish its key characteristics, including
revocation, reassignment, privileges, and opacity
Shine a light on some of the shadowy underpinnings of
middleware
A key first step toward the loftier middleware goals
Authentication Options
Password based
• Clear text
• LDAP
• Kerberos (Microsoft or K5 flavors)
Certificate based
Others - challenge-response, biometrics
Inter-realm is now the interesting frontier
Web initial sign-on needs to relate to account login
Cuttings: Authentication
user side management - crack, change, compromise
central-side password management - change management, OS
security
first password assignment - secure delivery
policies - restrictions or requirements on use
Some authentication good practices
Precrack new passwords
Precrack using foreign dictionaries as well as US
Confirm new passwords are different than old
Require password change if possibly compromised
Use shared secrets or positive photo ID to reset forgotten
passwords
US Mail a one-time password (time-bomb)
In-person with a photo ID (some require two)
For remote faculty or staff, an authorized departmental
representative in person, coupled with a faxed photo ID
Initial identification/authentication will emerge as a critical
component of PKI
Directory Issues
Applications
Overall architecture
• chaining and referrals, redundancy and load balancing, replication,
synchronization, directory discovery
The Schema and the DIT
• attributes, ou’s, naming, object classes, groups
Attributes and indexing
Management
• clients, delegation of access control, data feeds
Directory-enabled applications
Email
Account management
Web access controls
Portal support
Calendaring
Grids
A Campus Directory Architecture
border
directory
Enterprise metadirectory
applications dir
enterprise
directory departmental OS directories
directories (MS, Novell, etc)
directory registries source
database systems
Key Architectural Issues
Interfaces and relationships with legacy systems
Performance in searching
Binding to the directory
Load balancing and backups are emerging but proprietary
Who can read or update what fields
How much to couple the enterprise directory with an operating
system
http://www.georgetown.edu/giia/internet2/ldap-recipe/
Which attributes and object classes in which directories
Schema and DIT Good Practices
People, machines, services
Be very flat in people space
Keep accounts as attributes, not as an ou
Replication and group policies should not drive schema
RDN name choices rich and critical
Other keys to index
Creating and preserving unified name spaces
Attribute Good Practices
inetOrgPerson, eduPerson, localPerson
Never repurpose an RFC-defined field. Add new attributes -
adding attributes is easier than thought
Keep schema checking on, unless it is done in the underlying
database; watch performance
Most LDAP clients do not treat multi-valued attributes well, but
doing multiple fields and separate dn’s is no better.
Management Good Practices
No trolling permitted; more search than read
LDAP client access versus web access
Give deep thought to who can update
Give deep thought to when to update
LDIF likely to be replaced by XML as exchange format
Delegation of control - scalability
“See also”, referrals, replication, synchronization in practice
Replication should not be done tree-based but should be filtered
by rules and attributes
Metadirectories
The critical functions to glue together what inevitably turns out to
be a number of campus, departmental and application-oriented
directory services
Typically a coordinated set of services that watches updates to
specific directories or from legacy data feeds and spreads those
updates to other directories
Performs several subfunctions
• an identity registry or crosswalk to relate entries in different
directories
• a set of connectors that take changes from one source and
convert them for dissemination to other sources
Basic implementation from Metamerge is free to higher ed
PKI
First thoughts
Fundamentals - Components and Contexts
The missing pieces - in the technology and in the community
Higher Education activities (CREN, HEPKI-TAG, HEPKI-PAG,
Net@EDU, PKI Labs)
PKI: A few observations
Think of it as wall jack connectivity, except it’s connectivity for
individuals, not for machines, and there’s no wall or jack…but it
is that ubiquitous and important
Does it need to be a single infrastructure? What are the costs
of multiple solutions? Subnets and ITPs...
Options breed complexity; managing complexity is essential
PKI can do so much that right now it does very little
A few more...
IP connectivity was a field of dreams. We built it and then the
applications came. Unfortunately, here the applications have
arrived before the infrastructure, making its development much
harder.
No one seems to be working on the solutions for the agora.
A general-purpose PKI seems like a difficult task, but instituting
a PKI Light as a first step may not have enough paybacks.
The general state of PKI
There are campus and corporate successes
• Corporations use internally for VPN, some authentication, signed
email (with homogeneous client base)
• MIT, UT medical, soon VA, UCOP
Key is limited application use, lightweight policy approaches
There is very limited interrealm, community of interest or
general interoperable work going on
• Federal efforts
• HealthKey
• Higher Ed
• Some European niches
Why X.509/PKI?
Single infrastructure to provide all security services
Established technology standards, though little operational
experience
Elegant technical underpinnings
Serves dozens of purposes - authentication, authorization,
object encryption, digital signatures, communications channel
encryption
Low cost in mass numbers
Why Not X.509/PKI?
High legal barriers
Lack of mobility support
Challenging user interfaces, especially with regard to privacy
and scaling
Persistent technical incompatibilities
Overall complexity
The Four Planes of PKI
on the road to general purpose interrealm PKI
the planes represent different levels of simplification from the
dream of a full interrealm, intercommunity multipurpose PKI
simplifications in policies, technologies, applications, scope
each plane provides experience and value
The Four Planes are:
Full interrealm PKI - multipurpose, spanning broad and
multiple communities, bridges to unite hierarchies, unfathomed
directory issues
Simple interrealm PKI - multipurpose within a community,
operating under standard policies and structured hierarchical
directory services
PKI-Light - containing all the key components of a PKI, but
many in simplified form; may be for a limited set of applications;
may be extended within selected communities
PKI-Ultralight - easiest to construct and useful conveyance;
ignores parts of PKI and not for use external to the institution;
learn how to fly, but not a plane...
Examples of Areas of Simplification
Spectrum of Assurance Levels
Signature Algorithms Permitted
Range of Applications Enabled
Revocation Requirements and Approaches
Subject Naming Requirements
Treatment of Mobility
...
PKI-Light example (Texas-
Houston)
CP: VeriSign
CRL: VeriSign
Applications: authentication
Mobility: USB dongle
Signing: md5RSA
Thumbprint: sha1
Naming: X.500
Directory Services needed: I?
Deployment: 5,000 medical students
PKI-Light (MIT)
CP: none
CRL: limit lifetime
Applications: internal web authentication
Mobility: one per system; also password enabled
Signing: md5RSA
Thumbprint: sha1
Naming: X.500
Directory Services needed: none
Deployment: approximately 350,000 over five years
D. Wasley’s PKI Puzzle
Uses for PKI and Certificates
authentication and pseudo-authentication
signing docs
encrypting docs and mail
non-repudiation
secure channels across a network
authorization and attributes
secure multicast
and more...
Implementation varies by contexts/components
Contexts/ Intracampus Intercampus General
Components
Certificate Inhouse,
Systems insource,
outsource
Application
Integration
I/A processes
Profiles and
Policies
PKI Components
X.509 v3 certs - profiles and uses
Validation - Certificate Revocation Lists, OCSP, path
construction
Cert management - generating certs, using keys, archiving and
escrow, mobility, etc.
Directories - to store certs, and public keys and maybe
private keys
Trust models and I/A
Cert-enabled apps
X.509 certs
purpose - bind a public key to a subject
standard fields
extended fields
profiles to capture prototypes
client and server issues
v2 for those who started too early, v3 for current work,
v4 being finalized to address some additional cert formats
(attributes, etc.)
Standard fields in certs
cert serial number
the subject, as x.500 DN or …
the subject’s public key
the validity field
the issuer, as ID and common name
signing algorithm
signature info for the cert, in the issuer’s private key
Extension fields
Examples - authorization/subject subcodes, key usage, LDAP
URL, CRL distribution points, etc.
Key usage is very important - for digital signatures, non-
repudiation, key or data encipherment, etc.
Certain extensions can be marked “critical” - if an app can’t
understand the extension, then it doesn’t use the cert
Requires profiles to document, and great care...
Cert Management
Certificate Management Protocol - for the creation, revocation
and management of certs
Revocation Options - CRL, OCSP
Storage - where (device, directory, private cache, etc.)
and how - format (DER, BER, etc.)
Escrow and archive of keys - when, how, and what else needs
to be kept
Certificate Authority software or outsource options
• Homebrews
• Open Source - OpenSSL, OpenCA, Oscar
• Third party - Baltimore, Entrust, etc.
• OS-integrated - W2K, Sun/Netscape, etc.
Directories
to store certs
to store CRL
to store private keys, for the time being
to store attributes
implement with border directories, or ACLs within the enterprise
directory, or proprietary directories
Certificate Policies (CP) and
Practices Statements (CPS)
Policies: legal responsibilities and liabilities (indemnification
issues)
Operations of certificate management systems
Will hopefully be somewhat uniform across the community
Assurance levels - varies according to I/A processes and other
operational factors
Practices - site-specific details of operational compliance with a
cert policy
A Policy Management Authority (PMA) determines if a CPS is
adequate for a given CP.
Inter-organizational trust model
components
verifying sender-receiver assurance by finding a common
trusted entity
must traverse perhaps branching paths to establish trust paths
must then use CRLs etc. to validate assurance
if policies are in cert payloads, then validation can be quite
complex
delegation makes things even harder
Hierarchies vs. Bridges
• a philosophy and an implementation issue
• the concerns are transitivity and delegation
• hierarchies assert a common trust model
• bridges pairwise agree on trust models and policy mappings
Mobility Options
smart cards
USB dongles
passwords to download from a store or directory
proprietary roaming schemes abound - Netscape, VeriSign, etc.
SACRED within IETF recently formed for standards
Difficulty in integration of certificates from multiple stores (hard
drive, directory, hardware token, etc.)
Will it fly?
Well, it has to…
Scalability
Performance
OBE
“With enough thrust, anything can fly”
The Major Projects
eduPerson and eduOrg (mace-dir)
the Directory of Directories for Higher Education (DoDHE)
Shibboleth (mace-shibboleth) and Webiso (mace-webiso)
HEBCA and PKI-Light (HEPKI-PAG and HEPKI-TAG)
Directories
metadirectories
groups
affiliated directories
Videoconferencing and video on demand (vidmid)
PKI Labs at Dartmouth and Wisconsin
OKI, JA-SIG and the Grids
eduPerson
An object class for higher education
Defines higher education syntax and semantics for general
“person” attributes
Includes several new attributes for instructional, research and
administrative inter-institutional use
Presumes that campuses add local person object class
A joint effort of EDUCAUSE and Internet2
eduPerson 1.0
parent objectclass=inetOrgPerson
includes:
• affiliation (multi-valued)
• primary affiliation (faculty/student/staff)
• orgUnitDN (string)
• nickname (string)
• ePPN (identifier, user@securitydomain)
version 1.5 and beyond will contain other shared attributes
A Directory of Directories
an experiment to build a combined directory search service
to show the power of coordination
will highlight the inconsistencies between institutions
technical investigation of load and scaling issues, centralized
and decentralized approaches
human interface issues - searching large name spaces with
limits by substring, location, affiliation, etc...
to suggest the service to follow
Sun donation of server and 6 million DNs
http://dodhe.internet2.edu/dodhe/
Shibboleth
inter-institutional web authentication and basic authorization
authenticate locally, act globally - the Shibboleth shibboleth
emphasizes privacy through progressive disclosure of attributes
linked to commercial standards development in XML through
OASIS
scenarios and architecture done; coding has commenced with
alpha code due in January, 2002 to pilot sites
coding and design teams feature IBM/Tivoli, CMU, and the Ohio
State University
strong partnership with IBM to develop and deploy
http://middleware.internet2.edu/shibboleth/
Stage 1 - Addressing Three
Scenarios
Member of campus community accessing licensed resource
• Anonymity required
Member of a course accessing remotely controlled resource
• Anonymity required
Member of a workgroup accessing controlled resources
• Controlled by unique identifiers (e.g. name)
Taken individually, each of these situations can be solved in a
variety of straightforward ways.
Taken together, they present the challenge of meeting the
user's reasonable expectations for protection of their personal
privacy.
Shibboleth Architecture
Concepts - High Level
Pass content if user is allowed
Authorization Phase
Browser Target
Web
Authentication Phase Server
First Access - Unauthenticated
Origin Site Target Site
Shibboleth Architecture
Concepts (detail)
Authorization
Success!
Authentication
Entitlements
Attribute
Ent Prompt
Phase
Server
Target
Req Ent Browser Web
Server
Second Access - Authenticated
Web Auth OK Pass entitlements for authz decision
Login
Server
Authentication Pass User to user is allowed
Redirectcontent ifLocal Web Login
Ask to Unauthenticated
First Access -Obtain Entitlements
Origin Site Target Site
Shibboleth Architecture -
Components and Flow
Shibboleth, eduPerson, and
Middleware Inputs & Outputs
everything else
Licensed Embedded
Resources App Security
JA-SIG & Inter-realm
Grids OKI futures
uPortal calendaring
Shibboleth, eduPerson, Affiliated Dirs, etc.
Enterprise
authZ
Campus Enterprise Enterprise Legacy
web SSO Directory Authentication Systems
Project Status
Architecture definition finished (v0.9+)
Design/Programming now Underway
• Team membership drawn from IBM/Tivoli, CMU, Ohio State
• First Face-to-Face meeting on Sept 27, 28 at CMU
First Set of Pilot Sites Selected
• Chosen to test all 3 scenarios
• UK participation
Timeline for programming, piloting available end of October
VidMid
Middleware for video
Videoconferencing
authenticated, identified video clients - work with
commercial clients to use the underlying middleware
plumbing
H.323, VRVS, and new SIP-oriented clients
Video on demand
access controls for video resources
schema for meta information
Works closely with ViDe (www.vide.org)
http://middleware.internet2.edu/video/
aggressive time frames
Mace-Med
Unique requirements - HIPAA, disparate relationships, extended
community, etc.
Unique demands - 7x24, visibility
PKI seen as a key tool
Mace-Med recently formed to explore the issues
HEPKI (www.educause.edu/hepki/)
HEPKI - Technical Activities Group (TAG)
• universities actively working technical issues
• topics include Kerberos-PKI integration, public domain CA, profiles
• regular conference calls, email archives
HEPKI - Policy Activities Group (PAG)
• universities actively trying to deploy PKI
• topics include certificate policies, RFP sharing, interactions with
state governments
• regular conference calls, email archives
Internet2 PKI Labs
At Dartmouth and Wisconsin in computer science departments
and IT organizations
Doing the deep research - two to five years out
Policy languages, path construction, attribute certificates, etc.
National Advisory Board of leading academic and corporate PKI
experts provides direction
Catalyzed by startup funding from ATT
OKI, JA-SIG and Grids
OKI
• major open learning management system being developed by MIT,
Stanford, and North Carolina State, funded by the Mellon Foundation;
reference architecture and open source implementation
• http://web.mit.edu/oki/intro.html
JA-SIG
• uPortal is a major portal architecture and implementation being developed
by a number of schools with funding from the Mellon Foundation; also
hopes to share administrative Java applets
• http://www.ja-sig.org/ and
http://mis105.mis.udel.edu/ja-sig/uportal/index.html
GRIDS Center
• expanding use of Grids will reach to many campuses
• integration efforts underway
• http://www.globus.org and http://www.gridforum.org
Middleware 201 typical issues
Access control lists for directories - how to construct, how to
manage
Integration of LAN and enterprise accounts and passwords
Enterprise and LAN directories - schema, replication and
synchronization
Working with legacy and source systems
Processes and processes
Where to Watch
middleware.internet2.edu
www.educause.edu/hepki/
www.cren.org