The Architecture of Secure Multicast Conferencing

Document Sample
The Architecture of Secure Multicast Conferencing Powered By Docstoc
					            The Architecture of Secure Multicast Conferencing
                   ICECAR Project, Deliverable D3.1
                                   Peter T. Kirstein, I. Brown, E. Whelan
                                         University College London
                                                30 July 1999

                     This report outlines the architectural principles of secured multicast
                     conferencing. It discusses the general principles of the unsecured
                     activity, giving details on the tools, methods of announcing
                     conferences, and relaying between multicast and unicast. It then
                     considers how to invoke security procedures in a manner consistent
                     with the other applications being pursued in the ICECAR project.

Abstract .................................................................................................................................................... 1
1    Introduction ...................................................................................................................................... 2
2    Multicast Conferencing in the Absence of Security Considerations ................................................ 2
  2.1      Overview Architecture of Multicast Conferencing ................................................................... 2
  2.2      Multicast Traffic Distribution ................................................................................................... 4
  2.3      Transport Protocols................................................................................................................... 5
  2.4      The Multimedia Tools .............................................................................................................. 6
  2.5      Session announcement , Invitation and Tool Launch ............................................................... 6
     2.5.1         Introduction ...................................................................................................................... 6
     2.5.2         Session Descriptions ......................................................................................................... 7
     2.5.3         Session Announcements ................................................................................................... 7
     2.5.4         Session Invitations ............................................................................................................ 7
     2.5.5         Offline Mechanisms ......................................................................................................... 8
  2.6      Relays ....................................................................................................................................... 9
3    Security Considerations in Multicast Conferencing ....................................................................... 10
  3.1      Introduction ............................................................................................................................ 10
  3.2      Encryption of Media Streams ................................................................................................. 11
  3.3      Network Level Security and IPSEC ....................................................................................... 11
     3.3.1         The Background of IPSEC ............................................................................................. 11
     3.3.2         Global architecture ......................................................................................................... 12
     3.3.3         Principles of IPSEC ........................................................................................................ 13
     3.3.4         The Security Association Database (SAD) ..................................................................... 14
     3.3.5         Security policies. The Security Policy Database (SPD) ................................................. 14
     3.3.6         IPSEC, IP Multicast and Conferencing .......................................................................... 15
  3.4      Encrypted and Authenticated Session Descriptions ............................................................... 15
     3.4.1         Assumptions on Group Management for Conferencing ................................................. 16
     3.4.2         Authentication of Session Announcements .................................................................... 16
     3.4.3         Shared Secret Distribution .............................................................................................. 17
     3.4.4         Distributing Session Descriptions Securely .................................................................... 17
     3.4.5         Use of Symmetric or Asymmetric Encryption Mechanisms with SAP .......................... 18
     3.4.6         Use of Smart Cards for Secure Conferencing ................................................................. 18
  3.5      Use of IPSEC in Secure Conferencing ................................................................................... 19
4    Secured Operation with Relays ...................................................................................................... 19
5    Conclusions .................................................................................................................................... 20
References .............................................................................................................................................. 21

30 July 1999                                       ICECAR Security Architecture, v2.4                                                                        1
1 Introduction
As part of the UCL work on the ICECAR project, we contracted to provide a document on the
architecture for secured, multimedia conferencing. This subject could cover a number of
different areas. We decided to cover only those areas that we might be addressing on the
ICECAR project. For this reason we are not addressing the security aspects of H.323
conferencing [h323] - where the mechanisms are not yet agreed in the Standards bodies. We
will, however, address most aspects of Mbone conferencing.
We regard this document as a companion to [kir99], which gives the architecture of the
Mbone conferencing activity as carried out in the MECCANO project [kir99]. The tools and
other components for conferencing are being developed or integrated in the MECCANO
project; this includes the securing of the media tools. However the integration of such tools
with a security infrastructure, and their alignment with the ICECAR security technology, are
the province of the ICECAR project. Thus in Section 2 we survey the tools and other
components being imported from the MECANO project in the absence of security
considerations; it is no coincidence or plagiarism that this is a shortened form of the relevant
parts of [kir99]. Besides the tools and methods of announcing conferences and inviting
participants, this includes one of the mechanisms for providing gateways. In Section 3 we
then discuss how to build on the components of Section 2 to provide secured conferences.
While some of the subjects of Section 3 are broached superficially in [kir99], the real detail
on how they might be implemented is provided in this report.
There are two areas that we address here which may be implemented only partially within the
ICECAR project because of insufficient resource; nor did we contract to implement them
fully. One is the use of IPSEC, the second is the full securing of gateways. The components
for the first will be addressed in the COIAS project [coias]. It had not been mentioned in the
ICECAR proposal, because IPSEC implementations were not sufficiently advanced at the
time that we could be sure that they would be available for the ICECAR project; however,
this has been remedied in the meantime. The second will be addressed partially, but the full
area of securing relays is still a research issue, so that the treatment here, which is given in
Section 4, is incomplete.
There are three other relevant areas that we do not address. One is the securing of media
servers. We had hoped to do more in this area, but have decided it is too large in its own right
to warrant a superficial treatment here. A second is the deployment of multimedia
conferencing through firewalls. This is a very important area, which is being addressed in the
MECCANO project. We do not expect to be able to deploy this as part of the ICECAR
project, and refer to [kir99] for further detail. A third is the distribution of encryption keys for
multicast using the multicast trees for the distribution. This last is under consideration as a
research area by the Internet Research Task Force, and so is considered too speculative for
inclusion in this report.

2 Multicast Conferencing in the Absence of Security

2.1   Overview Architecture of Multicast Conferencing
The architecture that has evolved in the Internet is general as well as being scaleable to very
large groups; it permits the open introduction of new media and new applications as they are
devised. As the simplest case, it also allows two persons to communicate via audio only, i.e. it
encompasses IP telephony.

30 July 1999                    ICECAR Security Architecture, v2.4                                2
The determining factors of conferencing architecture are communication between (possibly
large) groups of humans and real-time delivery of information. In the Internet, this is
supported at a number of levels. The remainder of this section provides an overview of this
support, and the rest of the document describes each aspect in more detail.
In a conference, information must be distributed to all the conference participants. Early
conferencing systems used a fan-out of data streams, e.g. one connection between each pair of
participants, which means that the same information must cross some networks more than
once. The Internet architecture uses the more efficient approach of multicasting the
information to all participants (cf. Section 2.2).
Multimedia conferences require real-time delivery of streamed continuous media (audio and
video information) and reliable, near real-time delivery of shared workspace information. In a
datagram network, multimedia information must be transmitted in packets, some of which
may be delayed more than others. In order that audio and video streams be played out at the
recipient with the correct timing, information must be transmitted that allows the recipient to
reconstitute the timing. A transport protocol with the specific functions needed for this has
been defined (cf. Section 2.3). Two of the tools used in the MECCANO project to provide the
streamed continuous media are called RAT (for the audio) and VIC (for the video); some of
the salient characteristics of these tools are described in Section 2.4. There are similar tools
for shared workspace operation; one of those used in MECCANO is a network text editor
called NTE; details of this are also given in Section 2.4.
The humans participating in a conference generally need to have a specific idea of the context
in which the conference is happening, which can be formalised as a conference policy. Some
conferences are essentially crowds gathered around an attraction, while others have very
formal guidelines on who may take part (listen in) and who may speak at which point. In any
case, initially the participants must find each other, i.e. establish communication relationships
(conference set-up, Section 2.5). During the conference, some conference control information
is exchanged to implement a conference policy or at least to inform the participants who is



 Prev iew:
 This EPS picture was not sav ed
 with a prev iew inc luded in it.
 This EPS picture will print to a
 PostSc ript printer, but not to
 other ty pes of printers.

               Figure 1             Internet multimedia conferencing protocol stacks
Most of the protocol stacks for Internet multimedia conferencing are shown in Fig. 1. Most
of the protocols are not deeply layered, unlike many protocol stacks, but rather are used
alongside each other to produce a complete conference. For secure conferencing, there may
be additional protocols for group management. This question is addressed in Section 3.

30 July 1999                         ICECAR Security Architecture, v2.4                        3
The variety of different network technologies, workstation capabilities and conference system
technologies preclude the adoption of a single conferencing system, at a single speed, with
homogeneous facilities. Many mechanisms have been suggested, and some even
implemented, to address system heterogeneity. At some level these may be independent of
any direct intervention inside the network. For example, if layered coding is used, and all
workstations support multicast, a disadvantaged receiver may just not subscribe to all the
relevant multicast groups. However the above constraints are too prescriptive, and alternatives
must be explored for heterogeneous environments.
There has been a recent move to consider Active Networks, in which each node can do
reasonably complex packet manipulations at the IP level. We consider this much too radical
for deployment in the MECCANO/ICECAR projects. Here we are prepared to put in Active
Components, but only at the Application level at boundaries between technologies. In Section
2.6, we will consider a specific component called the Universal Transcoding Gateway (UTG)
[kir98]. This is a device that is located near a specific change in network technology. While
the tools used in the Mbone technology are provided on both sides of the UTG, it is possible
to provide in that relay multicast - unicast conversion, video and audio multiplexing,
transcoding and packet filtering. It is also possible to arrange for that component to act as a
multicast node from the viewpoint of terminating some multicast groups and allowing clients
to subscribe to a limited range of such groups.
The ability to archive multimedia data from conferences and to introduce stored data into
conferences are requirements in many multimedia conferencing applications. Recording a
multimedia session enables anyone who could not originally participate to replay it and find
out the content of the discussion or seminar. Additionally, a participant in an on-going
conference may play back a pre-recorded clip in order to illustrate a point. Multimedia servers
with multicast capabilities along with recording, playback and editing facilities must be an
integral part of the emerging multimedia computing infrastructure. The five functions of
recording, storage, editing, announcement and playback are quite separate. They may use
quite different equipment and techniques - though there must be an integrating software
system to ensure that the system is easily usable. It is both acceptable, and in the future
probably normal, that there will be diverse systems for recording, with yet others for
playback. Section 2.7 describes some of the architectural consideration in such systems.

2.2     Multicast Traffic Distribution
IP multicast provides efficient many-to-many data distribution in an Internet environment. It
is easy to view IP multicast as simply an optimisation for data distribution; indeed this is the
case, but IP multicast can also result in a different way of thinking about application design.
To see why this might be the case, examine the IP multicast service model, as described by
Van Jacobson [flo95]:
     Senders just send to the group
     Receivers express an interest in receiving data sent to the group
     Routers conspire to deliver data from senders to receivers
With IP multicast, the group is indirectly identified by a single IP class-D multicast address.
 Several things are important about this service model from an architectural point of view.
Receivers do not need to know who or where the senders are to receive traffic from them.
Senders never need to know who the receivers are. Neither senders nor receivers need care
about the network topology as the network optimises delivery.
 The level of indirection introduced by the IP class D address denominating the group solves
the distributed systems binding problem, by pushing this task down into routing. Given a
multicast address (and UDP port), a host can send a message to the members of a group

30 July 1999                    ICECAR Security Architecture, v2.4                                4
without needing to discover who they are. Similarly receivers can "tune in" to multicast data
sources without needing to bother the data source itself with any form of request.
 IP multicast is a natural solution for multi-party conferencing because of the efficiency of the
data distribution trees, with data being replicated in the network at appropriate points rather
than in end-systems. It also avoids the need to configure special-purpose servers to support
the session; such servers require support, cause traffic concentration and can be a bottleneck.
For larger broadcast-style sessions, it is essential that data-replication is carried out in a way
that requires only that per-receiver network-state is local to each receiver, and that data-
replication occurs within the network. Attempting to configure a tree of application-specific
replication servers for such broadcasts rapidly becomes a "multicast routing" problem; thus
native multicast support is a more appropriate solution.
There are a number of IETF documents outlining the requirements of hosts and multicast
routing. The most important defining the host extensions for IP multicast is [dee89]. Many
mechanisms have been proposed for multicast routing. It is beyond the scope of this
Deliverable to discuss the differences and advantages of the different proposals.

2.3   Transport Protocols
So-called real-time delivery of video and audio traffic requires little in the way of transport
protocol. In particular, real-time traffic that is sent over more than trivial distances is not re-
transmittable. With packet multimedia data there is no need for the different media
comprising a conference to be carried in the same packets. In fact it simplifies receivers if
different media streams are carried in separate flows (i.e., separate transport ports and/or
separate multicast groups). This also allows the different media to be given different quality
of service. For example, under congestion conditions, a router might preferentially drop
video packets over audio packets. In addition, some sites may not wish to receive all the
media flows. For example, a site with a slow access link may be able to participate in a
conference using only audio and a whiteboard, whereas other sites in the same conference
with more capacity may also send and receive video. This can be done because the video can
be sent to a different multicast group than the audio and whiteboard. This is first step towards
coping with heterogeneity by allowing the receivers to decide how much traffic to receive,
and hence allowing a conference to scale more gracefully.
The transport protocol for real-time flows is Real-time Transport Protocol [sch97]. This
provides a standard format packet header which gives media specific timestamp data, as well
as payload format information and sequence numbering amongst other things. RTP is
normally carried using UDP. It does not provide or require any connection set-up, nor does it
provide any enhanced reliability over UDP. For RTP to provide a useful media flow, there
must be sufficient capacity in the relevant traffic class to accommodate the traffic. How this
capacity is ensured is independent of RTP. Each original RTP source is identified by a source
identifier, which is carried in every packet. RTP allows flows from several sources to be
mixed in gateways to provide a single resulting flow. When this happens, each mixed packet
contains the source IDs of all the contributing sources. RTP media timestamp units are flow
specific - they are in units that are appropriate to the media flow. For example, 8kHz sampled
PCM-encoded audio has a timestamp clock rate of 8kHz. This means that inter-flow
synchronisation is not possible from the RTP timestamps alone.
Each RTP flow is supplemented by Real-Time Control Protocol (RTCP) packets. There are a
number of different RTCP packet types. RTCP packets provide the relationship between the
real-time clock at a sender and the RTP media timestamps so that inter-flow synchronisation
can be performed, and they provide textual information to identify a sender in a conference
from the source ID.
Continuous-media tools like rat and vic do not normally provide error re-transmission
facilities; they can reconstruct missing audio and video at the receiver, sometimes helped by

30 July 1999                   ICECAR Security Architecture, v2.4                                5
forward error correction (FEC) data to correct errors incurred during transmission [per98].
By contrast shared workspace tools, like the network text editor (NTE), mentioned in Section
2.4, require fully reliable transmission. To achieve this there is a need of a reliable multicast
transport. Various techniques to achieve this have been implemented [han97], [jac93]]. Again
the details are beyond the scope of this Deliverable.

2.4   The Multimedia Tools
There are many tools that can be used in multimedia conferencing. In ICECAR, we plan to
provide three particular tools - VIC [mcc95] for video, RAT [har95] for audio and NTE
[han97] as a shared text editor. Fundamental to the transmission of audio and video streams
over digital networks is the use of coders and decoders; their combination is called a codec.
These are devices that sample the analogue signals, and process the resulting digital streams.
This processing, which is done in the codec, will require variable amounts of processing
power, and produce output with different properties. The algorithms used in different codecs
are beyond the scope of this Deliverable. Different codec algorithms can have different
compression factors; i.e. for a given picture, different amounts of data are generated. If one
part of the network is able to transmit at one speed without undue network error, and another
has a lower capacity, it may be necessary to use different coding algorithms in the two
regions. To mediate between the two may require decoding and re-coding (though this may be
possible completely in the digital domain). Devices that carry out this process are called
Another property of codecs is that they may be scaleable, producing different layers of coding
[mcc96]. A receiver may process one, some or all of these layers. With well-structured
layered coding, processing one layer will provide a minimal quality of media; processing
more layers will provide progressively better quality. If all the layers are sent over one
multicast group, then a layered codec may not be architecturally different from other codecs
for the purpose of this paper. However, if it is easy for an intermediate node to recognise the
different layers, then it may be easy to provide digitally the equivalent of transcoding. It is
also possible to send different layers to different multicast groups. By subscribing only to
some groups, a receiver may avoid overloading the network, or its own processor.
Alternately, by forwarding only certain multicast groups, an active element in the network
may ensure the protection of a lower capacity region. Both these mechanisms do have impact
on the architecture of multicast conferencing.
There are several shared workspace tools, e.g. WB [jac93]], NetMeeting [net] and NTE
[han97]. For ICECAR purposes, we want a tool over which we have complete control, and for
which we have the source code. For this reason we have chosen to work with NTE. This is a
shared text editor, which is very useful for collaborative meetings, in which there is a
considerable amount of text to be reviewed by several parties.

2.5   Session announcement , Invitation and Tool Launch

2.5.1 Introduction
 There are several functions in the set-up of conferences. First it is necessary to announce the
existence of a conference, and give a basic description of its characteristics. One may
announce it in various ways, and rely on the interested participants to find out about its
existence by themselves; one can announce its existence on a special port, or one can invite
particular people to participate. Having found out about the conference, participants can join
with various mechanisms. We discuss here some of the technologies adopted. These include a
basic form of conference discovery mechanism. The conferences can be announced in a
“broadcast” mode, individuals can be invited in real time, or information about the session

30 July 1999                   ICECAR Security Architecture, v2.4                              6
can be provided off-line – by putting the information in a depository or sending it by e-mail.
Each is described briefly below.

2.5.2 Session Descriptions
The rendezvous mechanism for many lightweight sessions is a multicast-based session
description. The session descriptions provide an advertisement that the session will exist, and
also provide sufficient information including multicast addresses, ports, media formats and
session times so that a receiver of the session description can join the session. The Session
Directory Protocol (SDP) [han98] describes the content and format of a multimedia session.

2.5.3 Session Announcements
One method of announcing sessions is to send the session description on a well-known
multicast port, with a specific scope, using the Session Announcement Protocol SAP
[han199]. The announcement includes some information like the organiser of the conference,
some authentication information and the Session Description. People wishing to participate in
a particular conference must then listen for the SAP announcement, and start up their tools
with the SDP details provided. An automated tool (SDR [sdr]) has been developed which can
receive the "broadcast" session descriptions [han98], browse through all sessions currently
being announced, and then start up the relevant tools. An important aspect is that if the
announcement of the message is received, there is a high probability that the session itself can
be joined. This mechanism can also be applied to advertised tightly coupled sessions, and
only requires that additional information about the mechanism to use to join the session be
given. However, as the number of sessions in the session directory grows, we expect that
only larger-scale public sessions will be announced in this manner; smaller, more private,
sessions will tend to use direct invitation rather than advertisement. This is because otherwise
either the bandwidth required by SAP, or the interval between announcements will become
too large.

2.5.4 Session Invitations
Not all sessions are advertised, and even those that are advertised may require a mechanism to
invite explicitly, but in real-time, a user to join a session. Such a mechanism is required
regardless of whether the session is a lightweight session or a more tightly coupled session,
although the invitation system must specify the mechanism to be used to join the session.
As users are mobile, it is important that such an invitation mechanism be capable of locating
and inviting a user in a location-independent manner. Thus user addresses need to be used as
a level of indirection rather than routing a call to a specific terminal. The invitation
mechanism should also provide for alternative responses, such as leaving a message or being
referred to another user, should the invited user be unavailable. The Session Initiation
Protocol (SIP) [han99] provides a mechanism whereby a user can be invited to participate in a
conference. SIP does not care whether the session is already ongoing, or is just being created.
It does not care whether the conference is a small tightly coupled session or a huge broadcast
- it merely conveys an invitation to a user in a timely manner, inviting them to participate, and
provides enough information for them to be able to know what sort of session to expect. Thus
although SIP can be used to make telephone-style calls, it is by no means restricted to that
style of conference.

30 July 1999                   ICECAR Security Architecture, v2.4                              7


 Prev iew:
 This EPS picture was not sav ed
 with a prev iew inc luded in it.
 This EPS picture will print to a
 PostSc ript printer, but not to
 other ty pes of printers.

                   Figure 2            Joining a lightweight multimedia session

2.5.5 Offline Mechanisms
It is also possible to use off-line mechanisms for providing the information on up-coming
sessions. One is to send the information by e-mail; mechanisms for parsing e-mail
mechanisms and starting sessions automatically have been provided [scua]. While this is an
adequate method, it really needs to be provided in a generic way, so that it can be parsed
automatically by the mail systems. Since a special MIME type has been defined for
specifying SDP, it would be best to use this for all implementations passing SDP in messages.
It is then possible to provide a MIME Plug-in for each e-mail tool, to parse automatically this
particular MIME type and launch the tool. It is also desirable to provide a mechanism for
obtaining listings of all sessions currently available, or announced for a certain interval. We
plan to provide these from a WWW browser in MECCANO, and then consider their
relevance to secure conferencing in ICECAR.
Alternately the information can be put into a repository known to the potential participants.
The information can then be extracted at will be the potential participants. This mechanism is
convenient if potential participants can be expected to access the directory sufficiently often.
A combination of the use of ordinary e-mail to announce the existence of a conference,
together with a directory mechanism, will probably be the most popular for a large class of
applications. Specialised Web-based tools already exist which allow the browsing through
lists of conferences, together with client plug-ins that can extract the relevant Session
Description and start up the tools. There is still a potential problem re address allocation. If
the Session Announcements are made privately, the address allocation is not necessarily
unique; two independent announcements may be made of different conferences on the same

30 July 1999                        ICECAR Security Architecture, v2.4                        8
multicast address/port. If the address allocation by the announcer is made randomly, this is
unlikely to occur – and will be easily detected. Moreover, the MMUSIC group is addressing
this problem at the moment; any solutions it develops will also be incorporated in the
MECCANO architecture. Multicast sessions are scope-limited; hence although an
announcement may be retrieved, it does not follow that the retriever can participate in a
particular session. Finally, use of a repository can lead to a single point of failure – the
availability or accessibility of the Depository. It is possible to ameliorate this problem by
putting the announcements in several depositories; this is comparable with the running of a
primary and several secondary DNSs in the Internet.
None of these mechanisms will scale well to very large conferences, because of the potential
number of messages or depository accesses. However it is not clear that we have the tools to
manage really large conferences in any case.

2.6   Relays
The UCL Transcoding Gateway (UTG) provides access to multicast conferences for hosts
with only unicast connectivity. In addition, it provides limited transcoding and mixing
functions, primarily for audio.
The UTG was developed as part of the MERCI/MECCANO projects. A conceptual outline of
the UTG system is illustrated in Fig. 3.



Figure 3:       Conceptual outline of the UTG system
 The components in the UTG architecture are expected to perform the following tasks:
 The RTSP controller module will provide the control interface to the unicast-only end-
 The access control module is used to verify that requests for transcoding and gatewaying
  are from authorised users.
 One or more media engines are instantiated to perform transcoding and gatewaying when
 Finally, the Mbus controller provides the necessary glue between all the other modules.

30 July 1999                 ICECAR Security Architecture, v2.4                            9
A number of the components necessary for the UTG already exist. In particular, the media
engines are well developed.
To ease operations, one engine that will be developed is a SAP proxy. Just as the invitee of
Section 2.6 can extract the Session Description from a session announcement or invitation, so
the relay will be invited to join the conference. The relays may be scattered around the
Internet, and the organiser of a conference will have no awareness of their location or
identities. Would-be participants – normally at the end of unicast links – will activate the
relays in much the same way as they would start the media engines in their own workstation.
For this reason, the relay will be invited to join the conference by the would-be user. The
session descriptions are passed by the relay to the end-user workstation

3 Security Considerations in Multicast Conferencing

3.1   Introduction
There is a temptation to believe that multicast is inherently less private than unicast
communication since the traffic visits so many more places in the network. In fact, this is not
the case except with broadcast and prune-type multicast routing protocols [dee91]. However,
IP multicast does make it simple for a host to anonymously join a multicast group and receive
traffic destined to that group without the other senders' and receivers' knowledge. If the
application requirement (conference policy) is to communicate between some defined set of
users, then strict privacy can only be enforced in any case through adequate end-to-end
One way to secure media streams is at the transport level. Continuous Media transport uses
RTP [sch97]; there is a standard way to encrypt RTP and RTCP packets using symmetric
encryption schemes such as DES [des]. The standard also specifies a standard mechanism to
manipulate plain text keys using MD5 [riv92], so that the resulting bit string can be used as an
encryption key. Similar techniques can be used for encrypting the contents of the non-AV
portions of the conferences. Most of our early work will be done with DES because of the
prevalence of the implementations; later we will move to more secure encryption algorithms.
The symmetric encryption algorithm used has only pragmatic, not architectural, significance.
DES is now denigrated in the IETF community - being relegated to "historical" status. In this
architectural note we will normally use the words DES more as a short-hand for a fairly
conventional symmetric encryption algorithm.
There are mechanisms defined in the IETF for standard secure operations on IP packets
[tha98]. These are not yet defined fully for multicast operation; in particular, the key-
exchange mechanisms are not yet developed. There will be some discussion of mechanisms
for multicast key distribution given below. We have not yet decided fully whether IPSEC will
be used in the ICECAR project; it was not part of the Programme Plan [icecar]. The use of
IPSEC is part of other research projects that the MECCANO participants (in particular UCL)
are doing with other funding bodies. The results will probably not be reflected, however, into
the ICECAR deliverables during the remaining life of ICECAR.
Because the use of plain text pass-phrases can be used to derive symmetric encryption keys,
one can use simple out-of-band mechanisms such as any privacy-enhanced mail scheme e.g.
[cal98], [pgp99] or S-MIME for encryption key exchange. It is also possible to integrate the
key-exchange mechanism at least partially into the session announcements and invitations.
Each of these methods is considered below.
There is currently considerable controversy whether Session Announcements are an
appropriate mechanism for announcing limited sessions - and thus whether there is a place for
encrypted Session Announcements at all. One argument is that already, when most sessions
are open, the bandwidth taken up for Session Announcements is quite large. With the
bandwidth allocated for such announcements one typically still has to wait some 10 minutes

30 July 1999                  ICECAR Security Architecture, v2.4                             10
before an announcement is refreshed. Another concern is that one of the functions of Session
Announcements is to avoid conflicts in the use of multicast addresses; this avoidance is
impractical if the whole announcement is encrypted, and too much information is released if
the multicast address and time is sent in the clear. The first concern could be addressed by
using Session Announcement cache proxies [swa98]; the second by separating out the address
allocation functionality from the rest of the announcement mechanism. Both questions are
still being discussed in the IETF. In the meantime, the standardisation of encrypted session
announcements is being hampered; some of the mechanisms considered here have not been
ratified by the IETF. The use of proxies will not affect the time taken to refresh the proxies; it
will, however, impact dramatically the time for a workstation to obtain Session Descriptions
from such proxies.
In Section 3.2, we discuss how privacy is provided by encrypting the data streams in the
media tools at the application level. It is still necessary, of course, to distribute securely the
encryption keys to all authorised participants in the conference. In Section 3.3, we consider
the provision of security at the network level by the use of IPSEC. To set up secure
conferences, it is necessary to distribute securely the Session Description of the conference;
the mechanisms to achieve this are discussed in Section 3.4. There are attempts to use
multicast groups to provide this sort of key distribution; all such techniques are still in the
pure research stage, and will not be pursued in the ICECAR project. In Section 3.5, we
consider how secure conferencing will change if IPSEC techniques are introduced; although
the resulting architecture is explored, it is unlikely that there are sufficient resources to
attempt an implementation under ICECAR auspices.

3.2     Encryption of Media Streams
If the media or shared application tools send their data in the clear, then it is easy for anybody
knowing the time and multicast address of a session to participate. Moreover, since the RTCP
responses could be suppressed, the participation may be unnoticed by the other participants.
For this reason, private conferences will need to have the data streams encrypted. Because of
the processing load arising from encryption, it is customary to encrypt the streams with
symmetric algorithms. Typically one uses DES [des], though it would be possible to use
triple-DES [tdes] or IDEA [lai92] or any other such algorithm if desired. In view of the
current move in the IETF to abandon DES, because of the ease of cracking it, we will
presumably move over to triple-DES or IDEA here too - though this is a pragmatic detail, not
something of architectural significance.
A subset of the tools used in MECCANO have had the capability for encryption added; the
subset is large enough that most of functionality for encrypted conferencing is available.
These tools are currently VIC, RAT, VAT, WBD and NTE.

3.3     Network Level Security and IPSEC

3.3.1    The Background of IPSEC
The disadvantage of application-level security is that it is necessary to secure each tool
separately. This was the most convenient method of securing conferences until recently,
because the alternative - to secure the media stream at the network level - required a different
network IP stack. While such a stack has been proposed and been on the standardisation track
for nearly a decade, the relevant standards have only been ratified in the IETF over the last
few months. There are now a number of reference implementations, and many suppliers are
about to provide the new functionality in their main production stacks.
TCP/IP protocols in their then current version 4 were originally designed 20 years ago for
data transmission. Security of communication network layers was not a major concern, and
attention was focused on data rates and data loss avoidance. The goal of these protocols was

30 July 1999                   ICECAR Security Architecture, v2.4                              11
to ensure reliable transfers between two hosts, without worrying about malevolence. Now, the
need for secure communication has widely increased with the growing use of the Internet by
companies, and the appearance of electronic business. Simplicity, scalability and universality
of IP were the keys to its success. It was designed to be implemented easily in a large range of
networks from high-speed networks to dial-up access via low-rate modems.
A new Internet protocol (IPv6 [dee95]) is being designed in order to prepare for the future. It
takes into account questions of security through defining two new network layer security
mechanisms. These mechanisms, combined with adequate security policies and key
management techniques, provide authentication of IP packet emitters, integrity of IP packets
and confidentiality of communications. Though defined from the perspective of IPv6, these
mechanisms can be used with the IPv4 protocol, without any drawback for transit routers not
implementing IPSEC. These mechanisms are independent of the protocols above the network
layer, thus requiring no modification of either applications or transport protocols.
3.3.2     Global architecture
IPSEC is a security extension to the network layer of the Internet protocol. It protects all
layers from network to application layer during the transfer of a datagram on the network. It
may be implemented at different levels inside or outside an IP protocol stack.
Three types of architecture have been standardised:
 native implementations
IPSEC is integrated in the native IP implementation. This requires access to the IP source
code and is applicable to both hosts and security gateways.

            User                      ionApplication
        environment                      ionApplication

                                 System kernel interface     (Socket)

          Kernel            TCP/UDPv4                        TCP/UDPv6
                                  IPv4                         IPv6

        MAC layers                       Network interface

                      Figure 4           Native implementations of IPSEC
 bump-in-the stack (BITS) implementations
IPSEC is implemented underneath an existing implementation of an IP protocol stack,
between the native IP and the local network drivers. Source code is not required in this

30 July 1999                     ICECAR Security Architecture, v2.4                          12
            User                     ionApplication
        environment                     ionApplication

                                System kernel interface       (Socket)

           Kernel             TCP/UDPv4                   TCP/UDPv6
                                   IPv4                       IPv6

                                           IPSEC module

        MAC layers                        Network interface

            Figure 5       Bump-in-the stack (BITS) implementations of IPSEC
 bump-in-the wire (BITW) implementations
An outboard cryptographic processor is plugged on the network and serves one or more hosts.
This device is usually IP-addressable. When supporting a single host, it may be quite
analogous to a BITS implementation, but in supporting a router or firewall, it must operate as
a security gateway.

            User                     ionApplication
        environment                     ionApplication

                                System kernel interface       (Socket)

           Kernel             TCP/UDPv4                   TCP/UDPv6
                                   IPv4                       IPv6

        MAC layers                        Network interface

        cryptography                       IPSEC module

            Figure 6       Bump-in-the wire (BITW) implementations of IPSEC
3.3.3     Principles of IPSEC
IPSEC consists of the definition of two network layer security mechanisms and their
management. Two new extension headers were defined to convey security information in an
IP packet:
The Authentication Header (AH): The AH is inserted between the IP header and the upper
layer protocol headers. It aims at authenticating the emitter of the packet, and at guaranteeing
that the packet, once sent, was not modified along the path on the network (integrity).
Integrity is not only guaranteed for data, but also for protocol headers fields, including source
and destination addresses, TCP ports (when TCP is used), and so on. Processing of AH
consists of calculating an electronic signature on the packet, normally by applying a hash
function with a secret key, and inserting this signature into the packet headers. The recipient
verifies that the received signature is valid using either a secret key shared with the sender or
the sender’s public key.

30 July 1999                    ICECAR Security Architecture, v2.4                            13
                                                                                IP header
         IP header
                                                                              Signature (AH)
                                                   Signature (AH)
           Data          with a secret key
     (upper protocols)
                                                                             (upper protocols)

                                      Figure 7                  Authentication with IPSEC
Integrity is guaranteed since the hash result is completely different if only one byte of the
original message is modified. Authentication is due to the secrecy of the key.
The Encapsulating Security Payload (ESP): The ESP encrypts all of the protected data. It
provides data confidentiality. ESP processing consists of enciphering a part or the entirety of
the packet. Integrity data may (and should) be appended to the ESP header. Only enciphered
part is authenticated in that case.

    IP header

                                                 IP header
                           with a secret key
 (upper protocols)

                         with a secret key     Signature (AH)

       Figure 8                   Enhanced Security Payload with IPSEC in its Transport Mode
Two modes exist for both AH and ESP: transport mode and tunnel mode. The preceding
figure shows how transport mode is handled. Tunnel mode consists in first pre-pending a new
IP header, which will be called « outer IP header », whose fields may be different from those
of the « inner IP header », and then process the AH or ESP, considering the original packet as
raw data. Tunnel mode protects the entire IP packet by the technique of « tunnelling ».
Two types of hosts may implement IPSEC: hosts and security gateways. The latter use IPSEC
to encapsulate traffic leaving the trusted area and unwrap traffic entering the area.
3.3.4       The Security Association Database (SAD)
A Security Association (SA) defines the IPSEC processing that will be applied to a packet. A
Security Association is an agreement between two hosts that consists of the IP destination of
the communication, the IPSEC mechanism to apply (AH or ESP) and its mode, the
cryptographic algorithm and its key(s), etc. A security association is entirely defined by the
triple (destination address, mechanism (AH or ESP), Security Parameters Index (SPI)).
Security Associations are negotiated using mechanisms such as IKE/ISAKMP (Internet Key
Exchange protocol/Internet Security Association Management Protocol) [har98], [mau98].
They may also be maintained with keys derived by other negotiation mechanisms.
SAs are stored in a database called the SAD. Selection of a particular SA depends on the
characteristics of the communicating hosts and on security requirements of the
communicating application or the system security policy.
3.3.5       Security policies. The Security Policy Database (SPD)
The IETF specification defines a database that determines IPSEC processing to be applied to
IP traffic originating from or sent to the host. Each entry, called a Security Policy (SP) is
composed of two parts: a selector and an SA specification. Traffic matching the selector must

30 July 1999                                   ICECAR Security Architecture, v2.4                14
be processed using a SA matching the specification. To solve the problem of packets
matching several selectors, SPs are ordered in the SPD.
The kernel will choose a hybrid policy combining the maximum security requirements of the
SPD and the user security policy.

3.3.6 IPSEC, IP Multicast and Conferencing
The use of IPSEC in IP multicast environments has not yet been completely defined. The
main reason for this is that the key exchange mechanisms like IKE do not work with multiple
end points. In fact the security associations may have to be defined before all the end-points
are associated with the exchange. To the extent that IP multicast is used for routing packets,
there is no reason why its IPSEC extensions cannot be used directly - provided that the
security policies reflect the reality of how the SAs are set up. If the authentication and
encryption keys are provided by appropriate mechanisms, then there should be no problem in
securing the media streams at the network level.
There is the potential problem that IPSEC normally secures streams at the host level, while
conferencing is more concerned with the identity of persons participating. This gap is bridged
by the mechanisms used for providing keys. For conferencing, these mechanisms should use
personal authentication in order to provide the keys that will be used in the Security
There is the potential problem that IPSEC normally secures streams at the host level, while
conferencing is more concerned with the identity of persons participating. This gap is bridged
by the mechanisms used for providing keys. For conferencing, these mechanisms should use
personal authentication in order to provide the keys that will be used in the Security
There are still potential problems. One is that IPSEC is designed to operate at the IP level,
while multicast addresses use IP number/port number. It may be desirable to do different
encryption operations on different conference streams. With the present specifications, it will
be necessary to have any streams with different security policies having different IP numbers.
It is not clear yet whether this will cause operational difficulties.
There is a potential further problem. While at one level an IP multicast address is no different
in syntax from a unicast, some implementations of IP multicast provide routing activities in
the stack. In some of these implementations [swan], the way that the stack has been
implemented will not work with multicast addresses. This is clearly a temporary defect, which
will be remedied in future versions of the requisite stacks.

3.4   Encrypted and Authenticated Session Descriptions
As mentioned in Section 2.5.1, the announcement and invitation to conference sessions is
critically dependent on the passing the Session Description (SDP) [han98] to the authorised
invitees. This information can be passed by many technologies - both in-band and out-of-
There are three main reasons for providing authentication in announcements and invitations.
One is that if one intends to provide billing depending on the announcement itself, then some
form of authentication is essential. The second is that one may wish to be sure that the
conference has indeed been called by someone who is authorised to do so. A third is that there
are also mechanisms for modifying Session Announcements; a simple Denial-of-Service
attack is to modify the announced time or location with unauthenticated announcements.
The main reason for constraining access to the SDP data itself is to ensure that only
authorised people may participate in the conferences. Since the SDP data includes the
encryption key(s), it is essential that the information about the Session Description be passed

30 July 1999                  ICECAR Security Architecture, v2.4                             15
securely between the persons authorised to participate. This can be achieved by passing the
SDP data in an encrypted way, where only authorised parties can obtain the encryption keys.
Alternately, it may be accomplished by storing the SDP itself in a depository, from which
only persons so authorised can retrieve it. Of course the SEK may be kept separate from the
rest of the SDP. For a conference to be considered private, it will have to be encrypted; hence
as long as the SEK is passed securely, it does not matter if the rest of the SDP is seen by
unauthorised persons.

3.4.1 Assumptions on Group Management for Conferencing
All the mechanisms of this section require that there be some means for managing groups
securely. Either shared secrets must be sent to groups, or access control lists must be
maintained for groups. We assume that for every secure conference, there must be a Security
Manager (SM), a set of Authorised Conference Participants (ACP), list of authorised
participants (LACP) and a Conference Organiser (CO). If a person leaves an organisation or
an activity, they may no longer remain authorised to participate in the conference; the LACP
must then be updated. It will be the responsibility of the SM to keep a database of the
certificates of all the persons who might be LACPs; it is then up to the CO to decide which of
them will be LACPs for a specific conference series.
In some cases, the SM will provide public/private key pairs as part of the provision of a
Personal Security Environment (PSE). It is more likely that a proposed participant will
provide only the public key or certificate, together with any documentary proof of identity
required by the specific security policy in force. The derived certificate will be signed by the
SM, and entered into a public database like a Web server [apach], X.509 Directory [mesdir]
or DNSSEC [eas99] database.
We assume also that the Session Encryption Keys (SEKs) will be changed for each
conference. If that is the case, then the key distribution problem is much eased. We assume
also that the LACP is reasonably stable; conferences are held much more often than the
LACP has to be changed. Thus the SEKs must be changed much more often than the LAP.
Finally, we assume that every potential participant in secure conferencing has a public key
certificate, which contains the public authentication key that authenticates them as the
originator of a message, or who can decrypt with their private encryption key any message
sent to them encrypted with their public encryption key. This certificate will have been signed
in such a way that any other conference participant can verify the certificate; this assumption
is normal to the Public Key Infrastructures being standardised in the IETF [hou99].

3.4.2 Authentication of Session Announcements
Any CO wishing to announce a conference for a specific group can send out a session
announcement. He signs the announcement, allowing it to be verified by recipients.

It is possible to send out authenticated announcements by just distributing the CO's public key
at the same time. Anyone receiving this public key can cache it. It is impossible to verify that
the key came from the CO without a certification infrastructure, but the main concern is the
Denial of Service attack arising from an announcement being contradicted by a subsequent
change. Even without a security infrastructure, any ACP can check that the changed, received
announcement came from the same source as the original announcement, and possibly was
signed by the SM – and reject it otherwise. Thus for merely ensuring that only the original
announcer can alter a venue, it is not necessary to maintain an infrastructure.
Even without privacy requirements in the conference policy, strong authentication of a user is
required if making a network reservation results in usage based billing. These considerations
are orthogonal to the announcement of sessions; they are relevant, however, to the
mechanisms adopted on joining sessions.

30 July 1999                  ICECAR Security Architecture, v2.4                             16
3.4.3 Shared Secret Distribution
Encryption key distribution is closely tied to authentication. Conference or session
description keys can be distributed securely using public-key cryptography on a one-to-one
basis (by e-mail, a directory service, or by an explicit conference set-up mechanism).
However the security is only as good as the certification mechanism used to certify that a key
given by a user is the correct public key for that user. The mechanism outlined in Section
3.4.1 should give this assurance. Such certification mechanisms [pkix] are, however, not
specific to conferencing, and in the conferencing portions of the IETF (the MMUSIC group),
a strong preference for using PGP certificates [cal98] has been expressed.
A number of mechanisms will be used for distributing shared secrets. One is to use secure e-
mail; here S-MIME [ram99] or PGP [cal98], [pgp99] are the message systems used most
commonly. The IETF conferencing community has a strong preference for PGP; the ICECAR
community has an equally strong preference for S-MIME. A second is secure access to a Web
depository; here mechanisms like S-HTTP or SSH are appropriate. A third is access to a
X.509 directory; here access control with strong authentication and encrypted transmission is
appropriate. Both the second and the third will be called "access to a secured depository" in
the subsequent text. In these last two, an Access Control List, using either a password or a
public key certificate, will be used. When a distribution mechanism is suggested, one of the
above will be implied. Although DNSSEC is appropriate for storing public keys, it is not an
appropriate secured depository due to a design decision that there will be no control on access
to DNS databases.
There are problems of security policy and a question of how certificates should be used with
all these methods. First, we have assumed implicitly that the distribution of the shared secret
is to known people using their public key certificates. For some purposes, it is not the
personal name that is relevant, but the role. Hence both in S-MIME and the Depository, the
access may have to be made to persons using their role rather than their names. To allow
access to depositories may be easier than the S-MIME for this case, since it must be assumed
that there is local authentication binding the current role to the name. Another problem will be
the question of validation of the certificates used in either the S-MIME or the depository.
Certificates may be revoked for many reasons; some will be the change in role of the persons,
but others may be expiration or other reasons. There will probably have to be helper modules,
under the control of the Security Manager, which regularly poll the relevant Certificate
Revocation Lists. These must then keep the validity of the certificates in the certificate base
up to date. Presumably one function of such a helper module will be to inform either the
Conference Organisers or the Authorised Conference Participants if their certificates become
invalid, even if they are authorised participants.

3.4.4 Distributing Session Descriptions Securely
Private sessions can be announced in many ways, and we will be using several in the
ICECAR project. All are based on providing the Session Description (SD), complete with its
Session Encryption Key(s) (SEK), in a secure way to all authorised participants. While each
media stream may use a different SEK, it is important that the same SD can be used
irrespective of the manner it is transferred. This allows the facility that launches the encrypted
media tools to be oblivious of how the SD came to the recipient.
Session descriptions may be distributed securely through secured electronic mail (as indicated
in Section 3.4.3), secured SIP session invitations [han99], in encrypted SAP session
announcements, or stored in a secured depository (as indicated in Section 3.4.3) with access
control. The mechanisms of SAP and SIP have been described in Section 2.5. None of these
mechanisms provide for changing keys during a session as might be required in some tightly
coupled sessions, but they are sufficient for most usage in the context of lightweight sessions.

30 July 1999                   ICECAR Security Architecture, v2.4                              17
With the use of encrypted announcements via SAP, it is only necessary that the Session
Announcement Encryption Key (SAEK) be distributed securely to all authorised participants.
Using SIP or secure messaging, it is necessary that a Public Key of all authorised participants
be known; the facilities in Section 3.4.1 ensure that this will be the case. If the SAEK is kept
in a secured depository, it is necessary only that the relevant Access Control List be
maintained. For some purposes, it is preferable to keep the whole Session Description in the
depository; for others just the SAEK is required.
The relevant advantages of using SAP, SIP or depositories for publicising conferences have
been discussed in Section 2.5. The further complexity raised by the security aspect is that if
SAEKs are changed frequently, and the announcement is made by e-mail, it may be difficult
for people "on the move" to locate the current SAEK - even though they have little difficulty
in joining a session at their current location. Clearly it is possible to provide additional plug-
ins for the mail systems which parse incoming messages for the MIME type which represents
a SAEK; such plug-ins have not yet been developed, but should not prove too difficult.

3.4.5 Use of Symmetric or Asymmetric Encryption Mechanisms with SAP
There has been some controversy over the last few years over the form of encryption that is
optimal for private SAP announcements. One set of people has advocated the use of
symmetric encryption, with the prior distribution of a number of SAEK encryption keys -
with or without an identifier. Symmetric encryption algorithms have been used such as DES
[des], Triple DES [tdes] or IDEA [lai92]. The use of an identifier might be considered to
weaken the security, but would ease the encryption process. Alternately, one could choose not
to use an index number; the recipients would then need to try to decrypt each incoming
session announcement using one after another of the SAEKs in his/her cache. This need be
done only the first time the encrypted announcement is transmitted, since there is a unique
hash associated with each announcement, which tells the recipient if they have seen the
announcement before. Yet others have advocated the prior distribution of a public/private key
pair as a Group Session Announcement Encryption Key (GSAEK key pair) to all the ACPs of
Section 3.4.2; the SAEK is then encrypted with one of the GSAEK and sent out as a header to
the Session Announcement. Only ACPs then have the other part of the key pair to extract the
SAEK from the announcement. The main reason for using the asymmetric mechanisms is that
there exist widely available PKI toolkits that work with asymmetric encryption algorithms;
two such exist from SECUDE [sec] for X.509 certificates and PGP [pgp99] for PGP ones.
The use of public-key or strong symmetric cryptography for this purpose has not yet been
standardised, because of disagreements on which technology is most suitable. The
standardisation of these mechanisms will clearly accelerate the use of secure conferencing
and commercial broadcasting. The considerations include the frequency of change of the
groups, the nature of the events to be announced, the amount of infrastructure one assumes
amongst potential participants, and which is the easiest to implement - in view of the security
toolkits now available. For this reason the SAP RFC [whe99] does not prescribe which
method is to be used.
UCL has released a version of its SAP/SIP tool, which includes starting the tools securely,
based on a version of SDR [sdr] which supports asymmetric encryption with group keys.

3.4.6 Use of Smart Cards for Secure Conferencing
In Section 3.4.1, we stated our assumption that every potential participant in secure
conferencing has a public key certificate that contains the public authentication key that
authenticates them as the originator of a message. We assume also that they can decrypt with
their private encryption key any message sent to them encrypted with their public encryption
key. This certificate will have been signed in such a way that any other conference participant
can verify the certificate.

30 July 1999                   ICECAR Security Architecture, v2.4                              18
There are number of Public Key Infrastructure (PKIXs) tools that can be used to set up these
key pairs for individuals, including SECUDE [secude], Netscape [netsc] and ENTRUST
[ent]. It is then necessary to set up a Personal Security Environment that holds the private key.
Normally this will be held on an individual's workstation, protected by a PIN, or kept on a
floppy disc - possibly in encrypted form. It is possible to streamline many of the operational
issues if the key pair is kept on a smart card [kum99]. Such a card is easily portable, and can
be used on other workstations without the need to transfer a PSE in software. It is expected
that use of such smart cards will greatly ease the operational deployment of secure
conferencing. It should be much easier to automate operations, and to reduce the security
constraints on the workstations with the use of such cards.We certainly plan to demonstrate
secure conferencing in ICECAR with the use of such card.

3.5   Use of IPSEC in Secure Conferencing
Section 3.2 described the operation of secure conferencing with application level security in
the media tools. The tools are launched from a tool like SDR, using a Session Encryption Key
(SEK) to encrypt/decrypt the media, and a Session Description (SD) to set up the tool
parameters. If IPSEC is used in place of application-level security, then the calls to launch the
tool will be unchanged. The SD will still set up the parameters of the tool; the SEK will now
be used in the Encapsulating Security Payload (ESP), and the multicast address and port
number in the Authentication Header (AH).
Until multicast key negotiation mechanisms are standardised, setting up secured sessions will
remain an application-level function. Security parameters are distributed in a session
description using any of the methods described in section 3.1. A tool such as SDR will insert
security parameters from the session description into the local Security Association Database,
causing traffic sent to and from that multicast address to be cryptographically processed using
the standard IPSEC mechanisms.
Many of the current IPSEC implementations are really designed for tunnel mode VPNs, and
others may find it difficult to accommodate multicast addressing. We have not yet
implemented any secure conferencing using IPSEC, and are not certain that this will be
practicable with the limited resources available under the ICECAR project.

4 Secured Operation with Relays
The relay of Section 2.6 also requires secured operations. This takes several forms; in each it
is necessary to decide whether to trust the relay, and how to operate it, for the following
 To control the relay from individual remote Hosts
 To provide encryption keys for secured Session Descriptions
 To provide encryption keys for secured sessions
Clearly some level of access control is necessary to ensure that only authorised remote hosts
can control the relay. Either IPSEC from specific hosts or SSL [die99] from specific people
would be good mechanisms to use.
The relay will be used as a SAP proxy. It will cache SAP announcements from the Mbone.
While it could be given encryption keys to allow it to decode all incoming secured
announcements, this would be too great a security risk. It would make the relay a very
tempting target; if compromised, it could be used to gain access to other communications to
that recipient.
Instead, remote hosts will make connections to the SAP proxy, The SAP announcements are
passed by the SAP proxy in the relay to the end-user workstation – whether or not the session
is encrypted. Alternately they can be obtained from a trusted depository just as in Section 3.4.

30 July 1999                   ICECAR Security Architecture, v2.4                             19
The user can then invite the relay to join the conference with a secured invitation on behalf of
the user of the remote host.
Certain filtering operations can be done without providing the relay with any encryption keys;
an example is joining only some of multicast groups for layered encoding. If application-level
encryption has been used, even the multicast-unicast conversion can be done on the encrypted
If more powerful transcoding operations are required, some greater risks must be taken. The
recipient should be able to invite the relay to join the conference with the specific SEKs
allowing it to decrypt data from the specific conference it is currently filtering. The relay will
then do the requisite filtering, and re-encrypt with a new SEK for the unicast session agreed
between the user and the relay. For this last, an ordinary IPSEC connection could also be
appropriate. While this mode of working reduces risk to an acceptable level, the relay must
still run in a trusted environment. An attacker should not be able to access the session key or
the media data itself. A schematic of the mechanisms just described is shown in Fig. 9.
                        Trusted environment

                                                                                              Mobile Host

                                Crypto                     Crypto
                                engine    Plugin filters   engine

                                                              Secure, wired high-bandwidth links
                                                            Secure, wireless low-bandwidth links

     Figure 9         Schematic of Operation of Secured Relay with User-supplied SEK

This mechanism has the advantage that a malicious person and a poorly secured relay may
compromise a particular session, but nothing more. Moreover, only those relays that are
directly participating in a session need to know any of the session encryption keys.
There is no reason why the relay may not be issued with a smart card containing its Security
Environment; it could be arranged that the relay could not be operated without this card as an
additional security measure.
There is no reason why relays should only have one client. If conference data is encrypted,
the cost of decrypting the data for filtering is spread between each client. While the relay only
needs to be given the conference session key once to decrypt the data, it can use knowledge of
this key to restrict access to authorised clients.
5 Conclusions
In this document we have shown that it is possible to secure the components of multicast,
multimedia conferencing in several ways - most of which will be done in the ICECAR
project. Many rely, however, on components produced in other projects like MECCANO and
COIAS. The main tools that are to be secured in the ICECAR project are a video, audio and
shared editor tool; all have been built either so that they can be encrypted at the transport
level, or they can be sent above the IPSEC network level. An important element of the
conference is the Session Description; this normally includes Session Encryption Keys
(SEKs) if encryption is to be used.
The SEK for the encryption of the data streams must be transferred securely to all the
authorised participants. We have outlined techniques for achieving this including encrypted

30 July 1999                    ICECAR Security Architecture, v2.4                                          20
announcements, encrypted invitations, storing the key or the whole Session Description in a
secured depository, or sending the descriptions by secure e-mail. The relative merits of the
different techniques have been discussed; most will be piloted in the ICECAR project.
All the techniques rely on the establishment of conference groups, and the maintenance of a
security infrastructure for that group that is compatible with the rest of the ICECAR
infrastructure; our proposals for tackling this problem are described in detail.
Finally, we have stated that there is a need for transcoding relays or gateways in the total
architecture. We have summarised the salient aspects of these relays, and discussed how
security technology should be used in their protection. Again we have analysed the relative
merits of the different mechanisms.
We are now ready to deploy secure conferencing in earnest over multicast networks.

We acknowledge useful comments from Steve Hailes, Stephen Farrell and Colin Perkins in
the preparation of this report.

[apach] The Apache HTTP Server Project,
[cal98] Callas, J et al., OpenPGP Message Format, RFC 2440, Nov. 1998.
[dee89] Deering, S, Host Extensions for IP Multicasting, STD 5/RFC 1112, Aug. 1989.
[dee91] Deering, S: Multicast Routing in a Datagram Internetwork, PhD thesis, Stanford
        University, Dec. 1991.
[dee95] Deering, S and R. Hinden, Internet Protocol, Version 6 (IPv6) Specification, RFC
        1883, December 1995.
[des] NIST: Data Encryption Standard, National Institute of Standards and Technology
        (NIST), "FIPS Publication 46-1, Jan. 1988.
[die99] Dierks T and C. Allen, "The TLS Protocol Version 1.0," RFC 2246, Jan. 1999.
[eas99] Eastlake, DE, Domain Name System Security Extensions, RFC 2535, Mar. 1999.
[ent] -
[flo95] Floyd, S. et al., A Reliable Multicast Framework for Light-weight Sessions and
         Application Level Framing, ACM SIGCOMM 1995, pp 342-356.
[h323] ITU Recommendation H.323, Visual telephone systems and equipment for local area
         networks which provide a non-guaranteed quality of service, ITU, Geneva, 1997.
[han97] Handley, M and J Crowcroft, Network Text Editor (NTE): A scalable shared text
         editor for the MBone, Proc. ACM SIGCOMM 1997, Cannes, France, 1997.
[han98] Handley, M and V. Jacobson, SDP: Session Description Protocol, RFC 2327, Apr.
[han199] Handley, M et al., SAP Session Announcement Protocol, Internet Draft, draft-ietf-
         mmusic-sap-v2-01.txt, 1999.
[han99] Handley, M et al., SIP: Session Initiation Protocol, RFC 2453, Mar.1999.
[har95] Hardman, V et al., Reliable Audio for Use over the Internet, Proc. INET'95,
         Honolulu, Hawaii, June 1995.
[har98] Harkins, D and D. Carrel, The Internet Key Exchange, RFC 2409, Nov. 1998.
[hin96] Hinsch, E et al., The Secure Conferencing User Agent : A Tool to Provide Secure
         Conferencing with Mbone Multimedia Conferencing Applications, Proc. IDMS '96,
         Berlin, Mar. 96.

30 July 1999                 ICECAR Security Architecture, v2.4                          21
[hou99] Housley, R et al., Internet X.509 Public Key Infrastructure Certificate and CRL
         Profile, RFC 2459, Jan. 1999.
[jac93] Jacobson, J, Whiteboard (WB) README file, Lawrence Berkeley Labs, 1993.
[kir98] Kirstein, P.T. et al., Accessing Mbone Sessions over Point-to-Point
        Connection", Submitted to Proc. Multimedia Systems, 1998.
[kir99] Kirstein, PT et al: The MECCANO Internet Multimedia Conferencing Architecture,
         Deliverable 3.1,
[kum99]Kömmerling, O and M. G. Kuhn, Design Principles for Tamper-Resistant Smartcard
         Processors, Proc. USENIX Workshop on Smartcard Technology, May 1999.
[lai92] Lai, X, On the design and security of block ciphers, ETH Series in Information
         Processing, J.L. Massey (editor), Vol. 1, Hartung-Gorre Verlag Konstanz, Technische
         Hochschule (Zurich), 1992.
[Mau98] D. Maughhan, et al., Internet Security Association and Key Management Protocol,
         RFC 2408, Nov. 1998.
[mcc95]McCanne, S et al., vic: a flexible framework for packet video, Proc. ACM Multimedia
[mcc96]McCanne, S, Scalable Compression and transmission of Internet Multicast Video,
         PhD thesis, University of California, Berkeley , December 1996.
[net] Microsoft, NetMeeting 2.1,
[netsc] – Netscape,
[per98]Perkins, C et al., A Survey of Packet Loss Recovery Techniques for Streaming Media,
         IEEE Network Magazine, Sep./Oct. 1998.
[pgp99] PGP Software and Users' Guide,
[pkix] CCITT, X.509: The Directory - Authentication Framework, Consultative Committee
         on International Telegraphy and Telephony, 1988.
[ram99] Ramsdell, B, S/MIME Version 3 Message Specification, Internet draft, Apr. 1999.
[riv92] Rivest R, The MD5 Message-Digest Algorithm", RFC 1321, MIT, 1992
[sch97] Schulzrinne, H et al., RTP: A Transport Protocol for Real-Time Applications, RFC
         1889, Jan. 1996.
[sch97] Schulzrinne, H et al, RTP: A Transport Protocol for Real-Time Applications, RFC
[swa98] Swan, A et al., "Layered Transmission and Caching for the Session Directory
         Service," Proceedings of ACM Multimedia '98, Bristol, UK, Sep. 1998.
[swan] Linux FreeS/WAN Home Page,
[tdes] American National Standards Institute, "Triple Data Encryption Algorithm Modes of
         Operation," ANSI X9.52-1998, 1998.
[tha98] Thayer, R et al., IP Security Document Roadmap, RFC 2411, Nov. 1998.

30 July 1999                 ICECAR Security Architecture, v2.4                          22