UNIT by jianghongl

VIEWS: 6 PAGES: 60

									               UNIT 5
  Presentation & Application Layers
Presentation – Formatting – Data compression – Cryptographic algorithms
– DES / RSA - Applications – DNS – SMTP – MIME – HTTP- SNMP

In this unit, we will look at the functions provided by the upper three layers of the
OSI model – namely session, presentation and application. We will focus more
on the last two. We first deal with presentation issues (chapter 12), and look at
the standards used for representing data. We then look at two other aspects –
data compression, and security (chapters 13 and 14). We will look at some
techniques used for these two purposes. Finally, we will look at how some of the
common applications that we are so familiar with operate and understand the
protocols behind them (chapter 15).

Structure of the unit

Chapter 12 Session and Presentation Layers
     12.1 Sessions layer
     12.2 Presentation layer
            12.2.1 Presentation issues
            12.2.2 Examples of standards
            12.2.3 Mark-up Languages
Chapter 13 Data Compression
     13.1 What is data compression
     13.2 Lossless compression
            13.2.1 Run-length encoding
            13.2.2 Differential pulse code modulation
            13.2.3 Dictionary based methods
     13.3 Lossy compression
            13.3.1 Image compression
            13.3.2 Video compression
            13.3.3 Audi compression
Chapter 14 Network Security
     14.1 Security fundamentals
     14.2 Cryptography algorithms
            14.2.1 Symmetric-key algorithms
            14.2.2 Asymmetric-key algorithms
            14.2.3 Authentication
            14.2.4 Key distribution
     14.3 Securing the network
            14.3.1 Network security protocols
            14.3.2 Firewalls
            14.3.3 Intrusion detection systems
            14.3.4 Security tools
Chapter 15 Application Layer
     15.1 Telnet
     15.2 FTP
     15.3 E-mail
            15.3.1 SMTP
            15.3.2 MIME & S/MIME
            15.3.3 POP3
            15.3.4 IMAP
            15.3.5 Web-mail
     15.4 DNS
     15.5 HTTP
     15.6 SNMP
                           Chapter 12
                Sessions and Presentation Layers
In the OSI reference model, the 3 layers that are above Transport layer are
Application, Presentation and Session layers (top down). Unlike the lower layers,
the demarcation between these layers is less rigid in terms of the functions
performed. In this section, we will take a brief look at the functions of sessions
layer and a more detailed look at the presentation layer.

Lesson 12.1 Session Layer

This layer provides the control for communication between applications. It
manages sessions between the communicating applications. Both the Transport
and Session layers help in end-to-end communication. However, the Session
layer provides additional services like logging on, authentication, file transfer
etc… The session layer also ties different Transport streams of the same
application together. For e.g., it might manage the audio and video streams of a
Teleconferencing application. It also provides services like Token Management
(Authentication control by allowing only the holder of a token to perform the
critical operation) and Checkpoint Management. Session layer inserts
checkpoints during data transfer. In case a crash occurs during a data transfer, it
is enough to redo the data transfer after the last checkpoint.

Lesson 12.2 The Presentation Layer
Applications in a network may perform many application-specific operations to
their data. However, there are some general operations that can be performed on
all kinds of data. They are Compression, Encryption and Presentation Formatting.
Compression deals with removing redundancy in the data and hence effectively
using lesser network bandwidth. Encryption deals with security of data.
Presentation Formatting deals with the conversion of the data from a sender
understandable format to network acceptable format and back from that to
receiver understandable format during reception. While other layers (Application,
Session and even Network layer) may handle Compression and Encryption, the
Presentation Layer handles Presentation Formatting.

12.2.1 Presentation Issues
We first need to understand why this conversion of data formats becomes
necessary. It is because different machines may have their own way of
representing data and data types. Representation of even a simple data type
such as integer could vary across machines – with different architectures using
different sizes (16-bit, 32-bit etc.). You can then imagine the confusion with
respect to the representation of complex data types like arrays, pointers, lists etc.
Also, there can be differences in the way they store these data types in memory.
Consider an Intel 80x86 machine on the Internet. Let's say it is talking to a
PowerPC processor based machine on the Internet. A long integer in the Intel
machine is 4 bytes long and the bytes are arranged such that the least significant
byte goes into the lowest address and vice versa (little-endian style). However,
on the PowerPC, a long integer, which again is 4 bytes long, is stored in such a
way that the least significant byte goes into the higher address (big-endian style).
For e.g., let us take the number 65539 (0x010003). It is represented in the
machines as follows:
          Intel              00 01 00 03
          PowerPC            03 00 01 00

If the data from the Intel machine is sent as such to the PowerPC machine, it
would be misinterpreted as 50331904 (0 x 256 ^0 + 1 x 256 ^1 + 0 x 256 ^2 + 3
x 256 ^3).

This example clearly gives us a feel of the need to convert data from one format
to another. Three different approaches can be taken in the conversion of data
formats.

Sender converts the data to a standard form in the network. Receiver converts it
back to its own format. The advantage of this method is that both sender and
receiver need to know only the way of converting between their format and one
other external format. This method is the one mostly used in practice.

Sender sends the data without worrying about conversions. Receiver inspects
the data to find the nature of the sender and converts the data into its own format
from the sender's format. This is referred to as the “receiver makes right”
approach. Though there is an advantage of sender's burden being reduced,
there's a disadvantage that the receiver has to know the method of converting
data from many external formats. However, one cannot dispense with this
method because in typical networks, communication mostly takes place between
similar machines. In such cases, using this method leads to zero conversions
while the first method will lead to conversions at 2 places.

The third approach is that the sender knows the format of the receiver that it
communicates with – and converts to the intermediate format only if necessary. If
the receiver is of the same type as the sender, no conversion is done. The catch
here is – “how does the sender know the receivers type”. We need some external
agency from which that can be obtained. This method is not in vogue.

Our focus here is on the first approach. This involves an encoding or marshalling
function at the senders end, and a decoding or unmarshalling function at the
receiver end. The terms marshalling and unmarshalling are borrowed from the
RPC world, where arguments passed to the remote procedure and results
returned are converted to standard data types. The intermediate representation
is typically standardized to specify basic data types and complex data types. In
some standards, tags are used with the data to specify the type of data, length of
data etc. We look at two common presentation layer standards of data
representation - XDR and ASN.1, one that does not use tags, and one that does.

12.2.2 Examples of Standards (XDR, ASN.1)

XDR (External Data Representation)

This is the network format used with SunRPC. Some of its characteristics are:
    It supports C-like datatypes but without function pointers.
    It uses a standard format of data on the network and conversions are
       made at the sender and receiver side.
    The data is untagged except for array length being used as a tag (i.e., no
       'Type' tag is used).
    Integers are represented in big-endian format.
    Variable length arrays are represented by specifying the size in the first 4
       bytes of data followed by the actual data elements.
    For both arrays and structures, the size is specified as a multiple of 4.
       Smaller datatypes are padded to 4 bytes with zeroes. An exception to this
       rule is the 'character' datatype where the size is specified as the actual
       number of bytes.

ASN.1 (Abstract Syntax Notation One)

This is an ISO standard in which data representation is specified as “Basic
Encoding Rules (BER)”. Some of its characteristics are :

      It supports C-type system without function pointers.
      It defines a canonical intermediate form.
      It uses type tags.
      It represents each data item as a triple <tag, length, value>, (TLV), where
       tag specifies the type of data, ‘length’ its length in bytes and ‘value’ - the
       actual value.
      Compound data types can be constructed by nesting primitive data types,
       where the value field itself will consist of multiple TLV units. .
      Integers are represented in big-endian format.

It is used in the SNMP protocol.

Both XDR and ASN.1 have many properties in common. While ASN.1 has more
flexibility since it takes a tag-based approach, it is not as efficient as XDR for
standard data types. It requires more processing time.

12.2.3 Markup Languages
While XDR, and ASN.1 deal with representing primitive data types and
compound data structures, Markup Languages (ML) such as HTML, XML etc.
also deal with presentation formatting. These are normally handled by the
application layer though. HTML helps a web browser understand how the server
wishes its content to be displayed. Similarly XML (extensible ML), allows
interpretation of collection of data using schemas, and schema definitions, and
user-defined tags. A complete description of the tag structure is beyond the
scope of this course – but you should know that these are also deal with
presentation issues.


Have you understood ?

   1.   What is the function of a presentation layer ?
   2.   Why is this functionality required ?
   3.   What is meant by marshalling and unmarshalling ?
   4.   What are ASN.1 and XDR ?


                               Chapter 13
                            Data compression

We are all familiar with zip files, jpg files, mpeg files etc. What is common among
all of them ? – They all employ some kind of compression technique and result in
a reduced file size. We all use it because we can save on storage space. In a
networking scenario, data compression becomes important as it reduces the
bandwidth requirements. So we look at some data compression techniques in
this chapter.

Lesson 13.1 What is data compression ?

Data compression is typically carried out by using fewer bits to represent the
same information by means of some coding techniques. For instance, to start
with, the 26 alphabets A-Z may be represented using 5 bits. We can optimize this
further if we knew that certain alphabets (say e or r) occurs more frequently than
the others, by using fewer bits to code these characters. This is the basis of a
coding technique called the Huffman’s technique. Similarly, many compression
techniques exist – each with their own characteristics and issues. Compression
at one end obviously means that we need decompression at the other end. Often
both these operations, especially decompression, need to be carried out in real
time.

Further, sometimes during compression some information may be lost (lossy
compression). But it may be okay for certain applications. Thus, the design of
data compression schemes involves trade-offs among various factors, including
the degree of compression, the amount of distortion introduced and the
computational resources required to compress and uncompress the data. We will
understand these tradeoffs as we look at some details.

First let us look at lossy and lossless compression.

Lossless compression ensures that the data recovered after decompression is
the same as the original data. We use lossless compression for situations where
loss cannot be tolerated – executable code, text files, numeric data etc.
Techniques used for lossless compression usually exploit statistical redundancy
(in the form of discernible patterns) in such a way as to represent the data more
concisely, but nevertheless perfectly. Lossless compression is possible because
most real-world data has statistical redundancy. For example, in English text, the
letter 'e' is much more common than the letter 'z', and the probability that the
letter 'q' will be followed by the letter 'z' is very small. Such information can be
used to advantage.

Lossy data compression on the other hand, is used when some amount of data
loss can be tolerated; or rather compensated for by other factors such as human
perception. For instance, the human eye is more sensitive to subtle variations in
luminance than it is to variations in color. Hence image compression takes
advantage of this fact by dropping or approximating less-important information.
Higher compression rates are achieved using such lossy schemes. Typically they
are used for audio, video and image content.

Lossless compression schemes are reversible so that the original data can be
reconstructed, while lossy schemes accept some loss of data in order to achieve
higher compression.
Lesson 13.2 Lossless compression algorithms
We will take a brief look at three lossless compression algorithms (just the idea)
– Run Length encoding (RLE), Differential pulse code modulation (DPCM), and
dictionary based encoding. Since our focus is more on the lossy ones that really
reduce the bandwidth requirements, we will look at those in detail in the next
section.

13.2.1 Run-Length encoding (RLE):
RLE is a simple brute-force compression technique. It just replaces consecutive
occurrences of a given symbol by a single copy and a count of how many times it
occurs. You can see why it is given the name “run-length”.

A pattern of AAAABBBBBB will be coded as A4B6.
You can see that it will work very well when there is lots of common adjacent
data. For instance, in an image that has large homogeneous regions, it can be
very effective. Similarly, for scanned text documents, it will be very effective – it
can replace huge amounts of white spaces by run-lengths. You can also see that
it will not be effective if there are too many variations in the image.

13.2.2 Differential Pulse Code Modulation (DPCM):
DPCM is based on the idea that we first use a reference symbol, and for all
subsequent symbols, we code the difference between that symbol and the
reference symbol. The pattern AAAABBBBBBB will then be coded as
A0000111111. A is the reference symbol, and each A is no different from the
reference – hence a sequence of 4 0’s. Then comes a set of 6 Bs which differ
from A by one position – hence a sequence of 6 1’s. The idea is that differences
can be coded in a fewer bits than that required to code the symbols themselves.
For example, a difference range of 0-3 can be represented using 2 bits instead of
using 7-8 bits for each symbol. When the difference becomes large, we again
choose a reference symbol and continue.
DPCM works better for images, since it can take advantage of the fact that
difference in neighboring pixels is typically small.
Note that we can also run RLE on the DPCM coded data to achieve better
compression !

13.2.3 Dictionary based methods :
A popular algorithm of this class of algorithms is the Lempel-Ziv (LZ)
compression algorithm. Many variations of this algorithm are used in various
places – in the compress command of unix, GIF format etc. The idea here is to
use or build a dictionary with variable length strings or common phrases that we
expect to find in the data. Then occurrences of these strings are coded as
corresponding indices to the patterns in the dictionary. Significant compression
rates can be achieved using this coding depending on the size of the dictionary.
For example, if we use a 25000 word dictionary, indices would only be 15 bits,
rather than 7-bit ASCII for each character of the word. Thus, the word dictionary
would be coded in 15 bits instead of 70 bits – a compression ratio of almost 4.5 !

As you can see, choice of dictionary is important. It may be static, but then it
would be limited in size and domain. If it is dynamic, then the receiver doing the
decompression needs to be sent the dictionary too. LZ follows an adaptive
approach where the dictionary is built based on the document to be encoded. We
do not go into the details of how that is done – but you can see that it is an
interesting problem in itself.

Lesson 13.3 Lossy compression algorithms :

13.3.1 Image compression :
As we said earlier lossy compression works by dropping some information which
may not actually be perceptible to the human sensory organs. With respect to
images, the following ideas are used :

      Color Palette : Reducing the color space to the most common colors in the
       image. This uses the fact that the human eye is more sensitive to
       luminance rather than color. So we can afford to play with less color. And
       reduce the number of bits required to represent each color pixel. The
       selected colors are specified in the color palette in the header of the
       compressed image. Each pixel just references the index of a color in the
       color palette.
      Chroma sub-sampling : Based on the same fact this works by dropping
       half or more of the chrominance information in the image.
      Transform coding. This is the most commonly used method. A Fourier-
       related transform such as DCT or wavelet transform is applied to the
       image to identify the spatial frequency components of the image. The low
       frequencies correspond to the gross features of the image and the high
       frequencies correspond to fine detail. The idea is to drop higher frequency
       components that may just be barely perceived by the eye. The selected
       components are then quantized and encoded for further compression.
      Fractal compression : This technique relies on the fact that images such
       as natural sceneries exhibit a property called self-similarity. It is possible
       to identify the basic pattern using which larger pieces can be constructed
       using certain mathematical transformations. Compression here works by
       mathematical encoding.

Let us look at the details of the popular JPEG coding standard which is a
transform based method.

JPEG compression :

JPEG is a digital image format named after the Joint Photographic Experts
Group that designed it. It specifies the compression algorithm and the format for
representing image data. It can be used for gray-scale as well as color images.
We will first understand the technique using gray-scale images and then see how
it can be readily extended to color images.

JPEG compression involves 3 steps (Fig. 13.1) – applying the discrete cosine
transform (DCT), quantization, and encoding. The image is divided into 8x8
blocks and all these steps are carried out on each block.


     Source                                                        Compressed
     Image              DCT            Quantization   Encoding     image




Fig. 13.1 JPEG compression

Step 1 : DCT

DCT is a transform derived from the Fast Fourier Transform (FFT). It takes an
8x8 matrix of pixel values as input and produces an 8x8 matrix of spatial
frequency coefficients as output. The first frequency coefficient of the output
matrix is the DC component representing the average value of the 64 pixels. The
other 63 elements represent the AC coefficients representing higher and higher
frequencies as you traverse the matrix in a zigzag manner from (0,1 ) to (8,8).

Step 2 : Quantization

This step is where the compression begins – and also becomes lossy. It is here
that we determine which components to keep and how many bits to use for
coding the chosen components. Different quanta can be used for the different
coefficients. This is specified by means of a quantization table. Different
quantization tables can be used to give different levels of compression.

Example of a Quantization Table :

3     5       7    9    11   13   15   17
5     7       9    11   13   15   17   19
7     9       11   13   15   17   19   21
9     11      13   15   17   19   21   23
11    13      15   17   19   21   23   25
13    15      17   19   21   23   25   27
15    17      19   21   23   25   27   29
17    19      21   23   25   27   29   31

The zig-zag traversal of quantized coefficients is as shown in Fig 13.2.
Fig 13.2 Zig-zag traversal for JPEG

Step 3 : Encoding

In this step, the quantized coefficients are encoded in a compact form. A lossless
compression technique is used. RLE is used to code the 0 coefficients (which are
likely to be large in number depending on the quantization). Other coefficients
are coded using Huffman coding.

In addition to this, the DC components across blocks are coded using differential
coding – as difference from the previous DC component. Here again sufficient
compression will be achieved if the variation between adjacent blocks is less –
which is typically the case in many images.

Color images are handled by using the same process to compress each of the
color planes – red, green and blue in the RGB representation – in the same
manner.

13.3.2 Video Compression :

A video consists of a sequence of frames which can actually be considered to be
a succession of still images. Hence, a natural extension from image compression
would be to compress each of the frames using the image compression
techniques. Going one step further, we can then look at eliminating inter-frame
redundancy and compress the video further. Often, there is very little change
between successive frames. We can capitalise on that to improve our
compression of videos. The popular MPEG standard defined by the Moving
Picture Experts Group takes this approach of eliminating both intra-frame and
inter-frame redundancies. Other standards such as H.261/H.263, are also
similar. Flash is a standard that takes a slightly different approach – it defines a
set of polygons and lines, and uses a sequence of vectors that determine how
these objects move over time.

MPEG standard :
MPEG takes a sequence of frames as input and compresses them into three
types of frames – I, B and P frames. I frames are intra-picture frames that are
considered as reference frames. They are self-contained, and do not depend on
any other frames. P frame is a predicted picture frame which specifies the
relative difference from the previous reference frame (I frame). B frames are
bidirectional predicted picture frames that give an interpolation between the
previous and subsequent I or P frames.

The figure below (Fig. 13.3) illustrates a sequence of frames encoded as I, P and
B frames. The two I frames stand alone – they can be decompressed without
waiting for any other frames. The P frames depend on the preceding I frame, and
can be decompressed at the receiver only of the corresponding I frame is
received. The B frames depend on preceding and succeeding I/P frames, and
requires that both reference frames be present for decompression. Thus the
order in which the frames will be transmitted will be – I P B B I B B.

Input      Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7
stream




                                          MPEG
                                        compression
                            Forward
                           prediction

Compressed I f rame   B f rame B f rame P f rame B f rame B f rame I f rame
stream

                                        Bidirectional
                                         prediction

Fig. 13.3 An example sequence of I, P, and B frames in MPEG

We can see that the I frames are similar to the JPEG compressed version of the
source frame. The difference is that MPEG uses 16x16 macroblocks. For a color
video represented in YUV format (Y – luminance, U & V – chrominance), the U
and V components in each macro block are downsampled into an 8x8 block,
while the Y component uses 16x16 blocks. Again, this banks on the idea that the
human eye is more sensitive to luminance than color – so we can afford to lose
some color detail !

The P and B frames are also processed in macroblocks, but they give
information on the motion component in the video. They give the direction and
distance of movement of the macroblock with respect to the reference frame
( referred to as motion vector).
Thus MPEG effectively uses inter-frame redundancies as well as intra-frame
redundancies. Compression ratios from 90-to-1 to 150-to-1 have been achieved
with MPEG coding. The main drawback of MPEG is that the compression is
computationally expensive and not suitable for on-line operation. Decoding is
relatively simpler and faster and is typically done on-line.

13.3.3. Audio Compression (MP3):

Audio compression is also specified as part of the MPEG standard. It can be
used to compress audio accompanying a video or stand-alone audio.

The first question that comes up when we talk of audio compression is – what is
the data that is being compressed. Telephone-quality voice data is sampled at a
rate of 8 KHz, with 8 bits per sample giving a data rate of 64 kbps – which is not
very high. However, CD quality data is sampled at a rate of 44.1 KHz, with 16
bits per sample, resulting in a data rate of 1.41 Mbps. That clearly is high – and
demands compression for efficient transfer. MP3 provides for 3 levels / layers of
compression with compression factors of 4 , 8 and 12.

To achieve this compression ratio, MP3 does the following :

   1. It splits the audio stream into a certain number of frequency sub-bands.
      The trick here is in determining how many bits are used for each sub-band.
      The selection of sub-bands is achieved using psycho-acoustic models and
      is the key to the quality of the encoding.
   2. Each sub-band is split into a sequence of blocks which vary in length
      (from 64 to 1024 samples).
   3. Each block is transformed using Modified DCT, quantized, and Huffman
      encoded.

Very high rates of compression are achieved using MP3 compression.

Have you understood ?

   1. Why is data compression important in a network scenario ?
   2. What is meant by lossy compression ?
   3. How is lossy compression acceptable ?
   4. What is meant by run-length encoding ?
   5. What is the compression technique used in gif files ?
   6. What are the steps in a JPEG encoding ?
   7. What is meant by quantization ?
   8. Which step in JPEG contributes to the loss ?
   9. How is MPEG coding different from JPEG ?
   10. Why do we need audio compression ?
                               Chapter 14
                            Network Security
Security is a buzz-word that we keep hearing again and again – we talk of
system security, database security, secure O/S, transaction security and so on.
What do we actually mean by security? Let us look at this first, and then take a
closer look at how network security is achieved.

Lesson 14.1 Security fundamentals
Whenever we have a shared resource, used by many, there is a chance of
misuse - malicious or otherwise.. There are different types of “security attacks”.
They may vary from simple passive attacks such as eavesdropping to active
attacks such as changing data, masquerading as somebody else and accessing
data and so on. Guarding against such misuse normally comes under security
measures Security measures are typically characterized as providing one or
more of the following primary characteristics : Confidentiality, Integrity, and
Authentication (CIA).

Confidentiality ensures that an adversary cannot read/understand the
data/transaction in progress. This is normally achieved with the help of
encryption techniques. Encryption transforms a message into an unintelligible
form to any person who does not have the secret necessary to reverse the
transformation.

Even with confidentiality ensured, it is possible that an attacker may change
some value/data with out being identified. Hence we need techniques to detect
tampering of messages/data. These techniques ensure integrity of data. Integrity
of data includes ensuring timeliness of data and originality of data as well. Some
crypto techniques are usually employed to guarantee integrity.

Further, we need to be sure that the two ends of the transaction/ data transfer
are who they actually claim to be. To ensure this we bring in authentication and
access control. Authentication ensures that you are really talking to some one
who you think you are talking to. Similarly, access control ensures that you really
have the rights to access whatever you are accessing.

Thus we need techniques to ensure all of these, plus the many variants of these.
Cryptography is one panacea for many of these security problems. So let us first
understand the cryptographic algorithms, tools, and techniques.

Lesson 14.2 Cryptographic Algorithms

The need to covey information to one’s associates without allowing malicious and
unwanted people to get hold of the information is the basic driving force behind
cryptography. Cryptography works by transforming/ encrypting the information to
a form that cannot be understood by an interceptor. Encryption is the process of
converting data into a form that, even if intercepted by malicious forces, will not
provide any meaningful data until it has been decrypted. The decryption is
usually a reversal of the encryption algorithm. Cryptography is a technique that
has been associated with data security and data integrity historically. The earliest
ciphers known to modern man include the Atbash Cipher and the Ceasear’s
Cipher. While these may be simple encryption techniques they form the basis for
the modern cryptographic algorithms.

Cryptography, simply defined, is the process of combining some input data,
called the plaintext, with a user-specified password to generate an encrypted
output, called cipher-text, in such a way that, given the cipher-text, no one can
recover the original plaintext without the encryption password in a reasonable
amount of time. The algorithms that combine the keys and plaintext are called
ciphers. Many ciphers accept a fixed length password (also called a key). The
key-space is the total number of possible keys. For a cipher that accepts 160 bit
keys, this is 2160, or approximately 1.46 x 1048. Although recommended key
lengths change as computing power grows, the currently secure keylength for
encryption is 128 bits, with most modern algorithms using keys at least this
length.

So what makes one cipher better than another? What makes a cipher secure?
Although these questions are the essence of cryptography, their answers are
relatively simple: if there is no other way to "break" the algorithm (recover the
plaintext or key given some ciphertext) other than searching through every
possible key, then the algorithm is secure. This is where a large key length
comes in -- the larger the key length, the more possible keys to search through,
and therefore the more secure the algorithm. Cryptographic attacks are simply
means of reducing the number of keys that need to be searched.

Modern key-based encryption algorithms are of two distinct types - Symmetric
Key Algorithms and Asymmetric Key Algorithms. The major difference between
the two algorithms is the relation between the keys on the encryption and
decryption sides. Symmetric Key techniques require the key to be the same on
both sides whereas Asymmetric Key techniques allow the key to be different on
either side but neither should be derivable from the other. The main challenge in
private key encryption is the distribution of the secret key itself among the
communicating entities. Public key encryption provides a solution to this, as one
key is known to all – made public, and the other is kept secret by the party/entity
who owns that. Hence there is no need for sharing of secret key.

Symmetric Algorithms are divided into stream ciphers and block ciphers. Stream
ciphers encrypt the data one bit at a time while block ciphers encrypt data a block
(as in 32 or 64 bits) at a time. By comparison, stream ciphers operate on variable
lengths of data. Stream ciphers can be thought of as seeded random number
generators (with the seed being the key), with the random numbers being
combined with the plaintext to generate ciphertext. The more random the
generated numbers are, the more secure the stream cipher is.

14.2.1 Symmetric Key Algorithms (Private key ciphers):

Some of the popular Symmetric Key Algorithms are:

   DES

The Data Encryption Standard is a very popular symmetric key algorithm. DES is
a block cipher with a 64-bit block size. The algorithm uses a 56-bit key to
encipher/decipher a 64-bit block of data. It works by using a series of XOR and
subsitution, S-box operations that randomizes the block of data. The key is
always presented as a 64-bit block, every 8th bit of which is ignored. However, it
is usual to set each 8th bit so that each group of 8 bits has an odd number of bits
set to 1. The fact that it uses 56-bit keys, makes it suspectible to exhaustive key
search with modern computers and special-purpose hardware. DES is still strong
enough to keep most random hackers and individuals out, but it is easily
breakable with special hardware.

The limited key length problem can be overcome by using double or triple length
keys, as in the case of Triple-DES(3DES) algorithm. 3DES is a variant of the
DES involving the application of DES three times usually in the encrypt – decrypt
– encrypt sequence with three different keys.

If we consider a triple length key to consist of three 56-bit keys K1, K2, K3 then
encryption is as follows:
 Encrypt with K1
 Decrypt with K2
 Encrypt with K3
Decryption is the reverse process:
        Decrypt with K3
        Encrypt with K2
        Decrypt with K1

Setting K3 equal to K1 in these processes gives us a double length key K1, K2.
Setting K1, K2 and K3 all equal to K has the same effect as using a single-length
(56-bit key). Thus it is possible for a system using triple-DES to be compatible
with a system using single-DES.

3DES is now being superseded by another algorithm called Advanced Encryption
Standard (AES).

   The Blowfish Algorithm
The most interesting portion of Blowfish is its non-invertible f function. We will
look at this idea because ideas similar to this are used in many security
measures such as hash functions. This function uses modular arithmetic to
generate indexes into the S-boxes. Modular arithmetic is usually used to create
non-invertible f functions. The non-invertible idea can be understood with a
simple example:

Take the function f(x) = x2 mod 7.

               X             1 2 3 4         5     6    7
               X2            1 4 9 16        25    36   49
               X2 mod 7      1 4 2 2         4     1    0

Given an output, there is no function that can generate the specific input to f(x).
For example, if you knew that your function has a value of 4 at some x, there is
no way to know if that x is 2, 5, or any other x whose f(x) = 4. Blowfish does its
arithmetic over mod 232.

S-boxes are just large arrays of predefined data. During the process of key setup,
the key is combined with the S-boxes. Key setup in Blowfish is designed to be
relatively slow. This is actually a benefit, as someone doing a brute-force search
of keys will have to go through the slow key setup process for each key tried.
However, someone doing encryption and decryption must only go through the
key setup process once. Encryption and decryption are relatively fast.
14.2.2 Asymmetric key algorithms (Public key ciphers) :

Asymmetric ciphers work by choosing a pair of keys out of which one is made
public, and the other is kept secret. For confidentiality, the encryption key is kept
public, allowing anyone to encrypt with the key, while ensuring that only the
proper recipient who has the corresponding private key can decrypt the message.
Thus one is a public key and the otehr is called the private key. The security
provided by these ciphers is based on keeping the private key secret, and
making sure that the private key cannot be derived from the public key.

The public key technique also allows for authentication/ digital signature. If the
sender uses his private key to encrypt a message, then anybody (everybody) can
decrypt the message (using the publicly available public key). Since that public
key only can be used for decryption, it automatically implies that it was encrypted
using the corresponding private key only. Since the private key is not available
with anybody else other than the sender, this process provides an authentication
that the message was sent by the correct sender only. That is, anybody
successfully decrypting such messages can be sure that only the owner of the
secret key could have encrypted them. This fact is the basis of the digital
signature technique.

Let us look at one of the most common public key encryption algorithms – the
RSA algorithm.

RSA:

RSA is a public key algorithm invented by Rivest, Shamir and Adleman. The
algorithm is based on modular exponentiation. Numbers e, d and N are chosen
with the property that if A is a number less than N, then (Ae mod N)d mod N = A.

This means that you can encrypt A with e and decrypt using d. Conversely you
can encrypt using d and decrypt using e (this is used for signing and verification).
• The pair of numbers (e,N) is known as the public key and is published.
• The pair of numbers (d,N) is known as the private key and must be kept secret.

The number e is known as the public exponent, the number d is known as the
private exponent, and N is known as the modulus. When talking of key lengths in
connection with RSA, what is meant is the modulus length.

Without going into detail about how e, d and N are related, d can be deduced
from e and N if the factors of N can be determined. Therefore the security of RSA
depends on the difficulty of factorizing N. Because factorization is believed to be
a hard problem, the longer N is, the more secure the cryptosystem. Given the
power of modern computers, a length of 768 bits is considered reasonably safe,
but for serious commercial use 1024 bits is recommended.
The problem with choosing long keys is that RSA is very slow compared with a
symmetric block cipher such as DES, and the longer the key the slower it is. The
best solution is to use RSA for digital signatures and for protecting DES keys.
Bulk data encryption should be done using DES / AES etc.

14.2.3 Authentication

Another genre of algorithms are the Cryptographic Hash Algorithms used for
authentication and data integrity (tamper-proof!). The hash algorithm is used to
generate an authenticator – a value that serves to ensure authenticity and
integrity of the message. To ensure data integrity, the authenticator includes
redundant information about the message contents – like a checksum or CRC.
To support authentication, it uses some proof that whoever created the
authenticator knows some secret that only he/she is supposed to know.

The cryptographic hash function is chosen such that it generates sufficient
redundant information about a message to expose any tampering. The value
generated is generally known by the name of ”message digest”, and is
appended to the message being sent. Remember that the message itself may or
may not be encrypted depending on whether or not confidentiality is a concern.
The digest is normally of a fixed length (some n number of bits). Many message
patterns may map to the same digest value as in the case of any hash function –
and this many-to-one nature of the function is the key. Hash functions are thus
chosen to be one-way so that given a digest it is compuattaionally infeasible to
find another message with the same digest.

MD5, SHA-1 are all examples of has algorithms. They generate 128-bit and 160-
bit digests respectively. Longer the digest, the longer it would take for a brute-
force attack to find a matching message.

To provide authentication, this digest is encrypted. The receiver decrypts the
digest and compares it with the the digest calculated on the plain text. If the two
match, it indicates that the message is indeed from the correct sender and has
not been tampered with !

The encryption may be carried out using private key algorithm or public key
algorithm. If a public key algorithm is used, then the private key of the sender is
used to encrypt the digest, and anybody can use the public key to decrypt it and
verify the senders authenticity. Such an encrypted digest is referred to as
a ”digital signature” and provides non-repudiation – prove to any third party that
the sender actually sent the message.

DSS (Digital Signature Standard) is a format standardized by security enforcing
bodies.
Yet another technique for authentication is message authentication code (MAC)
based. Here, the authenticator uses a hash-like function that takes a secret value
as a parameter and generates an output value called the MAC. The sender
appends the MAC to the message. The receiver recomputes the MAC from the
plain text and verifies by comparing it with the received MAC.

We can also combine the hash and MAC to get what is called hashed message
authentication code (HMAC). In this, the hash is applied to the concatenation of
the plain text message and the secret value to obtain the HMAC. The HMAC is
then appended to the message. The rest of the process is the same.

Thus we can provide confidentiality, authentication and data integrity using a
combination of the above techniques. Note that all of them are dependent on
some key-based algorithm. That brings us to the next issue – how do we
secretly distribute the key to the communicating partners. If it is a public-key
based algorithm, we do not have to worry – the public key is published (!) and the
private key is private ! But we cannot always use public key algorithms as they
are computationally expensive – and time consuming – so they are not suited for
all kinds of situations. So we need to use private key techniques, and hence we
are faced with the problem of secretly distributing the secret keys. Let us look at
how this is handled.

14.2.4 Key distribution

What most systems normally do is to try and combine the advantages of both the
private and public key encryption mechanisms. Actual message transfer is
carried out using private-key techniques for the speed, and the key itself is
distributed using a secure channel created by means of public key encryption. To
prevent the private key from being compromised, it is changed at from time to
time – say from session to session. Thus we have the concept of session keys
(short-lived private-key keys), which are distributed using public keys (long-lived
and predistributed).

That brings us to how the public keys are generated and published. We need a
fool-proof mechanism to publish the public keys – proof that the public key
actually belongs to the person who is staking the claim for that key. Otherwise,
an adversary can forge a claim that the key belongs to him/her while it actually
does not. Thus we need an authority to certify that the key actually belongs to the
claimer. All this is normally handled by what is called a public key infrastructure
(PKI). A certification authority (CA)which is a well-trusted entity is used to
authenticate and publish public-key certificates. The certificates are digitally
signed by the CA. Of course we should know the public key of the CA. We can
then use the public key to verify the digital signature and have the certificate
authenticated. Then we can use the verified public key of the party publishing it
key, and use that to encrypt the private (session) key to be used.
There is a lot more detail in terms of how these things have been formalised and
standardized. It is a whole course in itself. But the idea is pretty simple and that is
what you should definitely know.

PKI is not the only scheme for key exchange. Other protocols such Diffie-
Hellman key exchange which do not require any predistributed keys are also
widely used.

Lesson 14.3 Securing the network
14.3.1 Network Security protocols

Equipped with all this background on security, let us look at how security is
provided in the network. Security may be incorporated between any two
communicating entitites in the network using a combination of the protocols to
provide the C,I and A. That means we can have security provided at the link level,
network (or IP ) level, transport layer (port) level, or application level.
Many of these security measures have been standardized. At the IP level, we
have IPSec a secure IP protocol, that encrypts the IP packets so that the
contents are not apparent to a person in the middle. Either the IP header or the
message body or both may be secured in IPSec.
Similarly, at the transport layer we have the TLS (transport layer security) which
is based on the SSL (Secure socket layer) standard. This protocol actually sits
between the TCP layer and the application layer. It allows the participants to
choose one of different protocols for C, I and A in a handshake phase that
precedes the encrypted data transfer phase. Depending on the application,
suitable algorithms may be chosen. When HTTP runs on top of TLS, it is referred
to as HTTPS (HTTP secure).

E-mail security is provided with a protocol called PGP – pretty good privacy. It
uses the concept of private session keys distributed using public keys which are
handled by a PKI. Secure Socket Shell (SSH) is a secure remote login service –
a secure equivalent of rlogin. It takes care of secure transport, authentication and
connection. SET – secure electronic transaction is a security protocol defined for
e-commerce applications such as credit card transactions.

Other than these cryptography based security solutions, there are also some
practical solutions widely employed today to secure a network from the bad world
outside! Firewalls and IDS – Intrusion detection systems ! They are something
you should know about.

14.3.2 Firewalls

A firewall is any device that prevents a specific type of information from moving
between the untrusted network outside and the trusted network inside. There are
five recognized generations of firewalls. The firewall may be:
       a separate computer system
       a service running on an existing router or server
       a separate network containing a number of supporting devices

First Generation firewalls
These are called packet filtering firewalls. They examine every incoming packet
header and selectively filter packets based on an Access Control List (ACL)
determined by - address, packet type, port request, and others factors. The
restrictions most commonly implemented are based on:

       IP source and destination address
       Direction (inbound or outbound)
       TCP or UDP source and destination port-requests

Second Generation Firewalls

These are called application-level firewall or proxy server. It is often a dedicated
computer separate from the filtering router. With this configuration the proxy
server, rather than the Web server, is exposed to the outside world. Additional
filtering routers can be implemented behind the proxy server. The primary
disadvantage of application-level firewalls is that they are designed for a specific
protocol and cannot easily be reconfigured to protect against attacks on protocols
for which they are not designed.

Third Generation Firewalls

These are called stateful inspection firewalls. Keeps track of each network
connection established between internal and external systems using a state table
which tracks the state and context of each packet in the conversation by
recording which station sent what packet and when. If the stateful firewall
receives an incoming packet that it cannot match in its state table, then it defaults
to its ACL to determine whether to allow the packet to pass. These firewalls can
track connectionless packet traffic such as UDP and remote procedure calls
(RPC) traffic.

Fourth Generation Firewalls

While static filtering firewalls, such as first and third generation, allow entire sets
of one type of packet to enter in response to authorized requests, a dynamic
packet filtering firewall allows only a particular packet with a particular source,
destination, and port address to enter through the firewall. It does this by
understanding how the protocol functions, and opening and closing “doors” in the
firewall, based on the information contained in the packet header. In this manner,
dynamic packet filters are an intermediate form, between traditional static packet
filters and application proxies.
Fifth Generation Firewalls

The final form of firewall is the kernel proxy, a specialized form that works under
the Windows NT Executive, which is the kernel of Windows NT. It evaluates
packets at multiple layers of the protocol stack, by checking security in the kernel
as data is passed up and down the stack

Packet filtering routers

Most organizations with an Internet connection have some form of a router as the
interface at the perimeter between the organization’s internal networks and the
external service provider. Many of these routers can be configured to filter
packets that the organization does not allow into the network. This is a simple but
effective means to lower the organization’s risk to external attack. The drawback
to this type of system includes a lack of auditing and strong authentication. The
complexity of the access control lists used to filter the packets can grow and
degrade network performance.

14.3.3 Intrusion detection systems

IDSs work like burglar alarms. IDSs require complex configurations to provide the
level of detection and response desired. An IDS operates as either network-
based, when the technology is focused on protecting network information assets,
or host-based, when the technology is focused on protecting server or host
information assets. IDSs use one of two detection methods, signature-based or
statistical anomaly-based.
Types of Intrusion detection systems

Network intrusion detection system is an independent platform which identifies
intrusions by examining network traffic and monitors multiple hosts. Network
Intrusion Detection Systems gain access to network traffic by connecting to a hub,
network switch configured for port monitoring or network trap. An example of a
NIDS is Snort

Protocol based intrusion detection system – consists of a system or agent that
would typically sit at the front end of the server, monitoring and analyzing the
communication protocol between a connected device (an user / PC or system).
For a web server this would typically monitor the HTTPS protocol stream and
understand the http protocol relative to the web server / systems it is trying to
protect. Where HTTPS is in use then this system would need to reside in the
interface between where HTTPS is un-encrypted and immediately prior to it
entering the web presentation layer.

Application protocol based intrusion detection system – consists of a system or
an agent that would typically sit within a group of servers, monitoring and
analyzing the communication on application specific protocols. For instance, in a
web server with database this would monitor the SQL protocol specific to the
middleware / business-login as it transacts with the database.

Host based intrusion detection system consists of an agent on a host that
identifies the intrusions by analyzing system calls, application logs, files systems
modifications (binaries, password files, capabilities / acl databases) and other
host activities and state – An example of a HIDS is OSSEC.

Hybrid Intrusion detection system combines two or more approaches. Host agent
data is combined with network information to form a comprehensive view of the
network. An example of a Hybrid IDS (not HIDS) is Prelude.

14.3.4 Security tools

A few words on a few other security tools :

Scanning analysis tools
Scanners, sniffers, and other analysis tools are useful to security administrators
in enabling them to see what the attacker sees. Scanner and analysis tools can
find vulnerabilities in systems.
Packet Sniffers

Packet sniffer is a network tool that collects copies of packets from the network
and analyzes them. These can be used to eavesdrop on the network traffic.
These are also known as network analyzer, and protocol analyzer

Content Filters

Although technically not a firewall, a content filter is a software filter that allows
administrators to restrict accessible content from within a network. The content
filtering restricts Web sites with inappropriate content. Content filtering is
commonly used by organizations such as offices and schools to prevent
computer users from viewing inappropriate web sites or content. Filtering rules
are typically set by a central IT department and may be implemented via software
on individual computers or at a central point on the network such as the proxy
server or internet router. Depending on the sophistication of the system used, it
may be possible for different computer users to have different to have different
levels of internet access.

Have you understood ?
1.     What is meant by CIA in security ?
2.     What are the two major categories of cryptography algorithms ?
3.     What is the idea in private-key based algorithms ?
4.     Name two common private-key based algorithms.
5.     What is the advantage of public-key algorithms ?
6.     Name two common public-key based algorithms.
7.     What is a message digest ?
8.     What is a MAC ?
9.     What is a digital signature ?
10.    What is meant by PKI ?
11.    How is security provided in the network layer protocols ?
12.    What is a firewall ?
13.    What is intrusion detection ?
14.    What is a packet filter ?
                                 Chapter 15
                                Applications

In this chapter, we’ll take a look at a few common applications, to get an idea of
how the application layer is organized. We’ll look at the following applications:

       TELNET
       FTP
       E-mail
       DNS
       HTTP
       SNMP

Lesson 15.1 TELNET

An important use of networking is that a user can use the resources of the
remote machine. These resources may be either hardware (faster processor,
larger hard disks etc.,) or software resources. To enable such a resource
sharing, applications for remote login are available. TELNET is a protocol in
TCP/IP suite for doing remote login.

How does TELNET Work?

   Telnet allows a user to establish TCP connection with a login        server of
    any machine in the network.

   It then passes the key strokes to the distant machine where the server
    process gives these keystrokes to the machine's OS.

   Similarly, the characters to be displayed are transferred to the client machine
    where they are displayed.

As we can see from the above two points, TELNET is transparent as it creates
an appearance that the user's I/O devices are directly connected to the remote
machine. TELNET allows multiple clients to connect to the same login server.
The TELNET protocol offers three basic services. They are :

   It defines a Network Virtual Terminal which standardizes the interface to
    remote systems. This ensures easy communication among heterogeneous
    systems. For example, in the client the input need not come from the
    keyboard and the output need not go to display. Instead, any program can be
    the client.

   TELNET ensures that the connection is symmetric.
   Various parameters like the data format, type of display etc can be
    set. TELNET provides a mechanism whereby the client and the server can
    negotiate these parameters or options.

For a remote login application to work, some basic support is needed from the
operating system. The OS should allow the key strokes to come from any source
and allow the characters to be displayed to go to any destination. The OS entry
point to which key strokes go and from which the display contents are obtained is
called a 'pseudo terminal'. Because of this 'pseudo terminal', the TELNET can
run as an application which interacts with this 'pseudo terminal' rather than being
part of the OS. But the disadvantage of being an application is that the
keystrokes and the characters have to travel through several processes.

TELNET Command sequence:

To control the application which is running at the remote machine, certain control
characters (like CTRL-C, CTRL-Z) etc) have to be sent.          NVT accommodates
this by encoding them in the range of 128-255. Control functions are encoded as
escape sequences, with the first byte being IAC (Interrupt as command) and the
second byte being the actual control character .Some of the control functions are
IP (Interrupt Process) AYT (Are You There to test if the server is alive), EC
(Erase character) EL (Erase Line) etc.
This system of sending control function in the data stream may not always
work. Consider a scenario where, the application running at the server stops
taking input. Soon, the receiver buffer will fill and so data won't be accepted from
the TCP connection. In such a case, the program can't be given any control
function, because commands won't be accepted at the server end. To overcome
this, TELNET resorts to 'out of band' signaling feature of the TCP. To do that,
initially a command called SYNCH is sent. For the commands following SYNCH,
the urgent bit is set to bypass the flow control mechanism. Finally a data mark is
sent, after which transfer becomes normal (without the URGENT bit set).

TELNET Options and their negotiation:

In TELNET, both the client and the server can set various options. For example,
even though the default data format is 7 bit ASCII, it can be set as 8 bit binary
data. He options can be set either by the server or by the client. One side can
request a particular option. The other side can either accept that request or
reject it. So if one end doesn’t understand the options requested by the other, it
can simply reject them and both can run the basic versions of NVT. Hence, it is
possible to use different versions of TELNET clients and servers.

Lesson 15.2 FTP – File Transfer Protocol

Transferring of files is often performed in a networked environment. What are
the issues involved in such a transfer?
      Authorization of the client which requests the transfer
      File ownership, access rights etc
      Data format of the two machines between which file transfer takes place.
All these issues are handled by the File Transfer Protocol or FTP.

Some important features of FTP
   FTP requires clients to authorize themselves by giving a password.
   Provides an interactive interface for users
   Users can specify the data format (like ASCII, binary etc). that should be
     used during file transfer.

How FTP Works?
The FTP server process listens to a standard port at the server Whenever a
client request comes, a new process is created to handle this request so that
multiple clients can connect to the same server. This slave process handles the
control connection with that client. The control connection persists till the client
terminates the connection. Whenever data has to be transferred, a new process
is created to handle that. For data transfer, FTP uses the NVT specification
specified for TELNET.

Let us see in detail, how the connections are established. The client chooses a
random local port and connects to the server at a well known port (port 21). This
is for the control connection. For data transfer, the server uses port 20. The
client specifies a local port to which data connections have to be made. This port
information is sent to the server using the control connection.

How does a user interact with the FTP program?
FTP has a set of user commands. Using these commands a user can perform
operations like sending/receiving a file, changing the current directory, changing
the data format used for transfer etc. The user specifies the domain name or IP
address of the FTP Server. Once a TCP connection is established, he gives out
the login name and the password. Then the user can perform a series of
operations till he terminates the connection. There is a provision in FTP for doing
an 'anonymous' login and password is user's email address. Obviously an
anonymous user will have less privileges than other users. Anonymous FTP is
widely used in the Internet for downloading files.

TFTP
TFTP stands for Trivial FTP. As the name suggests this is a much simpler
protocol than FTP. It doesn't provide authentication facility and provides simple
file transfer operations. TFTP doesn't need a reliable transport service and so
can run on top of UDP. Then how does it ensures reliable file transfer?. Files
are sent as fixed size blocks. For each of the blocks, an acknowledgement is
sent at the TFTP level.
Lesson 15.3 E-mail
E-mail is one of the oldest applications on the internet. E-mail needs no
introduction from a users’ perspective. So let us straight away jump to the details
of the architecture and protocols related to e-mail.

E-mail has three primary components – user agents, mail servers, and the
Simple Mail Transfer Protocol (SMTP). User agents are the programs (e-mail
readers) that allow us to read, compose, edit, reply, forward, send and save
messages. Mail servers are the programs that send our mails to the recipients
and receiver our mails from the other mail servers, and maintain our mailboxes.
Usually email address consists of a mail box name and a machine name
separated by '@' (example scse@annauniv.edu). The mail box name identifies
the user, and the machine name identifies the server. Now it is not necessary
that the machine name is an actual domain name or that mailbox is a user in that
machine. They may be aliases to some other machine and user. SMTP is the
principal protocol used for sending mails. In addition, e-mail also makes use one
of two mail access protocols - POP3 or IMAP, through which user agents can
automatically retrieve messages from the server. We will look at the details of
these three protocols.

15.3.1 SMTP

As with many application layer protocols, SMTP has two sides – a client side and
a server side. Actually both the client and server sides run on every mail server.
When the mail server is sending data, it acts as an SMTP client. When it is
receiving data, it acts as an SMTP server.
SMTP defines a format for sending messages, and a sequence of messages that
have to be exchanged between the client and the server.

 Message format
 The message has a header and a body.
 Header is divided into lines each of which has a keyword, a colon symbol and
   a value (example : FROM :abc @hotmail.com). Each line ends with a
   Carriage return & Line-feed characters ( CR-LF) which are defined in the
   standard 7-bit ASCII character set.
 The message and the header should contain only readable text. (That is how
   e-mail messages were originally sent – and continue to be sent today! How
   then do we send images, pictures, and movies as attachments ? A small trick
   is used here. We will see that shortly. Now focus on the text based protocol. )

Message transfer
The communication between the client and the server is in readable text. The
steps involved in transferring are as follows :
      The client establishes a connection with the server and waits for a
       'READY FOR MAIL' message to come from the server giving the name of
       the server and indicating that it is ready to receive the message from the
       client. Every message has a number associated with it. The number for
       this is 220.

      When it receives message 220, the client sends a HELO Command along
       with its name.

      The server then acknowledges (OKs) the HELO by sending a HELO from
       its side – a message with number 250. The mail connection is thus
       established.

      After this, any number of mail messages can be sent to the server. To do
       that, the following command exchange takes place.

      The client sends a MAIL command followed by the FROM field which tells
       the address from whom the mail is being sent (and to which errors should
       be reported).

      The server on receipt of the above command gives out a OK (250)
       message.

      The client then sends a series of RCPT command specifying the
       receivers. This is either acknowledged by an OK (250) message or an
       error message (550).

      After the set of RCPT commands the client sends a DATA command to
       indicate that it is ready to send data. On receiving this the server sends
       out the message ‘Start Mail Input’ (354).

      The client then sends the data and terminates it with a new line with a
       single “.” character.

      The server responds with an OK (250) message.

      After transferring all the messages, the client sends the QUIT command.

      Server responds with a 221 message and closes the connection.

The entire exchange is a text based exchange of messages. You can look at a
transcript of this exchange and easily understand what has happened.

This type of exchange takes place between the mail agent and the server. Again
a similar exchange takes place between your server and the recipient’s server.
Only thing is this may not be done immediately, but at the discretion of the server.
This is referred to as delayed delivery. How is delayed delivery implemented?

Once a user sends an e-mail message, the message along with additional
information like the sender, receiver etc are stored locally in a spool. A
background mail transfer process periodically checks the spool. If there is some
message in the spool, it tries to establish a TCP connection with the
destination. If a connection could be established and the message could be
successfully sent, the copy of the message in the local spool is deleted. Also, if a
message could not be transferred in a reasonable period of time, say a few says,
it is removed from the spool after sending an error message to the sender.

Forwarding mail to multiple addresses:-
You might know about the concept of mailing lists whereby a message can be
sent to multiple users. This can be easily achieved because of aliasing. As the e-
mail address may not correspond to an actual domain name and user, a mail
forwarding software maps this to the actual destination. By making this mapping
one to many, a single alias can correspond to different destinations. Conversely,
if this is many-to-one, a real destination can be accessed by different aliases.

15.3.2 MIME & S-MIME

The problem with non-text attachments :

We mentioned that the header and the message should contain only readable
text. But do we only send text in our messages ? No – we obviously have to send
attachments that consist of pictures, audio files etc. which are clearly non-text
messages or non-ascii messages. Look at the problem we may have when we
just send such data as part of our mail message – i.e in the DATA portion of the
mail message. Remember that a single dot on a line is the ending delimiter for
the DATA portion. If the message contains only text, you can make sure that this
pattern appears only at the end when you actually want to end the message.
Now, if we take non-ASCII data and send it as it is, this pattern corresponding to
end of data could occur any where in the non-ASCII data – and would be
interpreted by the SMTP server as end of data. That, is a problem. So we have
to make sure that such a sequence of characters never occurs as part of the data
being sent. (Does it ring a bell – we have seen similar problems at the data link
layer and solutions such as bit stuffing and character stuffing !).

So what we do is use an encoding scheme defined by a standard called MIME –
Multimedia Internet Mail Extensions. MIME encodes the non-ASCII data into
what is called as base64 encoding. It takes 3 bytes of the non-ASCII data and
divides into four 6-bit units (base 64 units). Base 64 units fall in the 7-bit ASCII
range of characters and appear as pure text, and can be carried safely as part of
DATA in SMTP protocol. We have to let the server know that base64 encoding
has been used so that it can decode it back to the original content. This is
achieved by adding a few header lines to the SMTP format. We add at least 3
fields to the header – a MIME version field specifying the version of MIME being
used, Content-Transfer-Encoding specifying that base64 encoding is used, and a
content type field that specifies the type of content. Examples of content type
would be image/jpeg, text/plain, audio/mp3, application/Openoffice and so on.

If multiple-attachments are sent, a Multi-part header is added indicating the
message consists of multiple parts, and content related header fields are
specified separately for each part.

A sample message :

From : aa@xx.org
To : bb@yy.org
Subject : sample message

MIME-version : 1.0
Content-type: multipart/mixed; boundary="frontier"

This is a multi-part message in MIME format.

--frontier
Content-type: text/plain

This is the body of the message.

--frontier

Content-Transfer-Encoding : base64
Content type : image/jpeg

PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGh
pcyBpcyB0aGUg(base 64 encoded data …)
…

--frontier--

S-MIME is a version of MIME with security features built-in. It has additional
functionalities to encrypt and decrypt the data being sent.

15.3.3. POP3 – Post Office Protocol

This is one of the two commonly used mail access protocols. You understand
why we need this protocol – isn’t it ? That is, when somebody sends you a mail –
it is stored in your mail server to be accessed by you at your convenience. This
access is facilitated by the mail access protocols. These protocols give you other
facilities to keep track of messages that you have already downloaded from the
server, authorize access to your mailbox, maintain your mailbox, obtain mail
statistics etc.

POP3 also uses a client-server paradigm with the POP3 server running on the
mail server, and the POP3 client running on your local machine, often as part of
your mail agent program (say MS outlook ). The messages exchanged are again
simple text based messages which contain a sequence of commands and
response to these commands. POP3 goes through three phases – an
authorization phase, a transaction phase and an update phase. Authorization
involves sending of username and password to the server, which allows further
access if the authorization information is correct.

A sample exchange :
S : +OK POP3 server ready
C : User ram
S : +OK      /* if Ok or –ERR message if in error */
C: pass lakshman
S: +OK user successfully logged on
When the client establishes a connection with the server, the server first
responds with an OK message.

Then comes the transaction phase in which commands such as list (list size of all
stored messages), retr (retrieve one by one), dele (delete ) etc. are sent from the
client to the server, and the appropriate actions are performed. To understand
the various features of POP3 – just look at any mail agent program such as
outlook express, and look at all the information that you configure and all the
options that you are provided (download and delete, down load and keep - keep
a copy on server etc). They are all features of the POP3 protocol ! A quit
command ends the POP3 session. The mail server carries out any updates
(deletes) specified by the client – the update phase.

15.3.4 IMAP

IMAP is a more feature-rich protocol than the POP3 protocol. It provides all the
functionalities of POP3, plus many more. Prominent among them are : creation of
mail folders at the server, and movement of messages across folders, search for
messages by sender name or subject, retrieve header only, selectively retrieve
messages / attachments etc. It also maintains user state information across
sessions (POP3 does not). All these are useful features especially for the
nomadic user, who wants to access mail from different locations.

15.3.5 Web-based mail
A discussion on e-mail cannot be complete without a discussion on web-based
mail. Surely, all of you have accounts on one or more of the web-based mail
services – hotmail, yahoo, gmail …

What is the difference here ? In this service, the user agent is any ordinary web
browser, and the messages are exchanged between the user and the mail server
using HTTP messages. When you access mail from the mailbox – you are using
HTTP’s GET, rather than POP3. Similarly when you compose and send your
message – it goes to your server using HTTP POST, not SMTP. But when your
server contacts the recipient server to actually deliver the mail, it uses SMTP. It is
the user interface that is done through the HTTP.
Can you list the advantages of web-based mail ?

Lesson 15.4 DNS - Domain Name System
So far, we have been identifying end hosts by IP addresses, which is very
convenient for processing at routers, and intermediate systems. But for us
humans to remember IP addresses of the machines that we may want to access
is a problem. We would be more comfortable remembering meaningful names for
the servers that we want to access. For instance, we specify www.google.com,
www.annauniv.edu and so on. We call these as domain names (or host names).
But for the network (TCP/IP) to actually contact this host, it needs the
corresponding IP address. Where do we get this from ? Who provides the
mapping between domain names and IP addresses ?

The answer to these questions is the domain name system (DNS) which itself is
an application providing a service – naming service. Let us look at how this DNS
works.

When users present a domain name to an application such as a web browser or
mail application, this application in turn takes the help of the DNS application to
get the name resolved to an IP address. Since the number of IP addresses and
domain names is very large, it is not practical to maintain the mapping of all
names in one single central server.
(Earliest versions of DNS did that in one hosts.txt file !). Hence DNS maintains a
distributed database of the domain-name IP address mappings or bindings as
they are called, and uses a well-defined procedure to access it. The name space
itself is organized in a hierarchical fashion, that also helps to decide where the
binding information is stored.

Hierarchical name space: The entire DNS hierarchy can be visualized as a tree.
The root of the tree is the top most in the hierarchy, and major domains such as
edu, com, net, gov, org, mil, in, au, sg, etc. (domains for each country) appear as
the first level nodes. These domains are further divided into sub-domains, etc,
and finally the leaves of the tree correspond to the hosts being named. The name
is thus obtained by tracing up from the leaf to the root with a dot separating the
node names.

For example : cs.annauniv.edu can be traced by going up the tree given in Fig.
15.1.




                  edu                  com                gov                  mil          org              net   uk   fr

                       ■■■                 ■■■                ■■■              ■■■              ■■■

            annauniv         mit   cisco         yahoo nasa         nsf arpa         navy acm         ieee


   cs        ee physics


ux01 ux04

Fig. 15.1     The domain names tree

Taking advantage of the hierarchical naming, DNS servers are also arranged in
a hierarchical fashion with each server maintaining the bindings of its immediate
next lower level. It may also temporarily cache other information, but it is an
authoritative name server for that set of bindings. Thus, to start with we have a
root name server, which will definitely have the bindings of the first level servers
– edu, com etc. Each of these servers (called Top-Level Domain – TLD
servers )in turn will have information of its next lower level of servers and so on
until we get to the leaves which identify the end machine. The idea is that if we
start at the root, we can definitely find the mapping by systematically querying the
different name servers down the tree. The host name specified is parsed from
right to left to correspond to searching on this hierarchy.

Actually, each DNS client first contacts a local DNS server which may be
common to the local network. (The IP address of this local DNS server is
configured in the client – either manually or by DHCP). The local DNS server will
cache the bindings of the queries that it answers. When many nodes in a network
are accessing the same server, they would have identical queries, and they can
be answered immediately from the cache. If the local DNS does not have the info,
then it acts as a DNS client and sends the query to the root or next level name
server.

In general, when DNS clients (or local name server) send DNS queries to the
name servers, they respond with direct answers or pointers to other DNS servers
which are likely to have the information. The client or local name server can then
query the next DNS server and so on. This is referred to as iterative querying (fig.
15.2). Alternatively, when the DNS client contacts a DNS server, it can take the
responsibility of contacting the next level DNS server that is likely to have the info,
which in turn will contact the next level, and so on, until the authoritative name
server is contacted. Once that is done, it returns the result back to the previous
server, which caches the result for future use, and passes the result back along
the path that the query was received. This process is called recursive querying
(Fig. 15.3).




Fig. 15.2 Iterative query
Fig. 15.3 Recursive DNS query

Both schemes – iterative and recursive have their advantages and
disadvantages. While iterative querying puts a burden on the local server initially,
in the process, it helps the local server accumulate a lot of intermediate
information which can be cached and used for subsequent requests. Recursive
querying alleviates the burden on the local server, but it hides information from
the local server. With recursive querying, if all the queries start contacting the
root server, the burden on the root can become very heavy. Hence, normally a
combination of recursive, and iterative is used. That is, local name server first
contacts the root name server (iterative process), gets the address of the next
level name server. Then it uses the recursive process by sending a recursive
query request to that name server (Fig. 15.4). Thus a judicious combination of
both approaches is often used. The choice of the type of querying is specified by
status bits in the header of the DNS message.
Fig. 15.4 Iterative and recursive querying in DNS

15.4.1 DNS resource records

All information pertinent to DNS, and exchanged between the DNS clients and
servers is maintained in the form of resource records (RRs). Essentially an RR is
a name-to-value mapping organized as a 5 tuple as shown below :

< name, value, type, class, TTL >

The name and value fields are self-explanatory. The type field identifies the type
of RR, and specifies how the name and value fields are to be interpreted. The
class field identifies the class of addresses being resolved – the default value is
IN – internet addresses. The TTL field (time to live) specifies the time duration for
which this record is valid. This is necessary for effective caching. The TTL value
specifies how long the entry is to be considered valid for caching. When the TTL
expires, the entry must be evicted from the cache.
Coming to the type field – we need to understand some important types, viz.,
type A, NS, CNAME, and MX.
An A type record gives the actual binding of host name and IP addresses. For an
A type record, the Name field is a host name, and the value field is the
corresponding IP address.
Eg. : < annauniv.edu, 194.3.4.23, A, IN, nn >

An NS type record gives the domain name for a host that is running a name
server that knows how to resolve names of that domain. If Type = NS, then
Name field is a domain, and the Value field is the host name of the authoritative
server for that domain.
Eg. : <cs.annauniv.edu, dns.ananuniv.edu, NS, IN,nn >
This would be returned by the edu name server for a query on cs.annauniv.edu,
saying that it does not have the mapping for cs.annauniv.edu, but knows the
authoritative server for that.

A CNAME record is used to define aliases, and gives the canonical name for a
host. If Type = CNAME, then Name field is the alias name, and the Value field
gives the canonical name for which an A type record would exist giving the actual
binding.
Eg. : <cs.annauniv.edu, dcse.annauniv.edu, CNAME, IN, nn>

An MX record is used to identify mail servers for the domain. If Type = MX, then
Name field is the alias of the mail server for that domain, and the Value field
gives the canonical name of the mail server (for which an A type record would
exist).
Eg.: <annauniv.edu, mail.annauniv.edu, MX, IN, nn>

An example showing all of the above existing in the annauniv.edu name server:

< dcse.annauniv.edu, 195.43.64.23, A, IN, nn >
<cs.annauniv.edu, dcse.ananuiv.edu, CNAME,IN,nn>
< annauniv.edu, mail.annauniv.edu, MX, IN, nn>
<mail.annauniv.edu, 195.43.64.24, A, IN, nn >

In terms of actual implementation, the hierarchy is partitioned into subtrees called
zones for administrative purposes. Each zone can be thought of as an
administrative authority responsible for that portion of the hierarchy. Each zone
would then have a name server that manages all the sub-domains with in that
zone. This is depicted in Fig 15.5 below.
Fig. 15.5 DNS domains and zones

15.4.2 DNS Message Format

DNS uses a query-reply paradigm. Query and reply messages have the format
shown below. It is pretty much self-explanatory. The distinction between query
and reply is by means of a bit in the flags field. The format of a DNS message is
shown in Fig 15.6 below.

The first field is a 16-bit identifier that identifies the query, and matches the reply
with the query. The flags field has information about – query/reply, recursion
available or not, whether it is authoritative name server etc. This is followed by
the number of questions, type A RRs, authority RRs and other RRs being carried
in that message. This is followed by the actual queries and the answer RRs.
Fig. 15.6 DNS message format



Lesson 15.5 HTTP
The Hyper Text Transfer Protocol is central to the browsing on the World Wide
Web that we are all so familiar with. It allows a client program, typically the web-
browser, send requests to a server that hosts web pages and retrieve content
from the server. HTTP defines the structure of the messages exchanged
between the server and the client for this purpose. The web-page may consist of
all kinds of objects – HTML file, JPEG file, WAV file, Java Applet etc. Irrespective
of the type of the object, HTTP helps to retrieve the object from the server. Let us
look at how this happens.

15.5.1 Basic Operation

When you want to access a web page – what do you do ? You type the URL –
which is a complete address for the object (host name + the path name for that
object on that host) – in the web browser, and press enter. In turn, the browser
uses the HTTP client protocol to send the request to the server. The HTTP client
protocol establishes a TCP connection with the HTTP server program. (In
between – we will have a DNS resolution performed to identify the IP address of
the host name specified before TCP is invoked). The server program will parse
the request, identify the type of request, and respond with a reply message which
will contain the base file that is requested (if it is a GET file kind of request). The
base file may have references to other objects. These are retrieved by the
browser by sending separate requests for each of the objects.

Modes of operation

Here, two modes of operation are possible – persistent connection, and non-
persistent connection. With a non-persistent connection, as soon as a single
request is served, the corresponding TCP connection is closed. Hence to retrieve
each subsequent object in the base file, separate TCP connections are
established and closed. Obviously this takes more time – each object’s retrieval
will take 1 RTT to establish the TCP connection, atleast 1 RTT for retrieving the
data, and 1 RTT for closing the connection. Thus if a web page contains 5
objects, it would take 6*3 RTTs for retrieving the full information. One
optimization here could be the use of parallel TCP connections – where the 5
object requests can be sent in 5 parallel TCP connections. Most web browsers
allow the user to configure this feature. This mode of operation was used in the
original version of HTTP, and is still supported in current versions to maintain
compatibility.

The default option in current version of HTTP (ver 1.1) is persistent connection.
In this mode, once a TCP connection is established, it is used for retrieving
further objects from the same server. We do not use separate TCP connections
for each object. Advantage is that we obviously save on time. For the same
example mentioned above, this would take 1 RTT to establish the connection + 6
RTTs to retrieve 6 objects (7RTTs). But how long do we keep the connection
open ? The server typically will close the connection if it isn’t used for a certain
time (some timeout interval).

There are two variations of persistent connections – without pipelining and with
pipelining. In the non-pipelined mode of operation, the objects to be retrieved are
obtained one by one, in a serial fashion. That is, the request for a second file is
sent only after the first has been served. This would amount to the 7RTT
response time w.r.t the previous example. Obviously we can do better than this –
by sending all the requests together – back-to-back. The server processes the
requests in whatever order, and replies are also sent back-to-back. This is what
is done in the pipelined mode of operation. Obviously, the time taken comes
down even further – 1 RTT to establish connection, and 1 RTT to get the back-to-
back responses – just 2 RTT + you can add 1 to close the connection – though
that is actually done in the background.

Persistent connection with pipelining is the default mode of operation, though the
modes can also be used.
Whether persistent or non-persistent, you should note that HTTP it self is a
stateless protocol. That is, the server does not maintain any information about
the clients or their requests. If you request the same object again, the server will
respond with the same object again. It does not tell you that you just retrieved the
object. This does not sound very intelligent, but it helps to keep the HTTP server
simple.
Some tricks at the client side, and some special features in the HTTP protocol
are used to overcome such inefficiencies. We will look at that shortly. Before that
let us look at the format of the messages exchanged.

15.5.2 Format Of HTTP Messages

A HTTP request message has the following general format.

   [METH] [REQUEST-URI] HTTP/[VER]
   [fieldname1]: [field-value1]
   [fieldname2]: [field-value2]
   <Empty line>
   [Request body, if any]

The first line specifies the function requested (in the METH field), URL requested,
and the HTTP version. This is followed by other header lines which may give
information on
 HOST : Host name - which is requesting data,
 Connection : Close or open – whether the connection is persistent (open) or
    not (close)
 User-agent : Mozilla/5.0 – the user program that is making the request (this
    will allow the server to serve content differently if desired)
 Accept-language : ln – specification of which language you want the content
    delivered in , and so on.

The header portion ends with a blank line. It may be followed by some data if any,
that is to be sent along with the request.
The METH field can take different values – GET, POST, HEAD etc. GET is used
to get pages from the server, POST to send some data to the server (say for
filling in forms etc.), and HEAD is used to retrieve only the Header portion and so
on.
A HTTP Response Message looks like this:

HTTP/[VER] [CODE] [TEXT]
Field1: Value1
Field2: Value2

...Document content here...
The first line shows the HTTP version used, followed by a three-digit number (the
HTTP status code) and a reason phrase in plain text meant for us to understand
the status. If there is no error, the code is 200 and the phrase is "OK". The first
line is followed by some lines which contain information about the document
including
Date : 7 Jan 2008 16:53:00           / date and time of sending the reply
Server : Apache /1.3.0       // server program serving the request
Last-Modified : date                 // date at which the file was last modified
Content-length : nnnn                // length of the content
Content-type : text/html     // type of content
and such similar information which is quite self-explanatory.
The header ends with a blank line, followed by the document content.

If there is an error – the code field on the first line would give us that information.
All of you will be familiar with this “HTTP/1.1 404 Not Found” message. Similarly,
other messages are broadly of the following 5 types :

Code   Type          Example reasons
1xx    Informational Request rcvd, continuing process
2xx    Success
3xx    Redirection Moved permanently
4xx    Client error Bad syntax in request
5xx    Server error

15.5.3 Other features of HTTP

Caching

We can easily see the benefits of caching – in the HTTP scenario. As we
mentioned earlier, HTTP is a stateless protocol, and if you request the same
information again, it will be sent by the server again. This is wasteful of
bandwidth and inefficient. One way to handle this inefficiency is by caching the
content on the user’s machine. The browser itself can take care of this activity.
Does not require any modification to the server or HTTP protocol ! All that the
browser program has to do – is store recently accessed files on the local
machine, and when a request originates, check if the corresponding content is
already available in the store (cache). If it is, the request can be served locally,
without contacting the server.

Alternatively, a proxy server or web caching scheme can be employed. Concept
is same – caching. Just that the cache is common to a group of machines. This is
suitable for a network architecture where several machines are connected to the
internet through a gateway or a proxy server. The caching can then be done at
this intermediate point. The advantage is that any content recently accessed by
some other machine also is locally available and can be accessed from the local
network itself. Saves time, and reduced unnecessary traffic. Of course, the
caching service has to interrupt the request messages and do the necessary
interpretation.

Actually, this idea has been used very successfully , to the extent that caches are
maintained at a number of points on the internet, and a cooperative caching
scheme which involves caches talking to each other by means of an internet
cache protocol have been implemented.

Conditional GET

There is one problem with caching though. What if the cached copy is stale ? i.e.,
the page has been updated in the server, but the cache has old data. How would
you identify this ? This requires some checking with the server. That is facilitated
by the conditional GET command in HTTP. A conditional GET message is
specified by a header line which says :

If-Modified-Since : date.

If there is no modification, the server sends back a message with the status field
specifying “not modified”. There is no body to this message. Now, the cache copy
can be considered to be up-to-date and the content delivered locally. This
requires that the last modified date of the retrieved content is also stored in the
cache.

Cookies

Cookies are another solution to the stateless property of HTTP. Since HTTP
does not maintain any information across requests or across sessions, it
becomes difficult for web-sites to keep track of their users. And normally, web
sites would like to keep track of their users – either to restrict access or to
provide special services. Cookies have been devised to keep track of user
information at the users machine itself, and seamlessly pass that information to
the server application. Let us look at how this works.

The Cookie mechanism has four components – a cookie header line in the
request message and response messages, a cookie file kept on the user’s
system, and managed by the user’s browser, and a back-end database of user
related information at the web-site.
Cookies are installed in the user’s machine, by means of a Set-Cookie header in
the HTTP response from a site that wants to use cookies. The cookie is actually
just an identification number. The Set-cookie header line has this identification
number as its value field. This number is generated by the server for a user, and
maintained in its data base. On seeing this line in the response header, the
browser adds this information to its cookie file. The cookie file keeps track of
cookies for various sites. Subsequently, whenever a request is sent to that site,
the browser looks up the cookie file, picks up the cookie number, and sends this
information in the header of the request message as

Cookie : number

The web-site on seeing this number, consults / indexes its database based on
this number, and extracts user specific information, and accordingly serves the
users.
Cookies are used to authenticate users, track user behavior (especially for e-
commerce web-sites )etc. Speaking of authentication, there is another header
line that is used specifically for authentication purposes.

Lesson 15.6 Network Management
The network with all its protocols and standards that we have looked at so far,
would be of little value if it cannot be managed properly. The problem of
management becomes really bizarre (as any management!), as the size of the
network grows, especially with heterogeneous systems. Consider the current
scenario where we are connecting - not just computers but a host of other
devices as well - switches, private branch exchanges, UPSes, peripheral devices
etc. How does one manage such a network?

Obviously, some standard should be adhered to – by all the devices that need to
be monitored / managed. A discussion of such a standard brings us to network
management protocols.

We’ll look at the architecture of some standard network management protocols
to understand how things are handled. Before that, we’ll look at what we mean
by management of a network; what are we trying to monitor etc.

15.6.1 Network Management Modules

Overall, we would like to monitor the system with regard to
          o     Configuration
          o     Fault
          o     Performance
          o     Security
          o     Accounting.

Thus one may say that these are the modules that should constitute a network
management system. The standard and the protocol should be such that these
functions are facilitated-on the “objects” that are managed.             Note the
word ”object”. We are not just talking about “Systems” that need to be managed.
By the word ”object” – we are referring to any entity in the system that may need
to be monitored. It may be a protocol in the system, may be an add-on card in
the system, or may be the system itself.
Functions Of These Modules

The “configuration management” module should be responsible for initializing,
operating, closing down and reconfiguring the managed objects. For this
purpose, it could associate names with managed objects, set up parameters for
the objects and collect data about the operations to detect any change in the
state of the system.

The “fault management module” should be responsible for detecting, isolating
and repairing problems. To this end, it should support an ability to trace faults
through the systems, carry out diagnostics, and correct faults on the detection of
errors. Maintenance of error logs, time-stamps, analyzing error logs to detect
faults and their causes, etc., are also to be taken care of.

The “performance management” module would have to handle gathering of
statistical data and analyzing this to determine the performance of the system.
Performance may be determined with respect to some performance measures
such as - throughput, load, delay, initialization etc. It may do passive monitoring
or active monitoring. Passive monitoring would mean that it would just report
performance measures. Active monitoring would mean that, when the measure
hits a threshold, some corrective action would be taken.

The “security management” module is responsible for “protecting” the managed
objects. It could provide authentication procedures, rules, support for encryption,
maintenance of keys for that purpose, maintaining security logs, authorization
facilities etc., Basically, it would control the access rights for the managed objects.

The “Accounting Management” module is the accountant. It should keep track of
network usage, charges, costs and any limits on these. Thus, these are the basic
activities that any network management system should support.

We’ll now look at some of the common standards that have been developed for
network management.
15.6.2 Network Management Standards

There are two prevalent standards for network management – the OSI network
management standard and the Internet network management standard. The
basic methodology specified by both these standards is pretty much the same.
They differ in certain details. For an understanding of how they work, we’ll focus
on the Internet standard, which is more popular of the two. (The second reason
being that we are [hopefully!] familiar with the working of the internet by now!)

As with all internet related documents, these standards are also available as
RFCs. Before suggesting that you take a look at these RFCs., let us outline the
basic structure of a network management system. (We took a look at the
functionality that should be provided by an NMS, – now a peep into how it is
intended to be done).

Network Management System (NMS)

A Network management system is essentially a collection of tools for network
monitoring and control. It typically views the entire network as a single unified
architecture, and is able to uniquely address each point of control or object in it. It
may be viewed as having 4 key elements.

   o   Management station, or manager.
   o   Agent(s)
   o   Management information base(MIB)
   o   Network management protocol

The ‘Management station’ is the interface for the network manager (human!) to
the NMS.
    It should translate the network manager’s requirements into the actual
      monitoring and control of the network objects.
    It will maintain the data collected from the different objects.
    It will have a set of utilities to analyse this data, generate reports, identify
      faults, recover from them etc.

The “agent” (or agent s/w) resides on the individual network elements (say the
hosts, bridges, routers, hubs etc,
    It is responsible for collecting information and passing it on to the
       management station.
    It normally responds to requests for information and actions from the
       management stations.
    It may also report unusual conditions/alarms to the manager –(even when
       not asked for)

The “MIB” is a collection of the objects that are to be managed. Actually it is a
collection of data variables representing different attributes / aspects of the entity
to be managed. These data variables or objects are standardized across a class
of network elements. For instance, all routers support the same management
objects. The format in which these objects are represented are standardized.
The MIB is acted upon by the agent under the control of the management station.

The “Network management protocol” is the protocol used for communication
between the agent and the management station to exchange information
regarding the MIB. Coming to the standards, they basically differ in the
specifications of the MIB, and the communication protocol. The Internet standard
uses a protocol called the SNMP (Simple Network Management Protocol). The
OSI standard uses the CMIP – Common Information Management Protocol.
SNMPV2 is a protocol that is intended to work with both standards. The figure
below (fig. 15.7) depicts a typical NMS.




Fig. 15.7 A typical NMS

Note that an intermediate entity can act as both manager and agent – i.e a 2nd
level manager that manages agents below it. We’ll now look at the Internet
standard.

Internet standard

As mentioned above, the standard essentially outlines the organization of the
MIB and the communication protocol. So we’ll examine the Internet MIB, and
the SNMP/ SNMPV2 protocol.

Internet MIB

As given above, the MIB identifies the objects to be managed, and provides a
naming mechanism to uniquely identify each managed object. In addition to this,
it also specifies access rights about this information for the users, and how this
information is reported.

For the internet MIB, the internet Structure for Management Information (SMI)
describes the identification scheme and the structure of the managed objects.
The object definitions are specified by RFCs.

The identification scheme describes the names used to identify the managed
objects. These names serve as object identifiers. A hierarchical naming scheme
is used. This naming scheme is basically derived from the hierarchical
registration scheme used to categorize objects.

This is best understood with an example: consider the figure below (fig. 15.8) that
gives a portion of the internet registration hierarchy.




Fig. 15.8 Internet Standard MIB hierarchy

The internet object group is identified by the name 1.3.6.1. This name is derived
by tracing down the tree, beginning at the root. Thus if we want to refer to the
interface type we note that it is the 3rd item under the if entry which is the 1st entry
of If table, which is the 2nd entry under interfaces. Interfaces is the 2nd entry
under “mib” and so on. Therefore iftype is identified by 1.3.6.1.2.1.2.1.3. If the
value corresponding to this object =6, it indicates that it is an Ethernet interface.

Such identifiers uniquely identify a managed object. However, it may not always
be necessary to specify the complete identifier. For example, if a network
management protocol is exchanging traffic about Internet managed objects, then
the prefix identifiers of 1.3.6.1.2.1 does not change. These need not be
exchanged or stored every time. Only the suffix part may be exchanged between
the manager and the agent.

A quick look at what the different object groups under the “Internet MIB
represent” :

System:
    This object defines the name and version of the hardware, operating
      system and network software.
    Hierarchical name of the group.
    An indication of when the management portion was initialized, etc.

Interface:
     This gives details about the network interface.
     Number of network Interface supported.
     Type of Interface operating below the IP layer.
     Acceptable datagram size at the Interface.
     Speed of the Interface.
     Address of the Interface.
     Operational state(whether up/down)
     Traffic details - Number of packets received, delivered and discarded etc.

At:
         The address translation group gives
         The address translation tables for network to physical address translation
          and vice –versa.

IP:
         The IP group gives details of the activity at the IP layer.
         Does the m/c forward datagrams?
         The TTL value for datagrams originating at this host.
         Traffic details
         Address tables, subnet masks.
         Routing tables etc.,

ICMP:
         This specifies data on ICMP packets
         Number of various types of ICMP packets received, transmitted etc.
         Statistics on the problems.

TCP:
         This group gives TCP layer details.
         The retransmission algorithm, retransmission counter values.
         Number of TCP connections supported.
         Traffic details.
         End-point address for each connection.
UDP:
      Relevant details of UDP protocol
      Traffic details.

EGP:
      This gives details of the EGP protocol
      Traffic details
      Routing tables etc.

Transmission group:
    Provides information on types of transmission schemes and interfaces.

SNMP:
   This has details of the SNMP protocol itself.
   Statistics on SNMP traffic.,
   SNMP capabilities etc.

Thus to understand this better, mib details of a couple of groups is given in the
figure 15.9 below.
Fig. 15.9 MIB details of system and interfaces

Can you guess what these items mean?              The entries are (almost!) self
explanatory. Refer to the RFC for details.

How are these objects defined?

The objects are defined using templates and ASN.1 notation (Abstract Syntax
Notation) – (Remember that ASN.1 gives a neat way of specifying details across
heterogeneous systems – just what we need.) The complete ASN.1 set is not
used to specify the syntax of the object types.

The following primitive object types are used.-
       Integer
       Octet String
      Object Indentifier
      Null

In addition to this, Sequence & Sequence of constructs are allowed.

Also, 6 Internet types are defined by SMI using the primitive types mentioned
above.
    Network Address-CHOICE type specific protocol family
    IP Address-Octet string used to specify 32-bit address
    Time ticks-A non-negative integer used to represent time in multiples of
       1/100th of a second.
    Gauge-A 32-bit non-negative integer - can increase or decrease in value,
       but no wrap-around.
    Counter - Non-negative integer - can wrap around.

SNMPV2 allows a few more data types.
   Opaque - Octect string that can be used to pass anything (without
    bothering about the details)

Template:

The template contains 5 keywords that define the object as shown below.
    OBJECT-A name for the object type, with its corresponding OBJECT
      IDENTIFIED (in ASCII text)
    Syntax          -ASN.1 coding to describe the syntax of the object type-
      integer, octet string etc.
    Definition-Textual description of the managed object-meant for the user.
    Access          -Access options of whether the object is
          o Read-only
          o Write –only
          o Read-write or
          o Not accessible
    Status - Status of this object type-whether it is mandatory or optional to
      implement this object.

For eg., a definition of the sysuptime object as per this template will be as shown
below.

System OBJECT IDENTIFIER ::= {MIB 2 .1}
OBJECT Sysuptime OBJECT – TYPE : Time ticks.
SYNTAX                :Time ticks.
Definition            :Time sincelast reinitilization.
ACCESS                :Read-only.
STATUS                :Mandatory.
It is necessary to understand the MIB specification in the ASN.1 notation to
actually do anything with a network management protocol - writing agent
software or management software. So take a look at some of the related RFCs!!

The protocol messages

Coming to the protocol that is used for communication, the SNMP protocol
consists of a few simple messages. They are :

Get Request
   o Issued by a manager to the agent. Has a list of object names for which
      values are requested. If the get operation is successful, the agent
      responds with a Response-PDU that contain values of Objects requested.
      Else an error is indicated and a partial list may be returned.
Get Next Request
   o Issued by a Manager.
   o It requests for the object that is next in lexicographic order i.e. next in
      terms of its position in the tree structure.
   o Useful when the manager does not know the set of objects that are
      supported by the agent.
   o Agent responds with Response-PDU
Get Bulk Request
   o Issued by a Manager.
   o Same principle as get next request i.e. gets next-object instance in
      lexicographic order, but can get multiple lexicographic successors, as well.
   o Can specify fields for which only single lexicographic successor is needed
      and those for which multiple successors are needed.
   o Useful to retrieve entries of a table.
   o Facilitates retrieval of large blocks of data.

The format of these pdus is given below (Fig. 15.10).

  PDU    request-   0           0            Variable-bindings
  Type   id
  (a) Get Request – PDU, Get Next Request-PDU, Set Request-PDU,
  SNMPV2-Trap-PDU,Inform Request-PDU


  PDU   request- Error-              Error-index    Variable-bindings
  Type  id       status
  (b) Response-PDU


  PDU      request-   non-           max-               Variable-bindings
  Type     id         repeaters      repetitions
(C) Get Bulk Request-PDU
   name1 value1       name2            value2          ...     namen valuen
(d) Variable-bindings
Fig. 15.10 SNMP message format

Note that a very limited set of messages is used which implies a light weight
agent software that can work with all types of network elements that need to be
managed. That is the beauty of this protocol!.

To Sum it up:

An overview of one network management standard has been given. Other
standards are similar. One basically has to get hold of the related MIB (RFCs
are normally the source for such stuff), and figure out how to implement an NMS
using them. As an exercise think of what all you can do using the internet MIB -
GOOD LUCK!

Have you understood ?

   1.       Which is the protocol used for e-mail exchange ?
   2.       What are the mail access protocols ?
   3.       Why is the MIME protocol used with SMTP ?
   4.       Why is a domain name service required ?
   5.       How is the DNS organized ?
   6.       What is an authoritative name server ?
   7.       What is meant by recursive querying and iterative querying ?
   8.       What is the information contained in a DNS resource record ?
   9.       Is HTTP a stateless or stateful protocol ?
   10.      How does a persistent HTTP operate ?
   11.      What is the purpose of a conditional get message in HTTP ?
   12.      Why are cookies used ?
   13.      What are the functions of a network management system ?
   14.      Which are the four modules of a typical NMS ?
   15.      What is a MIB ?
   16.      Why is SNMP a “simple” protocol ?
   17.      What are the messages used in SNMP ?

Summary of the unit

        The session layer helps to organize data transfer into sessions and deals
         with issues such as maintaining continuity of data transfer in a session,
         authentication etc.
        The presentation layer provides standard data representation for
         communicating among heterogeneous machines.
   It marshals the data at the sender end – converts to standard format from
    native format, and unmarshals the data at the receiver end – coverts from
    standard format to the native format of the receiver.
   The application layer deals with actual application related issues.
   Thus presentation and session layers provide services to the application
    layer.
   Security and data compression are also common services that are
    required by most applications. They are handled at any layer (one or
    more) below the application layer. They may also be handled by individual
    applications.
   Different techniques for data compression are used depending on the type
    of data – text, image, audio, video etc.
   Security is primarily provided by use of cryptography techniques.
   The primary characteristics of a secure system are confidentiality, integrity
    and authentication (CIA).
   Commonly, confidentiality is ensured by use of secret keys and
    encryption; integrity by use of hashing techniques, and message
    authentication codes; and authentication using passwords and secret keys.
   Various algorithms exist for each of these operations.
   In private secret key based techniques, secure exchange of secret keys
    itself is a challenge. Public key based systems which use a combination of
    public and private key are useful in that sense. DES and RSA are popular
    examples of the private and public key techniques respectively.
   The computational complexity of the security schemes varies. Your choice
    of the security scheme depends on the computational effort you can afford,
    and your security needs.
   SSL and IPSEC are standards that incorporate security at the transport
    layer and IP layers.
   Of the upper three layers, viz., session, presentation and application
    layers – the TCP/IP model accommodates only the application layer.
   Common application layer protocols are SMTP for mail exchange, HTTP
    for web access, DNS for domain name resolution, FTP for file transfer,
    TELNET for remote access, SNMP for network management, etc.
   SMTP works by exchanging ASCII-based information between a mail
    server and a mail client. IMAP, and POP3 are mail access protocols that
    help a user access his/her mails from the mail server.
   MIME is an extension that helps SMTP to send non-text messages and
    attachments.
   HTTP also uses the client server paradigm. The web browser is the client
    that Gets and puts files from/to the web server. Again a simple ASCII
    based exchange of information is used.
   HTTP is a stateless protocol. Performance varies based on the different
    flavors of HTTP – persistent HTTP, non-persistent HTTP, with pipelining,
    and without pipelining of requests.
   FTP is a stateful protocol. It uses two connections – one for the data and
    one for control.
   DNS is the domain name service which helps to resolve domain names to
    IP addresses.
   The DNS uses a hierarchical organization of the domain names, with the
    domain name mappings distributed across many servers – root name
    server, authoritative name server, local name server etc. These servers
    exchange requests and replies to retrieve the required mapping from the
    appropriate server.
   An iterative or recursive query process may be used by the DNS servers.
   SNMP is a versatile network management protocol that can monitor any
    network entity.
   It consists of 4 components – a manager, an agent, a management
    information base, and a protocol for communication between the manager
    and agent.


Objective type questions

1. The e-mail server that handles outgoing mail is
      a. POP3
      b. SLIP
      c. LDAP
      d. SMTP

2. Examples of passive security attacks are
      a. Altering message contents
      b. Masquerading
      c. Denial of service
      d. All of the above
      e. None of the above

3. The encryption system that eliminates the problem of key change is
      a. Transposition cipher
      b. DES
      c. Public-key encryption
      d. AES
4. A digital signature
      a. Has no place in electronic commerce
      b. Can be imitated by someone else
      c. Is the network equivalent of signing a message
      d. Can be decoded by the receiver using the senders private key.

5. One common method of network access control is
     a. User IDs and passwords
     b. Encrypting the data
     c. Not linking the network to the internet
     d. Locking the doors to the network operations area

6. A firewall
       a. Is usually a combination of hardware and software
       b. Enforces a boundary between two or more networks
       c. Normally logs all transactions that pass through it.
       d. All of the above

7. SNMP is
     a. A protocol for high speed data transmission on simple networks
     b. A network management protocol
     c. An alternate name for the common management information
        protocol
     d. A simple protocol to send mail


1. What is the encoding used by MIME ? What problem does it solve ?
2. Suppose a host decides to use a name server not within its organization
   for name resolution. When would this result in more total traffic for queries
   not found in any DNS cache, than with a local name server ? When might
   this result in a better DNS cache hit rate and possibly less total traffic ?

3. Suppose A sends a message to B using his web-based e-mail account,
   and B accesses his mail from his server using POP3. How does the
   message move from A’s host to B’s host ? List the various application
   layer protocols used ?
4. Each internet host will have atleast one local name server and one
   authoritative name server. What role does each of these servers have in
   DNS ?
5. Is it possible that an organization’s web server and mail server have the
   same alias for a host name ? What would be the different RRs used in this
   case ?
6. Suppose that you send an e-mail message whose only data is a Microsoft
   excel attachment. What would the header lines look like ?
7. Consider accessing a web page consisting of 10 objects (assume that
   each object fits in 1 MSS) using persistent HTTP. Draw a time-line
   diagram showing the transfer of data taking congestion control into
   consideration.
8. Why should each name server know the IP address of its parent instead of
   the domain name of its parent ?
9. Suppose that a HTML file indexes three small objects on the same server.
   Neglecting transmission times, how much time elapses with (a) non-
   persistent HTTP with no parallel TCP connections (b) non-persistent
   HTTP with parallel connections (C) persistent HTTP with pipelining ?

								
To top