"An Introduction to Voice over IP"
An Introduction to Voice over IP Lance Parr Lead Systems Administrator VoIP Laboratory Manager Texas A&M University College Station, Texas What is VoIP? • VoIP (Voice over Internet Protocol), sometimes referred to as Internet telephony, is a method of digitizing voice, encapsulating the digitized voice into packets and transmitting those packets over a packet switched IP network. Voice over IP - the basics • Most implementations use H.323 protocol – Same protocol that is used for IP video. – Uses TCP for call setup – Traffic is actually carried on RTP (Real Time Protocol) which runs on top of UDP. VoIP Protocols • H.323 Multimedia Standard – H.225 RAS - Registration, Admission, Status – Q.931 - Call Signaling (Setup & Termination) – H.245 - Call Control (Preferences, Flow Control, etc.) – Lots of G.7XX CODECS for audio • SIP – Session Initialization Protocol – Covered in next presentation Here’s how it stacks up: H.323 Multimedia Protocol H.225 Call setup & Control – RAS (Q.931) H.235 Security & Authentication H.245 Call negotiation, capability exchange H.450 Other supplemental Services H.246 Circuit Switched Network Interop. H.332 Conferencing H.26X Video CODECS H.7XX Audio CODECS How they fit in: The ISO Model ISO Model Layer Protocol or Standard Presentation Applications / CODECS Session H.323 & SIP Transport RTP / UDP / TCP Network IP – Non QOS Data Link ATM, FR, PPP, Ethernet Comparison of Packet vs. Circuit Switching Circuit Packet Call Setup Database / H.323 & SIP SS 7 Overlay Communications Dedicated Shared Channel Addressing NANP IPv4 & IPv6 H.323 • Definition: a multimedia standard that provides a foundation to transport voice, video and data communications in an IP based non-QOS network. • H.323 Zone – Collection of terminals, gateways, MCUs registered with a single gatekeeper. H.323 Entities • Terminals (LAN Endpoints) • Gateways (Optional but really useful) • Gatekeepers (Also optional) • MCUs H.323 Equipment • Gateway – Device that connects H.323 voice network to non- H.323 voice network (SIP or PSTN) – Allows H.323 terminals to communication with non- H.323 terminals • Gatekeeper – Provides address translation (H.323 & E.164 to IP) – Admission control for H.323 terminals and gateways – Manage bandwidth allocation – Other optional services H.323 Equipment • MCU (multipoint control unit) – MC – multipoint controller • Routes call and control signaling to ensure endpoint compatibility – MP – multipoint processor • Switches, mixes and processes vice and video streams to conferencing equipment H.323 Equipment • Terminal – An endpoint that supports 2-way streaming with another H.323 terminal or gateway – Originates and terminates calls – Includes videoconferencing stations, hard phones, & soft phones Call Setup using H.225 RAS • Registration, Admission and Status (RAS), is responsible for registration, admission, and disengaging procedures between H.323 Gatekeeper and Gateway. • Discovery: GRQ, GCF, GCR – Unicast Discovery using UDP port 1718. Endpoint knows GK IP & register directly – Multicast using UPD multicast address 188.8.131.52 – non static, less admin overhead Call Setup using H.225 RAS • Registration by terminals, Gateways & MCUs using H.323 ID or E.164 address – RRQ Registration Request – RCF Registration Confirm – RRJ Registration Reject – URQ Un-registration Request – URF Un-registration Confirm – URJ Un-registration Confirm H.323 – H.225 RAS Messages • LRQ – location request – Gatekeeper A requests contact information from directory gatekeeper. • LCF – location confirm – Gatekeeper B returns IP address of destination gateway to gatekeeper A. Signaling using Q.931 messages • Q.931 is a signaling protocol used to setup, manage, and terminate H.323 connections between endpoints. – ARQ, ACF, ARJ Admission messages – LRQ, LCF, LRJ Location Request messages – IRQ, IRR, IACK, INAK Status messages – BRQ, BCF, BRJ, RAI, RAC Bandwdith messages H.323 – H.225 RAS Messages • ARQ – admission request – Gateway A requests admission to make a call. • ACF – admission confirm – Gatekeeper A responds with IP address of destination gateway. H.323 – H.225 RAS Messages • Request in Progress – RIP • Bandwidth change – BRQ, BCF, BRJ • Resource Availability – RAI (Indicator) – RAC (Confirm) H.323 – H.225 RAS Messages • Gatekeeper Discovery – GRQ, CCF, GRJ • Terminal/Gateway Registration – RRQ, RCF, RRJ • Terminal/Gateway Registration – URQ, UCF, URJ • Disengage – DRQ, DCF, DRJ H.323 – H.225 RAS Messages • Status Queries – IRQ – info request – IRR – info request response – IACK – info request ACK – INACK - info request NACK H.323 – Q.931 Messages • Alerting – Called user has been alerted, (phone is ringing) • Call Proceeding – Call has been established, no more call establishment information will be accepted • Connect – Acceptance of call by called party • Setup – Indicates H.323 party wants to setup a connection to called party H.323 – Q.931 Messages • Release Complete – H.225 (Q.931) call has been released, signaling channel is now open • Status – Sent when unknown call signaling message or a status inquiry message is received • Status Inquiry – Requests a call’s status H.323 – H.245 • Establishes logical channels for transmission of H.323 data • Negotiates: – channel usage – master/slave configuration – flow control – Codec used • H.245 ports – 1024-5000 TCP in Cisco implementation H.323 – H.245 Messages • Master/Slave Determination – Determines which terminal will be master which will be slave in the call • Terminal Capability Set – Contains information on a terminal’s ability to send and receive multimedia streams • Open Logical Channel – Opens logical channel for transport of multimedia data • Close Logic Channel – Closes the logical channel between two endpoints H.323 – H.245 Messages • Request Mode – Receive terminal requests type of transportation from a transmit terminal – Types of Modes: • Video • Audio • Data • Encryption H.323 – H.245 Messages • Send Terminal Capacity Set – Instructs far-end terminal to send transmit and receive capabilities • End Session Command – Indicates the end of the H.245 session H.323 Call Setup via Gatekeepers Directory and Tier 1 Gatekeeper Call Setup Directory Gatekeeper 2. LRQ 3. LRQ Tier 1 Gatekeeper Tier 1 Gatekeeper 4. LCF 1. ARQ 2. RIP 7. ARQ 8. ACF 5. ACF 6. Q.931 Call Setup VoIP PBX 8. Q.931Call Proceed VoIP PBX H.245 RTP IP Phone IP Phone Gatekeeper Peering and Redundancy Codec ITU G.711 • G.711 is the international standard for encoding telephone audio on a 64 kbps channel. It is a pulse code modulation (PCM) scheme operating at a 8 kHz sample rate, with 8 bits per sample, fully meeting ITU-T recommendations. The module is designed and tested on the TI TMS320C54x platform but can be ported to other DSP and RISC platforms, as well as MS Windows. Information from http://www.spiritcorp.com/ Codec ITU G.711(cont) • Features • Fully compliant with ITU-T G.711 • 64 kbit/s expander input rate • 104 or 112 kbit/s expander output rate • A-law or mu-law expander input • Uniform PCM expander output • 104 or 112 kbit/s compressor input rate • 64 kbit/s compressor output rate • Uniform PCM compressor input • A-law or mu-law compressor output • Selectable frame/buffer memory size according to the system needs • Very simple application interface • Compliant with TI's eXpressDSP standard. Code is reentrant, supports multithreading and dynamic memory allocation. At the same time allows direct (non-eXpressDSP) interface to enable static memory allocation • Can be easily ported to any DSP or RISC platform Codec ITU G.722.1 • G.722.1 is a low-bit-rate wideband coder, which codes speech at 24 kbps or 32 kbps. The quality at 32 kbps is the same as that of G.722 SB-ADPCM at 64 kbps. It uses a transform-coding scheme called Modulated Lapped Transform (MLT), with a 20 ms frame size. The algorithmic delay is 40 ms (20 ms frame size + 20 ms look-ahead). Codec ITU G.722.1(cont) • Supports all bit rates viz. 16/32 kbps at 16 khz sampling rate • C callable API for initialization, encoding and decoding of speech data • Supports Multi-channel capability • Optimized implementation • Bit Compliant with ITU-T test vectors Information found at http://www.ittiam.com/pages/products/g722-1.htm Codec ITU G.723.1 • G.723.1 is a speech compression algorithm standardized by ITU. G.723.1 has dual coding rates at 5.3 and 6.3 kbps. The vocoders process signals with 30 ms frames and have a 7.5 ms look-ahead and low distortion while passing DTMF tones through. The input/output of this algorithm is 16 bit linear PCM samples. • Middle bit rate G.723.1 vocoder delivers one of the highest compression ratios of any of the current ITU standards without compromising speech quality. This vocoder can perform full duplex compression and decompression functions for multimedia, visual telephony, wireless telephony, and videoconferencing products. Codec ITU G.723.1 (cont) • Features • Fully bit exact with ITU-T G.723.1 • 5.3 and 6.3 Kbps encoded bit stream rates • Discontinuous transmission support (DTX) using Voice Activity Detection (VAD) and Comfort Noise Generation (CNG) • Includes optional High Pass Filter and optional Post Filter • Direct interface with PCM 8KHz sampled data. Both sample-by-sample and block based processing supported • Very simple application interface • Can be easily ported to any platform. Codec ITU G.726 • ITU-T G.726 has speech compression and decompression at rates of 16, 24, 32 and 40 Kbps based on Adaptive Differential Pulse Code Modulation (ADPCM). It can be effectively used for speech compression in such applications as speech storing, digital circuit multiplication and telephony applications. Codec ITU G.726 (cont) • Features • Fully bit exact with ITU-T G.726 • Sample-by-sample or block based analog input • 16, 24, 32 or 40 Kbps bit stream rate • A-law, mu-law and 14-bit uniform 8 kHz PCM input/output • Direct interface with PCM 8KHz sampled data. Both sample-by-sample and block based processing supported • Very simple application interface • Can be easily ported to any DSP or RISC platform Codec ITU G.728 • ITU-T recommendation G.728 Annex G is the fixed-point version of the coding of speech at 16kbps using Low Delay Code Excited Linear Prediction (LD-CELP). It uses backward adaptation of predictors and gain to achieve an algorithmic delay of 0.625 ms. Under error-free transmission conditions the perceived quality of a 16 kbit/s LD-CELP codec is equivalent to that of a codec conforming to 32 kbit/s ADPCM. The codec is suitable for applications such as VoIP. Codec ITU G.728(cont) • Features: • API functions for initialization, encoding and decoding of speech data • Supports Multi-channel operation Information from http://www.hellosoft.com Codec ITU G.729 • ITU-T recommendation G.729 codec belongs to the Code- Excited Linear-Prediction coding (CELP) model speech coders and uses Conjugate-Structure Algebraic-Code- Excited Linear-Prediction (CS_ACELP) for coding speech signals at 8 kbits/sec. The coder operates on speech frames of 10 ms corresponding to 80 samples at a sampling rate of 8000 samples per second and the total algorithmic delay is 15 milliseconds. The encoder functionality includes Voice Activity Detection and Comfort Noise Generation (VAD/CNG) and the decoder is capable of accepting silence frames. G.729 provides near toll quality performance under clean channel conditions and is the default codec as prescribed by the Frame Relay Forum and is also suitable for voice over network (VoIP) applications. Codec ITU G.729 (cont) • Features: • C-callable API functions for initialization, encoding and decoding of speech data • Voice Activity Detection and Comfort Noise Generation • Supports Multi-channel operation and Reentrancy • Code passes all test vectors specified by ITU-T • Optimized implementation Codec ITU G.729A • ITU-T recommendation G.729 annex A (referred as G.729A) is the reduced complexity version of G.729 recommendation and operates at 8 kbits/sec. This version is developed mainly for multimedia simultaneous voice and data applications, although the use of the codec is not limited to these applications. This version is bit stream interoperable with the full version (G729). The coder operates on speech frames of 10 ms corresponding to 80 samples at a sampling rate of 8000 samples per second and the total algorithmic delay is 15 milliseconds. The encoder functionality includes Voice Activity Detection and Comfort Noise Generation (VAD/CNG) and the decoder is capable of accepting silence frames. The performance of this codec may not be as good as the G729 in certain circumstances. The codec is suitable for voice over network (VoIP) applications Codec ITU G.729A (cont) • Features: • C-callable API functions for initialization, encoding and decoding of speech data • Voice Activity Detection and Comfort Noise Generation • Supports Multi-channel operation and Reentrancy • Code passes all test vectors specified by ITU-T • Optimized implementation GIPS • GIPS Enhanced G.711 - G.711 with GIPS developed enhancement providing superior packet loss robustness. GIPS Enhanced G.711 consists of the G.711 codec combined with an enhancement to provide packet loss robustness. Call setup is done with G.711 and the enhancement is detected and activated after call setup if both end points have the enhancement. The enhancement unit works similarly to an encryption method. The packets are transcoded to provide packet loss robustness instead of privacy. A SoundWare solution with GIPS Enhanced G.711 in combination with NetEQ™ provides a PSTN speech quality level at packet loss/delay rates up to 10%. This is achieved without increasing the bit rate, and without significant increases in latency and complexity. • Information provided by http://www.globalipsound.com GIPS (cont) • GIPS Enhanced G.711Quality:At parity with PSTN, even under severe packet loss conditions • SAMPLING RATE:8 kHz • BITRATE:Variable, in average equal to G.711COMPLEXITY: Very lowPACKET LOSS ROBUSTNESS: Very highALGORITHMIC DELAY:Equal to G.711 • VOICE ACTIVITY DETECTION: Available • COMPATIBILITY:Transparent with G.711 at the end- points and is only activated if the both end-points are improved with a GIPS codecDTMF, • FAX AND MODEM COMPATIBLE:Yes Speech Codec Comparison • Codec TypeRate Algorithmic Delay(ms) • G.711 A-Law / µ-Law 64 0 • G.722 SB-ADPCM 64/ 56/ 48 0 • G.723.1/ AMP-MLQ/ACELP* 6.3/ 5.3 37.5 • G.726 ADPCM 16/ 24/ 32/ 40 0 • G.727 Embedded ADPCM 16/ 24/ 32/ 40 0 • G.728 LD-CELP 16 <2 • G.729 CS-ACELP 8 15 • G.729 ACS-ACELP 8 15 • G.729 BCS-ACELP* 8 15 • G.729 ABCS-ACELP* 8 15 Notes on Table • All codecs are voice-band and run at an 8kHz sampling rate, except for G.722 which has a 7kHz bandwidth and 16kHz sample rate • * These rates are nominal due to utilization of silence compression schemes • G.711 and G.722 are provided free of charge with G.728 if required for H.320 • Algorithmic delay of "non-predictive" codecs is effectively zero Information from http://www.spa.com.au/faqs/codecs.html Matching PSTN Quality Better Than PSTN Quality VOIP Codecs - bandwidth vs. Quality • The tradeoffs: – How much do you need (quality)? – How much can you afford? – How much coding delay can you tolerate? – Do you have special needs? Issues with VoIP • Firewalls • NAT • QoS • Network Testing VoIP Issues – Firewalls • A set of security mechanisms than an organization implements to prevent unsecured access from the outside world to its internal network. • Typically work by blocking access of certain network protocols to specific ports. VoIP Issues – NAT • Helps protect the intranet from exposure to unwanted traffic by providing one single external address to remote users. • Translates local intranet addresses into an external address. • Remote users connect to this external address to connect to the local user, without actually knowing its local address. Issues with Firewalls and NAT • H.323 requires the use of specific static ports for RAS messages, and a number of dynamic ports for RTP. • SIP has one port (5060) for SIP messages, as well as dynamic ports for RTP. • For these protocols to pass the firewall, the specific static and the range of dynamic ports must be opened for all traffic. H.323 Ports used by Cisco equipment Source: Call Manager Dest.: Gatekeeper Description Type Destination Port H.225 RAS UDP 1719 Source: Call Manager Dest.: Call Manager Description Type Destination Port H.225 Call Setup TCP 1720 H.245 Call Control TCP 1024-5000 H.323 Ports used by Cisco equipment Source: Gatekeeper Dest.: Gatekeeper Description Type Destination Port H.225 RAS UDP 1719 Source: Terminal Dest.: Call Manager Description Type Destination Port Skinny TCP 2000 H.323 Ports used by Cisco equipment Source: Terminal Dest.: Terminal Description Type Destination Port RTP UDP 1024-65535 RTCP UDP 1024-65535 Port Usage • Cisco Call Manager -- Call Control for IP Phone – 2000 TCP • Cisco Call Manager -- H.225 Signaling – 1720 TCP • Cisco IP Phone -- RTP Streaming – 16384-32766 UDP (dynamically allocated) • Cisco IP Phone – TFTP, DNS – 49152-53247 TCP (dynamically allocated) • Cisco IP Phone – Call Control – 49152-53247 TCP (dynamically allocated) Voice Over IP - the reasons that we have all heard! • Perception – It is cheaper to run just one network. – It is easier to integrate advanced technology when your phone is on the network (CTI). – If you don’t do it someone else will. • Reality – Convergence will occur some day so it is important that we build the required relationships now. TCP/IP implementations • Departmental VOIP PBX • Centralized VOIP PBX • Road warriors • VOIP trunking – Intranet – Internet VOIP decisions • Power for VOIP phones • E-911 mapping • Which Codec to use Parameters the impact VOIP- How much is to much? • Packet loss • Latency • Jitter QoS issues • QoS has been added to the H.245 OLC packets to allow endpoints to set QoS parameters for the media streams, including RSVP parameters. H.323 only communicates QoS information between H.323 devices. Actual reservation and control of resources is outside the scope of the standard. QoS options • Prioritize by Application • Prioritize by Address - many applications • Create separate VLANs