RESEARCH AREAS by linzhengnd



              Advanced Image coding

1. K. Yu et al, “ Practical real time video codec for mobile devices,” vol. III,
pp. 509-512, ICME 2003.

Developed a practical low complexity real-time video codec for mobile
devices. Reduces computational cost in ME, integer DCT, DCT/quantizer
bypass. Applied to H.263. Extend this to H.261, MPEG1,2,4 (MPEG 4
Visual, SP, ASP) and to H.264 baseline profile.

2. G. Lakhani, “ Optimal Huffman coding of DCT blocks”, IEEE Trans.
CSVT, vol.14, pp.522-527, April 2004.

Modified the Huffman coding in JPEG baseline, Can similar techniques be
applied to other video coding standards! (H.261, MPEG series, H.263 3D –
VLC etc) See Table I about # of bits for each image (comparison).

3. M. Horowitz et al, “ H.264 baseline profile decoder complexity analysis”,
IEEE Trans. CSVT, Vol. 13, pp. 704-716, July 2003 and
V. Lappalainen, A. Hallapuro and T.D. Hamalailen, “ Complexity of
optimized H.26L video decoder implementation”, CSVT, Vol. 13, pp. 717-
725, July 2003

Develop similar complexity analysis for H.264 Main Profile (both
encoder/decoder) and compare with MPEG-2 Main Profile.

Develop similar complexity analysis for H.264 High Profile (both
encoder/decoder) and compare with MPEG-2 Main Profile.

Review the following papers:
a) A. Molino et al, “ Low complexity video codec for mobile video
conferencing”, EUSIPCO 2004, Vienna, Austria, Sept. 2004.
( (
b) M. Li et al, “ DCT-based phase correlation motion estimation”, IEEE
ICIP 2004, Singapore, Oct. 2004.
C) M. Song, A. Cai and J. Sun, “ Motion estimation in DCT domain”,
(contact Proc. 1996 IEEE Intrnl. Conf. on
Communication Technology, vol.12, pp. 670-674, Beijing, China, 1996.
d) S-F. Chang and D.G. Messerschmitt, “ Manipulation and compositing of
MC-DCT compressed video”, IEEE JSAC, vol. 13, pp. 1-11, Jan. 1995.

ME/MC (generally implemented in spatial domain) is computationally
intensive. ME/MC in transform domain may simplify this. Implementation
complexity is a critical factor in designing codecs for wireless (mobile)
communications. Consider this and other functions in a codec based on e.g.,
H. 264 (baseline profile) and other standards.

4. J. But, “ A novel MPEG-1 partial encryption scheme for the purposes of streaming
video”, Ph.D Thesis, ECSE Dept, Monash University, Clayton, Victoria, Australia,
2004. (copy of the thesis is in our lab). Also papers by J. But (under review)

Implementing Encrypted Streaming Video in a Distributed Server
 - Submitted to IEEE Multimedia

An Evaluation of Current MPEG-1 Ciphers and their Applicability to
 - Submitted to ICON 2004

KATIA - A Partial MPEG-1 Video Stream Cipher for the purposes of
 - Submitted to ACM Transactions on Multimedia Computing,
Communications and


In view of the improvements in networks, internet services, DSL, cable modems, satellite
dishes, set-top-boxes, hand-held mobile devices etc, video streaming has lots of potential
and promise. One direct and extensive application is in the entertainment industry where
a client can browse, select and access movies, video clips, sports events,
historical/political/tourist/medical/geographical/scientific encoded video using video 0n
demand VoD service. A major problem for content providers and distributors is in
providing this service to bonafide/authorized client and collecting the regulated revenues
without unauthorized persons duplicating/distributing the protected material. While
encryption schemes have been developed for storage media, (DVD, Video CD etc), there
is an urgent need to extend and implement this approach to video streaming over public
networks such as internet, satellite links, terrestrial and cable channels. The thesis by But
addresses this highly relevant and beneficial subject. Encryption techniques are applied to
MPEG-1 coded bit streams such that with proper (authorized) decryption key, clients can
access and watch the video of choice. without duplication/distribution. This approach
requires a thorough understanding of MPEG-1 encoder/decoder algorithms together with
video/audio systems and as well the encryption details.

The author has proposed a range of modifications to the distributed server design that will
lead to lower implementation costs and also increase the customer base. The development
of a MPEG-1 partial selection scheme for encryption of streaming video is significant
since the current encryption algorithms are mainly designed for encryption and protection
of stored video. Extension of the MPEG-1 cipher to MPEG-2 bit stream is discussed in
general terms with details left for further research, Additional research areas are
developing encryption schemes for other encoded bit stream based on MPEG-4 Visual,
H.263 and the emerging H.264/MPEG-4 Part 10.

Further research;             Apply/extend/implement these techniques to video
streaming based on MPEG-2, MPEG-4 visual, H.263 and H.264/MPEG-4 Part 10
(encryption, authentication, authorization, robustness, copyright protection etc.).Pl see
chapter 7 Conclusion of the thesis for summary and further research.

5. H.264/MPEG-4 Part 10, the latest video coding standard specifies only video
coding unlike MPEG1,2,4 (visual), H.263 etc. (see IEEE Trans. CSVT, vol. 13, July
2003, Special issue on H.264/MPEG-4 Part 10). For all video applications, audio is

       Investigate multiplexing of H.264/MPEG-4 Part 10 (encoded video) with encoded
       audio based on the MPEG-2,4 Systems compatibility at the transmitter side
       followed by inverse operations (demultiplexing into video and audio bit streams
       and decoding these two media along with the lip-sync and other aspects) at the
       receiver side. There are several standards/non standards based algorithms for
       encoding/decoding audio (] M. Bosi and R.E. Goldberg, “ Introduction to digital
       audio coding standards”, Norwell, MA: Kluwer, 2002).

. H.264/MPEG-4 Part 10 video can be in various profiles/levels and as well the audio
(mono, stereo, surround sound etc) aimed at various quality levels/applications and as
well at various bit rates. This research can lead to several M.S. Theses. This research
also has practical/industrial applications.

Below are the comments by industry experts actively involved in the video/audio
       Just like MPEG-2 video, the audio standards used in broadcast applications are defined
       by application standards such as ATSC (US Terrestrial Broadcast), SCTE (US/Canada
       Cable), ARIB (Japan) and DVB (Europe). ATSC and SCTE specify AC-3 (Dolby) audio
       while DVB specifies both MPEG-1 audio as well as AC-3. ARIB specifies MPEG-2 AAC.

        The story for audio to be used with H.264 is more complex. DVB is considering AAC with
       SBR (called AAC plus) while ATSC has selected AC-3 plus from Dolby. In addition, for
       compatibility all the application standards will continue to use the existing audio
       standards (AC-3, MPEG-1 and MPEG-2 AAC).

        The glue to all of these is the MPEG-2 transport that provides the audio/video
       synchronization mechanism for all the video and audio standards.

       I have the document ETSI TS 101 154 V1.6.1 (2005) DVB: Implementation guidelines
       for the use of video and audio coding in broadcasting applications based on the MPEG-2
       transport stream (file: ETSI-DVB).

              264/MPEG-4 Part 10 ( see item 5 above)
       58. 6. H.

           Several new profiles/extensions have been
           developed. (ex. Studio and/or digital cinema, 12
           bpp intensity resolutions, 4:2:2 and 4:4:4
           formats, file and optical disk storage/transport
           over IP networks etc). See G.J. Sullivan, P.
           Topiwala and A. Luthra, “The H.264/AVC advanced video coding
           standard: Overview and introduction to the fidelity range extensions”, SPIE Conf. on
           applications of digital image processing XXVII, vol. 5558, pp. 53-74, Aug. 2004. This
           paper discusses the extensions to H.264 including various new profiles (high, High
           10, High 4:2:2 and High 4:4:4) and compares the performance with previous
           standards. G.J. Sullivan, “ The H.264/MPEG-4 AVC video coding standard and
           its deployment status”, SPIE/VCIP 2005, Vol. 5960, pp. 709-719, Beijing,
           China, July 2005.
                          Kun-Wei Lin, “ Encoder optimization
See also Y. Su, Ming-Ting Sun and
for H.264/AVC fidelity range extensions”, VCIP2005, SPIE,
vol. 5960, pp. 2067-2075, Beijing, China, July 2005.

These extensions can lead to several M.S. Theses and possibly Ph.D dissertations.

7. see K. Yu et al, “ Practical real-time video codec for mobile devices”, IEEE ICME
2003, Vol. III, pp. 509-512, 2003. They have developed a practical low-complexity
real-time video codec for mobile devices based on H. 263. Explore/develop similar
codecs based on H.264 baseline profile. See also the paper S.K. Dai et al, “
Enhanced intra-prediction algorithm in AVS-M”, Proc. ISIMP, pp. 298-301, Oct.
2004. (M is for mobile applications).
8. Fractal lossless image coding. See proc. of EC-VIP-MC 2003 in our lab. Extend
this approach to

color images, video etc

9. Explore Fractal/DWT (Similar to Fractal/DCT) in image/video coding.

10. Explore fractal/SVD in image/video coding

11.Y-C. Hu “ Multiple images embedding scheme based on moment preserving
block truncation coding”, Real-Time Imaging (under review)

Embedding multiple secret images in grayscale cover image using BTC
(compression of secret images) followed by DES encryption is proposed. Extensive
literature survey is very helpful (read review papers)


Extend this to color images RGB ----- YCBCR). Consider embedding in Y, CB, or
CR with different combinations. Follow this by DES encryption (robustness).
Consider compression schemes other than BTC for the secret images. (see Figs. 3
and 4). Consider schemes other than LSB substitution for embedding secret images.
Evaluate capacity, robustness, complexity etc, (Review the theses by Ramaswamy
and Sally. Their software will be very helpful.) Consider embedding secret images in
JPEG or JPEG2000 (see conclusions of the above paper).

12. P. Tsai, Y-C. Hu and C.C. Chang, “ A progressive secret reveal system based on
SPIHT image transmission”, SP: Image communication, vol. 19, pp. 285-297, March

Secret image is directly embedded in a SPIHT encoded cover image (monochrome).
Sally has extended this to color, RGB ----- YCBCR. She has also investigated
robustness to various attacks. Sally’s thesis and software are in our lab.


Tsai, Hu and Chang suggest adding encryption schemes (DES, RSA) to encrypt the
secret image before embedding. The objective is to enhance the security by
steganography and encryption. INVESTIGATE THIS.

Ramaswamy has completed his thesis using SHA, DES, RSA for encrypting H. 264
video (verify integrity, identify sender/content creator etc). Both Puthussery and
Ramaswamy have all the software (operational). This research topic is viable and
N. Ramaswamy, “ Digital signature in H.264/AVC MPEG4 Part 10”,M.S.. Thesis,UTA,
Aug. 2004.
S. Puthussery “ A progressive secret reveal system for color images”, M.S. Thesis, UTA,
Aug. 2004.


   1. I. Avcibas, N. Memon and B. Sankur, “ Steganalysis using image quality
      metrics”, IEEE Trans. IP, vol. 12, pp. 221-229, Feb. 2003.
   2. I. Avcibas, B. Sankur and K. Sayood, “ Statistical evaluation of image quality
      measures”, J. of Electronic Imaging, vol. 11, pp. 206-223, April 2002.
   3. A.M. Eskicioglu and P.S. Fisher, “ Image quality measures and their
      performance”, IEEE Trans. Commun., vol.43, pp. 2959-2965, Dec. 1995.
   4. A.M. Eskicioglu, “ Application of multidimensional quality measures to
      reconstructed medical images”, Opt.Eng., vol. 35, pp. 778-785, March 1996.
   5. B. Lambrecht, Ed., “ Special issue on image and video quality metrics”, Signal
      Process., vol. 70, Oct. 1998.

13. High-fidelity multi channel audio coding with
Karhunen-Loeve transform
Dai Yang Hongmei Ai Kyriakakis, C. and Kuo, C.-C.J.

IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 365-380, July

Review this paper and related ones cited in the references.

KLT is applied to advanced audio coding (AAC) adopted in MPEG-2. Can this
technique be extended to other multi channel audio coding algorithms?

14. Pl go to Google HEAAC or AAC plus (HE is high efficiency). Also go to This is an improved multi channel audio coder adopted in MPEG-
4 and also by various companies. Also in MP3 called MP3 pro. It is both backward and forward
compatible with AAC. (see M. Wolters, K. Kjorling and H. Purnhagen’ “ A closer look into
MPEG-4 high efficiency AAC”, 115 convention AES, New York, NY: 10-13, Oct. 2003, and P. Ekstrand, “ Bandwidth extension of audio signals by spectral band
replication”, Proc. Ist IEEE Benelux Workshop on Model Based Processing and coding of Audio
(MPCA-2002), Leuven, Belgium, Nov. 2002)

Can the KLT approach described in ref. 13 above be applied to AAC part of HEAAC (the other
part is SBR – spectral band replication) to further improve the coding efficiency.

(See M. Wolters et al, “ A closer look into MPEG-4 high efficiency AAC”, 115 AES convention,
10-13, Oct. 2003, New York, NY. Also
16. Encode H.264 High profile (FRExtensions) video and HE-AAC audio, multiplex the two coded
bit streams using MPEG-2 or MPEG-4 systems (or any other), followed by inverse operations at
the receiver (demultiplex into video and audio coded bit streams and decode). ISMA (internet
streaming media allowance) has adopted H.264 along with HE-AAC for streaming media over
internet. Access

I have the hard copy of the PP slides on New technologies in MPEG audio presented by Dr.
Quackenbush, ( MPEG audio research group chair, Audio
research labs presented in the one day workshop on MPEG international video and audio
standards, HKUST, Hong Kong, on 22 Jan. 2005. Several research projects can be
explored based on these slides.

17. See thesis from NTU, Development of AAC-Codec for streaming in wireless mobile
applications (E. Kurniawati) 2004. I have this thesis. This research develops various techniques
in reducing the implementation complexity while maintaining the same quality desirable for mobile
communications. One concept is using odd DFT for both psychoacoustic analysis and
MDCT. Extend these techniques to HE-AAC audio, (see item 14 above) MPEG-1 Audio and

    18. See S. Srinivasan et al, “ Windows media video 9: Overview and applications”, Signal
    Processing: Image Communication, vol. 19 , pp. 851-875, Oct.2004.
    S. Srinivasan and S.L. Regunathan, “ An overview of VC-1”, “, SPIE/VCIP2005,
    vol.5960, pp.720-728, Beijing, China, July 2005. These papers describe the state-of-the-
    art video coding developed by Microsoft. It is being standardized by SMPTE ( to be named
    VC-9) and being adopted/considered by Blu-Ray DVD and HD-DVD. Compare its
    performance with The H.264/AVC advanced video coding standard: Overview and
    introduction to the fidelity range extensions”, SPIE Conf. on applications of digital image
    processing XXVII, vol. 5558, pp.55-74, Aug. 2004, by G. J. Sullivan, P. Topiwala and A.
    Luthra (similar analysis as in this paper). Carry out comparative performance analysis of
    VC-9 (Microsoft), H.264 FRExtensions and AVS China (see items 19-22) below. See also
A.E. Bell and C.J. Cookson, “ Next generation DVD: application requirements and
    technology”, Signal Processing: Image Communication, vol. 19, pp.909-920, Oct.

                          Gao et al, “ AVS – The Chinese next-generation video
        19.see the paper W.
        coding standard”, NAB 2004, Las Vegas, NV, April 2004. This deals with
        Audio-Video standard of China similar to H.264. It also claims high coding
        efficiency compared to MPEG-2. There are also several papers in the
        special session on AVS in ISIMP2004 held in Hong Kong (Oct. 2004). The
        MPL has the proceedings on CD. One paper deals with AVS-to-MPEG2
        transcoding system. It is designed for transcoding from AVS coded
        bitstream to MPEG-2 coded bitstream applicable to MPEG-2 decoders.
        Develop similar transcoding schemes between H. 264 and MPEG-2.

Rochelle Pereira has completed her thesis MPEG-2 Main Profile to H.264 Main Profile transcoder
Her research opens up a # of related thesis topics. I have her M.S. Thesis, pp slides and
1. MPEG-2 various profiles to H.264 various profiles transcoders and the reverse
2. An immediate and relevant topic is H.264 Main Profile to MPEG-2 Main Profile transcoder.

        M. Bosi and R.E. Goldberg, “ Introduction to digital audio coding standards”,
        Norwell, MA: Kluwer, 2002).

    H.264 and MPEG-2 (CONSIDER ALL LEVELS AND PROFILES).. see also L.
    Yu et al, “ Overview of AVS-Video: Tools, performance and complexity”,
    VCIP2005, pp..             ,Beijing, China, July 2005. I have pp slides related
    AVS China.

    20. Transcoding AVS China to- H.264 and vice versa: Does this have any
    significance or relevance? This transcoding can be based on various
    profiles/levels at different bit rates/quality levels and spatial/temporal
    resolutions. Similarly transcoding between AVS China and WMV-9 (Microsoft
    video coder) and between H.264 and WMV-9. All levels and profiles.

    See the paper H. Kalva, B. Petljanski and B. Furht, “ Complexity reduction
    tools for MPEG-2 to H.264 video transcoding”, WSEAS Trans. on Info.
    Science & Applications, vol. 2, pp. 295-300, March 2005, (This has several
    interesting papers listed in references.).

    See also Y. Su et al, “ Efficient MPEG-2 to H.264/AVC intra transcoding in
    transform-domain”, IEEE ISCAS 2005. (CD in our lab).

Jing Wang, Lei Shi, Li-Wei Guo, Hui Xu, Fu-Rong Zhang, Jian Lou and Lu Yu, “An AVS-to-MPEG2
Transcoding System”, in Proc. of 2004 International Symposium on Intelligent Multimedia, Video and
Speech Processing, Oct.2004 (CD in our lab)

“An AVS-to-MPEG2 Transcoding System”, oral presentation, ISIMP 2004, Hong Kong, Oct.22-
24, 2004

    Sony play station player has developed a MPEG-2 to AVC transcoder –
    pspvideo9 freeware

21. see the paper “Enhanced intra-prediction algorithm in AVS-M”, There are
also several papers in the special session on AVS in ISIMP2004 held in Hong
Kong (Oct. 2004). MPL has the proceedings on CD. Propose and evaluate
similar techniques for H.264-M (here M is mobile. this is not the designation

22. See the paper “ Architecture of AVS hardware decoding system”, as in
21) above develop similar architecture for H.264 decoder at several

   23) See the paper, A. Ehret et al, “ Audio coding technology of ExAC”,
   Proc. ISIMP 2004, pp. 290-293, Hong Kong, Oct. 2004, This paper
   discusses a new low bit rate audio coding technique based on enhanced
   Audio Coding (EAC) and SBR (spectral band replication). Multiplex this
   audio coder with AVS video coder, demultiplex into audio/video coded
   bitstreams and decode them to reconstruct the video/audio. Consider
   several audio/video levels/profiles, (bit rates, spatial/temporal resolutions,
   mono/stereo/5.1 audio channels etc)
   ( See M. Bosi and R.E. Goldberg, “ Introduction to digital audio coding
   standards”, Norwell, MA: Kluwer, 2002).

24) Compare the performances of WMV9, H.264 with FRExtensions and AVS
of China. Consider complexity, profiles/levels, error resilience, bit rates
PSNR/subjective quality and other parameters.

25) WMV9 , H.264 with FRExtensions and AVS of China use different
(although similar) 8x8 integer DCTs. Compare their coding gains, complexity
(fixed point), ringing artifacts and related issues.

   26) Repeat item 25 for 4x4 integer DCTs used in H.264 and WMV9.

   See       Y-J Chung, Y-C. Huang and J-L. Wu, “ An efficient algorithm for
   splitting an 8x8 DCT into four 4x4 modified DCTs used in AVC/H.264”, 55     th

   Eurasip conf., EC-SIP-M2005,pp. 311-316, Smolenice, Slovakia, June-July, 2005.
Can these four 4x4 modified DCTs used in AVC/H.264 be combined to get 8x8
27) Implement/evaluate scalability extensions of H.264 (see current JVT
documents). JSVM (Joint scalable video model) and SVC (scalable video
   28) Design/evaluate/simulate rate control techniques for all profiles/levels in
   29) In HE-AAC or AAC-plus reduce complexity (also lossless audio) by using
       lifting scheme for MDCT/MDST. See Yoshi’s dissertation (UTA-EE Dept.)
   30) Consider using 4x8 and 8x4 integer DCTs in H.264FRExtensions besides
       4x4 and 8x8 integer DCTs. (WMV9 uses all these four transforms).
       Develop encoder/decoder based on these four transforms and evaluate
       any gains in coding.

   31) Design/implement/simulate digital rights management (DRM) for H.264
       codecs (video streaming/VOD/DVD etc). See C.C. Jay_Kuo’s tutorial on
       DRM. ISIMP2004, Oct.2004. Also review the paper WMV9 by Microsoft.
       (see item 18).

   32) Nvidia ( has developed a software decoder to
       transcode MPEG-2 content into WMV 9 player. Develop a software
       decoder to transcode WMV content into MPEG-2 player. This may
       require release from Microsoft. Pl see J. Xin, C-H. Lin and M-T. Sun, “
       Digital video transcoding”, Proc. IEEE, vol. 93, pp. 84-97, Jan. 2005.

   33) MPEG is considering the need for development of a new voluntary
       standard specifying fixed point approximation to ideal IDCT (also for
       DCT) 8x8 (see ISO/IEC/SC29/WG11/N6915, Hong Kong , Jan 2005)

      This document provides all details including evaluation criteria. Develop
      this 8x8 INTDCT/INTIDCT that can meet the evaluation criteria and
      integrate with MPEG codecs.

   34) One-day workshop on MPEG International Video and audio standards,
       22 Jan. 2005, HKUST, Hong Kong,( Right after 71st MPEG meeting in
       HKUST) Several thought provoking R&D topics have been suggested in
       this workshop. (lecture notes in our lab). Some of these are

H.264 Scalable video coding
Multiview coding 3DAV
Scalable audio coding   
Spatial audio coding    
Joint speech and music coding

H.264 Scalable video coding (new project Jan.2005) temporal/SNR/spatial

Topics in this workshop are as follows:
   1. The MPEG Story
   2. Past, Present and Future of MPEG video
   3. New Technologies in MPEG audio
   4. Recent trends in multimedia storage :HD-DVD
   5. Recent trends in multimedia IC and set-top box
   6. Panel discussion: Where is MPEG going ?
   7. The China AVS story
   8. AVS 1.0 and HDTV for 2008 Olympic Games AVS-M and 3G
   9. Hong Kong ITC Consumer Electronics R& D Center under ASTRI
   10. Recent trends of Digital video broadcast and HDTV in Greater China
   11. Recent trends of IC industry in Greater China
   12. Recent trends of mobile multimedia services in Greater China
   13. Panel discussion :Challenges and opportunities of AVS and MPEG in the
       telecommunication and consumer electronics market in Greater China

For encoded/decoded audio quality evaluation refer to

ITU-R BS 1387-1 “Method for objective measurements of perceived audio
quality”, (I have the document). FastVDO’s H.264 High Profile decoder

Due to demand for HD test data, FastVDO is pleased to
provide a consolidated
10-bit HD data set (mostly 1080p) for the research
community. Content
includes a rich set of both film and non-film data. The
data is from a
variety of sources, which retain data rights; usage rights
are limited to
testing, research, standards development, and technical

Please check the site below, where some preliminary
information on this is

Included are:
1. some brief descriptions, including scene selections,
provided by Dolby
and FastVDO when this data was first made available to this
(JVT-J039 and JVT-J042), and
2. some instructions for obtaining this data

More information will be added shortly.

Dr. Pankaj Topiwala              Voice: 410-309-6066
President/CEO FastVDO LLC            Fax: 410-309-6554
7150 Riverwood Dr.,        Mobile: 443-538-3782
Columbia, MD 21046-1245 USA      Email:

    The document reference is ETSI TS 101 154 V1.6.1 (2005-01), "DVB:
Implementation guidelines for the use
of Video and Audio Coding in Broadcasting Applications based on the MPEG-2
Transport Stream".
DVB specifies H.264 Main profile level3 for SDTV and High profile level 4 for

The following information about the Joint Video Team (JVT) and its work may be
helpful to some of you.

The primary work of the JVT currently consists of:
1) scalable video coding (SVC) extension development, and
2) maintenance of the existing Advanced Video Coding (AVC) standard ITU-T
Rec. H.264 & ISO/IEC 14496-10, e.g., including errata reporting and
maintenance of reference sotware and conformance specifications.

The JVT currently has 3 active email reflectors.
You can subscribe to two of them (the general JVT reflector and the
conformance/interop bitstream exchange activity reflector) through and
To subscribe to the 3rd JVT reflector (which is devoted to SVC work), send email
to "" containing "subscribe svc" in the body of
the message.

JVT and VCEG documents can be found at No
password is required for access to nearly all documents. A select few
documents (such as integrated-format standard drafts) require password access,
using a password given only to formal JVT members.

The next JVT meeting will be in Poznań Poland. The dates that were
preliminarily announced for that meeting were 23-29 July 2005. .After the
Poznań meeting, the plan for the next two JVT meetings will be to co-locate them
with each MPEG meeting (i.e., 16-21 October 2005 in Nice, France and 15-20
January 2006 in Bangkok, Thailand). That co-location of meetings is expected to
continue until the 1st 2006 meeting of ITU-T SG 16, upon which we plan to meet
alongside SG16, approximately 12-17 March 2006. We are then likely to return
to meeting with MPEG (16-21 July 2006 in Klagenfurt, Austria and 22-27 Oct.
2006 in Hangzhou China).

The JVT has two parent bodies, which are MPEG (ISO/IEC JTC 1/SC 29/WG 11)
and VCEG (ITU-T SG 16 Q.6). Participation in the JVT is open to anyone who is
qualfied to participate either in MPEG or VCEG, and to those personally invited
by the chairmen. We are liberal in granting invitation requests.

To progress the work of the JVT between meetings, the JVT has created the
following ad-hoc groups, and has appointed the following listed chairpersons for
that work. The discussions involved in the work of those ad-hoc groups will be
conducted on the above-listed email reflectors. 1. JVT Project Management and
Errata Reporting (Gary Sullivan, Jens Rainer Ohm, Ajay Luthra, and Thomas
Wiegand) 2. JM Description and Reference Software (Thomas Wiegand, Karsten
Sühring, Alexis Tourapis, and Keng Pang Lim) 3. Bitstream Exchange and
Conformance (Teruhiko Suzuki and Lowell Winger) 4. SVC Core Experiments
(Justin Ridge, Ulrich Benzler) 5. JSVM software improvement and new
functionality integration (Greg Cook) 6. JSVM Text and WD Text Editing (Julien
Reichel, Heiko Schwarz, Mathias Wien) 7. Spatial Scalability Resampling Filters
(Gary Sullivan) 8. Test conditions and applications for error resilience (Ye Kui
Wang) 9. Test conditions for coding efficiency work and JSVM performance
evaluation (Mathias Wien, Heiko Schwarz) 10. Study of 4:4:4 video coding
functionality (Teruhiko Suzuki)

In the work on scalable video coding (SVC), the JVT is conducting the following
core experiments (CEs). A document describing each of these CEs is available
on the JVT ftp site in the 2005_04_Busan directory as document number JVT-
O3xx, where "xx" is the number of the core experiment as listed below. The
appointed core experiment coordinator, some participating companies, and some
relevant documents (prefix the numbers below by "JVT-O" for the complete
document number) are also listed below.
CE1: MCTF memory management (009, 026, 027, 028) (Visiowave, Panasonic,
Nokia) Julien Reichel
CE2: Improved de-blocking filter settings (non-normative?) (RWTH, FTRD) (067)
Mathias Wien
CE3: Coding efficiency of entropy coding (SKKU, ETRI, Samsung) Woong Il
Choi, (021, 063)
CE4: Inter-layer motion prediction (Samsung, LG) Kyohyuk Lee (058)
CE5: Quality Layers (FTRD, Nokia, ...) (044, 055) Isabelle Amonou
CE6: Improvement of update step (015, 030, 062) (Samsung, MSRA, Nokia,
FhG-HHI) Woo-Jin Han
CE7: Enhancement-layer intra prediction (Thomson, FhG-HHI, Sharp, Huawei,
Samsung) (010, 053, 065) Jill Boyce
CE8: Region of Interest (NCTU, ICU, ETRI, I2R) (020) Zhongkang Lu
CE9: Improvement of quantization (046, 060, 066, 069) (FTRD, Panasonic,
Siemens, RWTH, FhG-HHI, Microsoft, Sharp) Stéphane Pateux
CE10: Extended spatial scalability (Thomson, FTRD, Sharp, LG) (008, 041, 042)
Edouard Francois
CE11: Improvement of FGS (055) (Nokia, FhG-HHI, NCTU) Justin Ridge
CE12: Weighted prediction from FGS layers (054) (Nokia, Visiowave, FhG-HHI)
Yiliang Bao

On the ISO/IEC side, our standards are published as part of the ISO/IEC 14496
suite of standards, which is available for purchase at:

Anyone can get copies of 3 ITU-T standards for free by using the following link:

The links to the JVT's standards at ITU-T are as follows:
Title: H.264 (03/05) : Advanced video coding for generic audiovisual services
Title: H.264.1 (03/05) : Conformance specification for H.264 advanced video
Title: H.264.2 (03/05) : Reference software for H.264 advanced video coding

Best regards,

Gary Sullivan

jvt-experts mailing list http://mailman.rwth-

DIRAC CODEC (comparison/evaluation with H.264)

Dirac is a conventional hybrid motion-compensated
(overlapped block motion compensation is used) video
codec. Dirac uses arithmetic coding.

Main difference from MPEG: Dirac uses a wavelet
transform rather than the DCT – or DCT-like,
I am still reviewing/evaluating codec from the point
of view of D-Cinema (DCI spec, 2k-4k scalability,

--- Jean-Marc Glasser <> wrote:

>   Dear JVT experts,
>   Please find here the link to an alternative CODEC :
>   I wonder if it fits within the JVT framework and how
>   it compares to H.264.

MPEG-2 MULTIPLEXER FOR TS and PS page may help you.

More specifically,
implements "An ISO-13818 compliant multiplexer for generating MPEG2
transport and program streams".

US PATENTS    (US patent and trademark office) While the
claims made in these patents can be simulated no products/devices
based on these patents can be used for commercial purposes (proper
licensing, patent release etc must be obtained)

1.US Patent 4, 999, 705 dated March 12, 1991 A. Puri, “Three
dimensional motion compensated video coding”, Assignee: AT & T Bell
Labs, Murray Hill, NJ.

2. US Patent 4, 958, 226 dated Sept. 18, 1990, B.G, Haskell and A.
Puri, “ Conditional motion compensated interpolation of digital motion
video”, Assignee AT & T Bell Labs, Murray Hill, NJ.

Patent # 1 discusses adaptive 2D or 3D DCT of MXN or MXNXP blocks and a
special “zig-zag-zog” scan for the 3D DCT case. Here MXN is the
spatial block and P is in the temporal domain. It has # of interesting
features and claims improved compression. It follows the GOP concept
size of GOP. This patent can be basis of # of research topics specially
at the M.S. level.

3. US Patent 5, 309, 232, May 3, 1994 J. Hartung et al, “ Dynamic bit
allocation for three-dimensional subband video coding”, assignee AT & T
Bell Labs, Murray Hill, NJ. (see also S-J Choi and J.W. Woods, “
Motion-compensated 3-D subband coding of video”, IEEE Trans. IP, VOL.8,
PP 155-167, Jan. 1998.)

This patent has 3 of interesting features and can lead to several
research topics specially at the M.S. level.
    -     Residual color transform

In Frextensions to H.264/MPEG-4 Part 10, a new addition is
the residual color transform. In this technique, the
input/output and stored reference pictures are in RGB
domain while bringing the forward and inverse color
transformations inside the encoder and decoder for
processing of the residual data only. Color transformations
are RGB to YCgCo (orange and green chroma) and the
inverse. Residual data implies (I assume)intra or motion
compensated prediction errors.

                 pl see email from Woo-Shik Kim

In RCT, the YCgCo transform is applied to the residual signal after intra/inter prediction
and before integer transform/quantization at the encoder, and the inverse YCgCo
transform is applied to the reconstructed residual signal after dequantizaiton/inverse
integer transform and before intra/inter prediction compensation at the decoder.
Since this is not a SVC subject, if you need further discussion you can use the JVT
reflector or 4:4:4 AhG reflector.
The e-mail address is same for both ( and [4:4:4] is
added to the subject for the 4:4:4 AhG reflector.

Best Regards,
Woo-Shik Kim

        Advanced 4;4;4; profile in H.264/MPEG-4 Part 10

Intra residual lossless DPCM coding is proposed in advanced 4:4:4 Profile of
H.264. Implement this and compare with RESIDUAL COLOR TRANSFORM. See
JVT-Q035 17-21 Oct. 2005.

Rahul et al, (31 Jan. 2006)

The most important thing to know about the High 4:4:4 profile is that we have removed it (or are
in the process of removing it) from the standard. We are working on a new Advanced 4:4:4
profile. So the prior High 4:4:4 profile should be considered only a historical curiosity for
purposes of academic study now.
In answer to your specific question, the primary other difference in the High 4:4:4 profile in
addition to support of the 4:4:4 chroma sampling grid in a straighforward fashion similar to what
was done to support 4:2:2 versus 4:2:0, was the support of a more efficient lossless coding mode,
as controlled by a flag called qpprime_y_zero_transform_bypass_flag. This flag, when equal to
1, causes invoking of a special lossless mode when the QP' value for the nominal "Y" component
(which would be the G component for RGB video) is equal to 0. In the special lossless mode, the
transform is bypassed, and the differences are coded directly in the spatial domain using the
entropy coding processes that are otherwise ordinarily applied to transform coefficients.

Best Regards,

-Gary Sullivan

                                residual color prediction (H.264)

                             Pl open both documents. These files can be
                        sources for research projects.

                                MODEL AIDED CODER

    TRADITIONAL HYBRID CODING (mc-transform/prediction) is proposed by
    Thomas Wiegand (file: vicawiegand). This involves considerable original
    research to develop/design/implement this coder.


    free-viewpoint video (FVV, almost free navigation), Omni directional video
    (look around views) MPEG-4 2D/3D scene and object models are some of the
    research areas proposed by Jens-Rainer Ohm (file vicawiegand). These can
    lead to innovative research topics..


To top