									  ARM Graphics and Video
Solution for Connected Home
             Alan Tsai
        Technology Specialist

              March 2010

A World of Infinite Content

                     Armchair browsing
                          Broadcast, pay-per-view, pre-recorded, web content,
                           user generated content, video conferencing…
                          Internet usage in 2012 predicted to be 75x 2008[1]
                          Increasing popularity of IPTV
                          Challenge for cable operators to enable access to
                           video via the web (YouTube, iPlayer, UGC)

                     Almost infinite array of content
                          How to personalize and simplify access?
                          How to draw consumers to directed advertising?

                     Good, intelligent user interface design
                          Critical for any screen-based solution in the home

                      [1] Source: Cisco, October 2008 set top box conference

  Personalizing access to content

Widgets provide                                           Pleasing look and feel:
configurability                                           - simple blending of
and easy access                                           windows
to preferred                                              - transparency (reveal
content                                                   the window

• EPG evolving into customized “home-page” – quick/easy navigation
• Attractive UI to draw consumers to high-reach, high-frequency

Battle for your eyeballs

    EPG on my DTV                  STM 3D UI

  Which UI will you expect on a $1500 HDTV?

3D Game to see in Home w/o console

                     Not far away..

  Now we have …

HD resolution = high pixel throughput

                                     Even simple UI effects challenging
                                          32 bits per pixel
                                          Over 60Mb/frame

         x94 bit rate
                                                        F/Phone   QVGA        iPhone       WVGA      HD 1080p
                                           resolution   220x196   320x240     480x320      854x480   1920x1080
                                           bit/pixel    16        18          24           32        32
                                           Mb/frame     0.7       1.3         3.5          12.5      66
  QVGA     480x320      WVGA   HD 1080

High quality UI = additional processing per pixel

Trilinear filtering for all perspective texturing               Bilinear filtering for all 2D texturing

Additional texture fetch and blending for reflections   Anti-aliasing for all geometry edges

Mali – The World’s Most Licensed GPU
Increasing success versus                         Cumulative number of Mali
competition                                           GPU Licensees
 Mali is the most widely licensed   25

    embedded GPU                     20
   Tier 1 OEM and SiP adoption      10
                                                  2006        2007        2008   2009

                                            Proven IP that works with ARM CPUs,
                                             fabric and your IP
                                                 27 GPU licensees, >60M video engines
                                            Easily integrated, high quality software
                                            Optimized for embedded designs
                                                 Low power and small silicon area
                                                 Industry-leading bandwidth leading to
                                                  lower SoC power consumption

Mali-400 MP Overview
 High end, scalable 2D and 3D accelerator
      World’s 1st embedded multi-core GPU
      1 to 4 fragment processors
 2D and 3D graphics acceleration
      OpenVG 1.1, OpenGL ES 2.0 / 1.1
 High performance and image quality
      Scales to HD 1080p resolutions
      L2 cache tuned for maximum throughput
      Power efficient use of memory bandwidth
      Single, optimised driver for all configurations
 Best bandwidth performance for 1080p
      WVGA to 1080p, pixel rate up 5.4x, while bandwidth
       only increase 2.5x
 Mali still the ONLY GPU IP to pass Khronos
  conformance test at 1080p

Mali 4x Multi-Sampled Anti-Aliasing
 Detail from 3DMM06 Samurai – WVGA 800x480
   Less than 10% performance drop for 4xMSAA

 Single-sampling “NoAA”          4x Multi-sampling

Mali GPU for efficient rendering of UI

               UI requirement                                 Mali-400 MP dual core
~ 100 to 200 triangles for cover images,           30M triangles per second
reflection, text and background
For 1080p/30 HDTV, require min 120MPix/s           550M pixels/sec
Assuming at least 50% bi- or tri-linear filtered   Efficient trilinear filtering & dedicated internal
texture data - Fetch ~360M 24bit pixels per sec    caches to minimize texture bandwidth usage
(1.1Gbyte/s)                                       Blinear filtering with no overhead
Min 3.2 GFLOPS for filtering operations (not       >12GFLOPS
including transparency / blending)                 Alpha blending with no overhead - texture
                                                   fetch and blending for reflections in above UI

Combining video and graphics
   Several use cases
        User interfaces
        Transition effects
        Video post processing
        Picture-in-Picture

   Different formats used for GPU and Video
      GPUs use RGB, video decoders produce YUV
      Separate YUV-RGB conversion consumes
         bandwidth and CPU MHz
                                                              Image courtsey of The Astonishing Tribe

   Getting data from the video decoder into GPU
      GPU & video hardware from different vendors
      No common programming interfaces
      How does user application control graphics and video

Using Mali to optimize data flow
                                                    Mali - unified memory arch - read from any
                                                    part of memory
             Application                            EGL - Khronos std for integrating graphics
                                                    with OS and window system

 OpenGL ES       EGL         Video decoder - calls into EGL to create shared resource
                                           - programs video decoder to output YUV
                                           into shared resource
                                           - sets up GL rendering using the shared
  Mali GPU                                 resource as texture

                                                    - reads YUV data directly (offload CPU – no
                                                    separate conversion passes)
                                                    - convert to RGB using fragment shader
                                                    - YUV to RGB in just 2 cycles/pixel

                           Delivered by ARM

                           Integrator / 3rd Party

 Graphics demands increasing
  Per-pixel depth of field                                                  YUV texturing to ”pull” video into
          effects                                                                  graphics pipeline

                                                                              ...including realtime contrast,
Per-pixel diffraction effects                                                    brightness and saturation

    Same number of output pixels, but more work done for each pixel
    Requires at least 6 GFLOPS
    Can be efficiently handled by Mali GPU fragment shader

    For example, a dual core Mali-400 MP can perform 20 shader operations per cycle
  addition to texture filtering, blending, anti-aliasing...
    …realtime contrast, brightness and saturation adjustment with YUV texturing to 275MPix/s

Advanced effects require ever higher pixel througput

   Radial blur transition above easily adds 10x HDTV pixels
   ...start approaching 800M pixels per second and beyond...
   ...requiring a 3 or possibly even 4-core Mali-400 MP
   Bandwidth efficiency is critical

Mali-VE Video Engine overview
   High Definition Video Engine
        Full 1080p HD at 60fps
        Scalable architecture - up to 20x slow motion
   Multi-standard, multi-session codec
        Encode H.264, MPEG4, JPEG and H.263
        Decode H.264, MPEG4, DivX, JPEG, MPEG2,
                VC-1, VP6, Real 8/9/10 and H.263
   Designed for energy efficiency
        Low power consumption, 1080p30 at 60mW
        Small Si area of just 2.15mm2
        Industry leading low memory bandwidth
        Firmware driven for configurability
   Applications
        Home entertainment
           HD DTV, Blu-ray devices, HD camcorders
        Portable entertainment
           Smartphone video, video telephony
                                                                             Data quoted at 40G, 0.9V

 Bewildering array of content available
 Good UI design allows personalization and simplifies access
  to content
 GPU is a must-have in connected home
     Use of 3rd dimension & the “Apple effect” for UI
     HD screens demand very high pixel throughput & bandwidth
     Adobe Flash 10.1 demands OpenGL ES 2.0 GPU
 ARM and partners deploying GPU today in wide range of
 Bringing the connected home to life

Want to know more?
 See Mali demos from our partners
      ST
      Mentor Graphics
      Digital Aria
      Telechips

 Come and see us at the ARM booth
    “Canvas” demo - OpenGL ES 2.0 UI on tablet
        Soft shadows, per-pixel lighting, multiple env
            maps, dynamic textures
           Over 30 FPS with 4x anti-aliasing at 720p HD

...or go to Mali Developer Center
 Full range of Mali developer resources
    Tools, Development boards, Demos, Examples, Tutorials


