MHP_Tutorial 
The lifecycle of an MHP application
The middleware in the receiver contains a component called the application manager, which is responsible for monitoring the current services and starting or stopping applications as appropriate. This application manager has ultimate authority over what happens to an application - the application manager starts it, the application manager stops it and any interaction between the rest of the MHP environment and the application that may affect its lifecycle is carried out via the application manager. Before the application manager can actually run an MHP application, several things have to happen. First of all, the receiver has to know that an MHP application actually exists. Second, it has to know that the user is allowed to run it at the current time. Finally, it has to be able to access everything that it actually needs to run the application, such as class files and data files or assets. The first and second parts are handled by the same mechanism. MHP defines an extra service information (SI) table called the Application Information Table (AIT). This table is broadcast for every service that contains an MHP application, and it contains an entry for every MHP application that's valid for that service. Thus, if a service has two applications associated with it, this table will contain two entries. The AIT contains all the information that the receiver will need to run the application and to tell the user what applications are available in a meaningful way. This includes elements such as the name of the application, the location of its files and any arguments that should be passed to the application when it starts. Each application that is broadcast is given a unique identifier, which is also stored in the AIT. This identifier allows other parts of the system to actually be able to refer to an application uniquely, since the name or other attributes may not be unique. Each identifier consists of two parts - a 32 bit organization ID, which is unique to every organization that produces MHP applications, and a 16 bit application ID. This application ID does not have to be unique, but no two applications signalled in the same AIT can have the same organization ID and application ID. Given the size of the ID range, it's good practice not to reuse application ID's where possible. Applications may be started or stopped automatically by the receiver, based on a status indicator signalled in the AIT. This status indicator shows whether the application should be started automatically when the service is selected, whether the application should be killed automatically by the receiver if it is running, or whether the user can start it by hand. This allows a broadcaster to specify that applications which are time-critical can only be run in a given time period, and that if a user selects that service within that time period then they will always see that application.
The diagram below shows the structure of the AIT:
table ID other section information number of common descriptors common descriptors { external application authorization descriptor transport protocol descriptor IP routing descriptor* } reserved number of descriptors in the application loop application loop descriptors { organization ID application ID application control code reserved number of application descriptors application descriptors { application descriptor application name descriptor application icons descriptor pre-fetch descriptor DBV-J application descriptors OR DVB-HTML application descriptors } } 32 bits 16 bits 8 bits 4 bits 12 bits 4 bits 12 bits 8 bits 60 bits 12 bits
Descriptors in italics are optional. Descriptors marked with an asterisk (*) may appear several times. Some descriptors are grouped, and will appear together or not at all. For instance, all descriptors relating to DVB-J applications will appear together, as will all descriptors relating to DVB-HTML applications. Descriptors may appear in the AIT in a different order to that listed here. This doesn't matter - the fact that the descriptors are present and that the overall syntax of the table is correct is what actually matters. The application files themselves are broadcast as part of the MPEG-2 transport stream, in a DSM-CC object carousel. Many developers (especially receiver developers) are not big fans of DSM-CC because it's big and complex. At the same time, it is probably the best existing format for the job, and allows things like dynamic updates that are very useful for MHP applications. There is nothing particularly special about the object carousels used by MHP, so almost any object carousel generator can be used.
Of course, having the object carousel containing the application files is only a part of the issue - the receiver still has to actually find it. Each entry in the AIT has an application location descriptor (which will have slightly different formats for DVB-J and DVB-HTML applications) that will identify the object carousel that contains the application, as well as the path within that object carousel since an object carousel can contain more than one application. So this describes what happens when you select a service that contains an MHP application. Well, partly. Of course, when you select a service, you're normally switching from another service. And that service might also contain an MHP application. It might even contain the same MHP application. Confused yet? This isn't actually as scary as it sounds. Basically, what happens is this: when you select a new service, the receiver watches for an AIT for that service. If it finds one, it compares the list of signalled applications with the applications that are currently running. Any applications that are running and which aren't signalled as being allowed to run in the new AIT get killed, and any applications in the new AIT that are signalled to start automatically get started. One potential problem with the AIT is that any applications that are explicitly signalled in it are either signalled as being available to start, or as being automatically killed. The AIT also allows broadcasters to take a neutral stance on some external applications, however. The external application authorization descriptor is a descriptor that may appear in the AIT that allows the broadcaster to indicate that an application may continue running if it is already started, although a new copy of it may not be started. For example, a news ticker application from CNN may be known not to interfere with any applications broadcast on ESPN, and so it may be listed in the external application authorization descriptor for ESPN and will continue running. However, there may be a known conflict with an application being broadcast on MTV, and so the news ticker will not be listed in the external application authorization descriptor for MTV Why do we need this functionality? Why not just signal it in the AIT? Well, the control codes that are possible in the AIT either explicitly kill or start an application, or say that it is available for starting. There is no control code which says that "this application may be allowed to continue if it's currently running, but no more copies can be started." Arguably, this is an oversight and it wold have been simpler to include an extra control code instead of adding the external application authorization descriptor, but for now we have to use the external application authorization descriptor. This is all made marginally more complex by the fact that some applications can be signalled in the AIT as being bound to the current service. This adds another step to the process, where any applications that are signalled as being service bound get killed before anything else happens.
What this means is that the final process that occurs when an MHP receiver switches to a new service is something like this: • • • • The application manager examines the current set of applications. Any which are signalled as being service bound are killed immediately. The application manager examines every application that is signalled on the new service. Any applications that are signalled as auto-start are loaded and started. Any applications that were already running and are not signalled directly in the AIT are compared against the application identifiers listed in the external application authorization descriptors. Any already running applications that is not signalled, and is not listed in these descriptors is killed.
To further complicate matters, each application specifies which MHP profile and profile version it requires to run. This allows a receiver to decide if an applicaton can actually be run on that MHP implementation, and only show the user those applications that will run. Any AIT entry with a profile or profile version entry higher than the receiver can handle will be ignored.
Interactive applications in MHP
MHP applications come in two flavours. The first type are DVB-HTML applications. These are not very popular, partly because the specification for DVB-HTML was only complerted with MHP 1.1, and partly because many broadcasters, box manufacturers and content developers find it too complex and difficult to implement. DVB-HTML applications are a set of HTML pages that are broadcast as part of a service. The spec is based around a modularized version of XHTML 1.1, and also includes CSS 2.0, DOM 2.0 and ECMAScript. By now you're probably starting to see why developers don't really want it. Most of these haven't been implemented properly in PC browsers yet, let alone on a set-top box. Suffice to say, the number of people who are even thinking about developing DVB-HTML applications can be counted on the thumbs of one hand. This is probably the last time that I'll mention DVB-HTML in this document. Version 1.1 of the MHP specification has much more details than version 1.0.1, but it's still only in draft form. If you really care, you can get a copy of the MHP 1.1. specification from www.mhp.org. The second , and by far the most popular, flavour is DVB-J applications. These are written in Java using the MHP API set and consist of a set of class files that are broadcast with a service. Since DVB-J applications are overwhelmingly the most popular type, they are the only type of application that we'll discuss in the rest of these pages. DVB-J applications are known as Xlets. These are a concept similar to applets for web pages that has been introduced by Sun in the JavaTV specification. Like applets, the xlet interface allows an external source (the application
manager in the case of an MHP receiver) to start and stop an application. The Xlet interface is found in the javax.tv.xlet package:
public interface Xlet { public void initXlet(XletContext ctx) throws XletStateChangeException public void startXlet() throws XletStateChangeException public void pauseXlet() public void destroyXlet(boolean unconditional) throws XletStateChangeException }
If you compare this to the java.applet.Applet class, you'll notice some similarities. Like the applet class, the Xlet has methods that allow the application manager to initialize it, start, and stop it. There are some major differences, however, between Xlets and applets. The biggest of these is that an Xlet can also be paused and resumed. The reason for this is very simple - in an environment like an MHP receiver where multiple applications may be running at the same time, and yet hardware restrictions mean that only one of those applications may be visible, nonvisible applications must be paused in order to keep resources free for the application which is visible. An Xlet is also much simpler than an applet - broadcasters and box manufacturers are paranoid, and so Xlets are in some way more limited in what they can do when interacting with their environment. Some of the other tasks that can be carried out via the applet class are supported in MHP, but other APIs must be used to do them. So, an Xlet has four main states - Loaded, Paused, Started, and Destroyed. If we examine the lifecycle of an Xlet, we can see where these states fit into the overall picture:
1. The application manager loads the Xlet's main class file (as signalled by the broadcaster) and creates an instance of the Xlet by calling the default constructor. This can happen at any point after the application is signalled. Once this has happened, the xlet is in the Loaded state. 2. When the user chooses to start the Xlet (or the AIT signals that the Xlet should start automatically), the application manager in the receiver calls the initXlet() method, passing in a new XletContext object for the Xlet. The Xlet may use this XletContext to initialize itself, and to preload any large assets such as images that may require some time to load from the object carousel. When the initialization is complete, the Xlet is in the Paused state and is ready to start immediately. 3. Once the initXlet() method returns, the application manager calls the startXlet() method. This will move the Xlet from the Paused state into the Started state,and the Xlet will be able to interact with the user. 4. During the execution of the Xlet, the application manager may call the pauseXlet() method. This will cause the application to move from the Started state back to the Paused state. The application will later be moved back to the Started state by calling the startXlet() method again. This may happen several times during the Xlet's life. 5. At the end of the Xlet's life, the application manager will call the destroyXlet() method, which will cause the Xlet to move into the Destroyed state and free all its resources. After this point, this instance of the Xlet cannot be started again. It's important to remember that an Xlet is not a standard Java application. There are many important differences between the two. An Xlet is conceptually much closer to an applet. Like an applet, there may be more than one Xlet running at any one time, which means that Xlets should not take certain actions that will globally affect the Java virtual machine. For instance, an Xlet should never, ever, EVER call the System.exit() method. I've seen some early applications that do this, and it's highly frustrating when an application simply shuts down the entire VM when it terminates. Some more do's and don'ts are listed below.
Xlet Contexts
Each Xlet has a context attached to it - an instance of the javax.tv.xlet.XletContext class. This is similar to the AppletContext class that is associated with an applet - in both cases the context is used to provide a way for the application to get
more information about its environment and to communicate any changes in its state to its environment
public interface XletContext { public static final String ARGS = "javax.tv.xlet.args" public void notifyDestroyed() public void notifyPaused() public void resumeRequest() public Object getXletProperty(String key) }
The notifyDestroyed() and notifyPaused() methods allow an Xlet to notify the receiver that it is about to terminate or pause itself - in this way, the receiver knows the state of every application and can take appropriate action. These methods should be called immediately before the Xlet enters the Paused or Destroyed states, because the receiver may take action that the application is otherwise unprepared for. An application can request that it be moved from the Paused state back to the Started state using the resumeRequest(). This may happen when a given event has occurred, for instance a certain time is reached or a certain event is detected in the MPEG stream. This effectively allows an application to 'sleep' for a while. Having said that, this method does only request that an application is started again - the receiver software may choose to ignore this request due to resource limitations, display issues or simply because it's feeling cruel.
The getXletProperty() method allows the Xlet to access properties that are defined for it in the information signalled by the broadcaster. At present, only one property is defined by MHP. The XletContext.ARGS property enables an application to access any arguments that are given to it in the application signalling (the AIT), since the Xlet model does not directly allow for commandline arguments to be passed to it. MHP also defines the following Xlet properties:
• • •
dvb.app.id - the application ID of the application, as indicated in the
application signalling
dvb.org.id - the organization ID of the application, as indicated in the
application signalling dvb.caller.parameters - the parameters passed to this application if it was started by a mechanism other than application signalling.
The main difference between the XletContext.ARGS property and the dvb.caller.parameters property is that the former refers to parameters passed in via application signalling while the latter refers to parameters passed in via the application listing and launching API. In the latter case, the XletContext.ARGS property will still carry the arguments as they are signalled. Applications can also access system properties using the standard System.getProperty() method. However, only a few system properties are supported by MHP.
Do's and don'ts for application developers
We've already seen that applications should not call the System.exit() method. But there are a few other things that Xlets should do, depending on their state:
•
•
• •
•
The destroyXlet() method should remember to kill all application threads and cancel any existing asynchronous requests that are currently outstanding in the service information or section filtering APIs. The destroyXlet() (and ideally the pauseXlet()) methods should free any graphics contexts that the application has created. The middleware will maintain references to these unless they are disposed of properly with a call to java.awt.Graphics.dispose() The application should remember that it may be paused or destroyed at any time, and should make sure that it can always clean up after itself. Resource issues are especially important in an MHP environment, as we will see later. An application should cooperate with other Xlets where possible on resource issues, and especially should not keep scarce resources longer than it has to. Remember that your application should be as reliable as possible. If a method throws an exception, CATCH IT! Exceptions get thrown for a reason.
Writing your first Xlet
Now that we've seen what an Xlet looks like, let's actually write one. In the grand tradition of first programs, we'll start with a "Hello world" type of application.
This example should work on all MHP implementations. While all of the samples on this website have been tested on a real implementation, it's important to remember that implementations do still differ, and until the MHP test suites are widely used, there may still be some changes needed for a particular platform. Now that we've got the displaimers out of the way, let's write the code. First, we'll start with the basic skeleton. This class implements the javax.tv.xlet.Xlet interface, and provides stub implementations of all its methods.
/** * The simplest Xlet that you will ever write. This Xlet does absolutely nothing, * not even print a message. However, it is a complete skeleton Xlet that will * compile, even if it does nothing useful once it's been compiled. * * This Xlet implements the Javax.tv.xlet.Xlet interface, in order to be the main * class for an Xlet. All methods are inherited from javax.tv.xlet.Xlet */ public class MyFirstExampleXlet implements javax.tv.xlet.Xlet { /** * Every Xlet should have a default constructor that takes no arguments. * No other constructor will get called. */ public MyFirstExampleXlet() { // The constructor should contain nothing. Any initialisation // should be done in the initXlet() method, or in the startXlet method // if it's time- or resource-intensive. That way, the MHP middleware // can control when the initialisation happens in a much more predictable // way } /** * Initialise the Xlet. The context for this Xlet will get passed in to this * method, and a reference to it should be stored in case it's needed later. * This is the place where any initialisation should be done, unless it takes * a lot of time or resources. If something goes wrong, then an * XletStateChangeException should get thrown to let the runtime system know * that the Xlet can't be initialised. */ public void initXlet(javax.tv.xlet.XletContext context) throws javax.tv.xlet.XletStateChangeException { // Do nothing for now } /** * Start the Xlet. At this point the Xlet can display itself on the screen and * start interacting with the user, or do any resource-intensive tasks. These * kinds of function should be kept in startXlet(), and should *not* be done * in initXlet(). * * As with initXlet(), if there is any problem this method should throw an * XletStateChangeException to tell the runtime system that it can't start. * * One of the common pitfalls is that the startXlet() method must return to its * caller. This means that the main functions of the Xlet should be done in * another thread. The startXlet() method should really just create that thread * and start it, then return. */ public void startXlet() throws javax.tv.xlet.XletStateChangeException { // Do nothing for now } /** * Pause the Xlet. Unfortunately, it's not clear to anyone (including the * folks who wrote the MHP specification) what this means. Generally, it * means that the Xlet should free any scarce resources that it's using, * stop any unnecessary threads and remove itself from the screen. * * Unlike the other methods, pauseXlet() can't throw an exception to indicate * a problem with changing state. When the Xlet is told to pause itself, it * must do that. */ public void pauseXlet() { // Do nothing for now }
/** * Stop the Xlet. The boolean parameter tells the method whether the Xlet has to * obey this request. If the value of the parameter is true, the Xlet must * terminate and the runtime system will assume that when the method returns, the * Xlet has terminated. If the value of the parameter is false, the Xlet can * request that it not be killed, by throwing an XletStateChangeException. If the * MHP middlewarestill wants to kill the Xlet, it should call destroyXlet() * again with the parameter set to true. */ public void destroyXlet(boolean unconditional) throws javax.tv.xlet.XletStateChangeException { // Do nothing for now } }
This is the most basic Xlet that you can get: it's a valid Xlet that does absolutely nothing. You'll notice that it has no constructor. This is deliberate. When the MHP middleware starts an application, it first needs to create an instance of the main class. Doing this will invoke the default constructor (if it exists), and any code in the cosnstructor will get executed. However, the Xlet has another method that should be used for this kind of setup - the initXlet() method. Doing this work in the initXlet() method allows better control over when this happens and means that it only gets done when the Xlet is actually initialised. In short, do not provide a default condstructoir for your Xlet. Do all the initialisation work in the initXlet() method, or in the startXlet() method if the initialisation uses a lot of resources. From our basic skeleton, let's move forward and make it do something interesting. We will add some code to the four main methods - initXlet(), startXlet(), pauseXlet() and destroyXlet(). Since this is a simple Xlet, we don't need to add any other classes or methods before we get our working Xlet. This example looks fairly scary, but it's mostly comments:
// The main class of every Xlet must implement this interface - if it doesn't do this, // the MHP middleware can't run it. public class MySecondXlet implements javax.tv.xlet.Xlet { // Every Xlet has an Xlet context, just like the Applet context that applets in a // web page are given. This is created by the MHP middleware and passed in to the // Xlet as a parameter to the initXlet() method. private javax.tv.xlet.XletContext context; // A private field to hold the current state. This is needed because the startXlet() // method is called both to start the Xlet for the first time and also to make the // Xlet resume from the paused state. This filed lets us keep track of whether we're // starting for the first time. private boolean hasBeenStarted; /** * Every Xlet should have a default constructor that takes no arguments. * No other constructor will get called. */ public MySecondXlet() { // The constructor should contain nothing. Any initialisation // should be done in the initXlet() method, or in the startXlet method // if it's time- or resource-intensive. That way, the MHP middleware // can control when the initialisation happens in a much more predictable // way } /** * Initialise the Xlet. The context for this Xlet will get passed in to this * method, and a reference to it should be stored in case it's needed later. * This is the place where any initialisation should be done, unless it takes * a lot of time or resources. If something goes wrong, then an
* XletStateChangeException should get thrown to let the runtime system know * that the Xlet can't be initialised. */ public void initXlet(javax.tv.xlet.XletContext context) throws javax.tv.xlet.XletStateChangeException { // store a reference to the Xlet context that the Xlet is executing in this.context = context; // The Xlet has not yet been started for the first time, so set // this variable to false. hasBeenStarted = false; // Since this is a simple Xlet, we'll just print a message to the debug output System.out.println("The initXlet() method has been called. Our Xlet \ context is " + context); } /** * Start the Xlet. At this point the Xlet can display itself on the screen and * start interacting with the user, or do any resource-intensive tasks. These * kinds of function should be kept in startXlet(), and should *not* be done * in initXlet(). * * As with initXlet(), if there is any problem this method should throw an * XletStateChangeException to tell the runtime system that it can't start. * * One of the common pitfalls is that the startXlet() method must return to its * caller. This means that the main functions of the Xlet should be done in * another thread. The startXlet() method should really just create that thread * and start it, then return. */ public void startXlet() throws javax.tv.xlet.XletStateChangeException { // Again, we print a message on the debug output to tell the user that // something is happening. In this case, what we print depends on // whether the Xlet is starting for the first time, or whether it's // been paused and is resuming // have we been started? if (hasBeenStarted) { System.out.println("The startXlet() method has been called to resume the \ Xlet after it's been paused. Hello again, world!"); } else { System.out.println("The startXlet() method has been called to start the \ Xlet for the first time. Hello, world!"); // set the variable that tells us we have actually been started hasBeenStarted = true; } } /** * Pause the Xlet. Unfortunately, it's not clear to anyone (including the * folks who wrote the MHP specification) what this means. Generally, it * means that the Xlet should free any scarce resources that it's using, * stop any unnecessary threads and remove itself from the screen. * * Unlike the other methods, pauseXlet() can't throw an exception to indicate * a problem with changing state. When the Xlet is told to pause itself, it * must do that. */ public void pauseXlet() { // Since we have nothing to pause, we will tell the user that we are // pausing by printing a message on the debug output. System.out.println("The pauseXlet() method has been called. Bedtime..."); } /** * Stop the Xlet. The boolean parameter tells the method whether the Xlet has to * obey this request. If the value of the parameter is true, the Xlet must terminate * and the runtime system will assume that when the method returns, the Xlet has * terminated. If the value f the parameter is false, the Xlet can request that it * not be killed, by throwing an XletStateChangeException. if the MHP middleware * still wants to kill the Xlet, it shoudl call destroyXlet() again with the * parameter set to true. */
public void destroyXlet(boolean unconditional) throws javax.tv.xlet.XletStateChangeException { if (unconditional) { // We have been ordered to terminate, so we obey the order politely and release any // scarce resources that we are holding. System.out.println("The destroyXlet() method has been called telling the \ Xlet to stop unconditionally. Goodbye, cruel world!"); } else { // We have had a polite request to die, so we can refuse this request if we want. System.out.println("The destroyXlet() method has been called requesting that \ the application stops, but giving it the choice. So, I'll decide not to \ stop."); // Throwing an XletStateChangeException tells the MHP middleware that the // application would like to keep running if it's allowed to. throw new XletStateChangeException("Please don't kill me!"); } } }
As you can see from this code, it simply prints out a different message when each method is called. Nothing complex, but enough to let you see what's going on. Now that we've got our code, we can compile it using javac or using your favourite IDE (make sure that the MHP classes are in your classpath). We will use this example as the basis of some other Xlets. More code samples are available in the code library.
Moving to MHP
So, you've finished reading all these pages, and you've taken a look at the specification, and you've decided that you like this MHP stuff. The only trouble is, you've got a big installed base of current receivers and broadcast equipment out there that don't support MHP. What can you do? One of the advantages of the MHP APIs is that they give you very low level access to the MPEG stream. This lets you do all sorts of interesting things, including access legacy data formats. For instance, suppose you have an OpenTV application that you're currently broadcasting in a DSM-CC data carousel, and that you want to convert to MHP. While you could broadcast an MHP application and all the assets in a DSM-CC object carousel as well, this is pretty wasteful of bandwidth. A better option is to simply broadcast the MHP application while re-using the assets from the OpenTV version. This takes a little more work in the application the first time you want to do it, but it can potentially save a lot of bandwidth. Although MHP doesn't include a method for accessing DSM-CC data carousels, it does provide a section filter API that allows you to directly access the MPEG-2 sections that make up the data carousel and parse it yourself. I've seen this technique used a number of times to great effect, and without a huge performance hit. When this is combined with MHP's support for various common iTV data formats and the way MHP builds on existing standards (especially at the lower
levels), switching to MHP starts to look less like an insurmountable obstacle. Yes, it will cost money, but there are benefits to be gained from doing it. Of course, there is a lot more to moving to MHP than that. You have to have content developed, and that means finding content developers - either hiring your own or contracting to an external company. If you are a content development company then this obviously means hiring your own or retraining existing staff. What does MHP mean to content developers? Since all MHP applications are written in either Java or HTML, there's a temptation to think that if you can write web applications then you can write MHP applications. This is simply not true. MHP is not the web. There are some major differences between the philosophy of web applications and the philosophy of MHP applications, not least the importance of cooperating nicely with other applications and with the receiver, the importance of bullet-proof reliability and the need to run on many different sets of hardware and software. In many cases, existing digital TV developers who have a knowledge of Java are probably more suited to MHP development than web developers trying to get into digital TV. Why? Well, at the end of the day, Java (or god forbid, HTML) are just languages. Learning them isn't that complex. Learning what happens when you build an unoptimized object carousel, or learning why your application really should catch all exceptions, check all return values and handle all possible cases takes time - it's about experience and attitude and domain knowledge. I'm not going to be arrogant and claim that it's impossible to learn this quickly, but usually it takes time because there's no fixed set of guidelines that you can apply. For both middleware developers and application developers, the delivery mechanism itself needs to be considered. MPEG-2 and broadcast delivery is a very different animal from IP-based delivery, and the whole focus changes. The application isn't the king - it's definitely secondary to its parent service, and that doesn't change even if the application is the only thing on that service. The underlying technology is also very different, and this gets underestimated so many times. I've seen many receivers that claim to support MHP, only look closer and see that they actually run personalJava and some of the MHP APIs, but the middleware authors haven't started looking at MPEG-2 yet. The APIs related to MPEG-2 are easily the toughest parts of MHP to implement well. I know - I've been involved in developing MHP middleware for a major CE company, and I've seen how long it took the folks writing the MPEG-related middleware to get a stable implementation, let alone one that was product quality. If you underestimate the complexity of this part of the stack, it will turn round and bite you. You have been warned.
The MHP APIs
MHP has APIs to do almost everything except make coffee for the developers, and that will probably get added in version 1.2. In general, these APIs can be put into several categories:
• • • • • •
low-level MPEG access APIs media control APIs application lifecycle APIs graphics APIs communication APIs other APIs
It's important to realize that the MHP APIs are the way they are for a number of reasons, and that one of the biggest of these is paranoia. Any consumer electronics application has to be incredibly reliable, and this gets to be even more important when you are talking about complex CE applications that can download and run third-party code. There are three main reasons for this paranoia:
•
•
•
Convenience. These applications have to work all the time, on every box. That's tough to achieve without restricting what an application can do, and how it does it. People are used to their PC crashing - they're a lot less used to their TV crashing. Stopping someone from working is one thing, but if you interrupt their TV watching, most people get mad. People will probably not blame the content authors for any crashes. If you've got a Sony set-top box, it's the Sony STB that's crashed, not the application. This means that CE manufacturers are extra-cautious, because they will get the blame even if it's not their fault.
This thinking affects a lot of the API design. In many cases, you'll notice that API methods 'suggest' or 'hint' or 'request' that something happens. You'll also notice that the APIs allow for a lot of different failure conditions that an application can handle. The whole purpose of this is to make applications (and the receiver itself) more reliable, by making sure that the system software on the receiver is always in overall control. One of the implications for software developers is that they should take into account all these potential failure conditions that the receiver may report, and use these to build reliable applications. There's a well-known software development saying: 'never check for a failure condition that you don't know how to handle'. In a digital TV application, developers should know how to handle all failure conditions.
Managing resources in an MHP application
A digital TV receiver is usually a fairly low-end device, with limited resources available to applications and a requirement that those resources be shared effectively between concurrent applications, and between applications and the
receiver middleware. This makes resource management a major issue for receiver manufacturers and application developers, and it's something that you will encounter fairly frequently in the MHP APIs. Many of the APIs in MHP are based on the DAVIC resource notification API contained in the org.davic.resources package. Note the choice of words there this is not a resource management API. Resource management is carried out purely by the receiver middleware as it sees fit, and an application has very little say in whether it gets to keep a scarce resource that is requested by another application. The resource notification API consists of three main classes, and is not intended to be a complete API in its own right. Instead, it's designed to be used by other APIs in a way that best suits them, although it does define some common concepts. To make this clearer in the following description, any references to a 'using' API mean an MHP API that implements the resource notification API as part of its own specification. As you can see from looking at the API specification, it's based around a client-server model, with one minor change for security and resilience. First, let's deal with the standard aspects. The ResourceServer interface is implemented by a class in the using API. This is responsible for handling requests for scarce resources, and for managing how those scarce resources are allocated. These could be either software (e.g. a software section filter) or hardware resources (e.g. a modem or MPEG decoder) - the API makes no distinction.
public interface ResourceServer { public abstract void addResourceStatusEventListener( ResourceStatusListener listener); public void removeResourceStatusEventListener( ResourceStatusListener listener); }
You'll notice that the ResourceServer interface defines no methods for actually requesting or releasing exclusive access to a scarce resource. This is a deliberate design decision, and it allows the using API to provide these functions in a way that's most natural to it. The way that a resource is reserved in a telephony API will very likely be different from the way a resource is reserved in a section filter API, for instance. The only methods that this interface defines allows an application to register as a listener for events indicating the change in status of a resource. APIs which use this interface can provide events to show how specific resources change their status. The ResourceClient interface is, not surprisingly, the client side of the API. This interface is implemented by a class in the MHP application and is responsible for managing the application's use of the resource. Basically, this interface allows the receiver middleware to notify an application that a resource is
needed by something else in the receiver, and that it should give it up. Now maybe it becomes clear why this is called a resource notification API.
public interface ResourceClient { public abstract boolean requestRelease( ResourceProxy proxy, Object requestData); public abstract void release( ResourceProxy proxy); public abstract void notifyRelease( ResourceProxy proxy); }
As you can see from the interface definition above, the ResourceClient interface contains three methods. These methods are all used to inform the application, with varying degrees of politeness, that a resource is needed by another application. The RequestRelease() method is the most polite of these, in effect requesting that an application gives up a scarce resource so that it can be used by something else in the receiver. The application is perfectly within its rights to refuse to release the resource (by returning the value false), either because it is currently using it for something critical or simply because it has decided not to release it. The release() method tells the client that it must give up the scarce resource. In this case, the client has no choice - all it can do is give up the resource so that whatever else needs to use it can do so. When this method returns, the receiver will assume that the resource is available for other parts of the system to use. This assumes that an application is going to be cooperative, of course. In the case above, the application could simply not return from the release() and it would keep access to the resource, right? Wrong. If the release() method doesn't return within an appropriate timeout period, the receiver can assume that the application is crashed or malicious, and reclaim the resource anyway. If this happens, the receiver will call the notifyReleased() method, which tells the client that the resource has been taken away from it and that it should clean up as best it can. This is a potentially brutal operation, and so it is only saved for those cases where the application really is not cooperating. A class implementing the ResourceProxy interface sits between the client and the actual resource. This serves an important security function (which we will see below), but there are several other reasons for its existence. The first purpose of the resource proxy is to provide an application with a simple way to set up the resource. If you have a fairly complex resource that needs a lot of setting up, like a modem, you don't want to have to explicitly set parameters on it every time that you request it. The design of the resource notification API lets you set parameters on the resource proxy, which can then be downloaded to the real resource with a single method call. Since the resource proxy exists across multiple request/release cycles, this can make life significantly easier.
A second reason for having the resource proxy is that since, in most using APIs, instances are created by the client, and have no link to a real resource until they are attached to it, the resource client can create multiple resource proxies, each with different parameter settings,and then choose to use only one of these when it really needs to access the resource. The final and possibly most important reason is security. By using the resource proxy as an indirection mechanism, the application never has a direct reference to the Java object that actually controls the scarce resource. This makes it much, much easier for the receiver to take a resource away from a misbehaving application. If the application has a direct link to the object controlling the resource, there is no way that it can be forced to break that link. However, consider the situation when we have the resource proxy acting as an indirection mechanism. In this case, the following steps happen when the application requests access to a resource:
1. The resource client calls the appropriate method on the ResourceServer implementation for the API and requests access to the resource, passing the ResourceProxy object to it as a parameter. 2. If access is granted, the resource server calls a private method on the resource proxy. This tells the resource proxy that it is now valid, and establishes a link between the resource proxy and the resource itself. 3. The resource client calls methods on the resource proxy to manipulate the resource. The resource proxy forwards these requests to the real resource. 4. When the resource client wishes to give up the resource, it calls the appropriate method on the resource server (which is defined by the using API) to release the resource, passing the resource proxy as a parameter. 5. The resource server calls a private method on the resource proxy. This tells it that it is no longer valid and should not forward anymore requests to the underlying resource.
6. The resource server updates its internal table of the state of the resources and marks the resource that was being used as free again. So far, so standard. Now let's look at what happens when access to a resource is revoked. For this example, we'll assume that the application is malicious and refuses to give up control of the resource:
1. The resource server calls a private method on the resource proxy that has access to the scarce resource, telling it that it is no longer valid. 2. The resource proxy breaks the link between it and the scarce resource, and resets its internal state to take account of this. 3. The receiver calls the notifyReleased() method on the class in the application that implements the ResourceClient interface. This informs the application that it no longer has access to the resource and that it should do any housekeeping necessary. 4. Any further attempts by the application to access the resource have no effect on the scarce resource. In this case, there is absolutely nothing that the application can do to interfere with the communication between the middleware and the resource proxy. Since the resource proxy is implemented by the receiver manufacturer, and can thus be trusted by the rest of the middleware, the resource proxy can be guaranteed to always give up the resource when it is asked. Similarly, since the methods used to validate or invalidate the resource proxy are not defined by the resource notification API (and are probably declared as private, a malicious application can do nothing to spoof a message from the resource server granting access to the resource. This may seem a little extreme, but it's the only way that reliability and paranoia concerns can be satisfied.
Using the resource notification API
In those APIs that use the resource notification API, resources typically accessed as follows:
1. The application creates an instance of a ResourceProxy object 2. The application sets some parameters on the ResourceProxy that will be passed to the real resource when it is acquired 3. The application calls an API-dependent method on the ResourceServer for the API (usually passing the ResourceProxy it has created as a parameter) in order to reserve the real resource. 4. The application uses the resource as it wants to. 5. When it has finished, the application calls an API-dependent method on the ResourceServer object to release the resource for use by another application. As you can see, there's nothing too complex about this. Although it sounds complex, in practise it's not that difficult to actually use. Knowing the philosophy behind this approach helps a lot - as one of the designers of this API, I've had to explain the reasoning behind it many times in the past, and every time I've done so it's answered the many questions that people usually have about this API. There are some things that an application author should consider when they are writing an application. As an application writer, you must be aware when you are using a scarce resource - your application may be stopping others from using the resource, and if another application with a higher priority wants the resource that you are using, your access to it may be revoked at any time. For these reasons, an application should only reserve scarce resources when it actually needs to use them, and it should also be prepared to lose access to its resources at any time so that it can recover gracefully when this happens. Graceful handling of conditions like these are the type of consideration that makes for a really reliable MHP application. While your application doesn't have to implement any sensible precautions in the methods provided by the ResourceClient interface, it's obviously much better. Failing to take account of the way that resources are used by your application (and especially those resources that are reclaimed by the receiver middleware) will lead to your application being more unreliable and errorprone.
Application lifecycle control APIs.
There are two APIs in an MHP receiver that deal with actions that can affect the execution state of another MHP application, either by causing an application to start running, or by killing an already running application. As we will see later, the service selection API is one of these, but this is not its main purpose and application lifecycle control is really a small part of what it does. If you really want detailed control over which applications run and don't run in the current service, then the application discovery and launching API is for you. This is contained in the org.dvb.application package. This API is organized
around the AppsDatabase class - this maintains a database of all applications currently being signalled on the current service(s). An application can get information about other applications, start them and possibly stop them, and monitor the status of other applications and the contents of the database. The philosophy behind the design of the application listing and launching that there are two types of operation that are likely to be carried out. Some applications will only want (or only be allowed) to get information about other applications that are being signalled. This is all encapsulated in the AppAttributes class, which provides access to the information contained in the AIT. Other applications will want to actually manipulate other applications. This functionality is provided by the AppProxy class and its subclasses, which has methods that enable the controlling application to change the state of another application. Both of these classes are described in more detail below. By separating these functions into two different objects, the application manager can check the security of requests for more easily - anything can access information about other applications, so no security checking is necessary in the AppAttributes class. On the other hand, any request for an AppProxy object will cause the permissions of the requesting application to be checked, in order to make sure that it has permission to control applications. The AppsDatabase is a singleton object, and applications can get a reference to it by calling the static AppsDatabase.getAppsDatabase() method. Applications can request information from the applications database using the AppsDatabase.getAppAttributes() method. There are two versions of this, one taking an application ID (actually an instance of the AppID class that represents an application ID), and one taking a AppsDatabaseFilter. This allows the application to select only the applications that match a set of conditions. The only filter currently defined in MHP is the CurrentServiceFilter that selects only those applications signalled as part of the current service, but others may be defined later and an application can define its own. Only the MHP implementation,however, could define any really useful new filters, so there's not much point in a user defined filter at the moment.
AppsDatabaseFilter and its subclasses have an accept() method that takes an
application identifier as an argument. The AppsDatabaseFilter implementation checks the application with this identifier against the criteria that are hardwired into that particular filter. If the specified application matches these criteria, then the method returns true - otherwise, it returns false. This allows a pluggable mechanism for adding new types of filter in future versions of MHP.
Controlling applications
If an application actually wants to manipulate another application, it can call the AppsDatabase.getAppProxy() method. This takes an application ID as a parameter, and returns an AppProxy object representing that application (if the requesting application actually has permission to manipulate it). Now that we've seen what the AppsDatabase can actually do, here is the complete AppsDatabase Interface:
public interface AppsDatabase { public static AppsDatabase getAppsDatabase(); public int size(); public Enumeration gettAppIDs( AppsDatabaseFilter filter); public Enumeration getAppAttributes( AppsDatabaseFilter filter); public AppAttributes getAppAttributes(AppID key); public AppProxy getAppProxy(AppID key); public void addListener( AppsDatabaseEventListener listener); public void removeListener( AppsDatabaseEventListener listener) }
The contents of the database can change with time, depending on actions by the broadcaster or the user. For instance, the broadcaster may choose to stop signalling one application, while signalling another in its place (e.g. when one TV show finishes and another starts). Alternatively, the user may switch to a different channel, which is signalling a completely new set of applications. An application can get notification of these changes by listening to AppsDatabaseEvents. These allow an application to be notified about changes to the contents of the applications database. The addListener() and removeListener() methods on the AppsDatabase let applications register (and unregister, respectively) to receive events when the contents of the AIT change. An AppsDatabaseEvent gets generated whenever an application is added or removed from the AIT, when an application is changed, or when the whole AIT gets changed (e.g. when the user changes channel). The appAttributes class The AppAttributes class, as we saw earlier, maintains all the information that the applications database holds about a given application. As such, it's not that complex and I won't describe it in too much detail. AppAttributes is an interface that provides get methods for all the attributes listed in the AIT for any application, ranging from the name and the type of the application (DVB-J or DVB-HTML) to the priority that it's been given by the broadcaster.
public interface AppAttributes{ public final int DVB_J_application; public final int DVB_HTML_application; public int getType();
public String getName(); public String getName(String iso639Code) throws LanguageNotAvailableException; public String[][] getNames() ; public String[] getProfiles(); public int[] getVersions(String profile) throws IllegalProfileParameterException ; public public public public public public } boolean getIsServiceBound() ; boolean isStartable() ; AppID getIdentifier () ; AppIcon getAppIcon () ; int getPriority(); org.davic.net.Locator getServiceLocator();
public Object getProperty (String index) ;
This class is relatively simple, and so we won't cover it in too much detail here. The appProxy class The AppProxy class is also relatively simple. As we can see from the interface definition below, it provides a set of methods to control the state of the application:
public interface AppProxy { public int getState(); public void start(); public void stop(boolean forced); public void pause(); public void addAppStateChangeEventListener (AppStateChangeEventListener listener); }
The only non-obvious method here is the addAppStateChangeEventListener() method. This allows an application to listen to events (specifically, AppStateChangeEvent and its subclasses) that notify it of changes in the state of another application. For instance, if one application needs to know when another application terminates, this mechanism lets the first application receive notification of this. You might have noticed that both the AppAttributes class and AppProxy class are actually defined as interfaces. There was much argument during the design of this API as to whether separate classes were needed, or whether they should be merged into one. In the end, the decision was that for security reasons, separate interfaces were better. By defining these as interfaces, however, an MHP receiver may choose to implement both interfaces in a single class. This is a compromise, but doesn't significantly affect security. Application should not assume that these two interfaces will be implemented by the same class. Given that we've now seen all the important elements of the Application listing and launching API, let's take a look at an example:
// This example starts all applications that are // signalled on the current service. // The first thing that we have to do is get a // reference to the applications database. Since // this is a singleton object, we use the // getDatabase() static method. AppsDatabase theDatabase; theDatabase = AppsDatabase.getDatabase(); // Now that we've got a reference to the database,
// we can find what applications are being // signalled on the current service by using a // CurrentServiceFilter to tell us what is being // signalled on this service. AppsDatabaseFilter filter; filter = new CurrentServiceFilter(); Enumeration attributes; attributes = theDatabase.getAppAttributes(filter); // Since the version of the method that uses a // filter returns an Enumeration to us, we need // to loop through the returned elements to find // one that's interesting to us. while(attributes.hasMoreElements()) { // First, get the attributes of the application // from the enumeration AppAttributes info; info = (AppAttributes)attributes.nextElement(); // The attributes contain the application ID // which we can get by using the getIdentifier() // method on the AppAttributes object. AppID id = info.getIdentifier() // We then use this to get an AppProxy object // that represents the application by using // the getAppProxy() method on the AppsDatabase. AppProxy proxy; proxy = (AppProxy)theDatabase.getAppProxy(id); // Now that we've got a reference to a proxy for // the application, we can start it. proxy.start(); }
One thing that you'll notice if you read the specification is that the application 'requests' that operations get carried out on other applications. The main reason behind this is to allow the application manager running in the receiver to refuse to carry out a request. This could be either for security or for resource reasons, but it's aimed at improving the reliability of the receiver.
Graphics APIs
No application would be complete (or even functional in many cases) without being able to draw stuff on the screen. The MHP specification uses a mixture of a cut-down version of AWT and a GUI API from the HAVi specification. The graphics model is probably one of the most complex parts of MHP, and for good reason: there is a lot of different issues that need to be addressed when you are talking about graphics in a TV environment. Here are just some of the things that you typically have to consider:
•
•
Pixel aspect ratios. Video and TV applications typically use non-square pixels, while computer graphics APIs usually assume that pixels are square. Combining these two models can make things very interesting. To make it worse, in some receivers they may not be combined, so an application trying to produce perfect overlays of graphics on video could be in for a very hard time Aspect ratio changes. As if the previous point wasn't enough, the aspect ratio of a TV signal may change from the standard 4:3 to widescreen (16:9) or even 14:9. This could be changed by the TV signal itself, or by the user. In either case the effect on graphics and
• • •
•
images not designed for that aspect ratio can be pretty unpleasant from an application developer's point of view. It doesn't look good when all circles in an application stop being circular, for instance. Translucency and transparency. How can we make graphics transparent so that the viewer can see what's underneath them? Colour space issues - how do we map the RGB colour space used by Java on to the YUV colour space used by TV signals? No window manager. Window managers are too complex to be used on many MHP receivers, so an application needs some other way of getting an area it can draw on. This also means that an application will have to coexist with other applications in a way that's different from standard Java or PC applications. User interface differences - an application may not have input focus if another application is active. The application may also need a new user interface metaphor, if the user only has a TV remote to interact with it.
Given all these issues, it's probably not surprising that there are a lot of graphics related issues that you need to think about as a developer. We'll cover the parts that are the same before we go on to the differences, though, and there are a lot of things that are the same. Many of the lessons learned from developing AWT applications will still apply here, although typically you won't want the standard PC-like look and feel for any applications. Basically, MHP uses a subset of the personalJava AWT API for its UI handling. There are a few differences caused by the constraints of a TV environment, but most of the time anyone who can develop an AWT application will be happy working with the graphics APIs in MHP. The differences come in the extensions that MHP has added. The HAVi (Home Audio Video Interoperability) standard defines a set of Java GUI extensions known as the HAVi Level 2 GUI. DVB has adopted these extensions for MHP, because they fulfill almost all of the requirements that MHP had and because using them saved DVB from having to define its own extensions. These extensions are contained in the org.havi.ui package, and are related to several areas of graphics operations. The first of these, and the easiest to understand, is a set of extra GUI widgets that are designed for use on a TV screen, both to fit TV-specific limitations and requirements, and to provide a TV-specific interaction model. These classes (such as org.havi.ui.HComponent and org.havi.ui.HContainer should be used in place of their AWT equivalents. We will examine several elements of the org.havi.ui package in the rest of this section, but before we can do that, we have to understand the graphics model in an MHP receiver.
Screen Devices and the MHP display model
An interative TV display can typically be split into three layers. The background layer is usually only capable of displaying a single colour, or if
you're lucky a still image. On top of that, there is the video layer. As its name suggests, this is the layer where video is shown. Video decoders in set-top boxes usually have a fairly limited feature set, and may only display video at full-screen and quarter-screen resolutions and a limited set of coordinates. On top of that layer, there is the graphics layer. This is the layer that graphics operations in MHP will actually draw to. This graphics layer may have a different resolution from the video or background layers, and may even have a different shape for the pixels (video pixels are typically rectangular, graphics pixels are typically square). Given the nature of a typical set-top box, you can't expect anything too fancy from this graphics layer - 256 colours at something approaching 320x200 may be the best that an application developer can expect.
All of these layers may be configured separately, since there are quite possibly different components in the system responsible for generating all of these layers. As you can imagine, this is potentially a major source of headaches for MHP graphics developers. To complicate it further, while the layers can be configured separately, there is some interaction, and configuring one layer in a specific way may impose constraints on the way the other layers can be configured. HAVi (and MHP) defines an HScreen class to help solve this problem. This represents a physical display device, and every MHP receiver will have one HScreen instance for every physical display device that's connected to it typically this will only be one. Every HScreen has of a number of HScreenDevice objects. These represent the various layers in the display. Typically, every HScreen will have one each of the following HScreenDevice subclasses:
• • • HBackgroundDevice, representing the background layer HVideoDevice, representing the video layer HGraphicsDevice, representing the graphics layer
As we can see from the HScreen class definition below, we can get references to these devices by calling the appropriate methods on the HScreen object:
public class HScreen { public static HScreen[] getHScreens() public static HScreen getDefaultHScreen() public HVideoDevice[] getHVideoDevices() public HGraphicsDevice[] getHGraphicsDevices() public HVideoDevice getDefaultHVideoDevice() public HGraphicsDevice getDefaultHGraphicsDevice() public HBackgroundDevice getDefaultHBackgroundDevice() public HScreenConfiguration[] getCoherentScreenConfigurations() public boolean setCoherentScreenConfigurations( HScreenConfiguration[] hsca) }
We can get access to the default devices, but in the case of the video and graphics devices, there may also be other devices that we can access. Obviously we can only have one background, but features like picture-inpicture allow multiple video devices, and if multiple graphics layers are supported, we can have a graphics device for each layer. This is shown in the diagram below.
Configuring screen devices
Once we have a device, we can configure it using instances of the HScreenConfiguration class and its subclasses. Using these, we can set the pixel aspect ratio, screen aspect ration, screen resolution and other parameters that we may wish to change. These configurations are set using subclasses of the HScreenConfigTemplate class. An HScreenConfigTemplate provides a mechanism that lets us set the parameters we would like for a given screen device, and we can then query whether this configuration is possible given the configuration of the other screen devices. Let's examine how this works, taking the graphics device as an example. The HGraphicsDevice class has the following interface:
public class HGraphicsDevice extends HScreenDevice { public HGraphicsConfiguration[] getConfigurations() public HGraphicsConfiguration getDefaultConfiguration() public HGraphicsConfiguration getCurrentConfiguration() public boolean setGraphicsConfiguration( HGraphicsConfiguration hgc) throws SecurityException, HPermissionDeniedException, HConfigurationException public HGraphicsConfiguration getBestConfiguration( HGraphicsConfigTemplate hgct) public HGraphicsConfiguration getBestConfiguration( HGraphicsConfigTemplate hgcta[]) }
The first three methods are pretty obvious - these let us get a list of possible configurations, the default configuration and the current configuration. The next method is equally obvious, and lets us set the configuration of the device. The only thing to note here is the HConfigurationException that may get thrown if an application tries to set a configuration that's not compatible with the configurations of other screen devices. The last two methods are the interesting ones to us. The getBestConfiguration() method takes a HGraphicsConfigTemplate (a subclass of HScreenConfigTemplate) as an argument. This allows the user to construct a template and set some preferences for it (we'll see how to do this in a few moments) and then see what configuration can be generated that is the best fit for those preferences, given the current configuration of the other screen devices. The variant on this method takes an array of HGraphicsConfigTemplate objects and returns the configuration that best fit all of them, but this is used less often. Once we have a suitable configuration, we can set it using the setGraphicsConfiguration() method. As we've already seen, the HAVi API uses the HScreenConfigTemplate class to define a set of preferences for the configuration. These preferences have two parts - the preference itself, and a priority. The preferences are fairly selfexplanatory, and valid preferences are defined by constant values in the HScreenConfigTemplate class and its subclasses. We'll only mention three of these preferences here - the ZERO_GRAPHICS_IMPACT and ZERO_VIDEO_IMPACT preferences are used to specify that any configuration shouldn't have an affect on already running graphical applications or on currently playing video; and the VIDEO_GRAPHICS_PIXEL_ALIGNED preference indicates whether the pixels in the video and graphics layers should be perfectly aligned (e.g. if the application wants to use pixel-perfect graphics overlays on to video). The more interesting part is the priority that's assigned to every preference. This can take one of the following values:
• • • • •
REQUIRED - this preference must be met PREFERRED - this preference should be met, but may be ignored if
necessary
UNNECESSARY - the application has no preferred value for this
preference
PREFERRED_NOT - this preference should not take the specified value,
but may if necessary
REQUIRED_NOT - this preference must not take the specified value
This allows the application a reasonable degree of freedom in specifying the configuration it wants, while still allowing the receiver to be flexible in matching the desires of the application with those of other applications and the constraints imposed by other screen device configurations. When the application calls the HGraphicsDevice.getBestConfiguration() method (or equivalent method on another screen device), the receiver checks the preferences specified in the HScreenConfigTemplate and attempts to find a configuration that matches all the preferences specified as REQUIRED and REQUIRED_NOT while meeting the constraints specified by other applications. If this can me done, the method returns a HGraphicsConfiguration object representing the new configuration, or NULL of the constraints can't all be met. This sounds horribly complex, but it's not as bad as it seems at first. The table below shows how these classes are used together:
Device HGraphicsDevice HVideoDevice HBackgroundDevice Configuration class HGraphicsConfiguration HVideoConfiguration HBackgroundConfiguration Configuration template HGraphicsConfigTemplate HVideoConfigTemplate HBackgroundConfigTemplate
One thing that becomes apparent from this method of setting the graphics configuration is that the configuration may change as an application is running. Given that applications usually want to know when their display properties have changed, the HScreenDevice class allows applications to add themselves as listeners for HScreenConfigurationEvent events using the addScreenConfigurationListener() method. These events inform applications when the configuration of a device changes in a way that isn't compatible with the template that's specified as an argument to the addScreenConfigurationListener(). The application can use this to know when its display changes in a way that means it needs to change its display settings and react accordingly. Something this complex deserves a code example, so here's one that shows how to set the configuration on a graphics device:
// First, get the HSCreen. We'll just use the // default HScreen since we probably only have // one.
HScreen screen = HScreen.getDefaultHScreen(); // Now get the default HGraphicsDevice, again // because we'll probably only have one of them HGraphicsDevice device; device = screen.getDefaultHGraphicsDevice(); // Create a new template for the graphics // configuration and start setting preferences HGraphicsConfigTemplate template; template = new HGraphicsConfigTemplate(); // We prefer a configuration that supports image // scaling template.setPreference( template.IMAGE_SCALING_SUPPORT, template.PREFERRED); // We also need a configuration that doesn't affect // applications that are already running template.setPreference( template.ZERO_GRAPHICS_IMPACT, template.REQUIRED); // Now get a device configuration that matches our // preferences HGraphicsConfiguration configuration; configuration = device.getBestConfiguration(template); // // // // if Finally, we can actually set the configuration. However, before doing this, we need to check that our configuration is not null (to make sure that our preferences could actually be met). (configuration != null) { try { device.setConfiguration(configuration) } catch (Exception e) { // We will ignore the exceptions for this // example }
}
Background Configuration Issues
The background layer in an MHP receiver can display either a solid colour or a still image. An application can choose which of these two options to use by setting the appropriate preference in the HBackgroundConfigTemplate when getting a configuration. If an application chooses the still image configuration, then the configuration that's actually returned will be a subclass of HBackgroundConfiguration. This subclass, HStillImageBackgroundConfiguration, adds two extra methods over the standard HBackgroundConfiguration class:
public void displayImage( HBackgroundImage image) public void displayImage( HBackgroundImage image, HScreenRectangle r)
Both of these methods take an HBackgroundImage as a parameter. HBackgroundImage is a class designed to handle images in the background layer, as its name suggests. Why do we need a separate class for this? After all, we've got a perfectly good java.awt.Image class.
The big problem is that the background image isn't part of the AWT component hierarchy, so the AWT Image class isn't really suitable. The HBackgroundImage class is designed as a replacement that doesn't have all the baggage of the AWT image class, and only has methods that are needed for displaying background images.
public class HBackgroundImage { public HBackgroundImage(String filename); public void load(HBackgroundImageListener l) public void flush(); public int getHeight(); public int getWidth(); }
We won't examine this class in too much detail because it's not very complex, but at the same time you should at least know of its existence and how to use it. The HBackgroundImage class takes the filename of either a GIF, JPEG, PNG or MPEG I-frame image as an argument to the constructor, and the image must be loaded and disposed of explicitly, using the load() and flush() methods respectively. This has the big advantage of allowing the application absolute control over when the image (which may be quite large) is resident in memory.
Device Configuration Gotchas
As we've seen, setting the configuration of a display device is a pretty complex business, and there is a number of things that developers need to be aware of when they are developing applications. The first of these is that the different shapes of video and graphics pixels may mean that the video and graphics are not perfectly aligned. Luckily this can be fixed by setting the VIDEO_GRAPHICS_PIXEL_ALIGNED preference in the configuration template, although there's no guarantee that a receiver can actually support this. As if this wasn't enough, there are other factors that can cause display problems. Overscan in the TV can mean that 5% of the display is off-screen, so developers should stick to the 'safe' area which does not include the 5% of the display near each of the screen edges. If you really want to put graphics over the entire display, be aware that some of them may not appear on the screen. Another problem is that what the receiver outputs may not be what's actually displayed on the screen. Given that many modern TV sets allow the viewer to control the aspect ratio, the receiver may be producing a 4:3 signal that's being displayed by the TV in 16:9 mode. Even then, there are various options for how the TV will actually adjust the display to the new aspect ratio. While there's nothing that your application can do about this, because it won't even know, it's something that you as a developer may need to be aware of because even if the application isn't broadcast as part of a 16:9 signal now, it may be in the future.
Finally, as we've seen earlier, the graphics, video and background configurations are not completely independent, and changing one may have an effect on the others. Applications that care about the graphics or video configuration should monitor them, to make sure that the configuration stays compatible with what the application is expecting, and to be able to adapt when the configuration does change.
Coordinate systems in MHP
The three display layers in MHP don't just causes us problems with configuring them. MHP has three different coordinate systems, which are used to provide several different ways of positioning objects accurately on the screen. Typically, an MHP application that uses graphics to any great degree will need to be aware of all of them, even if it doesn't use them all. Different APIs will have different support for the various coordinate systems. these are used Normalized coordinates The normalized coordinate system has its origin in the top left corner of the screen, and has a maximum value of (1, 1) in the bottom right corner. This is an abstract coordinate system that allows the positioning of objects relative to each other without specifying absolute coordinates. Thus, an application can decide to place an object in the very centre of the screen without having to know the exact screen resolution. Normalized coordinates are usually used by the HAVi classes. Especially, they are used by the HScreenPoint and HScreenRectangle classes. Screen coordinates The screen coordinate system also has its origin in the top left corner of the screen, and has its maximum X and Y coordinates in the bottom right of the screen. These maximum values are not fixed, and will vary with the display device. The MHP specification states that the maximum X and Y coordinates will be at least (720, 576), however. Effectively, the screen coordinate space is defined by the HGraphicsDevice. Screen coordinates are used by HAVi for positioning HScenes and for configuring the resolution of HScreenDevice objects. AWT coordinates The AWT coordinate system is the standard system used by AWT (but you'd probably guessed that from the name). ike the screen co-ordinate system, it is pixel based, but instead of dealing with the entire screen, it deals with the root window for the application. The origin is in the top left corner of the AWT root container for the application (which as we shall see below is the HScene, and its maximum values are at the bottom right corner of that window. used by AWT
This is a very rough guide to the coordinate systems used in MHP. If you are feeling very curious, the MHP specification describes the various coordinate systems and the relationship between them in more detail than is ever likely to be useful to you.
HScenes and HSceneTemplates
Since set-top boxes usually have such limited memory and CPU power, it's not very likely that a set-top box will run a fully-fledged window manager. What this means for MHP developers is that you can't use the java.awt.Frame class as a top-level window for applications, because this class is simply too dependent on a window manager to work without one. Instead, we use the org.havi.ui.HScene class. This defines something conceptually similar to a Frame, but without all the extra baggage that a Frame has associated with it. It also has a few security limitations that the Frame class doesn't. The biggest one of these is that an application cannot see more of the AWT hierarchy that its own HScene and any components in it. HScenes from other applications are not visible to it,and other applications can't see the HScene belonging to this application. Thus, every application is almost totally isolated from the graphical activities of any other application. We'll see what almost means in a moment.
In the diagram above, the application that owns the green components cannot manipulate the white or cyan components, since these belong to other applications. Indeed, the application won't even know that these components exist. One of the other differences between an HScene and a Frame is that an application may only create one HScene - any attempts to create an HScene before the current one is disposed of will fail. While this may seem strange at first, it makes sense if you remember that the receiver probably has no window manager - after all, how many applications need more than one toplevel window? The only case where an application is allowed to have more than one HScene is where the HScenes appear on different HScreens (i.e. on different display devices). The HScene also lets us control the TV-specific functionality that a Frame wouldn't do, such as the aspect ratio of the screen and issues related to blending and transparency between the graphics layer and the other layers of the display. Given these differences, and the amount of information that we actually need to specify to set up an HScene the way we want it, we can't simply create one in the same way that we'd create a Frame. Instead, we use the org.havi.ui.HSceneFactory class to create one for us. This allows the runtime system to make sure that the HScene we get is as close to our requirements as it could be, while at the same time meeting all the limitations of the platform. For instance, some platforms may not allow HScenes to overlap on screen. In this case, if you request an HScene that overlaps with an existing one, that request can't be met and so your application will not get the HScene that it requested. We tell the HSceneFactory what type of HScene we want using almost the same technique that was used for screen devices
The HSceneTemplate allows us to specify a set of constraints on our HScene, such as size, location on screen and a variety of other factors (see the interface below), and it also lets us specify how important those constraints are to us. For instance, we may need our HScene to be a certain size so that we can fit all our UI elements into it, but we may not care so much where on screen it appears. To handle this, each property on an HScene has a priority either REQUIRED, PREFERRED or UNNECESSARY. This lets us specify the relative importance of the various properties.
public class HSceneTemplate extends Object { // priorities public static final int REQUIRED public static final int PREFERRED public static final int UNNECESSARY // possible preferences that can be set public static final Dimension LARGEST_DIMENSION public static final int GRAPHICS_CONFIGURATION public static final int SCENE_PIXEL_RESOLUTION public static final int SCENE_PIXEL_RECTANGLE public static final int SCENE_SCREEN_RECTANGLE // methods to set and get priorities public void setPreference( int preference, Object object, int priority) public Object getPreferenceObject(int preference) public int getPreferencePriority(int preference) }
Properties that are UNNECESSARY will be ignored when the HSceneFactory attempts to create the HScene, and PREFERRED properties may be ignored if that's the only way to create an HScene that fits the other constraints. However, REQUIRED properties will never be ignored - if the HSceneFactory can't create an HScene that meets all the REQUIRED properties specified in the template while still meeting all the other constraints that may exist, the calling application will not be given an HScene. So now that we've seen the mechanism that we need to use to get an HScene, how do we actually do it? The HSceneFactory.getBestScene() method takes an HSceneTemplate as an argument, and will return an HScene if the constraints in the HSceneTemplate can be met, or NULL otherwise. Once you have the HScene, it can be treated just like any other AWT container class. The only real difference is that you have to explicitly dispose of an HScene when you're done with it, so that any resources that it has can be freed up. If we look at the HSceneFactory again in some more details, we can see that there are a few other methods that allow us to manipulate HScenes:
public class HSceneFactory extends Object{ public static HSceneFactory getInstance() public HSceneTemplate getBestSceneTemplate( HSceneTemplate hst) public HScene getBestScene(HSceneTemplate hst) public void dispose(HScene scene public HSceneTemplate resizeScene(
HScene hs, HSceneTemplate hst) throws java.lang.IllegalStateException public HScene getSelectedScene( HGraphicsConfiguration selection[], HScreenRectangle screenRectangle, Dimension resolution) public HScene getFullScreenScene( HGraphicsDevice device, Dimension resolution) }
We've already seen the dispose() and getBestScene() methods, and some of the other methods are self-explanatory. The getBestSceneTemplate() method allows an application to 'negotiate' for available resources - the applications calls this method with an HSceneTemplate that describes the HScene it would ideally like, and the HSceneTemplate that gets returned describes the closest match that could fit the input template. Of course, since an MHP receiver can have more than one application running at the same time, there's no guarantee that an HScene can be created that exactly matches this 'best' HSceneTemplate since another application may create an HScene or do something else that changes the graphics configuration in an incompatible way.
High-Level Graphics Issues
Now that we've seen the lower-level graphics issues we can discuss some of the higher level issues relating to graphics and user interfaces in MHP. The first thing to cover is the image formats that are supported by MHP. MHP supports the following four image formats:
• • • •
GIF JPEG PNG MPEG I-frame
In addition to this, it supports both the DVB subtitle and DVB teletext formats for subtitles (yes, this is confusing). The image formats can be used with any java.awt.Image object, just like you'd expect. To manipulate the subtitles, however, you need to use the org.davic.media. SubtitlingLanguageControl. This is a Java Media Framework control (see the section on JMF for information about JMF controls) that allows you to turn subtitles on or off, query their status and select the language of the subtitles. This is an optional control, however, so many receivers may not support it.
Colours and transparency in MHP
As I mentioned in the start of this section, one of the differences between Java graphics and TV signals is that Java graphics use the RGB colour system while TV signals use the YUV system. You'll be pleased to hear that all conversion is handled within the MHP implementation, and that developers don't need care about this.
However, there is one colour-related issue that you should be aware of. Given that an MHP receiver is primarily a video-based device (and I just know that someone will probably disagree with this, but I'll say it anyway), it's generally useful for the graphics layer to be able to contain elements that are transparent through to the video layer so that the video can be seen through it. Since MHP is based on the Java 1.1 APIs, though, there is no default support in AWT for this. MHP allows colours to be transparent to the video layer by using a mechanism similar to the java.awt.Color class in Java 2. The class org.dvb.ui.DvbColor adds an alpha value to the java.awt.Color class from JDK 1.1. While the alpha (transparency) level is an eight-bit value, only fully transparent, 50% transparent and fully opaque must be supported by the receiver. Other levels may be supported however, but this can't be guaranteed.
The MHP Widget Set
Something else that we mentioned earlier is the fact that the AWT package in MHP has been cut down dramatically from the desktop AWT version that developers are familiar with. The MHP version of AWT supports 'lightweight' components only (i.e. those with no peer classes), which rules out things like java.awt.Button, java.awt.Dialog and java.awt.Menu. The HAVi Level 2 GUI that is used as part of MHP (the org.havi.ui package that we've already seen) provides a set of widgets that can be used instead of the missing AWT ones. The widgets provided by the HAVI API are the standard set that you'd expect to find in any GUI system:
• • • • •
buttons, check-boxes and radio buttons icons scrollable lists dialog boxes (these are a subclass of HContainer, the HAVi replacement for the java.awt.Container class) text entry fields
Of course, this is not a comprehensive list, and there are others. Check the org.havi.ui package for details of the entire widget set. One of the problems that application designers have with any standard widget set, of course, is that it enforces a standard look and feel. While a standard look and feel is fine (and generally considered essential) for a PC application, on a TV it's not so important or so useful. An a TV environment, the interaction model is usually simpler; also, the emphasis is less on efficiency and more on fun and differentiating your application from every other application.
To help content developers differentiate their applications from everyone else's, the HAVi API allows the appearance of the widget set to be changed fairly easily. The org.havi.ui.HLook interface and its subclasses allow an application to override the paint() method for certain GUI classes without having to define new subclasses for the GUI elements that will get the new look. HLook objects can only be used with the org.havi.ui.HVisible class and its subclasses, but this still offers a fairly large selection of possibilities. The HLook.showLook() method will get called by the paint() method of HVisible and its subclasses, so applications themselves do not need to call it. Effectively all the HLook method does is override those methods that are actually related to drawing the GUI element. An HLook instance is attached to an HVisible object using the HVisible.setLook() method. This takes an HLook object as an argument, and simply sets the HVisible to use that HLook object instead of using the internal implementation of the paint() method The most popular way for applications to provide their own look and feel so far, however, is for applications to simply provide their own widgets. It's not clear whether this is because developers are still unfamiliar with the HAVi API set, or whether there is a real benefit to rolling your own widgets. The big advantage seems to be for those applications that don't require complex widgets, but then using these complex widgets on a system where you may only have a TV remote control for user input is a potential usability nightmare.
User input event handling
Most MHP receivers will receive user input from a remote control. While input from a keyboard (and possibly a mouse) may be available, it's not very likely that this will be used in most cases. The MHP receiver defines the following key events that may be generated by a remote:
Constant name VK_UP VK_DOWN VK_LEFT VK_RIGHT VK_ENTER VK_0 to VK_9 VK_TELETEXT VK_COLORED_KEY_0 VK_COLORED_KEY_1 VK_COLORED_KEY_2 VK_COLORED_KEY_3 Key up arrow down arrow left arrow right arrow enter (also known as select or OK) number keys teletext key first coloured key second coloured key third coloured key fourth coloured key Key code (if standardized) not standardized not standardized not standardized not standardized not standardized 48 - 57 459 403 404 405 406
Key codes for VK_UP, VK_DOWN, VK_LEFT, VK_RIGHT and VK_ENTER are all defined by the Java platform. While the remote may generate other key
codes as well, these are the only ones that are actually guaranteed to be available on every receiver. Relying on others is a little dangerous. The full list of defined key events can be found in the org.havi.ui.event.HRcEvent class in the MHP specification.
Keyboard events and input focus
The AWT event model offers a pretty good basis for user input in MHP (not least because of its familiarity). In the conventional AWT event model, a component may only receive keyboard events when it has user input focus. In an environment such as MHP, this restriction can cause some problems. Imagine the case of an EPG that only pops up when a particular button is pressed. If this application wanted to maintain an invisible AWT component on the screen to detect any AWT events, it would have to be contained in an HScene. As we saw in the section on graphics, HScenes can't overlap on the screen. This means that the EPG would be restricting the space available to other applications, even though it had no real visible component on screen. Since this is not very polite, having no components in the AWT hierarchy is a far better idea. Even if we had the invisible component, that component would need focus to get the event anyway. All in all, this is less than useful. So, we need an API to get user input events without having to use AWT. The org.dvb.event package defines an API for allowing applications to access events before they enter the AWT event mechanism. A class that implements the UserEventListener can receive input events even though that application does not have user input focus. The events that get generated by this API are not AWT events, but are instead instances of the UserEvent class. This class covers more than just keyboard events in theory - at present, however, it only supports keyboard events although more classes of events may be added later. The information about which key was pressed and what modifiers (shift, CTRL, ALT or META keys, for example) are active can be obtained and the event acted upon just like an AWT event. Before an application can receive user events, it must define a UserEventRepository that defines the group of events that the application wishes to receive. As we can see from the API for this method, it contains methods to allow various combinations of keys to be added and removed:
public class UserEventRepository { public UserEventRepository (String name); public void addUserEvent (UserEvent event); public UserEvent[] getUserEvent (); public void removeUserEvent (UserEvent event); public void addKey (int keycode); public void removeKey (int keycode); public void addAllNumericKeys(); public void addAllColourKeys();
public void addAllArrowKeys(); public void removeAllNumericKeys(); public void removeAllColourKeys(); public void removeAllArrowKeys(); }
The OverallRepository is a subclass of UserEventRepository that contains all the user events. Once the application has defined the set of keys it wishes to receive events for, it can use the EventManager class to request access to these events. This class is a singleton object, which can be obtained by calling the EventManager.getInstance() method. Once an instance has been obtained, the addUserEventListener() and removeUserEventListener to add and remove listeners for a set of events defined in a UserEventRepository object. Access to these events is considered a scarce resource (we will see why below), and so an application may lose access to them. Since the EventManager class implements the org.davic.resources.ResourceServer interface, application can subscribe to ResourceStatusEvents to discover if it has lost access to user input events. It's important to note that an application can lose access to events and may not receive them. Applications that care about this should subscribe to ResourceStatusEvents from the event manager to make sure that they are aware when they have lost resources.
Exclusive access to keyboard events
This approach does bring some problems, however. When entering sensitive information (e.g. credit card numbers or passwords), allowing other applications to receive these keyboard events is a Very Bad Thing. For this reason, applications can request that they are allowed exclusive access to specific keyboard events. The EventManager class has two methods that we haven't discussed yet. These are addExclusiveAccessToAWTEvent() and removeExclusiveAccessToAWTEvent(), and as their names suggest they allow an application to request exclusive access to a set of events. Unlike non-exclusive user input events, exclusive events are received using the standard event mechanism from AWT. The addExclusiveAccessToAWTEvent() method takes a UserEventRepository and an org.davic.resources.ResourceClient as arguments - the first of these contains the set of events for which exclusive access is requested, while the second indicates which part of the application should be notified when exclusive access is revoked. When an MHP receiver gets a user input event, it has to decide which application (or applications) should get that event. This flowchart illustrates just how incoming events are routed to the appropriate applications:
Since the application can request exclusive access to user input events, these are treated like a scarce resource (since by definition, only one application can have exclusive access to a given user input event). The exact nature by which this resource (or actually, resources) is managed is platformdependent. Any application that requests exclusive access to user input events must take into account the fact that it could lose that exclusive access at any time.
Media control APIs.
MHP has two APIs that can be used for controlling the display of video signals and choosing which signals to actually present to the user. The first of these is
the Java Media Framework (JMF), with a few modifications and a lot of extensions. This is mainly used for controlling individual media clips. The second API that we have is the JavaTV service selection API. This is designed to let applications switch entire services, so using it to manipulate a video or audio stream is a little like using a sledgehammer to crack a nut, and is equally dangerous. Using this API involves a very real risk that you application may not survive the experience. Finally, and only tangentially related to media presentation is the tuning API. The tuning API provides applications with a way to access another transport stream, mainly so that an application can access data-only services that are on a different transport stream.
Locators, More locators and Yet More Locators
Before we start looking at how we can manipulate and control media in MHP, there's one important topic that needs covering first. Most of the media- and transport stream-related APIs in MHP use a concept called a locator to refer to a particular piece of media. Locators are opaque objects that do not directly expose their reference to the content. Internally, these references may be pointers, numbers or anything else that the implementation wishes to use. However, all locators have a human-readable format (an external form) that can be used, and it's probably no surprise that this format is a URL. Of course, a new URL format had to be defined for referring to DVB services, and we'll see what this looks like below. Now, it's confession time. Locators in MHP are a bit of a mess. Even worse, they're a mess that I'm partially responsible for. Sorry folks - this really was the best way of handling this situation that the various people who were involved could think of. Most sensible implementations will make things a little easier internally by making the various locator classes relate to one another pretty closely, but it's still ugly work to use them across APIs. Just about everything ends up getting passed around as a string. Why are locators such a mess? First of all, there's so many different flavours of them:
• • • • •
javax.media.MediaLocator org.davic.media.MediaLocator javax.tv.locator.Locator org.davic.net.Locator org.davic.net.dvb.DvbLocator
Some of these are subclasses of the others, and they can be split into three main class hierarchies: javax.media.MediaLocator and subclasses, javax.tv.locator.Locator (which currently has no subclasses) and org.davic.net.Locator and its subclasses.
Each of these is used to refer to some media, but they are all used in subtly different ways. For instance, a javax.media.MediaLocator is used by the Java Media Framework to refer to a media clip that will be presented, and so this may refer to either a DVB service (or service component), an audio clip or a video drip (we'll see more about all of these later).
org.davic.net.Locator and its subclasses are designed to refer to DVB transport
streams or services. javax.tv.locator.Locator is a more general locator defined by JavaTV that can refer to any piece of digital TV content (either a service, a transport stream or a file in a broadcast filesystem). Unlike the org.davic.net.Locator class, this may refer to any type of digital TV system, not just a DVB-based system. In the case of an MHP receiver, however, JavaTV locators refer to DVB transport streams and services just like the org.davic.net.Locator. One of the major problems in MHP is that all these locator classes may be mutually incompatible - there is a number of times when you will want to convert a locator from one format to another and unfortunately there is no easy way of doing this. The only reliable way of converting between them is to convert the locator to a URL string using the toExternalForm() method, and then create a new locator of the desired type. While it's not too painful, it is annoying and it's something that most application developers would prefer not to have to do. Creating locators Some of these locator types (javax.media.MediaLocator, org.davic.net.Locator and their subclasses) can be created directly from a URL. The org.davic.net.Locator class also has a number of other constructors that allow the various components of the locator to be specified directly. The exact nature of these components, and the URL formats for these locators are introduced in the next section. Instances of javax.tv.locator.Locator, on the other hand, can only be created by using the javax.tv.locator.LocatorFactory class. This factory class provides the createLocator() method, which takes a URL of the same format as the previous Locator classes.
Content Formats in MHP
The media-related APIs in MHP refer to three main content formats. This section will hopefully give you a little insight into them, since they are quite different from the content formats that you may be familiar with. It's important to remember, however, that many APIs will only be able to accept locators to some of these content formats - usually DVB services. Only the Java Media Framework can handle all of the content formats that are listed here. DVB Services Many of the APIs in MHP refer to DVB services. As we're already pretty
familiar with the format of DVB services, we'll concentrate here on the DVB locator format. Locators that refer to DVB services are created using the DVB URL format:
dvb://
..[.[&]][;][]
Where the various parts of the URL are as follows: onID The Original Network ID, which identifies the broadcaster or network that produced the content (not the network currently broadcasting it, if they are different) tsID The Transport Stream ID, which identifies a specific transport stream that the network is broadcasting sID The Service ID, which refers to a service within that transport stream ctag The Component Tag, which refers to a specific elementary stream that has been tagged in the service information evID The Event ID, which identifies a specific event that is part of the service path The path to a file in a broadcast filesystem that's being transmitted on that elementary stream All of these components (except for the path component) are represented as hexadecimal values without the leading 0x. The path component uses the standard URL path format as defined in RFC 2396. All of the numeric identifiers used for the various components of the URL match the identifiers used in the DVB service information that is part of the transport stream. Some example DVB URLs are: (identifies a DVB service) dvb://123.456.789.42 (identifies a DVB event) dvb://123.456.789/images/logo.gif (identifies a file in an object carousel) dvb://123.456.789;66 (identifies an elementary stream within a service)
dvb://123.456.789
As we can see, only the first three elements are required. This means that a DVB URL will always refer to a service, although in some cases (most notably the org.davic.net.dvb.DvbLocator class when used in certain APIs) it is possible that the service ID will be ignored and that a locator created from the URL will refer to a transport stream. The tuning API is the case where this is most obvious. MPEG audio clips This content type is pretty straightforward, and should be familiar to most developers. It's simply a piece of MPEG-1 audio that is loaded from a file. This file will typically be stored in a DSM/CC object carousel.
A standard file:// URL is used to refer to the clip - this makes things slightly simpler conceptually, but it does mean that the clip must reside on a DSM-CC object carousel that the receiver is already connected to. Video 'drips' This is a new content format that's pretty much unique to the digital TV world. The main aim of this format is to provide a memory-efficient way for displaying several similar images. Basically, it is a very short piece of MPEG-2 - the first thing in the file is an MPEG-2 I frame that can be decoded and presented to the user. This is followed by one or more P frames, which are then decoded based on the preceding I frame. This allows the decoder to update a static image in a very memory-efficient way. In this format, the data is passed to the Java Media Framework via an array of bytes, so the content is already loaded from a file or other data source. Because of this, the format doesn't really have a locator that identifies the place the data is loaded from. A URL is still needed, though, to create locators and create the appropriate objects to decode this content format. For this reason, the entire URL, for every piece of video 'drip' content is
dripfeed://
Believe it or not, this is a valid URL, and this tells the locator what the content format is, but without needing to refer to a specific piece of data. We'll see more about how this works in the next section.
The Java Media Framework
The Java Media Framework (JMF) was chosen for use in MHP mainly because it was a standardized API for media control that already existed, and DVB doesn't like re-inventing the wheel unless it has to. Although the API hasn't changed much from Sun's original JMF specification, there are some new restrictions on what does and doesn't work, and on the return values from some of the methods. These were needed because JMF tends to focus on playing media where it has complete control of the media source, and this just isn't the case in a broadcast environment. For instance, a normal JMF-based application that's playing a media clip from a hard disk can pause the playback and start it later, choose which point in the stream to start the playback from, and fast-forward and rewind through the clip. With a broadcast stream, this just isn't possible because the receiver must deal with whatever data is currently being broadcast. In a broadcast environment, if you pause the video stream and un-pause it after a few seconds, the video will jump to what's currently being broadcast and not start playing from where it stopped.
This is only one of the places where there is a fairly major change in philosophy between the PC and broadcast worlds. To be honest, JMF probably wasn't the ideal choice for a media control API given its strong focus on PC-based media, but it was a lot easier to use JMF than to define something new, and there are an awful lot of people who already know how to use JMF. If you're not one of those people, the rest of this part is a fairly gentle introduction to JMF. If you are familiar with JMF, you can skip to the next section, which describes the DVB-specific JMF extensions. The JMF API lives in the javax.media package, although the MHP-specific extensions are located in a couple of other packages that we'll see later. MHP uses version 1.1 of the JMF API - this has some substantial differences from the JMF 2.0 API that is standard at the time of writing. Mostly, JMF 1.1 only handles playback and does not cover recording, which makes the overall API a lot simpler and smaller. JMF has three major concepts that you need to understand before you can really use it effectively. The most important element of JMF is a Player. This is the class that is actually responsible for decoding and playing the media. Every Player object has a set of zero or more Control objects associated with it. A JMF Control is an extension to the Player object that allows new functionality to be added to a player without having to create a subclass. For instance, controls are typically used to provide things like freeze-frame functionality, or language choice, on top of the built-in functions of the Player class. The final element is the DataSource class. This is the class that actually gets the media data that the player will decode. Why is this separate from the player? Well, this design enables a player to receive data from more than one source without having to handle all the different sources itself. The DataSource class can be considered a little like the unified I/O model in a Unix platform, where everything appears like a file and can be manipulated with file operations. The application (and the player) can manipulate the DataSource object and get the media data through one single interface, without having to care if that data is coming from memory, a local hard disk, an HTTP connection or something even more exotic. A Player always has an associated DataSource object - it is not possible to create a player that does not have a data source attached to it.
The diagram above (not based on what is supported in MHP) shows how these relationships work. On the left, we have four possible data sources that we can choose from. These can load data from an HTTP connection, a local file, from memory or from an MPEG stream respectively. The byte stream that we will get out of these data sources will be identical - an application could use any data source and get data in the same way using the same interface. Only the way that the data was loaded from would change. The data source then sends its data to a player. In this example, we have three players that we can choose from. One handles MPEG-2 data, the second handles QuickTime movie clips and the final one handles MP3 audio data. Finally, we have two controls available. The freeze-frame control is only applicable to those players that handle video, which in this case is the MPEG2 and QuickTime players. The volume control, on the other hand, is applicable to all three players. Now that we've seen the overall architecture, let's take a look at how this actually fits with the API. The javax.media.Manager class is the main entry point for any application wishing to use the JMF API.
public class Manager { public static Player createPlayer( URL sourceURL) throws IOException, NoPlayerException; public static Player createPlayer( MediaLocator sourceLocator) throws IOException, NoPlayerException; public static Player createPlayer( DataSource source) throws IOException, NoPlayerException; public static DataSource createDataSource( URL sourceURL) throws IOException, NoDataSourceException; public static DataSource createDataSource( MediaLocator sourceLocator) throws IOException, NoDataSourceException;
public static TimeBase getSystemTimeBase(); public static Vector getDataSourceList( String protocolName); public static Vector getHandlerClassList( String contentName); }
The most interesting methods are the createPlayer() and createDataSource() methods. The two versions of the createDataSource() method both create a new DataSource instance that will fetch data from the location referred to by the URL or the MediaLocator instance. Similarly, the createPlayer() method creates a new Player object that can be used to play a media clip. The only difference between this method and the createDataSource() method is that a player can be created from a DataSource as well as from a URL or a locator.
The player creation process
Given that a JMF implementation can have multiple data sources, and players to handle multiple media types, how do we get from a URL or a locator to a player that's presenting a specific type of media from a specific source? Let's consider the process of creating a Player from a URL. We'll take a URL as our example, because in the end a locator also maps on to a URL. First, the application calls the static Manager.createPlayer() method. In this case, let's assume it calls it with the following URL:
http://www.example.net/media/SomeContent.mp3 •
The first thing that happens is the Manager examines the protocol part of the URL. The protocol indicates the data source that is needed to access this content, and the Manager uses this to construct the class name of the DataSource object that it needs to create. This is calculated as ..DataSource where in this case, the protocol is http. The class prefix is simply a fixed string that points to the top of the class hierarchy containing the data source implementation classes. So, if our class prefix was com.stevem.media.protocol, the resulting class name would be
com.stevem.media.protocol.http.DataSource
•
•
Once the class name has been created, the class is loaded and instantiated. The DataSource object will be used to get the media data that will be presented. Once the DataSource is loaded and instantiated, the Manager uses this data source to connect to the location specified in the URL (we will
•
cover this process in more detail below). Once a connection has been established, the Manager uses the getContentType() method on the data source to find the MIME content type of the data that the data source is connected to. The MIME content type is then used to construct the class name for the Player that will be loaded. This is very similar to the process used to create the class name for the data source. The class name takes the form ...Player So, if the class prefix was com.stevem.media.players and the MIME content type was audio/mp3 (since we've connected to a URL for an MP3 file), the resulting class name for the player is
com.stevem.media.players.audio.mp3.Player
•
•
Now that the Manager has constructed the class name for the Player object, it loads the class and instantiates it. Once the player is instantiated, the manager calls the Player.setSource() method to associate the data source with the player. The Manager now has a completely instantiated player and data source, and the player is returned to the application.
This appears quite complex, but it's really not. The most complex part is finding the MIME content type for the content referred to by the URL, but in many cases in an MHP receiver a protocol will only provide access to one type of content and so this is hard-coded in the DataSource classes.
A closer look at data sources
As you'll have noticed from the steps above, there are a few steps that we glossed over during the player creation process. Probably one of the most important of these is the actual mechanics of the data source. The other is the mechanics of the player, which we'll see in the next section. First, let's take a closer look at the DataSource class interface:
public class DataSource { public DataSource(); public DataSource(MediaLocator source); public void setLocator( MediaLocator source); public MediaLocator getLocator(); public abstract String getContentType(); public abstract void connect() throws IOException; public abstract void disconnect(); public abstract void start()
throws IOException; public abstract void stop() throws IOException; }
This isn't very complex, but it's important to understand it. The first thing to notice is the two constructors - while a DataSource object can be constructed using a MediaLocator directly, this is not the usual way that it gets created. Instead, the Manager normally ends up calling Class.newInstance() to instantiate this object (since we don't actually know what class we're loading until runtime). For this reason, we have separate methods to set and get the locator as well as being able to set it using the constructor. Without an associated locator, the data source is useless. We can only set the locator once - attempting to set it a second time will generate an error. The reason for this restriction is that if an application could set the locator, we would have to check that the locator was actually compatible with the data source (e.g. that only a HTTP URL was used when setting the locator for an HTTP data source). Since creating an new data source is not very resource intensive (although using the data source may be), it's easier to simply force the application to create a new data source. We can always trust the Manager to set the locator correctly, because that's a part of the middleware and so the receiver manufacturer can verify that this always does the right thing. Once we've set the locator, we actually have a data source that points to some data. The next thing that needs to be done is to connect the data source to the location specified by the locator. Until we've done this, we have no way of accessing the data. This method may do different things depending on the data source. For instance, in a File data source, the connect() method will open the file, while in an HTTP data source, the connect() method will set up an HTTP connection to the URL that's specified in the locator. Having made a connection, we can actually find the type of data that the locator refers to using the getContentType() method that we've already seen. In a PC-based JMF implementation, it's almost impossible for the JMF implementation to know the content type of the data until a connection exists for example, it may need to get the information from the HTTP headers or even from header information contained in the data itself. In an MHP implementation, however, where there are a limited number of standard data sources and content types that are supported by the receiver, the content type can sometimes be known just from the type of data source used to access it. A dvb:// locator that does not contain a path component (which would be ignored by JMF anyway) is always used to access a DVB service or elements of a DVB service, and so the data source for the dvb protocol knows that it will never be accessing any content type other than multipart/dvb.service (the MIME content type for a DVB service).
The start() and stop() methods respectively start and stop data transfer. it's only when the start() method is called that the data source actually has data that it can pass to a player. Obviously, the data source must be connected to the actual source of the media data before start() can be called. Once the data source is no longer in use, the disconnect() method can be called to disconnect the data source from the actual source. The advantage of doing this is that if some scarce resources are needed to the connection keep alive (e.g. a modem connection to a sever), then the resources can be explicitly released when they are not in use. By separating the connection/disconnection process from the process of actually getting data, the time-consuming parts such as connection setup can be carried out before the data is actually needed, thus saving time when the data is required.
JMF players
We've seen how to create a player, and so now it's time to see how we can actually use it.
public interface Player extends MediaHandler, Controller, Duration { public public public public public public abstract abstract abstract abstract abstract abstract void Time void Time long Time setStopTime(Time stopTime) getStopTime() setMediaTime(Time now) getMediaTime() getMediaNanoseconds() getSyncTime()
public abstract float getRate() public abstract float setRate(float factor) public abstract int getState() public abstract int getTargetState() public public public public public public public abstract abstract abstract abstract abstract abstract abstract void void void void void void void realize() prefetch() start() syncStart(Time at) stop() deallocate() close()
public abstract Control[] getControls() public abstract Control getControl(String forName) public abstract GainControl getGainControl() public abstract void setSource(DataSource source) throws IOException, IncompatibleSourceException public abstract void ControllerListener public abstract void ControllerListener } addControllerListener( listener) removeControllerListener( listener)
As you can see, this is a pretty complex interface. What's really scary is that this is not the whole interface. Before you start panicking too much, don't worry - you don't need to understand or even know about most of the stuff that a Player can do. If you're really interested, I recommend the book 'Programming with the Java Media Framework' (Sean Sullivan, Loren Winzeler, Jeannie Deagen and Deanna Brown, pub. Wiley).
There are some differences between players in the MHP implementations of JMF and players in the desktop PC version. Some of these will be described later, but we'll cover a couple of them here as well. The first of these differences is that some of the other standard features of players in a desktop implementation are not available. These include things like the setSource() method (inherited from the MediaHandler class), since setting the data source in this way has little or no meaning in an MHP environment due to limitations imposed by the hardware. We'll look at a different way of doing the same thing below. Another difference is that the selection of media may be driven more by user preferences or platform settings than would be the case in a desktop implementation. This includes things like the choice of audio track or subtitles based on user preferences and the language settings in the receiver firmware. The largest difference, however, is that players in a desktop JMF implementation will usually have a user interface or control panel attached, while in an MHP implementation they won't. The Player.getVisualComponent() and Player.getControlPanelComponent() methods will usually return NULL in a JMF implementation for MHP, because the media is typically played in the video layer (see the graphics section for a discussion of the various layers in an MHP display device). It is possible for Player.getVisualComponent() to actually return a java.awt.Component instance under certain circumstances, however. Players that are presenting video streams typically display their content in the video layer, which is outside the AWT graphics hierarchy. Some high-end receivers may allow the 'true' integration of graphics and video however, depending on how MPEG decoding is implemented, and in these cases the Player.getVisualComponent() method may return a component that can be used by the application to handle scaling and positioning of the video. These limitations on how video is decoded are caused by the receiver. Typically, a digital TV receiver will use hardware MPEG decoders and demultiplexers. Given that DTV receivers have very low profit margins, these usually have limits on the scaling and positioning capability of the decoder/demultiplexer. The true integration of video and graphics can really only be done if the MPEG decoding is done in software, and most processors are simply not fast enough to do this at a price point that is suitable for DTV use.
The player state machine
A player can be in one of several states, depending on what it's doing at the time. There are four major states that a player can be in:
• • •
Unrea