Video Workbench

Reviews
Shared by: trendy3
Categories
Stats
views:
113
rating:
not rated
reviews:
0
posted:
11/11/2008
language:
English
pages:
0
The Video Workbench: a direct manipulation interface for digital media editing by amateur videographers Michael Steele Computer Science Division - EECS, University of California, Berkeley, CA 94720-1776, USA Marti Hearst School of Information Management & Systems, Berkeley, CA 94720-4600, USA Lawrence A. Rowe Computer Science Division - EECS, University of California, Berkeley, CA 94720-1776, USA Abstract The Video Workbench is a nonlinear digital video editor designed to edit home movies. Unlike existing nonlinear video editors which allow experts to create complex arrangements of shots, sounds, and special effects but take some time to learn, the Video Workbench is intended to support the most common editing tasks in amateur projects while requiring only a few minutes to learn. The Video Workbench provides a visual conceptual model of how audio and video content is being manipulated by representing audio and video clips as visualizations that can be moved about on a virtual workbench surface. The specialized tools on the workbench operate directly on the visualizations, and in so doing, manipulate the content they represent. 1 Introduction Consumer video equipment and home computers with low cost video capture cards have extended video production from an expensive endeavor undertaken only by groups of professionals to a means of self-expression available to a growing number of amateurs and hobbyists. Up to now, the prohibitive cost Videography terms used in this paper and complexity of video editing systems has limited the production of most home movies to Videographer. The artist producing a video "in camera" editing, with every shot retaining its program. Video editor. The video editing software. position in the original shooting order. Amateur Shot. An uninterrupted sequence of video. productions can benefit from even the simplest Trimming. Removing unwanted material editing system by trimming and sequencing from the beginning and end of a shot. Linear editing. Editing that requires the shots more flexibly to tell a narrative. As digital video becomes more ubiquitous, many amateur videographers will want to exploit nonlinear editing because it allows for the most flexibility in constructing the program [13]. Unfortunately, existing nonlinear video editors require much time and effort to learn to use, even when needed only for simple and common editing tasks. program to be assembled from beginning to end. Nonlinear editing. Editing that allows material to be added or removed from anywhere in the program at any time. Bin. A collection of audio and video source material. Scrubbing. Rapid control over which frame of a video clip to display. 2 Most nonlinear video editors employ a timeline interface. Assembling a program with one of these video editors requires the creation of a timeline composition by bringing audio and video segments from bins to positions on the timeline according to how the segments are to fit together. The video editor then uses the completed timeline as a blueprint for constructing a new movie. While a timeline interface allows experts to construct very complex arrangements of video, audio, and special effects, it has some significant drawbacks for more inexperienced users. First, a timeline is not a direct manipulation interface so it is not clear initially how to begin editing or what must be done to get a result. Second, the process of editing takes place across multiple windows, including at least the timeline, one or more bins, one or more clip trimming windows, and a program preview window. The user must become familiar with all these windows and how to use them together before being able to edit anything. Third, feedback on how an editing choice looks is not instantaneous; the user must either render a test movie or invoke a preview mode first. This paper describes the design and implementation of the Video Workbench, a nonlinear digital video editor designed specifically to make home movies. It is not a full-featured editing system. Instead, it supports the most common editing tasks with intuitive interface controls that are easy to learn and use. The intention is to trade precise and powerful control over the final product for an interactive experience that is more creative, approachable, and fun. The Video Workbench interface is loosely based on the metaphor of shuffling film clips about on a 2D workbench surface equipped with specialized tools for working with the clips, as shown in Figure 1. The metaphor is loose because the clips are not presented as pieces of film celluloid, but rather as abstract visualizations that reveal the source, duration, and composition of the audio and video content they represent. As the clips are split apart (“cut”) and joined together (“spliced”) by workbench tools, the visualizations shrink and grow to expose the changes to their content. A clip can represent just audio, just video, or video with audio. 3 Figure 1: The Video Workbench represents audio and video clips as visualizations that can be moved about on the workbench surface. The Video Workbench supports two editing tasks in particular: trimming away unwanted footage and resequencing footage. Both tasks use the magic eye, shown in the middle of the workbench in Figure 1. Dragging a clip over the magic eye plays the video content on the virtual monitor located beneath the magic eye. Which frame is displayed in the magic eye monitor depends on the clip’s position over the magic eye; the left end of the clip corresponds to the beginning of the video. A scissors button, which does not appear until a clip is placed on the eye, cuts the clip on top of the eye into two separate clips, with the cut occurring right before the currently displayed frame. Trimming a clip is accomplished by placing the clip over the magic eye so that the frame where the unwanted footage begins or ends appears on the magic eye monitor. When the clip is cut, one of the resulting clips contains the unwanted footage and can be thrown away. Resequencing footage requires use of the joiner tool, which can splice two clips together. The joiner is at the top of the workbench in Figure 1. Resequencing footage is accomplished using the magic eye and the joiner in tandem. The magic eye cuts the footage into multiple pieces, and the joiner concatenates the pieces together in any order desired. The Video Workbench also supports independent editing of audio and video tracks so that it is possible to add voice-overs or music to home movies. The magic eye and joiner work with audio clips in a manner similar to that of video clips. The Video Workbench can also separate the audio and video tracks of a clip so they can be edited separately. The audio and video tracks can later be recombined (or more generally, any audio can be synchronized with any video) by using the joiner. The remainder of this paper describes the design and implementation of the Video 4 Workbench. Section 2 presents an example video editing session to demonstrate the functionality of the Video Workbench. Section 3 outlines the design goals of the Video Workbench. Section 4 describes the implementation of the Video Workbench and the media stream and editing abstractions upon which it is built. Section 5 discusses two critical design issues not addressed in the current prototype. Section 6 surveys related work, and Section 7 concludes the paper. 2 Video Workbench Interface This section follows an example editing session with the Video Workbench to demonstrate its interface in detail. The session begins with three source clips:    A video with audio clip of a researcher talking and then whistling. A video-only clip of a launching rocket. An audio-only clip of some music. The goal is to create a movie that begins with the rocket launching to the music, followed by a cut to the talking researcher as the music continues to play, followed by the music ending so that the whistling can be heard. As with any video editor, the first step is to locate and bring in the source material. Ultimately, clips will be brought onto the Video Workbench by drag and drop from a variety of sources, including file system windows, media queries across the web, or a “clip capture” area for digitizing new material, but the current implementation just opens video and audio files with a dialog box. Once on the Video Workbench, the visual representation of the clip denotes various properties of the source material. For example, when the clip of the talking and whistling researcher is brought in, it appears as in Figure 2. Figure 2: A clip’s visualization on the workbench reflects the duration and tracks of the source material. The clip show here has synchronized audio and video from a single source. The visualization of the clip of the talking and whistling researcher, shown in Figure 2, exhibits the following characteristics:  The length of the visualization is proportional to the duration of the clip source, in this case 12 seconds. A 4-second clip would have a visualization one third as long as this one. The left end of the visualization corresponds to the beginning of the clip and the right end corresponds to the end of the clip.  Because the source material consists of audio and video, the visualization has both audio and video tracks. A clip’s audio track is placed above its video track and has a smaller width.  Both the audio and video tracks are striped diagonally so that pieces derived from this 5 clip visually indicate how they originate from the clip.  Both tracks are given an arbitrary color, in this case red. Clips or tracks from the same source file are given the same color, so any copy of this clip will also be red.  The visualization has a handle to make it easier to click on when it is small, and to link the audio and video tracks together when the clip contains both. The visualizations of our three source clips, including the short video-only clip of a launching rocket and the longer audio-only clip of music, are shown in Figure 3. Figure 3: The visualizations of the three source clips, as they appear on the workbench, along with a description of each clip’s contents. “Position” gives the logical start and end times, in seconds, of each segment within the whole clip. “Source” gives the start and end times, also in seconds, of each segment in its source clip. Next, we will cut the clip of the researcher to separate the earlier talking portion from the later whistling portion so we can deal with the audio tracks of these two portions separately. A clip can be viewed or cut with the magic eye. The magic eye projects the contents of a clip on the virtual monitor below the eye as shown in Figure 4. The current frame projected corresponds to the position of the clip over the eye pupil which is denoted by the crosshair. The beginning of the video is displayed when the clip’s left end is positioned over the magic eye. Dragging the clip left and right over the magic eye scrubs the video. Standard VCR controls are also available on the magic eye monitor. The clip will automatically shift left or right over the magic eye as the current frame changes. Figure 4: The magic eye has two purposes: viewing clips and cutting clips. The clip can be viewed by either sliding the clip over the magic eye to scrub the video or by using the standard VCR controls. We position the whistling clip over the magic eye at the frame that begins the whistling 6 and click the scissors button beneath the magic eye to cut the clip at this frame, resulting in two clips. The cut operation works on both the video and audio tracks. The current frame remains with the “after” clip (the right clip), so this clip remains over the magic eye as the “before” clip is ejected to the left, as shown in Figure 5. Each track in the two resulting clips retains part of the original diagonal stripe to indicate from where in the source clip each piece originates. Figure 5: The scissors button beneath the magic eye cuts the clip at the currently displayed frame. Next, we remove the audio track from the clip without whistling (the 4-second clip). This operation is performed with the clip dock, where one of a variety of operations can be performed on the single clip in the dock. When the clip is dropped in the clip dock, several buttons appear to allow operations on the clip. Clicking the “Break audio and video” button separates the clip’s audio and video tracks into separate clips, as demonstrated in Figure 6. The 4-second audio clip can now be thrown away by dragging it onto the trash can. Figure 6: The audio and video tracks of a clip are separated by dropping the clip onto the clip dock and clicking the “Break audio and video” button. The clip dock provides three operations: save the clip, copy the clip, and separate the clip’s audio and video tracks. The clip dock can easily incorporate more operations when more functionality is added to the Video Workbench. For example, it could offer operations to stretch clips for slower playback or to squeeze them for faster playback, to apply special effects, to superimpose titles, or to break up a video clip along shot boundaries using a shot change detection algorithm. The clip dock replaces the idea of selection, which is problematic in video editing because of the care needed to set the “in” and “out” points of the selection. Instead, operations are performed on a subsequence of a clip by cutting out the subsequence using the magic eye, applying an operation to it in the clip dock, and then recombining the clips. 7 Next we look through the rocket video clip. This operation can be done quickly by rightclicking the rocket clip, which plays the clip from the beginning in the virtual monitor in the bottom-left corner of the workbench, as shown in Figure 7. While the clip plays in this monitor, a stop button appears for halting the video before reaching the finish. This monitor is very useful for quickly viewing clips without having to drag them across the workbench to the magic eye. Figure 7: Right-clicking a clip anywhere on the workbench plays it from the beginning in the leftmost virtual monitor. We are now ready to join it with the video-only clip of the talking researcher and the music clip. These operations are done using the joiner, shown in Figure 8, which combines any two clips at a time. When the two clips inserted into the joiner are both audio clips or both video clips, the joiner splices the right clip after the left clip. When one clip is an audio clip and the other is a video clip, the joiner synchronizes the audio to the video. Two clips are joined by dropping the first clip into the left slot of the joiner and the second clip into the right slot. Clicking the button in the middle of the joiner causes the right clip to slide leftward until it collides into the left clip. The collision occurs when the right clip’s audio track hits the left clip’s audio track, when the right clip’s video track hits the left clip’s video track, or when the right clip’s audio or video track hits the handle of the left clip, whichever happens first. The sliding and collision are animated to demonstrate clearly the operation’s behavior. Because we want the launching rocket to appear before the researcher, we place the rocket clip into the left slot of the joiner and the researcher clip into the right slot. Figure 8 shows how the joiner splices these two clips together. The joined clip shifts into joiner’s left slot, so that several other clips can be joined to the end of this one in quick succession. 8 Figure 8: Splicing two video-only clips with the joiner tool. Next we join the music clip, which will synchronize the music to the video. The right clip’s audio track slides leftward until it collides with the left clip’s handle. Figure 9 shows how the audio and video clips are placed into the joiner and the resulting clip. In this case, placing the video-only clip in the joiner’s right side and the music clip in the left side would also work. Figure 9: Synchronizing audio and video by joining an audio-only clip to a video-only clip. There are other ways to join these three clips into a single clip. For example, we could have joined the rocket clip to the music clip first, followed by joining the researcher clip. The joiner can join any two clips, even when one or both of the clips have audio and video tracks. How the joined clip is formed depends on a simple physics where the right clip slides leftward until one of its tracks collides with one of the tracks or handle of the left clip. Finally, we add the remainder of the whistling researcher video. But before we can do this, we need to chop away the extra length of the audio track in the clip constructed so far. (If we tried to join the whistling clip without doing this, the video track would not be contiguous. The music would play to its end, without any video, before the researcher begins to whistle.) After the whistling clip is joined, the seam in the video track is removed because both sides of the join are part of a contiguous source clip. These steps are shown in Figure 10. 9 Figure 10: Before the last clip is joined, the extra length of the audio track is removed using the magic eye so that the video track in the resulting clip will be contiguous. Now that the clip is completed, it can be dropped on the clip dock to save it to a file. In future versions of the Video Workbench, the final clip could be rendered as a new movie using the clip dock. 3 Design Goals The Video Workbench makes video editing more available to non-professionals by providing: 1. A readily grasped conceptual model of how the video and audio content is being manipulated. 2. Interactive editing, so that the results of an edit are available immediately after the edit. 3. A consolidated workspace so as to minimize interruption of artistic “flow”. 4. An easy way to informally organize the work in progress. The remainder of this section discusses these design goals in detail. 3.1 A readily grasped conceptual model Editing video with a timeline is unlike editing text in a word processor or an image in a graphics editor in that it is not so much the manipulation of the media itself but rather the manipulation of instructions for constructing the final work from the media. That is, a timeline is not a WYSIWYG direct manipulation interface, so its conceptual model can be confusing to the uninitiated. Some nonlinear video editors also offer a “three-point” or “four-point” editing interface, largely based on a conventional linear editing system which assembles the edited program on a destination tape deck by selectively copying footage from one or more source tape decks. An edit is accomplished by specifying any combination of three or four “in” and 10 “out” points in the source footage and the destination program, followed by performing either an operation to insert or overlay video on the destination tape deck. This editing paradigm is process-oriented and does not offer the user an obvious conceptual model of how the content is edited together. The Video Workbench provides a visual conceptual model of the editing process by tightly coupling the video and audio content with visualizations that can be directly manipulated. From the beginning of the editing process, the audio and video clips appear as real entities. With the aid of simple animation, the operations that shape these clips into a final composition, for example, cut and splice, are also visibly manifest. The interface thus implements a simple physics that takes but a few minutes to understand. 3.2 Interactive Editing For users who are learning to edit video, instant feedback is crucial because it can be difficult to anticipate how an editing choice will actually “look and feel” when placed into context. In many digital video editors no feedback is possible without compiling a test movie. The lengthy “edit-compile-preview” process also mandates that multiple editing choices be made at once, further complicating the problem of imagining how the result will look. Some digital video editors offer a preview mode, but this interface is invoked only when requested. The preview interface is not the primary interface. The Video Workbench supports interactive editing, so all source and edited content can always be examined immediately after each step of editing. There is never a need to compile test movies or previews for rapid feedback. This design goal places some restrictions on the kinds of operations that can ultimately be supported, but for amateur videographers we believe this restriction is a reasonable tradeoff for a more responsive interaction with the video editor. 3.3 Consolidated workspace In any creative endeavor, an artist pursues a flow of expression that should not be interrupted by his or her tools. For this reason, the Video Workbench consolidates all the material and tools needed to edit video into a single workspace: the workbench surface. While this approach may not scale for productions involving huge numbers of clips, it aims to alleviate the effort in editing amateur productions by avoiding the clutter and clumsiness of multiple windows to accomplish simple tasks, as is common in many digital video editors. 3.4 Informal organization of work The workbench surface offers another key benefit: it provides an informal means for sorting the material as it is being edited. The videographer, for example, can place in one region of the workbench all clips under consideration for a particular scene, while another region holds clips of completed scenes. The method of organization is entirely up to the videographer and can change to serve the task at hand as needed. 11 4 Implementation The Video Workbench prototype is implemented in about 2500 lines of code written in the Tcl/Tk [14] scripting language using the Berkeley Continuous Media Toolkit (CMT) [3, 9]. CMT includes a media playback and editing API designed to simplify the development of playback and editing applications. The prototype was completed in less than two person-months, which validates the CMT approach for the rapid prototyping of continuous media applications. This section describes the CMT abstraction for representing digital media, the playback and editing abstractions, and the implementation of the Video Workbench. 4.1 Introduction to CMT CMT is a Tcl extension that supports the development of distributed continuous media applications. It provides a number of modular, reusable objects including:      Segment objects, which read from a video or audio file on disk so that it can send the video or audio data to another object. Player objects, which decode a stream of video or audio data and play it on a device. Objects to package and send or receive media from a process on a different host. A Logical Time System (LTS) object for synchronizing media objects and controlling their playback. Media objects for a variety of video and audio devices and codecs. All CMT objects are accessed using Tcl commands. CMT objects are combined to create playchains to play video or audio across a network or off a local hard disk. A playchain can be thought of as the pipeline through which continuous media data flows from the source to a destination. Figure 11 shows a simple example of a playchain for the controlled playback of an MPEG file from a local hard disk. Figure 11: A playchain is a pipeline for continuous media data that flows from a file to an output device. 12 The LTS is the valve of the pipeline. This playchain is composed of three CMT objects. The MPEG Segment object reads from an MPEG video file (“hello.mpg”) and sends the video data to an MPEG play object. The MPEG Play object decodes the data sent from the MPEG Segment object and displays the video in a window. The LTS object synchronizes the objects in the playchain and controls the playback of the video. The LTS object has speed and value properties that control playback rate, direction (i.e., forward or backward), and location within the source. Setting the LTS speed causes the video to play, stop, rewind, or fast-forward. Setting the LTS value causes the video to immediately jump to the frame at that logical time value. When the video is being played, the LTS value automatically updates to indicate the current frame being displayed. Playchains become more complicated when the presentation incorporates video or audio segments from many different clips and the source material is stored on different file servers. For example, to play one video file after another, a playchain must be created with a segment object for each file, a play object for each kind of video data to display, and a single LTS that synchronizes the objects so they transfer data when required in the right order. The CMT media playback API, introduced next, creates and manages these complex playchains. 4.2 Introduction to the CMT Media Playback and Editing API The CMT Media Playback and Editing (MPE) API is a higher-level abstraction that simplifies the development of playback and editing applications by managing playchains. It provides stream and mediaPlayer abstractions to create and manipulate sequences of media clips and objects to play and edit streams. A stream is a sequence of non-overlapping video or audio segments, where a segment is a subsequence of video or audio frames from a single source clip. Each stream is either a video stream or an audio stream. The segments in a stream are played back in order according to the logical time defined in the stream. Each stream segment has a logical start (ls) time and a logical end (le) time. A stream can have any number of segments, so long as no two segments occupy the same position in logical time, that is, only one segment is played at any time. Each segment has a source start (ss) time and a source end (se) time, which specifies the portion of the source file to play. When the segment is played back in the stream at normal speed, the logical duration (le - ls) equals the source duration (se - ss). A single source clip can have several segments in the same stream. Two kinds of mediaPlayers are provided: 1.) a video mediaPlayer which plays the  The playback system actually stretches or shrinks a segment if the durations differ. Low level support for this operation hides problems with capture devices, such as audio boards, that return variable numbers of samples. 13 contents of a video stream in a window and 2.) an audio mediaPlayer which plays the contents of an audio stream to a speaker. A single stream can be played back by several mediaPlayer objects. Three important properties of the mediaPlayer object are:    stream, which identifies the stream to play. speed, which controls whether the stream is playing, stopped, rewinding, or fastforwarding. value, which is the current frame of video or audio being played. Setting the speed and value properties controls the playback of the stream or jumps the video or audio to a particular frame. The stream and mediaPlayer objects together create and manage the CMT playchains necessary to play the clips of the stream on the correct device. For every mediaPlayer instance, an LTS object is created to control the playback of the Stream. It is the speed and value of this LTS object that is set when the speed and value of the mediaPlayer is set. Most importantly, the stream and mediaPlayer objects automatically adjust the playchains whenever the stream is modified. This abstraction is a key service provided by the API for the Video Workbench because it allows editing changes to be viewed immediately after the edit is performed. Immediate playback is necessary for the Video Workbench to support interactive editing. 4.3 Video Workbench implementation issues The Video Workbench implements a clip abstraction on top of the stream abstraction provided by the MPE API. The clip abstraction manages the audio and video streams and the clip visualization on the workbench surface. Both the workbench surface and the clip visualizations on it are implemented as Tk canvas objects. A key implication of the implementation with CMT is that the Video Workbench supports only editing operations and effects that can be rendered in real-time. A primary advantage of a timeline interface is that it can render video clips that take more time to render than to play back, such as when the video has many transitions, special effects, or superimpositions. The lengthy rendering process is not performed until after all the editing choices have been made. The Video Workbench, on the other hand, supports interactive editing. But, because rendering a whole video clip takes too much time on most end-user computer systems, the Video Workbench also postpones the rendering of a clip until after all editing choices have been made. It only provides the illusion that edits are rendered immediately by instantly reconfiguring the playchains to reflect the edit. So long as the new playchains can play back in real-time, the illusion of instantaneous rendering is maintained. 14 5 Remaining Design Challenges The design implemented in the current Video Workbench prototype fails to address two important concerns: 1.) visually distinguishing clips by their audio or video content and 2.) dealing with the orders of magnitude difference in clip length between the shortest clip and longest clip that the editor can support. Both issues, and ideas for dealing with them, are discussed below. 5.1 Content-based clip visualizations The Video Workbench currently implements an edit-based visualization scheme for the clips on the workbench. The visualizations are rather generic when the source clips are first brought onto the workbench, and portray more information about the content they represent only after some editing. A purely edit-based visualization is problematic for many reasons, especially at the beginning of an editing session. First, it can be difficult recalling what video or audio content belongs to each clip. For example, the only visual cues distinguishing two video source clips is the color and length of their visualizations. Second, the visualization does not distinguish the various shots in a clip unless the shots originated from different source clips. For these reasons, the visualization should also portray the clip’s contents to aid the task of editing. Many digital video editors provide content-based visualizations of clips. A common technique to represent audio content, for example, is with a compressed waveform. This technique should work well in the Video Workbench for representing the content of a clip audio track. The most common technique to represent video content is with thumbnails, which are miniature frames sampled from the video at regular intervals. Though easily understood, this technique does not meet the requirements of the Video Workbench interface because a single thumbnail may represent several dozen frames or more. If there are several shots in the time span covered by a thumbnail, some of the shots may be completely absent in the thumbnail representation. Also, when cutting a clip through a thumbnail, one is not sure if the cut will be made through the shot depicted in the thumbnail or if it is made through another shot that has slipped in underneath the thumbnail. Another visual representation of video is offered by Arman, et al. [1], in which separate shots of a video sequence are each represented by an r-frame, which is a thumbnail of the shot framed by a motion tracking region. A variation of this idea is used for QBIC video content mosaics [7] where each r-frame is a salient still [15] from the shot it represents. Both representations, however, are also inappropriate for the Video Workbench because the widths of r-frames do not correspond to the durations of the shots they represent. We believe it is important to maintain this time dependent cue to avoid distortions between the source material and locations along the representation. Perhaps the most promising technique for visually representing video content is with a 15 videogram, introduced by MacNeil [12] and implemented in the Media Streams browser [6]. A videogram takes strips across the center of individual frames and stacks them sideby-side. Videograms are temporally continuous. On the other hand, a purely content-based visualization is less useful when the frame contents changes little over time, as in a “talking head” video. A hybrid of a videogram with the source striping technique already implemented in the current Video Workbench prototype provides the advantages of both a content-based visualization and an edit-based visualization. Figure 12 shows an example of this hybrid technique. Figure 12: This clip visualization is a hybrid of a videogram with source striping. 5.2 Orders of magnitude differences in clip durations The Video Workbench must handle clips as short as a few frames and as long as a feature-length movie, up to 3 hours. If the clip visualization length is directly proportional to the clip duration, as in the current prototype, the longest visualization would be nearly 50,000 times longer than the shortest visualization. No choice of scale between visualization length and clip duration will prevent short clips from becoming too small to work with or long clips from becoming too cumbersome to move. The visualization length needs to be a nonlinear function of the clip duration. Two schemes can distort the visualizations in this manner. In the first, the length of a visualization is solely a function of the clip duration, and not dependent on the visualization’s position on the workbench. The behavior of the clips in this scheme may sometimes surprise the user because after a clip is cut, both pieces will expand in length. Similarly, when two clips are spliced together, the resulting clip will have to shrink so that its length is less than the sum of the lengths of the two clips. The second scheme avoids this surprising behavior by distorting the workbench surface itself, so that the visualizations are distorted according to their position on the workbench. That way, when a clip is cut, the two resulting clips do not need to resize because both new clips do not fall far from where they came from. An example of this scheme is to wallpaper the workbench surface on a Perspective Wall [11] as depicted in Figure 13. 16 Figure 13: The workbench surface is wallpapered onto a Perspective Wall, so that clips become more compressed as they approach the left or right sides of the workbench. The fisheye distortion is applied only along the horizontal dimension to make full use of rectangular screen space. Clips in a central focused region, where the magic eye and joiner tools are located, have maximum magnification. This region is flanked on the left and right by two regions of “Fisheye View” distortion, with the most demagnification at the extreme left and right edges of the workbench. Thus, a clip’s left end would never reach the left side of the workbench because the left side of the clip collapses inward as the clip is positioned closer to the left side. This distortion scheme in particular has the advantage of providing focus+context viewing of long clips. Examining part of a long clip in detail is done by dragging that part of the clip into the center of the workbench. 6 Related Work Much innovative research has addressed, directly or indirectly, the goal of making nonlinear video editing more accessible to amateur videographers. Ueda et al. [16, 17] developed IMPACT, a video editing system targeted toward nonprofessionals that exploits sophisticated image processing algorithms to aid editing. A cut-detection algorithm splits footage into separate cuts, each which is represented with a moving icon (micon). The user can then trim and sequence the cuts. This system also aids editing by offering automatic classification of cuts by camera movement (like zoom or pan) and object extraction. The IMPACT interface is more oriented around managing individual shots than the Video Workbench, which can represent video clips consisting of multiple shots as discrete entities. Baecker et al. [2] built the Movie Authoring and Design (MAD) system for the design and creation of visual presentations. By providing a hierarchical structure for the 17 inclusion of text, pictures, audio and video, it supports the simultaneous top-down design and bottom-up construction of the presentation. MAD is not specifically targeted toward amateur videographers, but it does provide a variety of visual representations of the work in progress, including a real-time playback mode, which can aid them in the visualization of the final creation and in developing a coherent narrative and sequencing and rhythm of shots. Video Mosaic, introduced by Mackay and Pagani [10], is an augmented reality system that uses a storyboard drawn on paper as input to a digital video editing system for production. This system attempts to give digital video editing the advantages of initial planning on paper (i.e., a quick, portable and informal means of exploring ideas over a large space). The idea of implementing the Video Workbench as a similar augmented reality system, but where audio and video is edited by cutting and taping tangible strips of paper, is worth more exploration. Buchanan and Zellweger [4, 5] describe Firefly, an authoring system that automatically schedules the presentation of multimedia elements using a constraint-based specification of temporal relationships between elements. In terms of video editing, the videographer is able to explicitly express desires like which scenes go before others and is freed from the need to manually manipulate the scheduling of scenes to obtain the desired effect. Hudson and Hsi [8] promote a “walk-through” approach in which multimedia documents are created by walking through a presentation under construction, possibly several times over. An example of using this approach to edit video is to first walk through a number of video clips to establish an ordering among them, and then walking through this sequence to lay down a synchronized audio track. This interface is truly a direct manipulation approach because media elements are manipulated directly rather than proxies to the media elements, as in the Video Workbench. 7 Conclusion The Video Workbench is an interface for a nonlinear video editor designed to create more sophisticated home movies. It is targeted toward amateurs and hobbyists who need a video editor that supports the most commonly needed editing operations while requiring a minimal learning time. It attempts to match their needs by providing a visual conceptual model of the editing process modeled on the behavior of physical objects (i.e. clips lying on a workbench surface with specialized tools to manipulate them). The initial results of simple usability tests with the current prototype are very promising. All three participants, with no previous video editing experience, were able to construct simple movies using sample clips provided to them after a five minute introduction to the interface. More extensive and systematic testing is needed to validate the Video Workbench editing paradigm and to identify the strengths and weaknesses of the design currently implemented in the prototype. 18 References 1. Arman, F., Depommier, R., Hsu, A., and Chiu, M-Y., 1994. Content-based Browsing of Video Sequences. Proc. ACM Multimedia 94, 97-103. 2. Baecker, R., Rosenthal, A., Friedlander, N., Smith, E., Cohen, A, 1996. A Multimedia System for Authoring Motion Pictures. Proc. ACM Multimedia 96, 31-42. 3. Baldeschwieler, J.E., 1996. Editing Extensions to the Berkeley Continuous Media Toolkit. Master’s thesis, University of California at Berkeley. 4. Buchanan, M.C. and Zellweger, P., 1992. Specifying Temporal Behavior in Hypermedia Documents. Proc. ACM Hypertext 92, 262-271. 5. Buchanan, M.C. and Zellweger, P., 1993. Automatic Temporal Layout Mechanisms. Proc. ACM Multimedia 93, 341-350. 6. Davis, M., 1995. Media Streams: An Iconic Visual Language for Video Representation. In Baecker, R.M., Grudin, J., Buxton, W., and Greenberg, S., Readings in Human Computer Interaction: Toward the Year 2000, Morgan Kaufmann, 854-866. 7. Flickner, M., et al., 1995. Query by Image and Video Content: The QBIC System. IEEE Computer, September 1995, 23-32. 8. Hudson, S. and Hsi, C-N., 1994. The Walk-Through Approach to Authoring Multimedia Documents. Proc. ACM Multimedia 94, 173-180. 9. Jackson, M., Baldeschwieler, J.E., and Rowe, L., 1996. Berkeley Continuous Media Toolkit API. Submitted for publication. University of California, Berkeley. 10. Mackay, W. and Pagani, D., 1994. Video Mosaic: Laying Out Time in a Physical Space. Proc. ACM Multimedia 94, 165-172. 11. Mackinlay, J., Robertson, G., and Card, S., 1991. Perspective Wall: Detail and Context Smoothly Integrated. Proc. CHI 91. 173-179. 12. MacNeil, R., 1991. Generating Multimedia Presentations Automatically Using TYRO: the Constraint, Case-based Designer’s Apprentice. Proc. 1991 IEEE Workshop on Visual Languages. 74-79. 13. Ohanian, T. Digital Nonlinear Editing: New Approaches to Editing Film and Video. Focal Press, Boston, MA 1993. 14. Ousterhout, J. Tcl and the Tk toolkit. Addison-Wesley Publishing Co., Reading, MA 1994. 19 15. Teodosio, L., and Bender, W., 1993. Salient Video Stills: Content and Context Preserved. Proc. ACM Multimedia 93, 39-46. 16. Ueda, H., Miyatake, T., and Yoshizawa, S., 1991. IMPACT: An Interactive NaturalMotion-Picture Dedicated Multimedia Authoring System. Proc. CHI 91, 343-350. 17. Ueda, H., Miyatake, T., Sumino, S., Nagasaka, A., 1993. Automatic Structure Visualization for Video Editing. Proc. InterCHI 93, 137-141.

Related docs
Molecular Workbench
Views: 2  |  Downloads: 0
Workbench-Tutorial-Companion-Document
Views: 12  |  Downloads: 1
Free-D Workbench
Views: 13  |  Downloads: 1
Video
Views: 89  |  Downloads: 0
Video intro
Views: 11  |  Downloads: 0
Hacking Video Game console
Views: 22  |  Downloads: 2
Video-Basicsdoc
Views: 6  |  Downloads: 0
Video
Views: 13  |  Downloads: 0
Video Games
Views: 222  |  Downloads: 6
VIDEO
Views: 8  |  Downloads: 0
premium docs
Other docs by trendy3
AGENDA
Views: 647  |  Downloads: 15
October 2006
Views: 367  |  Downloads: 0
Real and Financial Industry Booms and Busts
Views: 394  |  Downloads: 10
TOWNSHIP OF BLAIRSTOWN
Views: 341  |  Downloads: 0
Victorian Festival Brochure
Views: 396  |  Downloads: 0
RPP TO PARTICIPATE IN HOUSING CASE AMICUS BRIEF
Views: 317  |  Downloads: 0
Efficient Space Planning Makes a Difference
Views: 359  |  Downloads: 4
RealMoney Silver - TOL 'Glimmers of Hope'
Views: 353  |  Downloads: 1
PLANNING AND ZONING COMMISSION
Views: 255  |  Downloads: 0
imagicdigital.com 215.964.9800
Views: 187  |  Downloads: 0
TREASURIES EQUITIES
Views: 192  |  Downloads: 0
Ocean's edge
Views: 188  |  Downloads: 1
lUNcH yOU aRE INvItEd tO tHE
Views: 254  |  Downloads: 1