Comprehensive Automated Multimedia Recoding for Wikimedia
Benefits to Wikimedia:
Wikimedia will be poised to easily present audio and video sourced from contributions
uploaded in a wide range of multimedia formats using an in-browser player.
Synopsis:
A quick analysis of contributions will occur at upload time to determine their type and, in
the case of audio or video files, whether the server knows how to recode them to an Ogg
format. In the case of files that can be recoded, the user will receive a simple “upload
successful” message, and the source file will be added to a queue of recoding jobs. In the
case of eccentric formats that the server can’t decode, the user will be informed that their
contribution won’t be available on the in-browser player, and directed to help on how to
upload contributions in compatible formats.
A second script (running perhaps as a cron job) will routinely monitor the job queue,
dispatching new jobs to the appropriate recoding software and monitoring its progress.
Upon job completion, the script will update the necessary databases to reflect the recoded
file’s availability. The script could also report jobs that fail unexpectedly at this stage to
editors/administrators in a similar fashion.
Deliverables:
* An upload analyzer script that rapidly determines if a given file is a valid audio/video
type, and if so determines the feasibility of recoding it to Vorbis or Theora. It will report
its findings to the contributors at upload time, and enqueue suitable jobs for recoding.
* A recode manager script, designed to run either daemon-style or as a frequently
recurring cron-job, will monitor the recoding job queue, issue commands to the necessary
applications to decode/recode many file types, and update appropriate databases when
jobs finish. At minimum, I would make use of ffmpeg and mplayer to support the formats
they can decode. One job would be run at a time per server.
* Minor tweaks to existing media playback pages to make them aware of complete/not
yet available recode jobs. Ideally this would be in collaboration with another student
working on the in-browser player.
Development Methodology:
There is a wealth of software I can make use of to help in the basic task, some of it even
native to PHP – see for example getID3.sourceforge.net, which can tell me a whole lot
about what an upload is or isn’t right from the get-go. Using information from this and
other utilities accessed through command-line interfaces I can establish exactly what an
uploaded file is and use this information to determine what software / commands will be
necessary to decode/resize/downsample it for use in streaming to a browser-based player.
I anticipate using ffmpeg2theora as the ultimate last step in all video encoding jobs.
Ffmpeg would also be a reasonable means of encoding vorbis files, but I might choose
something else.
The queue would likely be implemented as a single MySQL database table, containing
source file information, job status, and queue order. This design should be easily
adaptable to have the recode processing occur on another machine (or even multiple
machines) in a server cluster, especially if the servers have access to a shared MySQL
database. With that, all that is necessary is a means for the recoding system(s) to access
the original file over the network. The recoding script would then just download or mount
the source file after selecting an unclaimed job from the queue.
In projects like this, I have found that 98% of the work is maximizing compatibility with
as many media sources as possible. This means researching the most common formats
users are likely to upload content in (existing usage data from Commons would be a good
starting point), obtaining as many sample files in those formats as possible, and testing
them with both the upload analyzer and recode components. Often someone’s $40
hardware encoder or meta-information writer isn’t quite following official specifications
causing either the upload analyzer or decoder to choke up; uncovering problems like this
and finding workarounds to of them as possible is where I expect much of my time will
be invested in the later stages of the project.
An interesting challenge is investigating the possibility of developing code that behaves
reasonably on platforms not administered by Wikimedia, since they would not
necessarily provide the same set of multimedia analyzing and recoding utilities. This
would be a necessary step if this project were to be incorporated in the mainstream
MediaWiki. While my primary development goal will be achieving flawless performance
with one homogenous set of utilities for use on properly configured servers, if I have time
I will also take a look at probing Linux-like systems for common codecs and utilities. The
upload analyzer and recode manager would then take available utilities into consideration
when determining if an upload can be recoded.
Timeline:
Week 1: Search for and become familiar with additional open-source utilities that could
assist me in analyzing or recoding uploaded media.
Week 2-3: Make it. I expect a first implementation of the main deliverables can be
completed in this time.
A day or two around this time could be spent adapting media viewing pages on test
servers to make use of the new database information on ready/not yet available resources.
Week 4 – 8?: Testing against the numerous multimedia formats. Scour the Commons and
other multimedia repositories for content in a diverse range of formats, run it all through
the recoder, and fix issues as they are found.
Remainder: Assuming quality results are reliably obtained by this time, work on
extending the project to do the best/most possible with the set of utilities given on a
particular system. The intent would be to make it usable in mainstream MediaWiki.
Bio:
I am an undergraduate at the University of Minnesota Duluth, where I major in Computer
Science. I am self-taught in web development, but have had direct hands-on experience
with PHP and MySQL for nearly five years, mostly doing from-scratch web applications
for various clients acquired by word-of-mouth. Of these, two have required in-depth
analysis of multimedia files at upload time to determine their specifications, metadata,
etc. One required realtime audio recoding on a Linux platform (see
http://www.wazee.org/wmd/). In another I implemented an asynchronous recoding job
queue capable of providing done/not done status directly to the site’s end users, as would
be required for this project.