Over the last few years the BBC‘s video on demand service iPlayer (a kind of British Hulu) has moved beyond desktop PCs, to games consoles, mobile phones, tablets, set top boxes and internet connected TVs. But there’s still a segment of the population who aren’t being catered for; those whose primary computing device is an eReader.
To address this I’ve designed a system to automatically convert a terrestrial digital TV transmission, as captured by a DVB card or PVR, to a sequence of static images annotated with the subtitles from the stream that convey the story as a kind of comic strip format. This can be viewed as HTML or PDF, so expanding the number of devices the content can be viewed on.
In this example I’ve converted an episode of Britain’s favourite aspirational soap, ‘Eastenders‘. The technical details of the conversion process are given later on in this post, but for now the basic operation of the system is discussed.
A basic strategy is to only render the frames where subtitles appear (subtitle frames). This is very simple to implement given a video file and corresponding timestamped subtitle file, and for some sequences this is quite adequate. For example:
However introductions and reaction shots often have no dialogue but may still contain key plot points. These will be missed by using the subtitle strategy alone. In a programme like Eastenders where the characters spend much of their time staring in to the distance in despair, these silent scenes must be included. To cope with this, scene detection software is used to produce a list of the start and end frames of each scene, then a snapshot is taken from the midpoint of each scene (scene frames). For example, here’s a sequence where Phil sneaks a look at somebody else’s mail after they’ve left the room:
To produce a complete rendering of the entire programme, the subtitle frames and the scene frames are combined. Then any scene frames from scenes that also contain a subtitle frame are discarded. A HTML file can now be created that simply presents the remaining frames in order. This HTML file can then be used as is, converted to a PDF for download on to an e-reader, or sent to a printer. This last format is not only for inveterate Luddites, it also allows the content to be consumed in areas where electronic devices are not suitable, for instance on the beach, or in an aircraft during take off and landing. This format also allows distribution via the postal service. With a typical PDF episode weighing in at only 20MB and with offline viewing, this format has a clear advantage over existing mobile iPlayer services.
Here’s the complete programme presented in an embedded player:
Eastenders, BBC1, Tue 31 May 2011 19:30 BST
Press the ‘Play’ button to advance to the next frame, and the other transport buttons to navigate the programme.
In order to convert the video, first the DVB stream is demultiplexed in to the audio, video and subtitle streams using ProjectX. The DVB subtitles are actual images of the rendered captions carried as a subpicture stream. These could be used directly, however I had access to a pre-processed subtitle file in XML format. This gave a file with the caption text together with the start time, end time, position and colour of the caption. This was created from the subpicture titles using OCR and so occasionally suffers from misspellings and incorrect caption colours, but the XML format is extremely easy to process. An example XML subtitles file is given in the resources section of this post and more information about DVB subtitling can be found in the ETSI document ETSI 300 743 ‘DVB Subtitling Systems’.
Once the subtitles are available, a list of scenes in the programme is obtained. I used lav2yuv to do this, which works well enough for this demonstration. To use this first convert the MPEG file to MJPEG using ffmpeg:
ffmpeg -i source_video -r 25 -an -vcodec mjpeg destination_video.avi
Then build the scene list using lav2yuv:
lav2yuv -S list.txt destination_video.avi
This produces a file which is simply a list of start and end frames for each scene. Finally a simple PHP script is used to generate the final HTML output which can be passed through wkhtmltopdf to produce a PDF file. The whole process could easily be added to iPlayer’s current automated transcoding chain.
- The DVB stream also includes an audio descriptiontrack for the visually impaired which provides commentary about the scenes. Currently this only exists as audio, however if it were available as a timecoded text file similar to the subtitles file it would greatly improve the contextualisation of scenes. As an example, here’s the audio description immediately prior to the scene above with Masood:Example audio description
- It would be simpler and more accurate to use the DVB subpicture subtitles directly. Perhaps this can be done with the mplayer -sid option when capturing frames? However mplayer had difficulty capturing specific frames, preferring instead to capture the nearest keyframe.
- The scene detection software currently doesn’t work infallibly and can miss some scenes. It also requires the programme be converted to MJPEG before processing which is slow. A more efficient, tunable solution would be desirable.
Here’s the HTML version on a Kindle, inside a lovely case courtesy of Velvetmutineer
The embedded HTML player shown above can also be viewed directly on a mobile device at http://cdn.frisnit.com/dvb2pdf/: