WebVTT and Video Subtitles
With the web industry embracing HTML5‘s ability to embed video in the browser, we need to take a further look at what the future holds for video and HTML5, particularly in the field of accessibility.
Updated on to reflect changes in the WebVTT specification.
It’s all very well to embed audio and video into your website, but how accessible are these? Simply adding a video is fine for those who don’t need any help in viewing it, but for those who might need to read what’s being said or have something read it out to them (to take one particular example) it’s not so useful.
The WHATWG, and in particular Silvia Pfeiffer, have attempted to address this with a new file format called WebVTT: Web Video Text Tracks. This file, used in conjunction with the HTML5
track element, can be used to specify accessible information for a multimedia source:
That’s quite a lot, and today we’ll simply concentrate on video subtitles, but I aim to cover the others, with examples, in other articles in the near future.
WebVTT file format
A WebVTT file is simply a text file with a .vtt extension that follows a certain format. No surprise there.
A WebVTT file takes the following format:
WEBVTT FILE [idstring] [hh:]mm:ss.msmsms --> [hh:]mm:ss.msmsms [cue settings] TextLine1 TextLine2 ...
As you can see, the file needs to start with a
<abbr>WebVTT</abbr> FILE header followed by details of each cue that’s specified within the video itself. (The
idstring can be one or more characters that does not contain –> or \r, \n or \r\n).
Using this, let’s dive into immediately taking a look at a sample WebVTT file:
WEBVTT FILE 1 00:00:03.500 --> 00:00:05.000 vertical:lr align:start Everyone wants the most from life 2 00:00:06.000 --> 00:00:09.000 align:start Like internet experiences that are rich <b>and</b> entertaining
This particular example defines two cues, each of which is, and must, be separated by at least one single line. The timings are pretty self explanatory, as are the line texts, but I’ve thrown in some cue settings there too, which I’ve not yet spoken about (you’ll also notice some HTML in the line text also, more about that in a bit).
There are a number of settings that can be specified for a subtitle cue which affect how it is displayed on the video:
- vertical: lr | rl – specifies the direction
- line: XX% – specifies the line position relative to the video frame
- align: start | middle | end – indicates the text alignment
- position: XX% – specifies the text position
- size: XX% – specifies the text size
For a more comprehensive view of these, see cue settings.
As mentioned above, you can also add styling to the text within the cue itself. The example above uses a simple
<b> element but CSS classes can also be added:
3 00:00:11.000 --> 00:00:14.000 align:end Phone conversations where people truly <c.highlight>connect</c>
If you’d prefer that the text would appear step-by-step (as in karaoke) you can simply add different timings in the cue text itself:
00:00:11.000 --> 00:00:14.000 align:end Phone<00:00:11.000> conversations<00:00:12.000> where people<00:00:13.000> truly <c.highlight>connect</c>
These are the basics, although there are some other cue text formatting options available.
So how do we actually tell the browser to use this WebVTT file? Simple, use the
<track> element, which you place inside the
<video> element after the sources have been specified.
<track label="English subtitles" kind="subtitles" srclang="en" src="upc-video-subtitles-en.vtt">
<track> element has a
kind attribute, which is set to
subtitles here, you can see that this will also be used to specify captions, navigation and descriptions etc. for the same video source.
(More about the track element).
I have used the excellent Playr by Julien Villetorte , although there are others out there such as Captionator by Chris Giffard.
That’s all well and good, but I bet you actually want to see it working. I’ve put together a WebVTT Example which uses the playr polyfill mentioned above, to illustrate how it works.
The full WebVTT file can also be viewed.
This was only a taster of what the WebVTT file format can do for HTML5 video and for web accessibility. There’s more to come!