Eating the Captioning Dog Food

I'm prepping for an accessibility presentation and realized it was finally time for me to bite the bullet and caption a video. I even set aside an entire afternoon for it. Surprisingly though it only took me about 2 hours (for a 2 min video) to input text, do the final adjustments and export the final product. And it was much more relaxing than I thought it would be.

Captioning Content and Tools

The video is from the Internet Archive, specifically a 1948 Pedestrian Crossing public service video from the U.K. government (midcentury educational videos are often entertaining). I downloaded the 512K MPEG/.

The tool I'm using is Parity (developed by colleague Pat Besong). It's free to Penn State users (and installed in our lab) and hopefully will have a more public release in the future. It works mostly with Quicktime files and right now is Mac only (which is a common platform for many video shops, although not necessarily for the average instructor).

Here's what a section looks like (warning you not to get distracted while crossing or "you'll soon find yourself in trouble").

Parity has an option to either import a text transcript or let you start from scratch (as in this case). To start from scratch, you open Parity, create a New Project then load a movie into the view screen. You can play portions of the video within Parity as needed.

Once you load a movie, just start to play it and type in the black caption box. The great thing about Parity is that it will continuously loop a 4-second segment until you are able to complete a caption. This looping is actually huge time saver - a surprising amount of video time is actually wasted on pressing rewind, getting back to the right section then starting over. The looping really kept me in the captioning grove.

After you write a caption for the 1st 4 seconds, you can hit return to go to the next 4 seconds (which also loops). You can keep going until the end. The big gotcha here is NOT to hit the Return button to start a new line in the caption (because it kicks you into the next section).

Below is a view of the Parity screen with a series of captions and automatically generated time codes. Click image to view close up.

Set of caption lines at right. Frowning man in video on left

Editing Captions

For what it's worth, I recommend going straight through a video to get the bare captions, then worrying about spell checking and shifting words around later. If you have to go back and listen to a segment, move your cursor to the appropriate caption line, then click the Start button to start the looping process.

You can also shuffle text between lines as needed. One thing to look out for is duplicate captions (if you re do a section). Just delete a caption if you need to remove a line. You can also insert and split captions.

To make final adjustments, I used the "Preview in QuickTime" option to view the captioned video. I could decide to move text around then click the preview again until I was happy. Parity also lets you adjust font formatting and color, although the default is probably fine.

Export for the Web

What Parity exports is a caption text file which can be played in other files. For Quicktime (the default), it exports a .txt file (with the transcript & formatting) and a .smi (SMIL) file which is an XML file which calls the movie and captions and puts them together. As you can imagine the movie, SMIL file and .txt must be in the same directory.

There are other caption export options including Flash, Adobe Encore and others, but you have to manually convert a Quicktime video to Flash. I recommend an external track whenever posssible. That way you (or someone else) can edit typos or adjust formatting even if you don't have access to Parity at the moment.

A main exception is streaming Quicktime video in which captions must be burned or embeded as part of the video (ah well).

Final Product

The simplest result is probably Quicktime, so here is it is. I used the embed tag to link to a SMIL file. Click the play button to start the video.

Grey Means Go - Color Blindness in Transportation

"Grey Means go" ( is a Website devoted to colorblindess issues in transportation and ways that traffic signals can be enhanced so that color blind users know which is the red light.

The site argues (I think successfully) that the new designs would make it easier for everyone. The author, Brian Chandler, cites crash statistics in some cases.

Minimizing Captioning/Transcription Hours

A common accessibility accommodation is audio transcription and audio captioning, especially now that audio and video files are easier to create than ever. The transcription, on the other hand, can be a labor intensive process. I've been through a captioning project and I know it does require some hours. Fortunately, I was happy to see that half the class used the captions. It was something all students appreciated.

However there may be some ideas that can help with getting a quicker transcript. Once you have the transcription, it's much easier to embed it into a video caption. Either through a commercial service or through captioning programs. For audio of course, the transcript is the end product.

Finding a Pre-Made Transcript

Here are some possible ways to get a transcript relatively painlessly. There may be costs involved, but probably less than the alternatives of paying someone to transcribe audio.

  1. Write a script first if you can. Chris Millet from Digital Commons has a good policy that video tutorials aren't made until a text-based version exists first. Not only does this provide a transcript ahead of time, but it lets the shooter plan what tools to cover and in which sequence.

    The text document and the video may not be the same, but will be much closer, and someone really needing in text format will have something to use. This technique is also good for other informational videos (including fun science videos and role playing scenarios). The main gap is filming a spontaneous interview - that will have to be done after the fact.

  2. Get the script or transcript. There are probably transcriptions for many commercial or popular videos already out there on the Web. Many news organizations are releasing interview or video transcripts on the Web, and many Web sites have music lyrics or may quote key passages of videos. If that fails, you may be able to buy transcripts or screenplays from the source (but you definitely want it to be digital...or you will have to do an OCR scan or (ugh) re-type).

  3. Buy the DVD. Similarly, many DVD's, especially those for movies and TV shows include an English language subtitle track. You may not be able to stream it, but you may be able to loan it out to a student who requires it. Other students may be able to rent it from the library or the video store.

Copyright Issues?

Technically a screenplay to a film is copyrighted just as the original media is. What restrictions are there? One way to approach it is to buy or obtain transcripts only for the students needing the request. Publishers may ask students to sign a license NOT to distribute materials. Another is to consider it from a Fair Use or TEACH perspective. It's doubtful that a transcript to a legitimately acquired video will incur much economic harm. However, if a lawsuit is filed, it must be defended in court.

If a Web transcript from the original organization exists on the Web, it would probably best to provide a link. A purchased transcript just for a student needing an accommodation is probably also safe. Other scenarios may require thought, but may be doable.

Can't get a transcript?

If you can't get a transcript, then you may want to consider these scenarios

  1. First. consider how much of a video you may need. A documentary may be 90 minutes long, but maybe you really only need only a 5 minute segment. The same is true for any interview taped. In general, the shorter the video, the easier it is to view, store and transcribe.

  2. Second, you can experiment with speech recognition technology such as Dragon Natrually Speaking. In theory, speech recognition can take an audio file (or live feed) and generate a transcript. Note though that you should proof transcripts, especially for new speakers and content on technical subjects with unusual vocabulary.

    If a faculty member finds the creation of instructional podcasts useful, then this solution may be even more effective because speech recognition works best when it is "trained" to the same speaker over a period of time. The transcriptions will likely become more accurate over time.

  3. Finally there is the least popular option - paying someone to transcribe audio. This is what makes everyone wince, but you don't necessarily have to tie up your multimedia person for hours on end or pay a specialist in California. Transcription is something most people can do from an intern to a deserving grad student needing a little extra cash (even an instructional designer if the clip is short enough). I myself temped as a medical transcriptionist for 2 weeks - it was better than filing by a long shot.

    I would add that funding a grad student TA (or upper level work study) in the department that the course is taught in can be beneficial because that person will know the terminology and the content already. There should be much fewer hiccups on the way.

Live Captioning

A live captionist is someone who is able to transcribe speech as it happens. This kind of specialist does (and should) command a high fee, because of the skill involved. On the other hand, once it's have your transcript already to go when the recording is posted.

Before I end this, I thought I would bring some other factors to consider

Student Made Video?

A lot of courses are including student video assignments. Do these need to be captioned? In most cases, probably not...but if some kind of peer review is required, and the course includes a hearing impaired student, then yes.

Again writing a script first may help students create a better video and provide a ready-made transcript. However, interview footage will still need to be transcribed.

What about Music or Non-Spoken Audio?

I asked a version of this question to the Office of Disability Services, and the truth is that it's a little hard to predict. For some courses (e.g. an wildlife course featuring bird calls), a brief description of the audio may suffice. In other cases you may want to refer to visuals such as music scores or acoustic spectrograms (can be made with free software).

If an instructor is in this situation, the Office is willing to learn more about the course to determine what should be done.

Warn your Students in the Syllabus

Which leads into the next point - if your course features video or other technology applications (Excel, Powerpoint, Photoshop, Second Life....), you should inform students in the syllabus. If a student thinks an accommodation is needed, then he or she will know ahead of time to request it early in the semester rather than at the last minute.

Finally, Is that audio necessary?

This is sort of a dangerous question to ask, but one worth asking. For many cases, the answer is absolutely yes. Podcasting is critical in many courses from language to journalism, and a good video is invaluable in almost any discipline. Spoken language is also beneficial to many learning disabled students and preferred by many learners. But if you're factoring in the costs of accommodation, it is worth reviewing your rationale for audio (and making sure that information is available in another format).

I have to admit to some bias here because I don't like podcast audio or it's older incarnation, talk radio (either the NPR format or the other format). If content is on audio only, I may avoid it altogether. For one thing, it takes much more time to listen to an audio than to read the same amount of text. For another, speaking on a podcast or recording an instructional video well is an acquired skill just like writing for the Web is, and not everyone is good at it (yet). I am one of the many hearing learners who benefit from a transcript!

A good audio is definitely worth the time and effort put into it. You just want to make sure it's not going to a mediocre audio presentation....

ALT Tags without Tears?

I've been talking a lot about accessibility recently, but the one thing I have utterly failed to convey is that it's not as scary as it sounds. Sometime it can be relatively painless if you just know the right trick.

So I am going to switch up strategies and talk about some tips and tools I have found that make my accessification task easier. The first up is the infamous ALT Tag for images

ALT Tags "Reconceptualized"

The term ALT tag implies a scary HTML tag, but maybe it's better to think of it as a caption to use if it doesn't download. That is, if a user can't access the image (i.e. it doesn't download correctly or it's not visible), then the browser reads an alternate description.

Depending on your connection, I think we've all experienced a missing image for button or link, so wouldn't it be nice to know what it's supposed to be? Voilà the ALT Tag

How to do it

You can insert an image ALT tag in many tools, even if you don't know any HTML, usually by just filling out a description field in the image upload process. See the links below for inserting ALT tags in different tools:

Work Flow

I admit that if your course (or Web site) uses hundreds of images, then it will be a chore to tag them all at once. So...I don't usually tag them all at once. Instead, I try to tag them in small batches as the course is being developed.

Two strategies I have used:

1. If I am working on a Web site, then I tag each image as I create each page. I actually use Dreamweaver a lot even if the content will end up somewhere else (e.g. ANGEL, Drupal). Because the Dreamweaver ALT tag option is basically a form field in the Properties window (or the initial pop I get when I insert the image), I really don't have to touch the code that much (other to batch change the URL).

2. If you are working in Word first but converting to the Web later, it may make sense to just type in an ALT tag below the image as you insert it. When it comes time for the content to migrate, then the ALT tag will be there to be cut and pasted.

I've been using this process for the last 5 years now, so I can say that most images are used have some sort of ALT tag, and I don't spend too much time...unless I forget to tag as I add.

I know there are times when people are batch loading images to a site (e.g. some photo sharing sites) where it is very difficult to add an ALT tag. But I really think that should be the minority case since images are often collected and processed over the course of a period of weeks. Maybe I'm missing something though. That's why I have a comments section.