Live Captioning/Transcription

What is Live Captioning/Transcription?

There are two primary ways that live captioning is used as a support service at IC. The first is as a classroom support for students who are deaf or hard of hearing. In this scenario, a transcriptionist types out the content of a lecture or discussion while a student reads the captions from a computer or mobile device in real time. The second scenario is where captions are used for live video streams (closed captioning). Zoom meetings and webinars, YouTube livestreams, and Kaltura livestreams can all be captioned in this way.

The terminology for live captioning can be a bit confusing. It can be called any or all of the following:

  • Live captioning
  • Live transcription
  • Closed captioning
  • Real-time stenography
  • Real-time captioning
  • Speech-to-text services
  • Classroom transcription
  • Text-based accommodations

Types of Live Captioning

The preferred method for live captioning is to have human-created captions. This means that a trained captionist listens to the live video feed and types out captions in the moment. Human-created captions are the only method that meets accommodation standards.

Captioning options

The following definitions are adapted from “Communication Considerations A-Z” and “Speech-to-Text Services: An Introduction.

CART (Communication access real-time translation)

is a method of live captioning that provides “word for word” (verbatim) transcription. This is the type of captioning/transcription provided by court stenographers and is also used for broadcast television. CART providers are highly trained and used specialized equipment, software, and techniques. Stenographic equipment is connected to a computer where the words appear in English for the viewer to read at speeds of up to 300 words per minute.

CART transcription includes nearly every word spoken, including false starts, misspeaks, and filler phrases. CART is the standard for live events and broadcast television.

Meaning for meaning

Meaning-for-meaning service providers listen to spoken language and translate it into grammatically correct written language. False starts and misspeaks are typically eliminated, which results in fewer words than CART transcription.

Captionists who create “meaning for meaning” transcription use one of two types of specialized software: C-Print or TypeWell. These programs are run on a standard laptop and captionists are trained in abbreviation standards and text-condensing strategies. As with CART, the viewer will see regular English words that condense the discussion or instruction in class. Meaning for meaning is often appropriate in classroom settings (or online instruction) but is NOT used for broadcast television.

There are now ways in which ASR can be used to generate live captions. At this time, ASR does NOT consistently meet accessibility standards. While it can be quite accurate in situations where the speaker is easily heard, the audio quality is high, and the words are relatively uncomplicated, there are other times where the quality is too low to be useful.

There are a number of tools and programs that offer ASR-generated captions:

Providing Live Captioning Tech Guides

Below are some primary tools used at IC and guides for faculty/staff when Live Captioning is a needed accommodation. Please note: SAS provides the funding for this accommodation in academic settings for course requirements.

Zoom does NOT have automatically generated live captions. Live captions in Zoom can only be done by assigning captioning to a 3rd party person or service. There are several ways for students to access captions for synchronous classes:

  1. The meeting host can assign captioning duties to a participant in the meeting (using the CC function within Zoom). The participant would then type directly into the captioning window.
  2. The meeting host can pass an API token to a meeting participant. This is also done through the CC function. This allows the participant (usually a contracted service) to connect specialized software to the Zoom meeting, which allows for more efficient captioning (software bumps up typing speeds from standard typing speeds up to 300wpm in some extreme examples).
  3. The captionist/transcriptionist joins the meeting as a participant to hear the audio but uses separate software to convey captions to the student (generally through a website). This eliminates the need for the meeting host to assign captioning duties every meeting. The student just needs to make sure the captionist has the Zoom link.

Live captioning in Kaltura (media.ithaca.edu) is used only during large events that will be streamed live (such as graduation or an all-staff or faculty meeting). If you would like to learn more about live captioning in Kaltura, contact Information Technology at servicedesk@ithaca.edu.

Live Encoding Best Practices Guide

YouTube Live caption requirements

ACS: YouTube Live, Sharing Your Captioning URL

  • Microsoft Teams
  • YouTube Premiere
  • VoiceThread
  • FlipGrid