Exalead Creates Automatic Searchable Speech Transcripts
French search specialist Exalead thinks it's cracked one of the big challenges of search, making video searchable, and has landed a high-profile customer to prove it: the Web site of France's President.
Elysee.fr, the presidential Web site, is the first to use Voxalead, a tool developed by Exalead to make video and audio files searchable by automatically transcribing the speech they contain.
Now, Voxalead makes it possible to search all the videos of President Nicolas Sarkozy stored on the site for speeches and press conferences in which he used a particular word or phrase, jumping straight to the spot in the video where he utters the word. The transcripts can be overlaid on the videos as subtitles.
The tool will enable people to easily identify the president's position on particular issues, according to Exalead.
Google's YouTube also offers searchable transcripts of some of its videos, and it too uses those transcripts for subtitling. Users can choose to automatically generate a transcript for an English-language video when they upload it to YouTube, but the speech recognition can be very hit-or-miss. YouTube also offers a more reliable option, where uploaders attach a text file containing their own transcript of the video, which YouTube then synchronizes with the audio track to generate subtitles and allow searchers to jump straight to a particular word.
Exalead apparently still has some fine tuning to do, too, if its software is to correctly transcribe President Nicolas Sarkozy's voice. In a video of a speech he gave on March 24, the quality of transcription is good, but Voxalead repeatedly hears the sound "eh" as "ee", leading it to transcribe "c'est" (it is) as "six" (6), among a number of similar errors.
Voxalead has been available for months as an open beta test on Exalead's "Labs" site, where the company showcases it creating searchable transcripts of news broadcasts in English, French, Chinese, Arabic, Spanish and Russian.
The transcription and search tool was developed in Exalead's laboratories, but the company also acknowledged the support of researchers working for the French National Scientific Research Center (CNRS) and at Vecsys Research, a French company specializing in speech recognition.