The Bottleneck – Transcription File to Captioning file
After processing through the Watson – Amara approach It became clear that there is a piece missing. Watson does a decent job of creating a transcript. It even has something it calls a “Timing File”. Unfortunately, from the demo page, it is not possible to easily save either this file or the JSON file that is also available. I am presuming this is due to the fact that the Watson Speach-To-Text is a paid service and I am using a demo. So I will have to try to full-fledged service.
Trying out Watson Speach-to-Text API
The biggest question is “Can the Watson speech API output a Caption file format?” I went over to Wikipedia for some background and a list of the different formats.
A short simple introduction can be found here: https://en.wikipedia.org/wiki/Timed_text
A more in-depth article on Subtitles can be found here: https://en.wikipedia.org/wiki/Timed_text
What I was looking for, a chart of the formats is here:
From the Amara.com upload dialog the accepted formats are:
Our site accepts SRT, SSA, SBV, DFXP, TXT, and VTT format. Only files ending in .srt, .ssa, .sbv, .dfxp, .txt, .vtt or .xml (for dfxp) are accepted.
Attempt 1 – Github, SubtitleMe
I did what all of us do, I googled “Watson Speach API convert to subtitles.” Some of the first entries returned where GitHub entries so I tired the first.
It is a program called “SubtitleMe” its claim is that it will use the Watson Speach API to create a subtitle file. Here is my first attempt:
This certainly could be user failure, but I definitely want the easiest solution. I will try the second Github entry.
Attempt 2 – Github, Subtitler
So, it turns out that, Subtitler is a fork of the first one, but it did seem to get a little further. I fed it a one minute file. After more than a minute of streaming the file, I was getting no results. I set that window aside and moved on to the next approach.
Attempt 3 – Using IBM-Watson Nodes for Node-RED
In the next article, I will be going through how I used Node-RED to query IBM Watson’s Speech-to-Text API. For now, here is a screenshot: