This video tutorial gives a brief introduction to the Munich AUtomatic Segmentation -- or WebMAUS. It is a tool to align speech signals to linguistic categories which makes it, amongst other things, possible to align the audio signal of a video to its transcript. As input, WebMAUS needs a video signal and some kind of a transcription of the spoken text.
To get the actual output, the input text first needs to be normalized. With the Balloon tool, the expected pronunciation is created in SAMPA (a phonetic alphabet). In a next step, all other possible variants of pronunciation are made along with their probability. All those other possible pronunciations are visualized in a probabilistic graph where finally WebMAUS searches for the path of phonetic units that have truly been spoken. The outcome is a transcript of the real pronunciation along with its segmentation.
There is an open source download and a web application. The usage is free for all academic members of Europe.
Report
My comments