Using automatic annotation tools for transcription files
The EXMARaLDA Partitur-Editor (musical score editor) (version 1.6 and forward) enables access to the webservices provided by Weblicht and the CLARIN-D infrastructure. The linguistic tools can augment the transcription with automatic annotations, including morphological and syntactic information. No additional software installation nor any format conversion are needed. Weblicht as a Service work-flows can be defined and later be used with just one button click
Especially relevant for
everybody who works with the EXMARaLDA Partitur-Editor and wants to automatically annotate his files, for example researchers working with:
- linguistics
- anthropology
- political science, specially work with Video and Audio files
Starting point:
A transcription file in EXMARaLDA Partitur-Editor, import formats include among others: EXMARaLDA, FOLKER, EAF, and Praat files
Task:
Aggregation of multiple layer automatic annotation without conversions or installation steps
Solution:
Enable access from the EXMARaLDA Partitur-Editor to Weblicht as a Service
Related CLARIN-D tools and services
Short guide on using the WebLicht as a Service in the EXMARaLDA Partitur-Editor:
Preparation - setting a processing chain:
- go to WebLicht
- click on 'Start Weblicht'
- login with your institution account
- click on 'Start'
- press button 'Browse' to upload a TCF file (a transcriot can be exported as a TCF file from the Partitur-Editor)
- click on 'OK'
- choose tools to define a work-flow. This example uses the 'Morphology' and 'TreeTagger' services (for German)
- click on 'Run Tools' and wait for the work-flow to end
- click on 'Download chain' and save the processing chain file
Example files (for German):
Preparation - setting API Keys:
- go to the WebLICHT API-Key Website and click on 'generate'
- save the API key in a TXT file with Copy&Paste
WebLicht as a Service in the Partitur-Editor:
- open your transcript in the Partitur-Editor
- go to the menu 'CLARIN WebLicht' ...
- set the following setting:
- transcription language
- segmentation algorithm (according to the transcription conventions or 'GENERIC')
- path to the processing chain file
- API key
- The desired output format: with the selection of the output folder select the format of the output file - TCF, TEI (ISO-Standard for spoken language transcription) and/or HTML (visualization format)
- click on 'OK'
- processing progress is shown in a new window
- output the annotated transcription as TCF, TEI, or HTML file