----------------------------------------------------------------------------------- 2012-03-12 - Deutscher Gebärdensprachkorpus SIGNUM Dipl.-Ing. Ulrich von Agris von der RWTH Aachen (Institut für Mensch-Maschine- Kommunikation) stellt den 2009 fertiggestellten Korpus über Deutsche Gebärdensprache SIGNUM der CLARIN-Infrastruktur zur freien Verfügung. Der 847GB große Video-Korpus wurde mit Mitteln der DFG finanziert und zeigt Wörter und Phrasen gebärdet von 25 Versuchspersonen unter optimalen technischen Bedingungen. Aus der CMID Metadatenbeschreibung: "The SIGNUM Database contains both isolated and continuous utterances of various signers. Since we use a vision-based approach for sign language recognition the corpus was recorded on video. For quick random access to individual frames, each video clip is stored as a sequence of images. The vocabulary comprises 450 basic signs in German Sign Language (DGS) representing different word types. Based on this vocabulary, overall 780 sentences were constructed. Each sentence ranges from two to eleven signs in length. No intentional pauses are placed between signs within a sentence, but the sentences themselves are separated. The entire corpus, i.e. all 450 basic signs and all 780 sentences, was performed once by 25 native signers of different sexes and ages. One of them was chosen to be the so-called reference signer. His performances were recorded not once but even three times. The SIGNUM Database was created within the framework of a research project at the Institute of Man–Machine Interaction, located at the RWTH Aachen University in Germany. The SIGNUM (Signer-Independent Continuous Sign Language Recognition for Large Vocabulary Using Subunit Models) project was funded by the Deutsche Forschungsgemeinschaft (German Research Foundation) and aimed to develop a video-based automatic sign language recognition system." ----------------------------------------------------------------------------------- 2013-01-17 - Italienistik der Ludwig-Maximilians-Universität München importiert kalabresische Sprachdokumentation in CLARIN Prof. Dr. Thomas Krefeld und Dr. Stephan Luecke der Italienistik der Ludwig-Maximilians-Universität München stellen den 'AsiCa' Sprachkorpus in das Münchner CLARIN-Repository ein. Der ingest verlief weitgehend problemlos, da die erforderilichen Metadaten direkt aus einer SCL Datenbank exportiert und in das CMDI Tool COALA eingespeist werden konnten. Auszug aus den Korpus-Metadaten: "The AsiCa-Corpus basically is a documentation of the south italian dialect 'calabrese'. The main objects when building this corpus were the analysis of syntactical structures and their geolinguistic mapping in form of interactive, webbased cartography. The corpus consists of several audio files containing recordings of some sixty speakers of Calabrese one half of which having migration experience in Germany the other half almost allways having stayed in Calabria. Furthermore the informants were selected equally balanced regarding gender, age and geographical origin. Of most of the informants there exist at least one recording with spontanous speech and one recording based on stimuli each. The results of syntactical analysis (maps and text) can be seen on the projects website at http://www.asica.gwi.uni-muenchen.de." Mit der Übernahme der Metadaten in die CLARIN-Infrastruktur werden Prof. Krefelds Daten automatisch in allen Zugangsmechanismen wie VLO oder Federated Content Search sichtbar. ------------------------------------------------------------------------------------ 2012-12-12 - Universität Graz nimmt deutsche CLARIN Sprachkorpora als Vorbild und verwendet CLARIN Tools Prof. Martin Hagmüller der Universität Graz orientiert sich bei eigenen Aufnahmen für Österreichisches Deutsch an der Struktur des deutschen CLARIN Sprachkorpus BAS PD1, der über das CLARIN Repository des BAS frei verfügbar gemacht wurde. Darüber hinaus verwendet die Arbeitsgruppe von Prof. Hagmüller das CLARIN Aufnahme-Tool SpeechRecorder und erzeugt das Aussprache-Lexikon des neuen Sprachkorpus mit Hilfe des CLARIN -Tools BALLOON, das von Uwe Reichel im Rahmen des deutschen CLARIN-Projektes weiterentwickelt wurde und demnächst als frei verfügbarer Web-Service zur Verfügung stehen wird. ------------------------------------------------------------------------------------ 2013-02-05 - WebMAUS segmentiert seltene/bedrohte Sprachen in DOBES Das MPI für Ethnologie in Leipzig verwendet eine neue Variante von WebMAUS für die Segmentierung von Interview-Aufnahmen mit Sprechern sehr seltener oder bedrohter Sprachen. Dabei kommt zum einen die neue Sprachvariante 'sampa' zum Einsatz, bei welcher WebMAUS sprachunabhängig alle möglichen phonetischen SAM-PA Symbole verarbeitet; zum anderen verwenden die Ethnologen den neu implementierten Batch-Modus von MAUS, der die Verarbeitung von sehr langen, in Dialogbeiträge vor-segmentierten Audio-Dateien erlaubt. Auf diese Weise können die Ethnologen vollautomatisch eine wortgenaue Indizierung ihrer Videodaten erreichen, welche die Basis bildet für weitere Analysen des Materials. Erste Ergebnisse wurden in Beiträgen zum Workshop 'On Exploring Data from Language Documentation' in Berlin (Mai 2013) und zur DoBeS Konferenz in Hannover (Juni 2013) vorgestellt. ------------------------------------------------------------------------------------ 2013-08-12 - WebMAUS steuert Spiele-Avatar Ein Startup-Unternehmen aus München verwendet WebMAUS zur Steuerung der Lippenbewegungen ihres Spiele-Avatars. Zitat aus der Email des Entwicklers: "Das Spiel ist als App für das iPad konzipiert; graphisch verwendet es einfache, kindgerechte Comic-hafte 2D-Grafiken und simple schrittweise Animationen.  Spielerläuterungen und Hilfen für die Lernspiele werden durch animierte Figuren gegeben, die hierzu bestimmte Texte sprechen, die als einzelne Sprach-Audiodateien vorgegeben sind. Das Sprechen soll durch Animation der Münder der Figuren zusätzlich visualisiert werden. Hierbei gibt es nur zwei Animationszustände: Mund geöffnet und Mund geschlossen.  In einem inhaltlich verwandten Vorgängerspiel wurde dies so gelöst, dass in einem vorgelagerten Prozess alle Audio-Sequenzen hinsichtlich ihrer Amplituden abgetastet wurden. War die Amplitude über einem definierten Schwellwert, wurde dies als "Mund auf" interpretiert, darunter als "Mund zu". Die Ergebnisse dieses Präprozesses wurden in Datenstrukturen abgespeichert, die beim Abspielen der Sounds später im Spiel jeweils zusammen mit dem Audio geladen wurden, so dass die Animation synchron gesteuert wurde. Das Ergebnis dieser Vorgehensweise ist jedoch nicht vollständig zufriedenstellend, da die Mundbewegungen nicht natürlich wirken. Wir haben den Eindruck, dass gerade Konsonanten in der Aussprache zwar einen Amplitudenausschlag bewirken, dass man dabei aber keinen geöffneten Mund in der Animation erwartet. Wir haben daher beim aktuellen Spiel nach einer anderen Lösung gesucht. Als Basis liegen uns sowohl die Sprecher-Audiodateien als auch alle entsprechenden Texte vor, nicht jedoch eine "Lautschrift" Darstellung. Unser Grundgedanke war dann, dass wir tendenziell bei allen Vokalen den Mund offen darstellen und bei allen Konsonanten geschlossen. Bei Recherchen nach Lösungen sind wir auf MAUS gestoßen. Wir haben mit WebMAUS für eine Beispielsequenz den Versuch unternommen, eine Audiodatei zusammen mit dem Textfile des Sprecher-Texts phonetisch segementieren zu lassen. Mit dem Output von MAUS konnten wir mit den dort angegebenen Intervallen die Animation steuern, wobei wir den phonetischen "Buchstaben" jeweils einen der beiden Animationsszustände zugewiesen haben. Das Ergebnis war verblüffend: Die Animation wirkt viel natürlicher als mit dem alten Algorithmus." ------------------------------------------------------------------------------------ 2013-11-11 - Alignment des Korpus ‚Dialogstrukturen‘ mit WebMAUS Das Korpus ‚Dialogstrukturen (DS)‘ wurde als eines der ersten Gesprächskorpora des Deutschen in den 1970er Jahren unter der Leitung von Hugo Steger in Freiburg erhoben und bereits zu Beginn der 2000er Jahre im Archiv für Gesprochenes Deutsch vollständig digitalisiert. Ein großes Manko für die Nutzbarkeit war aber immer das fehlende Text-Ton-Alignment, also die Verbindung von Abschnitten des Transkripts mit zugehörigen Stellen in der Aufnahme. Mit Hilfe des WebMAUS-Services, den das BAS München über CLARIN anbietet, konnte dieses Defizit nun behoben werden. Nach einem manuellen Voralignment wurde der WebMAUS-Service eingesetzt, um die Transkripte des DS-Korpus wortweise mit den zugehörigen Aufnahmen zu alignieren. Das Ergebnis kann nun über die Datenbank für Gesprochenes Deutsch (DGD2, http://dgd.ids-mannheim.de) genutzt werden. (Thomas Schmid) ------------------------------------------------------------------------------------ 2014-01-06 - The Sad Story of SCRIBE Dies ist keine Erfolgsstory sondern ein gutes, typisches Beispiel, was passiert, wenn kein CLARIN sich um Daten kümmert: Zitat aus http://www.phon.ucl.ac.uk/resource/scribe/, Autor vermutlich Mark Huckval: "Status of the corpus The available audio recordings and annotations were released on eleven CD-ROMs (labelled SCRIBE_0 to SCRIBE_11) in April 1992. These were originally distributed by the Speech Group at the National Physical Laboratory, but after this was closed down the disks were passed to the MOD Speech Research Unit at Malvern which passed the disks on to a private contractor (who kept them in his garage). The Speech Research Unit itself became part of 2020Speech Ltd in 2000. The current availability of the CD-ROMs is unknown. At UCL we have one complete set which is labelled "Copyright © University of Cambridge, University of Edinburgh and University College London". The main documentation on the CD-ROMs has been collated into a SCRIBE Manual which can be viewed online - see below. Investigation of the annotated components of the corpus has revealed a number of file labelling and annotation alignment errors. Mark Huckvale has put a lot of effort into correcting these and the corrected annotated sub-component of the corpus is now available for download. This is only a small sub-set of the entire corpus and is made available on the understanding that ownership remains with the original producers, and that this material may not be sold or used in commercial products or services." ------------------------------------------------------------------------------------ 2015-11-01 - MAUS has a new descendent: Praatalign - a praat plugin for phonetic alignment Mart Lubbers and Francisco Torreira of the Max-Planck-Institute of Nijmegen have successfully adopted parts of the MAUS system (hdl.handle.net/11858/00-1779-0000-000C-DA82-F) into a praat plugin publicly available at dopefishh (https://github.com/dopefishh/praatalign/). The new plugin allows the users of the well-know labeller software praat (www.praat.org) to define a tier with an orthographic transcript, that marks chunks of recorded speech together with the words spoken. Using an external pronunciation dictionary or a rule-based text-to-phoneme algorithm ('phonetizer') the praatalign plugin then replaces the orthography by the standard pronunciations and calls an HTK aligner to signal-align the resulting string of phonetic symbols. For that the HTK aligner uses the publicly available MAUS HTK HMM sets downloadable from the MAUS software package at the Bavarian Archive for Speech Signals (http://www.bas.uni-muenchen.de/forschung/Bas/software/). The result of the Viterbi decoding is the segmentation of the phonetic symbol string onto the speech signal and is shown in a separate praat annotation tier. Special customized rule sets can be loaded to extend the search space of the Viterbi by pronunciation variants such as replacements, reductions or even insertions of phonetic segments. Thus, a user can interactively add and try out new rules to improve the signal alignment process. The CLARIN community welcomes and encourages members of the scientific community to use, improve and extend existing CLARIN services and resources for the benefit of the Digital Humanities. Insofar we are very pleased about the initiative of our colleagues Lubbers and Torreira at the Max-Planck institute. We very much hope that this valuable praat plugin will be used widely and further extended to more supported language, such as the CLARIN WebMAUS webservices themselves (currently 18 languages supported). ------------------------------------------------------------------------------------ 2016-01-20 - The 'Archiv Siebenbürger-Sächsischer Dialekte' (ASD) now in CLARIN The historic speech collection ASD of Saxons living in Romania (a German speaking minority) collected in the 1960-1970s is now being transferred to the BAS CLARIN repository and thus made available for a larger science community. Edited by Dr. Stephan Lücke and Prof. Thomas Krefeld of Ludwig-Maximilans-Universtät München, Germany, this valuable collection of historic recordings comprises read ('Wenker Sätze') and spontaneous (story telling) speech of 1811 speakers of Transylvanian Saxon dialects (a variety of German). Recordings were assigned to CLARIN standard CMD metadata instances that describe speaker features and annotation files for each recording. The ASD corpus will be available beginning of June 2016 via the standard CLARIN repository of BAS to all academic users. ------------------------------------------------------------------------------------ 2016-03-01 - New Speech Corpora of Cochlear Implant Speakers 'CI_2' Four new speech corpora (corpus ID 'CI_2_...') have been added to the BAS CLARIN repository, thanks to Dr. Veronika Neumeyer who courteously provided us with her data sets. These corpora are a follow up to the first Cochelar Implant speech corpus 'CI_1' and provide the research community with large samples of speakers using Cochlear Implant hearing aids. In contrast to 'CI_1' the new corpora contain not only more speakers of different situations but also new linguistic material such as vowels, consonantal clusters, VOT measurements and sibilant data. All data come with manually checked phonetic segmentations, and are freely downloadable for scientific usage. ------------------------------------------------------------------------------------ 2016-04-25 - The SOS Project in Bejing, China Gunnar Lindenblatt at a clinic in Bejing uses the G2P webservice (called via REST!) to create IPA transcriptions of European names. The background is that the 'SOS Card Project' aims to out-fit foreign students who do not speak Mandarin with so called 'SOS Cards' in which - among other vital information - there is also a phonetic encoding of the student's name (in case that a doctor has to address the student in an emergency, for instance). Since IPA is not known in China, the encoding is done in Mandarin characters that Chines can easily phonetize. But as an input the algorithm needs an IPA version of the name which is normally given in Western alphabet. Thanks to the BAS webservice G2P ('grapheme-to-phoneme') the researchers of the 'SOS Card Project' can now automatically translate a new student's name into IPA and then into Chinese characters. ------------------------------------------------------------------------------------ 2017-01-20 - The new Chunker algorithm by Nina Poerner Nina Poerner, a member of the BAS CLARIN team, has developped a new algorithm that is capable to automatically segment (or 'chunk') very long transcribed speech recordings into treatable 'chunks'. The trick of her method is that both, the signal and the transcript are chunked in parallel. Nina applies several state-of-the-art techniques from speech technology such as speech recognition, phoentic alignment etc. for her idea. The algorithmn is the heart of the new BAS WebService 'Chunker' which has been published recently (and incorporated in to the popular BAS WebService 'Pipeline'). This enables users for the first time to process very long recordings (in the range of several hours) in a MAUS pipeline. Her work has been published here: Nina Poerner and Florian Schiel (2018): A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings, Proceedings of LREC, Miyazaki, Japan. ------------------------------------------------------------------------------------ 2017-08-12 - Automatic anonymization of annotated speech data by Florian Schiel A new BAS WebService 'Anonymizer' has recently been added to the service suite. This service allows user to automatically mask certain (defined) words or names in the speech signal while at the same time all references to these words are masked in the linguistic annotations as well. This makes it possible to mask for instance sensitive information like names, addresses or other personal information before publication of the speech data. ------------------------------------------------------------------------------------ 2018-03-22 - Release of the new BAS WebService 'Subtitle' by Nina Poerner This new service developped by Nina Poerner is an easy to use tool for the creation of subtitle tracks from transcripts. In combination with a ASR-MAUS pipeline this enables the user to full automatically create a subtitle track from scratch. The service can also be used to back-map the result of a MAUS pipeline to the original transcript and thus restoring punctuation etc. that has been removed for in the tokenization process of the pipeline. ------------------------------------------------------------------------------------ 2018-06-15 - New tool 'Voice Activity Detection' by Thomas Kisler and Fritz Seebauer This new service of the BAS WebServices suite segments a speech recording into stretches of speech and silence ('voice activity'). It is based on a frame-besed TensorFlow DNN followed by a smoothing stage. ------------------------------------------------------------------------------------ 2019-08-21 - Signal Processing for Dummies: the 'AudioEnhance' tool by Florian Schiel and SoX The BAS WebService suite presents a new service that offers several signal processing tools via an easy to use webinterface (or RESTful interface). The tool is mainly aimed to assist laymen to optimize their media files for BAS Pipeline Processing, but it can alos used to manipulate speech in several ways: simple filters, normalization, re-sampling, iextracting sound tracks from video, noice cancelling, channel selection/mix, manipulating fundamental frequency and speech rate. ------------------------------------------------------------------------------------ 2020-06-11 - New text conversion service 'TextEnhance' by David Huss and Florian Schiel This new tool is a handy little gadget to solve all the mysterious problems that users experience when moving text files accross applications or operating systems. The tool reads a large number of different text file formats and produces a homogeneous UTF-8 text file that is cleaned by BOMs, different line terminators, encodings etc. The BAS WebServices offer this tool as a stand-alone service but also applies it routinely on all input transcription files fed into our Pipeline Processing. We estimate that this tool will automatically resolve about 40% of user support enquiries to our hot line. ------------------------------------------------------------------------------------ 2020-07-14 - Always the wrong format? The new annotation converter 'AnnotConv' The new BAS WebService 'AnnotConv' is now operational; the service translates signal based annotation files into several formats such as praat TextGrid, EMU SDMS, CSV, Iso TEI, EXMARaLDA and ELAN. ------------------------------------------------------------------------------------ 2020-10-26 - New Automatic Transcription service provided by Fraunhofer Institute The german Fraunhofer institute IAIS kindly provides us with access to their German ASR system for the purpose of scientific speech transcription. The system supports the languages German and American English; in an in-house evaluation the German recognizer showed the best results compared to commercial ASR systems. ------------------------------------------------------------------------------------ 2020-12-03 - Speaker Diarization by Fritz Seebauer 'Speaker diarization' is the automatic segmentation of a multi-speaker recording into speaker turns and the correct assignment of each turn to a consistent speaker label. Our new service of the BAS WebService suite is based on the 'pyannote' library with some extensions that allow users to predefine the number of speakers in the recording. ------------------------------------------------------------------------------------ 2021-01-12 - Release of the new front end design 3.0 of the BAS WebService by Markus Jochim A complete re-design and re-implementation and implementation of many features has been released today. Markus Jochim, a member of the BAS team, has spent several months working for this day. We very much hope that you all will enjoy the new user experience. ------------------------------------------------------------------------------------