SWAC Metatags
Systematic recording of the pronunciation of words and sentences and the creation of audio linguistic collections is allowed by actual technology (With specific tools it is possible to record the pronunciation of 1000 words in less than one hour).
These created audio collections can be useful for:
- Linguistic research use (record and compare the pronunciation in different regions)
- Didactic use (didactic audio collection such as «English Irregular Verbs»)
- Illustration use (for electronic dictionaries)
Exchange of audio files is now easy with the emergence of Internet. The files can be easily copied, downloaded... However, to be indexed and used, records of words or sentences have to be associated with some additional information (What word/sentence is recorded ? In what language ? etc.). It would be useful to have a standardized way of exchange for audio records along with their associated information. By the way, audio collection could be easily produced and used by different software systems, on different platforms, by different people.
This document proposes a convenient and simple way to associate information with audio record. Its purpose is not to define which but rather how pieces of information has to be provided.
Several solutions are usable: we could associate each audio file with a text file containing the individual information. This solution has several inconveniences: the audio and text file could be separated, and each record is materialized by two files.
The Vorbis Comment Metatags system allows to store additional information in ogg and flac audio files. This is very useful for the purpose of organizing word collections. It is an existing, free, and widely supported technology. It allows an easy transfer of audio files with the associated information without the need of any external description, as the information is enclosed directly in the audio file as meta information contained in the Vorbis Comment Tag.
Below is a proposed list of standard field names with a description of intended use. We recommend the adoption of a common naming scheme by the communities that produce and consume word collections, in the spirit of the list of standard field names in the Vorbis Comment specification: you don't have to provide a field with the name of the artist of your ogg tune, but if you do, you should call it "ARTIST" and not "BAND" or anything else. This way, it is easy for the players to check if there is a title information and which it is.
None of these fields are intended to be mandatory, although we believe that no real automated
processing can be done without the SWAC_TEXT and SWAC_LANG fields.
FIELDS
1. Information about the pronounced text:
- SWAC_TEXT
-
Text pronounced by the speaker
- « house »
- « it's raining cats and dogs ! »
- SWAC_LANG
-
The language of the word which is pronounced (ISO 639-3)
record value « rendezvous » eng « rendez-vous » fra « crocodile » eng « crocodile » fra - SWAC_ALPHAIDX
-
Items which allow program to generate automatically the alphabetical index of the audio collection
Separator is «|» (U+007C)
record value « house » (eng) house « It's raining cats and dogs! » (eng) rain|cat|dog « I am » (eng) be « 啊 » (chi) ā « se laver » (fra) laver (se) « j'ai faim » (fra) avoir|faim « ett fönster » (swe) fönster « telefonul » (ron) telefon - SWAC_BASEFORM
-
When the record is a derivative form of a word, this field indicates the base word
record value « I was » (eng) to be « je vais » (fra) aller « друзей » (rus) друг - SWAC_FORM_NAME
-
When the
SWAC_BASEFORMis defined, this field indicates the name of the formrecord value « je vais » (fra) Present. 1p.S. « друзей » (rus) Gen. Pl. - SWAC_FORM_REF
-
Name of the referential used by the
SWAC_FORM_NAMEfield (such as LMF codification) - SWAC_HOMOGRAPHIDX
-
Index which can help the user to differentiate different homographs among the audio collection
Basically the
SWAC_HOMOGRAPHIDXis bases on the grammatical difference between homographs.
But it can be a translation in an other language (basically in English) or a small explanation if the difference is not of grammatical nature.record value « пропа́сть » (rus) verb « про́пасть » (rus) noun « os » (fra) /os/ sing « os » (fra) /o/ plur record value « мука́ » (rus) flow « му́ка » (rus) pain « bass » (eng) fish « bass » (eng) music - SWAC_HOMOGRAPHIDX_REF
-
Name of the referential used by the
SWAC_HOMOGRAPHIDXfield.
2. Information about the speaker:
- SWAC_SPEAK_NAME
-
Speaker's name
- « Jacques Durand »
- « Иван Иванович Иванов »
- SWAC_SPEAK_GENDER
-
Speaker's gender [M/F]
- M: masculine
- F: feminine
- SWAC_SPEAK_BIRTH_YEAR
-
Speaker's year of birth
(Format: YYYY)
- SWAC_SPEAK_LANG
-
Speaker's native speaking language
(ISO 639-3)
- SWAC_SPEAK_LANG_COUNTRY
-
Country where the speaker acquired the
SWAC_SPEAK_LANG(ISO-3166-1)
- SWAC_SPEAK_LANG_REGION
-
Region where the speaker acquired the
SWAC_SPEAK_LANG- « Pays basque »
- SWAC_SPEAK_PRON
- General note about the pronunciation of the speaker (for example, about pronunciation defect)
- SWAC_SPEAK_LIV_COUNTRY
-
Speaker's living country code
(ISO-3166-1)
- SWAC_SPEAK_LIV_TOWN
-
Speaker's living town
- « Saint-Jean-Pied-de-Port »
- SWAC_SPEAK_CONTACT
-
Information which allow to contact the speaker
- « jacques-durand@shtooka.net »
- SWAC_SPEAK_DESC
- Free note about the speaker
3. Information about the pronunciation of the word:
- SWAC_PRON_INTONATION
-
Note about the intonation
record value « oh » Surprise « oh » Realization - SWAC_PRON_SPEED
-
[1/2/3]
- 1: slow pronunciation for pedagogical use
- 2: normal pronunciation
- 3: fast
- SWAC_PRON_COMMENT
-
Comment about the pronunciation of the word by the speaker
record value « abasourdir » (fra) /a.ba.zuʁ.diʁ/ Academic pronunciation « abasourdir » (fra) /a.ba.suʁ.diʁ/ Popular pronunciation « догово́р » (rus) Standard pronunciation « до́говор » (rus) Popular pronunciation in the south of Russia - SWAC_PRON_API
- Phonetic transcription (with the international API phonetic alphabet)
- SWAC_PRON_PHON
-
Specific phonetic transcription in the concerned language system
record value « мука » (rus) мука́ (with the diacritic symbol) « 啊 » (chi) ā (the pinyin transcription)
4. Information about the audio collection:
- SWAC_COLL_NAME
-
- « Base Audio Libre De Mots Français »
- SWAC_COLL_SECTION
- Section in the audio collection
- SWAC_COLL_DESC
- Description of the collection
- SWAC_COLL_ORG
- Organization producing the audio collection
- SWAC_COLL_ORG_URL
- URL where you could find information about the organization producing the audio collection
- SWAC_COLL_LICENSE
- License which applies to the collection
- SWAC_COLL_COPYRIGHT
- Copyrights of the audio collection
- SWAC_COLL_AUTHORS
- Authors of the collection
- SWAC_COLL_URL
- URL where you could find general information about the collection
5. Technical information:
- SWAC_TECH_QLT
-
Audio Quality [1/2/3/4/5]
- 1: very poor
- 2: poor
- 3: normal
- 4: good
- 5: very good
- SWAC_TECH_DATE
-
Date of recording
(Format: YYYY-MM-DD)
- SWAC_TECH_SOFT
- The program which was used to record the sound
Note about the Vorbis Comment specification:
Please consult the Vorbis Comments home page for more information about general comment tags specifications at: http://xiph.org/vorbis/doc/v-comment.html
The content of tags such as TITLE, DESCRIPTION, LICENSE and COPYRIGHT can be set to any value.
These fields can be automatically filled using information provided by SWAC fields. Although it is
recommended to set the GENRE field is recommended to « Speech ».
- GENRE
- « Speech »
According to the general Vorbis Comment specification, the use of additional fields is allowed. In what way
this enables SWAC Fields to cohabit with other specific pieces of information. For example, electronic
dictionaries can use such specifics tags as « OMEGAWIKI_ARTICLEIDX » to link audio items to their
articles.
Note about the ID3v2 Tagging Format:
Since the version 2.4 of the ID3 Tagging Format, it is possible to store Unicode characters strings in MP3
audio files. We don't recommend the use of this tagging format, however SWAC fields can be stored as
« TXXX » frames.
Please consult the ID3 Tagging Format home page for more information at: http://www.id3.org/
Note about this document:
This document is distributed and licensed by the Shtooka Project under the Creative Commons BY-CA License. More information about the license at: http://creativecommons.org/licenses/by/2.0/fr/deed.en_GB
