SWAC Metatags

Systematic recording of the pronunciation of words and sentences and the creation of audio linguistic collections is allowed by actual technology (With specific tools it is possible to record the pronunciation of 1000 words in less than one hour).

These created audio collections can be useful for:

  • Linguistic research use (record and compare the pronunciation in different regions)
  • Didactic use (didactic audio collection such as «English Irregular Verbs»)
  • Illustration use (for electronic dictionaries)

Exchange of audio files is now easy with the emergence of Internet. The files can be easily copied, downloaded... However, to be indexed and used, records of words or sentences have to be associated with some additional information (What word/sentence is recorded ? In what language ? etc.). It would be useful to have a standardized way of exchange for audio records along with their associated information. By the way, audio collection could be easily produced and used by different software systems, on different platforms, by different people.

This document proposes a convenient and simple way to associate information with audio record. Its purpose is not to define which but rather how pieces of information has to be provided.

Several solutions are usable: we could associate each audio file with a text file containing the individual information. This solution has several inconveniences: the audio and text file could be separated, and each record is materialized by two files.

The Vorbis Comment Metatags system allows to store additional information in ogg and flac audio files. This is very useful for the purpose of organizing word collections. It is an existing, free, and widely supported technology. It allows an easy transfer of audio files with the associated information without the need of any external description, as the information is enclosed directly in the audio file as meta information contained in the Vorbis Comment Tag.

Below is a proposed list of standard field names with a description of intended use. We recommend the adoption of a common naming scheme by the communities that produce and consume word collections, in the spirit of the list of standard field names in the Vorbis Comment specification: you don't have to provide a field with the name of the artist of your ogg tune, but if you do, you should call it "ARTIST" and not "BAND" or anything else. This way, it is easy for the players to check if there is a title information and which it is.

None of these fields are intended to be mandatory, although we believe that no real automated processing can be done without the SWAC_TEXT and SWAC_LANG fields.

FIELDS

1. Information about the pronounced text:

SWAC_TEXT
Text pronounced by the speaker
  • « house »
  • « it's raining cats and dogs ! »
SWAC_LANG
The language of the word which is pronounced (ISO 639-3)
recordvalue
« rendezvous »eng
« rendez-vous »fra
« crocodile »eng
« crocodile »fra
SWAC_ALPHAIDX
Items which allow program to generate automatically the alphabetical index of the audio collection Separator is «|» (U+007C)
recordvalue
« house » (eng)house
« It's raining cats and dogs! » (eng)rain|cat|dog
« I am » (eng)be
« 啊 » (chi)ā
« se laver » (fra)laver (se)
« j'ai faim » (fra)avoir|faim
« ett fönster » (swe)fönster
« telefonul » (ron)telefon
SWAC_BASEFORM
When the record is a derivative form of a word, this field indicates the base word
recordvalue
« I was » (eng)to be
« je vais » (fra)aller
« друзей » (rus)друг
SWAC_FORM_NAME
When the SWAC_BASEFORM is defined, this field indicates the name of the form
recordvalue
« je vais » (fra)Present. 1p.S.
« друзей » (rus)Gen. Pl.
SWAC_FORM_REF
Name of the referential used by the SWAC_FORM_NAME field (such as LMF codification)
SWAC_HOMOGRAPHIDX
Index which can help the user to differentiate different homographs among the audio collection Basically the SWAC_HOMOGRAPHIDX is bases on the grammatical difference between homographs.
recordvalue
« пропа́сть » (rus)verb
« про́пасть » (rus)noun
« os » (fra) /os/sing
« os » (fra) /o/plur
But it can be a translation in an other language (basically in English) or a small explanation if the difference is not of grammatical nature.
recordvalue
« мука́ » (rus)flow
« му́ка » (rus)pain
« bass » (eng)fish
« bass » (eng)music
SWAC_HOMOGRAPHIDX_REF
Name of the referential used by the SWAC_HOMOGRAPHIDX field.

2. Information about the speaker:

SWAC_SPEAK_NAME
Speaker's name
  • « Jacques Durand »
  • « Иван Иванович Иванов »
SWAC_SPEAK_GENDER
Speaker's gender [M/F]
  • M: masculine
  • F: feminine
SWAC_SPEAK_BIRTH_YEAR
Speaker's year of birth

(Format: YYYY)

SWAC_SPEAK_LANG
Speaker's native speaking language

(ISO 639-3)

SWAC_SPEAK_LANG_COUNTRY
Country where the speaker acquired the SWAC_SPEAK_LANG

(ISO-3166-1)

SWAC_SPEAK_LANG_REGION
Region where the speaker acquired the SWAC_SPEAK_LANG
  • « Pays basque »
SWAC_SPEAK_PRON
General note about the pronunciation of the speaker (for example, about pronunciation defect)
SWAC_SPEAK_LIV_COUNTRY
Speaker's living country code

(ISO-3166-1)

SWAC_SPEAK_LIV_TOWN
Speaker's living town
  • « Saint-Jean-Pied-de-Port »
SWAC_SPEAK_CONTACT
Information which allow to contact the speaker
  • « jacques-durand@shtooka.net »
SWAC_SPEAK_DESC
Free note about the speaker

3. Information about the pronunciation of the word:

SWAC_PRON_INTONATION
Note about the intonation
recordvalue
« oh »Surprise
« oh »Realization
SWAC_PRON_SPEED
[1/2/3]
  • 1: slow pronunciation for pedagogical use
  • 2: normal pronunciation
  • 3: fast
SWAC_PRON_COMMENT
Comment about the pronunciation of the word by the speaker
recordvalue
« abasourdir » (fra) /a.ba.zuʁ.diʁ/ Academic pronunciation
« abasourdir » (fra) /a.ba.suʁ.diʁ/ Popular pronunciation
« догово́р » (rus) Standard pronunciation
« до́говор » (rus) Popular pronunciation in the south of Russia
SWAC_PRON_API
Phonetic transcription (with the international API phonetic alphabet)
SWAC_PRON_PHON
Specific phonetic transcription in the concerned language system
recordvalue
« мука » (rus) мука́ (with the diacritic symbol)
« 啊 » (chi) ā (the pinyin transcription)

4. Information about the audio collection:

SWAC_COLL_NAME
  • « Base Audio Libre De Mots Français »
SWAC_COLL_SECTION
Section in the audio collection
SWAC_COLL_DESC
Description of the collection
SWAC_COLL_ORG
Organization producing the audio collection
SWAC_COLL_ORG_URL
URL where you could find information about the organization producing the audio collection
SWAC_COLL_LICENSE
License which applies to the collection
SWAC_COLL_COPYRIGHT
Copyrights of the audio collection
SWAC_COLL_AUTHORS
Authors of the collection
SWAC_COLL_URL
URL where you could find general information about the collection

5. Technical information:

SWAC_TECH_QLT
Audio Quality [1/2/3/4/5]
  • 1: very poor
  • 2: poor
  • 3: normal
  • 4: good
  • 5: very good
SWAC_TECH_DATE
Date of recording

(Format: YYYY-MM-DD)

SWAC_TECH_SOFT
The program which was used to record the sound

Note about the Vorbis Comment specification:

Please consult the Vorbis Comments home page for more information about general comment tags specifications at: http://xiph.org/vorbis/doc/v-comment.html

The content of tags such as TITLE, DESCRIPTION, LICENSE and COPYRIGHT can be set to any value. These fields can be automatically filled using information provided by SWAC fields. Although it is recommended to set the GENRE field is recommended to « Speech ».

GENRE
« Speech »

According to the general Vorbis Comment specification, the use of additional fields is allowed. In what way this enables SWAC Fields to cohabit with other specific pieces of information. For example, electronic dictionaries can use such specifics tags as « OMEGAWIKI_ARTICLEIDX » to link audio items to their articles.

Note about the ID3v2 Tagging Format:

Since the version 2.4 of the ID3 Tagging Format, it is possible to store Unicode characters strings in MP3 audio files. We don't recommend the use of this tagging format, however SWAC fields can be stored as « TXXX » frames.

Please consult the ID3 Tagging Format home page for more information at: http://www.id3.org/

Note about this document:

This document is distributed and licensed by the Shtooka Project under the Creative Commons BY-CA License. More information about the license at: http://creativecommons.org/licenses/by/2.0/fr/deed.en_GB

SWAC format