What is Oriental COCOSDA?
Oriental COCOSDA (O-COCOSDA) originally is the Oriental branch of COCOSDA, which stands for the International Committee for the Coordination and Standardisation of Speech Databases and Assessment Techniques.
Established in 1997, its primary goal is to foster idea exchange, share insights, and discuss regional matters related to the creation, use, and distribution of spoken language corpora for Oriental languages.
Now O-COCOSDA is independent, with minimal ties to COCOSDA or other regional groups. Additionally, O-COCOSDA focuses on assessing speech recognition and synthesis systems while promoting speech research in Oriental languages.
๐ Conference History
The annual Oriental COCOSDA International Conference is the flagship event of O-COCOSDA.
The first preparatory meeting took place in Hong Kong in 1997, and since then, 27 workshops have been hosted in various countries, including:
๐ฏ๐ต Japan
๐น๐ผ Taiwan
๐จ๐ณ China
๐ฐ๐ท Korea
๐น๐ญ Thailand
๐ธ๐ฌ Singapore
๐ฎ๐ณ India
๐ฎ๐ฉ Indonesia
๐ฒ๐พ Malaysia
๐ป๐ณ Vietnam
๐ณ๐ต Nepal
๐ฒ๐ด Macau
๐ฒ๐ฒ Myanmar
๐ต๐ญ Philippines
๐ฏ Background & Purpose
It has been well understood that it is necessary to collect and maintain large amounts of speech data of various kinds, allowing unrestricted access so that they can be utilized for research and development as well as for recognizer performance assessment.
Why Speech Corpora Matter
๐ฌ Research Repeatability: Utilization of common speech corpora increases repeatability and objectivity of speech research
๐ Cultural Preservation: From the linguistic or cultural viewpoint, it is necessary and important to preserve speech data of various languages, especially those that are becoming extinct
โฐ Urgency: Many local languages or dialects are disappearing by the day
Hence there is a pressing need to preserve natural record of such languages. This is another important purpose of speech databases.
Figure: The necessity and purpose of speech corpora
๐ฏ Our Missions
O-COCOSDA supports the development of spoken language resources and speech technology evaluation.
Resource Development
Promoting the development of distinctive types of spoken language data corpora for the purpose of building and/or evaluating current or future spoken language technology.
Research Coordination
Offering coordination of projects and research efforts to improve their efficiency.
๐ Strategy
Technical interests are organized on both country and topical basis.
Country Representation
Each country is represented on the central committee by country rapporteurs.
Topic Domains
Each agreed topic domain is represented by a topic domain rapporteur.
Synergy
Interaction between regional and topical rapporteurs provides the basis for promotion and coordination activities informed by both local and global expertise.
๐ฌ Topic Domains
O-COCOSDA supports the development of new topic domains based on technological needs.
Current Topic Areas
Our focus areas include:
๐ค Speech recognition
๐ฃ๏ธ Speech synthesis
๐ท๏ธ Speech classification
๐ Speech corpora
๐ง Corpus annotation tools
๐ Local languages
Criteria for New Domains
A new topic domain is warranted by a new speech technology application ONLY if that application places new demands on:
๐ Data corpora form and structure
๐งช Technology evaluation approaches
Open Documentation
We’re open to topic domains that relate to the formal documentation of spoken language without reference to any specific technological application.
Avoiding Redundancy
If redundancy is seen between a new topic domain proposal and an existing one, a combined topic domain will be considered.