What is Oriental COCOSDA?
Oriental COCOSDA (O-COCOSDA) originally is the Oriental branch of COCOSDA, which stands for the International Committee for the Coordination and Standardisation of Speech Databases and Assessment Techniques.
Oriental Cocosda was set up at the beginning of the 1990’s to enable people concerned with spoken language processing to exchange ideas, share information and discuss regional matters on every issues related to the creation, the utilization and the dissemination of spoken language corpora of oriental languages, and also to launch the design of assessment methods of speech recognition/synthesis systems as well as to promote speech research on oriental languages.
Established independently in 1997, its primary goal is to foster idea exchange, share insights, and discuss regional matters related to the creation, use, and distribution of spoken language corpora for Oriental languages.
Now O-COCOSDA is an independent organization to promote speech research in Asia, with minimal ties to COCOSDA or other regional groups. Additionally, O-COCOSDA focuses on assessing speech technology like speech recognition and synthesis systems for oriental languages.
🌏 Conference History
The annual Oriental COCOSDA International Conference is the flagship event of O-COCOSDA.
The first preparatory meeting took place in Hong Kong in 1997, and since then, 27 workshops have been hosted in various countries, including:
🇯🇵 Japan
🇹🇼 Taiwan
🇨🇳 China
🇰🇷 Korea
🇹🇭 Thailand
🇸🇬 Singapore
🇮🇳 India
🇮🇩 Indonesia
🇲🇾 Malaysia
🇻🇳 Vietnam
🇳🇵 Nepal
🇲🇴 Macau
🇲🇲 Myanmar
🇵🇭 Philippines
🎯 Background & Purpose
It has been well understood that it is necessary to collect and maintain large amounts of speech data of various kinds, allowing unrestricted access so that they can be utilized for research and development as well as for recognizer performance assessment.
Why Speech Corpora Matter
🔬 Research Repeatability: Utilization of common speech corpora increases repeatability and objectivity of speech research
🌍 Cultural Preservation: From the linguistic or cultural viewpoint, it is necessary and important to preserve speech data of various languages, especially those that are becoming extinct
⏰ Urgency: Many local languages or dialects are disappearing by the day
Hence there is a pressing need to preserve natural record of such languages. This is another important purpose of speech databases.

Figure: The necessity and purpose of speech corpora
🎯 Our Missions
O-COCOSDA supports the development of spoken language resources and speech technology evaluation.
Resource Development
Promoting the development of distinctive types of spoken language data corpora for the purpose of building and/or evaluating current or future spoken language technology.
Research Coordination
Offering coordination of projects and research efforts to improve their efficiency.
📋 Strategy
Technical interests are organized on both country and topical basis.
Country Representation
Each country is represented on the central committee by country rapporteurs.
Topic Domains
Each agreed topic domain is represented by a topic domain rapporteur.
Synergy
Interaction between regional and topical rapporteurs provides the basis for promotion and coordination activities informed by both local and global expertise.
🔬 Topic Domains
O-COCOSDA supports the development of new topic domains based on technological needs.
Current Topic Areas
Our focus areas include:
🎤 Speech recognition
🗣️ Speech synthesis
🏷️ Speech classification
📚 Speech corpora
🔧 Corpus annotation tools
🌏 Local languages
Criteria for New Domains
A new topic domain is warranted by a new speech technology application ONLY if that application places new demands on:
📊 Data corpora form and structure
🧪 Technology evaluation approaches
Open Documentation
We’re open to topic domains that relate to the formal documentation of spoken language without reference to any specific technological application.
Avoiding Redundancy
If redundancy is seen between a new topic domain proposal and an existing one, a combined topic domain will be considered.