Odia Bhasa Bruksha
A collaborative initiative to build Odia UD Tree Bank and CorefNet
Problem Statement
Despite the availability of parsers and treebanks for many high-resource languages, there remains a significant gap in resources for lesser-known, low-resource languages. The Odia language, spoken by millions, is one such example. Although it has a rich linguistic heritage, it currently lacks a comprehensive treebank within the Universal Dependencies (UD) framework—a widely adopted linguistic resource that provides consistent grammatical annotation for parts of speech, morphological features, and syntactic dependencies across over 100 languages and contains nearly 200 treebanks.
The existing Odia Treebank is limited to just 100 annotated sentences, which is insufficient for model training and research purposes. This lack of a robust, scalable resource has hindered the development of advanced natural language processing (NLP) models and other language technologies for Odia. Expanding this treebank to include more annotated data would be a crucial step forward, enabling more effective experimentation and the development of applications for the Odia language.
Proposed Solution
This project seeks to address the current limitations by creating a Universal Dependencies (UD) Treebank and Parser for the Odia language. Building on existing resources and insights from the current Odia Treebank, we aim to significantly expand the treebank’s size and scope. The key goals of this initiative include:
- Expanding the Treebank: Annotating a much larger set of Odia sentences, ensuring that the Treebank is comprehensive and suitable for training robust NLP models.
- Creating a UD-Compatible Parser: Developing a parser that can automatically annotate Odia text according to the Universal Dependencies framework, allowing for scalable processing of large datasets.
Through this work, we hope to fill the gap in linguistic resources for Odia and pave the way for further research and development in Odia language technologies.
References
Team
Dr. Kusum Lata
Sharda University, Greater Noida, UP, India
(Project Cooridinator)
Dr. Kalayanimalini Sahoo
University of Artois, France.
Dr. Satya Ranjan Dash
KIIT University, Bhubaneswar, India
John Bauer
Stanford University
(Generously funded the Odia Bhasa Bruksha initiative)
Dr. Bijayalaxmi Dash
Ravenshaw University, Cuttack, India
Dr. Atul Kumar Ojha
University of Galway, Galway, Ireland
Sourav Kumar Behera
Researcher
Nirmal Naik
Researcher
Srustiprava Satapathy
Researcher
Shashikanta Sahoo
Researcher
Contact
For any queries, please contact Dr. Kusum Lata (project coordinator): ranapoo@gmail.com