• broken image

    Odia Bhasa Bruksha

    A collaborative initiative to build Odia UD Tree Bank and CorefNet

  • Problem Statement

    Despite the availability of parsers and treebanks for many high-resource languages, there remains a significant gap in resources for lesser-known, low-resource languages. The Odia language, spoken by millions, is one such example. Although it has a rich linguistic heritage, it currently lacks a comprehensive treebank within the Universal Dependencies (UD) framework—a widely adopted linguistic resource that provides consistent grammatical annotation for parts of speech, morphological features, and syntactic dependencies across over 100 languages and contains nearly 200 treebanks.

     

    The existing Odia Treebank is limited to just 100 annotated sentences, which is insufficient for model training and research purposes. This lack of a robust, scalable resource has hindered the development of advanced natural language processing (NLP) models and other language technologies for Odia. Expanding this treebank to include more annotated data would be a crucial step forward, enabling more effective experimentation and the development of applications for the Odia language.

    Proposed Solution

    This project seeks to address the current limitations by creating a Universal Dependencies (UD) Treebank and Parser for the Odia language. Building on existing resources and insights from the current Odia Treebank, we aim to significantly expand the treebank’s size and scope. The key goals of this initiative include:

    • Expanding the Treebank: Annotating a much larger set of Odia sentences, ensuring that the Treebank is comprehensive and suitable for training robust NLP models.
    • Creating a UD-Compatible Parser: Developing a parser that can automatically annotate Odia text according to the Universal Dependencies framework, allowing for scalable processing of large datasets.

    Through this work, we hope to fill the gap in linguistic resources for Odia and pave the way for further research and development in Odia language technologies.

  • Team

    broken image

    Dr. Kusum Lata

    Sharda University, Greater Noida, UP, India

    (Project Cooridinator)

    broken image

    Dr. Kalayanimalini Sahoo

    University of Artois, France.

    broken image

    Dr. Satya Ranjan Dash

    KIIT University, Bhubaneswar, India

    broken image

    John Bauer

    Stanford University

    (Generously funded the Odia Bhasa Bruksha initiative)

    broken image

    Dr. Bijayalaxmi Dash

    Ravenshaw University, Cuttack, India

    broken image

    Dr. Atul Kumar Ojha

    University of Galway, Galway, Ireland

    broken image

    Sourav Kumar Behera

    Researcher

    broken image

    Nirmal Naik

    Researcher

    broken image

    Srustiprava Satapathy

    Researcher

    broken image

    Shashikanta Sahoo

    Researcher

  • Contact

    For any queries, please contact Dr. Kusum Lata (project coordinator): ranapoo@gmail.com