• Publication

    Recent Conference and Workshop Papers Accepted/Published

    Conference/Workshop: The 10th Workshop on Asian Translation (WAT 2023), MT Summit, 2023, Macau, China

     

    Description: This paper offers an in-depth overview of the team” ODIAGEN’s” translation system submitted to the Workshop on Asian Translation (WAT2023). Our focus lies in the domain of Indic Multimodal tasks, specifically targeting English to Hindi, English to Malayalam, and English to Bengali translations. The system uses a state-of-the-art Transformer-based architecture, specifically the NLLB-200 model, fine-tuned with language-specific Visual Genome Datasets. With this robust system, we were able to manage both text-to-text and multimodal translations, demonstrating versatility in handling different translation modes.

     

    Our results showcase strong performance across the board, with particularly promising results in the Hindi and Bengali translation tasks. A noteworthy achievement of our system lies in its stellar performance across all text-to-text translation tasks. In the categories of English to Hindi, English to Bengali, and English to Malayalam translations, our system claimed the top positions for both the evaluation and challenge sets.

     

    Authors: Sk Shahid, Guneet Singh Kohli, Sambit Sekhar, Debasish Dhal, Adit Sharma, Shubhendra Kushwaha, Shantipriya Parida, Stig-Arne Grönroos, Satya Ranjan Dash

     

    Status: Published [Paper]

    Generative Chatbot Adaptation for Odia Language: A Critical Evaluation

    Conference/Workshop: IEEE International Conference on Circuits, Power and Intelligent Systems (CCPIS), September 2023, Bhubaneswar, India

     

    Description: Large Language Models (LLMs) have gained significant attention in the field of Natural Language Processing (NLP) and Artificial Intelligence (AI) due to their ability to generate human-like text and facilitate conversational interactions. However, the majority of LLMs are majorly developed for English, limiting their accessibility and effectiveness for non-English speaking populations. In India, where only 10\% of the population is proficient in English, the need for LLM models adapted to regional languages becomes crucial. This research paper focuses on the adaptability of LLMs to the Odia language, spoken by approximately 50 million people in India. With a primary objective to cater to the Odia-speaking community, we aim to evaluate existing LLM models such as ChatGPT, and Olive, an instruction following Odia LLM, specifically in the context of generating conversational outputs in Odia. We employ a critical evaluation approach to assess the performance, language understanding, and response generation capabilities of the LLM models for the Odia language. By conducting experiments and comparative analysis, we seek to determine the strengths, weaknesses, and potential areas of improvement for the existing LLM models. Our findings will contribute to the development of more effective and contextually accurate generative chatbots for the Odia language, enabling better communication and accessibility for the Odia-speaking population.

     

    Authors: Parul Agarwal, Aisha Asif, Shantipriya Parida, Sambit Sekhar, Satya Ranjan Dash, Subhadarshi Panda

     

    Status: Published. [Paper]

     

    Presentation Video

    Olive: An Instruction Following LLaMA Model For
    Odia Language

    Conference/Workshop: IEEE SILCON 2023, NIT Silcher, November 2023.

     

    Authors: Shantipriya Parida, Sambit Sekhar, Subhadarshi Panda, Swateek Jena, Abhijeet Parida, Soumendra Kumar Sahoo, Satya Ranjan Dash

     

    Status: Published. [Paper]

    Building a Llama2-finetuned LLM for Odia Language Utilizing Domain Knowledge Instruction Set

     

    Conference/Workshop: Generative AI Workshop, 3rd International Conference AI-ML Systems 2023

     

    Authors: Guneet Singh Kohli, Shantipriya Parida, Sambit Sekhar, Samirit Saha, Nipun B Nair, Parul Agarwal, Sonal Khosla, Kusumlata Patiala, Debasish Dhal
         

     

    SStatus: Published. [Paper]