Overview
OdiaGenAI released a new BengaliGPT model as part of its initiative to build Generative AI and LLM-based technologies for Odia and Indic languages.
The BengaliGPT model is based on Llama-7b and finetuned with a 252k Bengali instruction set. The instruction set is translated data from open-source resources, resulting in good Bengali instruction understanding and response generation capabilities.
Data
This dataset is a mix of Bengali instruction sets translated from open-source instruction sets:
- Dolly,
- Alpaca,
- ChatDoctor,
- Roleplay
- GSM
In this dataset Bengali instruction, input, and output strings are available. The instruction set is available at Hugging Face for research and non-commercial purposes.
Training
The first experimental model was trained on GPU for 5 epochs following the Alpaca-LoRA training script. The training parameters are shown in Table 1 and loss curve in Fig. 1.
Table 1: Training Hyperparameters
Figure 1: Train/Eval Loss curve
Model
The BengaliGPT model, odiagenAI-bengali-lora-model-v1, was released on 10th June 2023 through Hugging Face with a CC BY-NC-SA 4-0 license. The model is based on Llama-7b as the base model and finetuned with the Bengali translated instruction set with 5 epochs. The Hugging Face model card shows the model descriptions and running instructions. The code (translation, training, and inference) is available on GitHub.
Inference
The inference script is adapted from Alpaca-LoRA considering the base model Llama-7b with the odiagenAI-bengali-lora-model-v1 weights.
The inference prompt accepts the input in Bengali text and outputs Bengali text—the generated output. The text-to-speech is integrated, so, the output is converted to speech.
Figure 2: Sample Inference. The question is, “What is the sum of 10 plus 20?” The answer is “The sum of 10 plus 20 = 10 + 20 = 30 and the sum of 10 plus 20 can be expressed as a number with 30”
Figure 3: Sample Inference. The question is, “What are the benefits of eating an apple a day?” The answer is “Benefits of eating an apple a day Apples are a healthy and wholesome food to eat.”
Figure 4: Sample Inference. The question is, “What is the primary source of energy that causes evaporation of water from the surface of a body of water?” input is, { "text": [ "solar radiation", "conduction by plants", "heat from surrounding land mass", "convection currents in water" ], "label": [ "A", "B", "C", "D" ] } The answer is “Solar radiation by plants is a primary source of energy that causes evaporation of water from the surface of water bodies.”
Figure 5: Sample Inference. The question is, "Write python code for Fibonacci Series". The answers "The following code can be use to write python code for the Fibonacci Series [python code]" .
How to Use
Bengali Generative AI has just released its latest model, and the best part is that anyone can now access and use it on Colab, free of charge! This powerful language model is trained to generate text in the Bengali language, allowing users to create Bengali content, generate creative writing, or even build Bengali language-based applications.
But that's not all. The Bengali Generative AI's latest model also comes with an integrated Text-To-Speech (TTS) feature. This means that not only can you generate text in Bengali, but you can also have that text converted into natural-sounding speech. It opens up a whole new range of possibilities for audio content creation, language learning, accessibility, and more. Running the model on Colab is a breeze.
Just follow these simple steps:
- Step1: Open the link to the Bengali Generative AI model on Colab: [https://colab.research.google.com/drive/1HYHZJwsNWk9auZ_o39G3AIGMtGkqVG2o?usp=sharing].
- Step2: In Colab, navigate to "Runtime" and select "Run all cells". This will initiate the model and load all the necessary dependencies.
- Step3: Once the cells have finished running, you will be provided with a Gradio URL. Gradio is a user-friendly interface that allows you to interact with the model effortlessly.
- Step4: Click on the Gradio URL, and it will open a web interface where you can input your desired text in Bengali
Figure 6: Gradio URL for Inference
You can choose to generate text or have it converted into speech using the integrated TTS feature. Play around with the model, generate Odia text, and listen to the TTS output. Explore its capabilities and get creative with your ideas.
Analysis
Although the current model is able to accept Bengali input text and generate answers in Bengali, it still fails to answer questions related to general knowledge about India due to a lack of domain knowledge. Also it fails in critical reasoning as per evaluation.
Future Plan
The plan includes i) fine-tuning with a more domain-specific Bengali instruction set, ii) Pre-train Bengali LLM model, iii) Chatbot development supporting Bengali.
Acknowledgment
We thank the following institutions/organizations for their LLM resources and support.
OdiaGenAI Team