গড়েইপা লৌশিং (AI) | Bishnupriya Manipuri Language Development Project

Building open-source AI for 500,000+ Bishnupriya Manipuri speakers worldwide

Bishnupriya Manipuri (BPY) is spoken across Assam, Tripura, Manipur, and Bangladesh. Despite its rich cultural heritage, it has zero support in Google Translate, Microsoft Translator, or any major AI model.

We are changing that.

🎯 Our Mission

Develop and maintain open-source NLP resources for Bishnupriya Manipuri, including:

All resources are free, open-source, and community-driven under MIT/Apache 2.0 licenses.

🚀 Our Models

NLLB Bishnupriya Manipuri Series

Fine-tuned versions of Meta AI's NLLB-200 for English → BPY translation.

Model Status Notes
nllb-bpy-beng-v8-5-3-merged ✅ Production Latest stable. 95%+ accuracy. Fixes number+noun patterns. Live endpoint available.
nllb-bpy-beng-v8-5-3 Adapter LoRA weights for fine-tuning. Requires base NLLB-200.
nllb-bpy-beng-v9-0 🔨 In Progress Training on Wikipedia + scanned book corpus. Target: 10k+ pairs.

Live Demo: https://manipuri.com/articles/bpy.php#translator

Key Fixes in V8.5.3:

📊 Datasets

Coming Soon: BPY Training Data Repository

We are building the first comprehensive open dataset for Bishnupriya Manipuri:

Planned releases:

  1. bpy-parallel-v1 - 10k+ English↔BPY sentence pairs from Wikipedia, books, community
  2. bpy-monolingual-v1 - 100k+ BPY sentences for LM pretraining
  3. bpy-eval-v1 - Standard test set for MT evaluation

Current sources:

Want to contribute data? See (CONTRIBUTING) or email us.

🤝 Collaborate With Us

We welcome researchers, developers, and BPY speakers to join:

We need help with:

  1. Data Collection - Scan books, transcribe text, translate sentences
  2. Model Training - Fine-tune LLMs, experiment with architectures
  3. Evaluation - Build test sets, human eval, error analysis
  4. Tools - OCR for Bengali script, tokenizers, text normalization
  5. Applications - Chatbots, TTS, ASR, educational tools

Tech Stack: PyTorch, Transformers, PEFT/LoRA, Hugging Face Hub, PHP/Python

Join the Community

📚 Resources & Research

Base Models Used:

Papers & Docs:

Language Info:

📜 License & Citation

All models and datasets released under MIT License - free for commercial use.

If you use our work, please cite:

@misc{bishnupriya-manipuri-nllb-2026,
  title={NLLB Bishnupriya Manipuri: Open-Source Machine Translation},
  author={Bishnupriya Manipuri Language Development Project},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/BishnupriyaManipuri}
}

থাকাত | Thank you for supporting low-resource language AI.

গড়েইপা লৌশিং (AI) - Let's build AI together.