AI Sauna/GenAI for Moroccan Arabic

From Meta, a Wikimedia project coordination wiki

GenAI for Moroccan Arabic[edit]

Description[edit]

Using GenAI to generate biographies in Moroccan Arabic.

The team[edit]

Created and run by: Ideophagous

Results[edit]

Our method[edit]

Stage 1: trying different prompts to generate biographies in Moroccan Darija

  • Preliminary results are encouraging, since a coherent text can be generated using ChatGPT4, but follow-up prompts are still needed to make adjustments, especially in terms of word choice, and sometimes grammar.
  • The next step would be to automate the process and recursively refine the prompts. Adding tokens from Wikidata may prove to be useful.

Stage 2: testing RAG

Resources we used[edit]

General knowledge of prompting and GenAI

Conclusion[edit]

Generating texts in Moroccan Darija with AI is feasible but requires some extra effort, in comparison to languages like English with larger online content to draw from.

What next[edit]

As already noted in the results section, next would be to refine and automate the process, and use RAG to obtain better results, by training the AI on curated Moroccan Darija content.

Links, images, documentation[edit]

Moroccan Darija text and wikitext generated with ChatGPT4