Building the Oranian-English Parallel Corpus: Methodology and Compilation Process

Dou, Abdelbasset; Kissi, Khalida

Journal of Languages and Translation
Volume 4, Numéro 2, Pages 161-174
2024-07-01

Building The Oranian-english Parallel Corpus: Methodology And Compilation Process

Authors : Dou Abdelbasset . Kissi Khalida .

Abstract

The scarcity of linguistic resources poses a major challenge for automated translation and processing of dialects. These resources are crucial for natural language processing experts conducting research on dialect recognition, processing, and machine translation. This paper describes the compilation of a dataset for an Algerian low-resource language as it emphasizes the importance of developing resources for Algerian dialects. It examines existing relevant corpora and details the creation process and unique features of the pioneering Oranian-English Parallel Corpus (OEPC). OEPC is the first parallel corpus built from scratch that pairs an Algerian dialect with its English counterparts. The paper outlines the criteria and steps involved in compiling a monolingual corpus for the Oranian dialect (ORN), including data sources and formats. ORN comprises 8500 sentences, which were then translated into English to form OEPC. This valuable linguistic resource is a product of the ERAD project, an initiative aimed at providing NLP professionals with diverse Algerian mono-, multi-, and cross-dialectal corpora. The paper also explains the data compilation and augmentation techniques used to expand the project's outputs.

Keywords

low-resource languages ; machine translation ; Oranian dialect ; parallel corpus ; ERAD

Analyse Didactique Du Manuel D’anglais « My Book Of English » De La 3ème Année Du Cycle Primaire Didactic Analysis Of The English Textbook "my Book Of English" Of The 3rd Year Of The Primary Cycle دراسة تحليلية للكتاب المدرسي "my Book Of English" للإنجليزية (السنة الثالثة من الطور الابتدائي)

Haddad Meryem .
pages 797-812.

أهمية التعليم الالكتروني في ضمان استمرارية العملية التعليمية في ظل كوفيد 19. -تجارب دولية وتحديات- The Importance Of E-learning In Ensuring The Continuity Of The Educational Process In Light Of Covid 19 -international Experiences And Challenges- أهمية التعليم الالكتروني في ضمان استمرارية العملية التعليمية في ظل كوفيد 19. -تجارب دولية وتحديات- The Importance Of E-learning In Ensuring The Continuity Of The Educational Process In Light Of Covid 19 -international Experiences And Challenges- أهمية التعليم الالكتروني في ضمان استمرارية العملية التعليمية في ظل كوفيد 19. -تجارب دولية وتحديات- The Importance Of E-learning In Ensuring The Continuity Of The Educational Process In Light Of Covid 19 -international Experiences And Challenges

بوجيت مليكة .
ص 316-331.

A Method Of Map Compilation For Expected Earthquakes On Complex Of Seismological Precursors

Sobolev Guenadi. A. . Baddari Kamel . Zavyalov Alexei. D. .
pages 95-106.

Compilation D’un Nouveau Logiciel Pour L’identification Des Espèces D’insectes Orthoptères

Hadj Seyd Abdelkader .
pages 878-887.

حماية المصنف الفكري في البيئة الرقمية من حقوق التأليف الفردية إلى حقوق المؤلف المشاعة The Protection Of Intelectual Compilation In The Digital Environment From Individual Copyrights To Common Copyrights

حنان مناصرية . مسعودة عمارة .
ص 242-265.

Building The Oranian-english Parallel Corpus: Methodology And Compilation Process

Abstract

Keywords

Les articles similaires

Formats de citation