Revue de l'Information Scientifique et Technique
Volume 24, Numéro 1, Pages 66-83
2019-05-22
Authors : Boucham Souhila .
Arabic language has become an increasing interest in the field of Multilingual Information Retrieval (MIR). We deal in this work with the problem of Information Retrieval in a trilingual containing corpus documents in Arabic, French and English languages. We propose a language independent approach based on a pivot language. The proposed approach combines a surface analysis and the Latent Semantic Analysis (LSA) statistical algorithm in a new way to break the terms of LSA down into units which correspond more closely to morphemes. These morphemes are the variable length character n-gram candidates extracted from different fragments separated by borders. The obtained results are encouraging and competitive with state of the art results in multilingual field.
multilingual document representation ; multilingual information retrieval including Arabic ; virtual document ; principle of border ; fragments and variable length character n-grams ; parallel corpus ; surface analysis and the LSA statistical algorithm ; concept types ; pivot language
Semmar, Nasredine
.
Elkateb-gara Faza
.
Laib Meriama
.
Fluhr Christian
.
pages 1-10.
Bessai-mechmache Fatma Zohra
.
Abdi Tariq
.
Hadibi Abdennour
.
pages 61-65.
Abderrahim Mohammed Alaeddine
.
Abderrahim Mohammed El Amine
.
pages 5-9.