Contact Us Sitemap Employee Login Search
 
Home About TAC Services Careers Contracts Community Outreach News & Events
Celatro

 

Home > Services > Celatro > Arabic Name Search

Arabic Name Search

Arabic Name Search™ is Celatro's pattern matching plug-in designed for the retrieval of Arabic personal names.

Matching Arabic names is a notoriously difficult task for a variety of reasons. Arabic names can be morphologically very complex, and can, in addition to the name itself (i.e. the root), contain prefixes, articles and suffixes. Some of these non-root name constituents create variations of the same name, whereas others uniquely identify one name as opposed to another. For example, "al Zarqawi" and "Zarqawi" could refer to the same person, although the former contains the article "al" whereas the latter does not. On the other hand, "Abd-ur-Rachman" and "Ab-ur-Rachman" are different names, and therefore probably do not refer to the same person; despite having the same root ("Rachman"), they contain different prefixes - "abd" ("servant of") versus "abu" ("father of"), and this is what makes them different. It is therefore very important to have a retrieval mechanism which will only return names that have the same semantic value as the query. Celatro's Arabic Name Search has built-in knowledge about the semantic value of all name constituents, and uses this knowledge to create the correct match list for a given query.

An additional problem facing Arabic name retrieval is the huge number of possible name variations in the original language. Some of these variations are a product of spoken language contractions (i.e. "Nur-al-Din", "Nureddine"; "Abd-al-Salam", "Abdussalam"), and others stem form different pronunciations of the same name in different dialects of Arabic (e.g. "Muhammad", "Emhemmed"). The number of name variations is further enlarged by transliterations from the Arabic into Roman script. Not only do different languages (e.g. French and English) have different ways to transliterate the same Arabic name, they also frequently have very vague transliteration rules, which yield many variations even within the same language (e.g. "Gadafi", "Kaddafi", "Ghaddafi", "Quadhafi', etc. in English).

Arabic Name Search™ takes all the above issues into account. Its sophisticated parsing technologies identify Arabic name constituents (prefixes, articles, roots, and suffixes), irrespective of spelling variations. Each constituent is treated differently, depending on its relative semantic importance. This enables Arabic Name Search to produce accurate results even when operating on incomplete names. If presented with a full name that contains more than one component, it not only parses out the constituents, but also assigns them to an individual component (e.g., the components of "Mohammad Khayr-ud-Dene Al Arussi" are "Mohammad", "Khayr-ud-Dene", "Al Arussi"). This enables the retrieval of lower-scoring partial matches (e.g. "Mohammad Al Arussi"). Since many Arabic names have a large number of components, not all of which are necessary to identify an individual, this approach results in a very high accuracy rate, especially when compared to other techniques which operate by generating large lists of name variations from a given Arabic name, and then trying to find an exact match within the generated list.