𝔖 Bobbio Scriptorium
✦   LIBER   ✦

An image-based automatic Arabic translation system

✍ Scribed by Yi Chang; Datong Chen; Ying Zhang; Jie Yang


Publisher
Elsevier Science
Year
2009
Tongue
English
Weight
377 KB
Volume
42
Category
Article
ISSN
0031-3203

No coin nor oath required. For personal study only.

✦ Synopsis


In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module.


πŸ“œ SIMILAR VOLUMES


An automatic English-Arabic HTML page tr
✍ Rached N. Zantout; Ahmed A. Guessoum πŸ“‚ Article πŸ“… 2001 πŸ› Elsevier Science 🌐 English βš– 506 KB

The Internet and the World Wide Web have become an integral part of everyday life, an important source of information and a communication medium. One of the main problems confronting non-English speakers in using the Internet is that it is heavily dominated by the English language. Knowledge of Engl

An automatic system for dirt in pulp ins
✍ F. Duarte; H. AraΓΊjo; A. Dourado πŸ“‚ Article πŸ“… 1999 πŸ› Elsevier Science 🌐 English βš– 314 KB

An automatic visual inspection system designed for dirt inspection in the pulp and paper industry is presented. A new hierarchical region oriented segmentation algorithm is introduced. The algorithm is tuned according to the singular characteristics of the pulp samples. A criterion based on the maxi

Automatic registration of brain magnetic
✍ Yeji Han; HyunWook Park πŸ“‚ Article πŸ“… 2004 πŸ› John Wiley and Sons 🌐 English βš– 951 KB

## Abstract ## Purpose To demonstrate a robust registration method of brain magnetic resonance (MR) images based on the Talairach reference system with automatic determinations of the fiducial points. ## Materials and Methods Eight specified landmark points of the Talairach reference system are

An image feature-based approach to autom
✍ R. Joe Stanley; Soumya De; Dina Demner-Fushman; Sameer Antani; George R. Thoma πŸ“‚ Article πŸ“… 2011 πŸ› Elsevier Science 🌐 English βš– 565 KB

The illustrations in biomedical publications often provide useful information in aiding clinicians' decisions when full text searching is performed to find evidence in support of a clinical decision. In this research, image analysis and classification techniques are explored to automatically extract