AcTiV is the first publicly accessible annotated dataset designed to assess the performance of different Arabic VIDEO-OCR systems. The database has been named AcTiV for Arabic Text in Video. The challenges that are addressed by AcTiV-database are in text patterns variability (colors, fonts, sizes, position, etc.) and presence of complex background with various objects resembling text characters. AcTiV enables users to test their systems’ abilities to locate, track and read text objects in videos.
The dataset includes 80 videos (more than 600,000 frames) collected from 4 different Arabic news channels (see Figure 1).In the present work, two types of video stream were chosen: Standard-Definition (720x576, 25 fps) and High-Definition (1920x1080, 25fps).We mainly focus on text displayed as overlay in news video, which can be classified into two types: static and dynamic. A text that does not undergo a change in its content, position, size, or color within its display interval is considered as static text. This group usually includes event information, speaker’s name, etc. Dynamic text targeted in our study, refers to the horizontal scrolling text that usually resides in the lower third of the television screen. In Arabic channels, dynamic text moves from left-to-right.
|Fig. 1. Typical video frames from the proposed dataset. Top Sub-figures: examples of Russia Today and ElWataniya1 frames. Bottom Sub-figures: examples of Aljazeera HD and France 24 frames.|
The annotation process consists of two different levels:The global annotation, which concerns the entire video, is performed manually thanks to a user interface.As a result of this step, we obtain theglobal xml file.The local annotation which concerns any specificframe extracted from the video,is done automatically according to the information contained in the global metafile (For more details about the annotation framework please see ).
|Fig. 2. A sample frame (a) and its correspondent specific xml file (b)|
Vidéo OCR; Vidéo database; benchmark; Arabic text
 Zayene, O., S. M. Touj, J. Hennebert, R. Ingold, and N. E. Ben Amara, "Semi-automatic news video annotation framework for Arabic text", Image Processing Theory, Tools and Applications (IPTA), 2014 4th International Conference on, pp. 1-6, Oct, 2014.
Téléchargement: license agreement AcTiV Database