SUMMARY : Session O24-EW Evaluation of Information Retrieval

 

Title Building a Heterogeneous Information Retrieval Test Collection of Arabic Document Images
Authors A. Abdelsapor, N. Adly, K. Darwish, O. Emam, W. Magdy, M. Nagi
Abstract This paper describes the development of an Arabic document image collection containing 34,651 documents from 1,378 different books and 25 topics with their relevance judgments. The books from which the collection is obtained are a part of a larger collection 75,000 books being scanned for archival and retrieval at the bibliotheca Alexandrina (BA). The documents in the collection vary widely in topics, fonts, and degradation levels. Initial baseline experiments were performed to examine the effectiveness of different index terms, with and without blind relevance feedback, on Arabic OCR degraded text.
Keywords
Full paper Building a Heterogeneous Information Retrieval Test Collection of Arabic Document Images