LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title A Self-Expanding Corpus Based on Newspapers on the Web
Authors Hofland Knut (HIT Centre, University of Bergen Allegt. 27, N-5007 Bergen, Norway, email:Knut.Hofland@hit.uib.no)
Keywords Batch Download, Corpus, Newspapers, Web, Web-Based Concordance
Session Session WO15 - Language Resources Projects
Abstract A Unix-based system is presented which automatic collects newspaper articles from the web, converts the texts, and includes these texts in a newspaper corpus. This corpus can be searched from a web-browser. The corpus is currently 70 millions words and increases by 4 millions words each month.

 

na">