Title

Progress on Multi-lingual Named Entity Annotation Guidelines using RDF(S)

Authors

Nigel Collier (National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan)

Koichi Takeuchi (National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan)

Chikashi Nobata (Communications Research Laboratory 2-2-2 Hikaraidai, Seika-cho, Soraku-gun, Kyoto, Japan)

Junichi Fukumoto (Ritsumeikan University Noji-higashi, Kusatsu-shi, Shiga 525-8577, Japan)

Norihiro Ogata (Osaka University 1-8 Machikaneyama, Toyonaka, Osaka, Japan)

Session

WO23: Corpus Analysis, Annotation, Representation

Abstract

This paper provides a discussion and concise summary of the PIA (Portable Information Access project) guidelines for annotators and tool developers for annotating what we call named entity ‘plus’ (NE+) expressions such as individual names or technical terms that we want to distinguish for whatever reason from the rest of a text. In particular we consider how to annotate locally ambiguous syntactic and semantic structures. We provide notation that conforms to RDF(S) so that annotated documents can have their content accessed on the Semantic Web, i.e. the next generation World Wide Web. In this new framework named entities become instances of concepts in an explicit ontology, and the base text provides links to the annotation and ontology data files.

Keywords

Multilingual, RDFS

Full Paper

329.pdf