Title | Exploiting Anchor Text as a Lexical Resource |
Author(s) |
Peter Anick
Yahoo! |
Session | P6-T |
Abstract | Anchor texts, the strings associated with hyperlinks on a web page, are currently employed to express millions of referrals to sites and topics on the world wide web. We consider how these strings might be exploited as a lexical resource, particularly when viewed from the perspective of their target documents rather than their sources. We find that for many target pages, incoming anchors form a miniature corpus of reference expressions whose properties with relation both to other target sites and to each other can be put to use for mining lexical information. |
Keyword(s) | Anchor text, data mining, entity extraction, proper names, hyperlinks, world wide web |
Language(s) | English |
Full Paper | 756.pdf |