<HTML>
<HEAD>
<TITLE>LREC 2000 - Paper 141 summary</title>
<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript">
<!--
// preload images:
 if(document.images)
  {
  hom_d= new Image(100,20);   hom_d.src="../eikones/hom_d.gif";
  pap_g=new Image(100,20);    pap_g.src="../eikones/pap_g.gif";  
  pap_d=new Image(100,20);    pap_d.src="../eikones/pap_d.gif";  
  pap_l=new Image(100,20);    pap_l.src="../eikones/pap_l.gif";  
  hom_l=new Image(100,20);    hom_l.src="../eikones/hom_l.gif";
  aut_d=new Image(100,20);     aut_d.src="../eikones/aut_d.gif";
  aut_l=new Image(100,20);    aut_l.src="../eikones/aut_l.gif";
  Key_d=new Image(100,20);    Key_d.src="../eikones/Key_d.gif";
  Key_l=new Image(100,20);   Key_l.src="../eikones/Key_l.gif";
  ses_d=new Image(100,20);    ses_d.src="../eikones/ses_d.gif";
  ses_l=new Image(100,20);   ses_l.src="../eikones/ses_l.gif";
  abs_l=new Image(100,20);   abs_l.src="../eikones/abs_l.gif";
  abs_d=new Image(100,20);    abs_d.src="../eikones/abs_d.gif";
  aut_l=new Image(100,20);    aut_l.src="../eikones/aut_l.gif";
}

function changimg(imgName,imgObjName)
 {
  if (document.images)
   {
   document.images[imgName].src=eval(imgObjName+".src");
   }
 }
//-->
</SCRIPT>

</HEAD>
<BODY marginwidth="0" marginheight="0" leftmargin="0" topmargin="0" rightmargin="0"  background="../eikones/fonto.jpg">
<TABLE align="center" border="0" width="100%" cellspacing="0" cellpadding="0" >
<TR>
<TD height="50" valign="center" colspan="7" bgcolor="#003163"><font face="Arial" size="4" color="#ffffff"><b>LREC 2000</b> 2<sup>nd</sup>
      International Conference on Language Resources &amp; Evaluation</font></TD>
</TR>
 <tr bgcolor="#003162">
 <td width="100" valign="center"><A href="../../default.htm" onmouseout="changimg('home','hom_d')" onmouseover="changimg('home','hom_l')"><IMG border="0" height="20" name="home" src="../eikones/hom_d.gif" width="100"></A></td>
 <TD width="100"><A href="../session.htm" onmouseout="changimg('sessions','ses_d')" onmouseover="changimg('sessions','ses_l')"><IMG border="0" height="20" name="sessions" src="../eikones/ses_d.gif" width="100"></A></TD>
 <TD width="100"><A href="../paper.htm" onmouseout="changimg('papers','pap_d')" onmouseover="changimg('papers','pap_l')"><IMG border="0" height="20" name="papers" src="../eikones/pap_d.gif" width="100"></a></TD>
 <TD width="100"><A href="../abstract.htm" onmouseout="changimg('abstracts','abs_d')" onmouseover="changimg('abstracts','abs_l')"><IMG border="0" height="20"  name="abstracts" src="../eikones/abs_d.gif" width="100"></A></TD>
 <TD width="100"><A href="../author.htm" onmouseout="changimg('authors','aut_d')" onmouseover="changimg('authors','aut_l')"><IMG border="0" height="20"  name="authors" src="../eikones/aut_d.gif" width="100"></a></TD>
 <TD width="100"><A href="../keyword.htm" onmouseout="changimg('keywords','Key_d')" onmouseover="changimg('keywords','Key_l')"><IMG border="0" height="20" name="keywords" src="../eikones/Key_d.gif" width="100"></A></TD>
<td width="1000">&nbsp;</td>
 </tr>
 </TABLE>
<BLOCKQUOTE style="MARGIN-RIGHT: 0px">
  <P><A href="140.htm">Previous Paper</A>&nbsp;&nbsp; <A href="142.htm">Next Paper</A></P></BLOCKQUOTE>
  <center>
<TABLE width="95%" Align="center" Border="1" bordercolor="#669999" cellspacing="1">
    <tr>
      <td width="15%" height="40"><b>Title</b></font></td>
      <td width="85%" height="40"><font color="#990033" size="4">Integrating Seed Names and ngrams for a Named Entity List and Classifier</font></td>
    </tr>
    <tr>
      <td height="40"><b>Authors</b></td>
      <td height="40"><font color="#006600">Buchholz Sabine</font> (ILK / Computational Linguistics Tilburg University, P.O. Box 90153, NL-5000 LE Tilburg, The Netherlands , email:fS.Buchholz@kub.nl, http://ilk.kub.nl)<br><font color="#006600">van den Bosch Antal</font> (ILK / Computational Linguistics Tilburg University, P.O. Box 90153, NL-5000 LE Tilburg, The Netherlands , email:vdnBoschg@kub.nl, http://ilk.kub.nl)</td>
    </tr>
    <tr>
      <td height="40"><b>Keywords</b></td>
      <td height="40">&nbsp;</td>
    </tr>
      <tr>
      <td height="40"><b>Session</b></td>
      <td height="40">Session WO14 - Named Entity Recognition</td>
    </tr>
     <tr>
      <td height="40"><b>Full Paper</b></td>
            <td height="40"><a href="../../ps/141.ps" target="newps" type="application/postscript">141.ps</a>, <a href="../../pdf/141.pdf" target="newpdf" type="application/pdf">141.pdf</a></td>
    </tr>
      <tr>
      <td height="40"><b>Abstract</b></td>
             <td height="40">We present a method for building a named-entity list and machine-learned named-entity classifier from a corpus of Dutch newspaper text, a rule-based named entity recognizer, and labeled seed name lists taken from the internet. The seed names, labeled either as PERSON, LOCATION, ORGANIZATION, or ADJECTIVAL name, are looked up in a 83-million word corpus, and their immediate contexts are stored as instances of their label. The latter 8-grams are used by a memory-based machine learning algorithm that, after training, (i) can produce high-precision labeling of instances to be added to the seed lists, and (ii) more generally labels new, unseen names. Unlabeled named-entity types are labeled with a precision of 61 % and a recall of 56 %. On free text, named-entity token labeling accuracy is 71 %.</td>
    </tr>
  </table><br>
  </center>
</BODY>
</html>