<HTML>
<HEAD>
<TITLE>LREC 2000 - Paper 287 summary</title>
<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript">
<!--
// preload images:
 if(document.images)
  {
  hom_d= new Image(100,20);   hom_d.src="../eikones/hom_d.gif";
  pap_g=new Image(100,20);    pap_g.src="../eikones/pap_g.gif";  
  pap_d=new Image(100,20);    pap_d.src="../eikones/pap_d.gif";  
  pap_l=new Image(100,20);    pap_l.src="../eikones/pap_l.gif";  
  hom_l=new Image(100,20);    hom_l.src="../eikones/hom_l.gif";
  aut_d=new Image(100,20);     aut_d.src="../eikones/aut_d.gif";
  aut_l=new Image(100,20);    aut_l.src="../eikones/aut_l.gif";
  Key_d=new Image(100,20);    Key_d.src="../eikones/Key_d.gif";
  Key_l=new Image(100,20);   Key_l.src="../eikones/Key_l.gif";
  ses_d=new Image(100,20);    ses_d.src="../eikones/ses_d.gif";
  ses_l=new Image(100,20);   ses_l.src="../eikones/ses_l.gif";
  abs_l=new Image(100,20);   abs_l.src="../eikones/abs_l.gif";
  abs_d=new Image(100,20);    abs_d.src="../eikones/abs_d.gif";
  aut_l=new Image(100,20);    aut_l.src="../eikones/aut_l.gif";
}

function changimg(imgName,imgObjName)
 {
  if (document.images)
   {
   document.images[imgName].src=eval(imgObjName+".src");
   }
 }
//-->
</SCRIPT>

</HEAD>
<BODY marginwidth="0" marginheight="0" leftmargin="0" topmargin="0" rightmargin="0"  background="../eikones/fonto.jpg">
<TABLE align="center" border="0" width="100%" cellspacing="0" cellpadding="0" >
<TR>
<TD height="50" valign="center" colspan="7" bgcolor="#003163"><font face="Arial" size="4" color="#ffffff"><b>LREC 2000</b> 2<sup>nd</sup>
      International Conference on Language Resources &amp; Evaluation</font></TD>
</TR>
 <tr bgcolor="#003162">
 <td width="100" valign="center"><A href="../../default.htm" onmouseout="changimg('home','hom_d')" onmouseover="changimg('home','hom_l')"><IMG border="0" height="20" name="home" src="../eikones/hom_d.gif" width="100"></A></td>
 <TD width="100"><A href="../session.htm" onmouseout="changimg('sessions','ses_d')" onmouseover="changimg('sessions','ses_l')"><IMG border="0" height="20" name="sessions" src="../eikones/ses_d.gif" width="100"></A></TD>
 <TD width="100"><A href="../paper.htm" onmouseout="changimg('papers','pap_d')" onmouseover="changimg('papers','pap_l')"><IMG border="0" height="20" name="papers" src="../eikones/pap_d.gif" width="100"></a></TD>
 <TD width="100"><A href="../abstract.htm" onmouseout="changimg('abstracts','abs_d')" onmouseover="changimg('abstracts','abs_l')"><IMG border="0" height="20"  name="abstracts" src="../eikones/abs_d.gif" width="100"></A></TD>
 <TD width="100"><A href="../author.htm" onmouseout="changimg('authors','aut_d')" onmouseover="changimg('authors','aut_l')"><IMG border="0" height="20"  name="authors" src="../eikones/aut_d.gif" width="100"></a></TD>
 <TD width="100"><A href="../keyword.htm" onmouseout="changimg('keywords','Key_d')" onmouseover="changimg('keywords','Key_l')"><IMG border="0" height="20" name="keywords" src="../eikones/Key_d.gif" width="100"></A></TD>
<td width="1000">&nbsp;</td>
 </tr>
 </TABLE>
<BLOCKQUOTE style="MARGIN-RIGHT: 0px">
  <P><A href="286.htm">Previous Paper</A>&nbsp;&nbsp; <A href="288.htm">Next Paper</A></P></BLOCKQUOTE>
  <center>
<TABLE width="95%" Align="center" Border="1" bordercolor="#669999" cellspacing="1">
    <tr>
      <td width="15%" height="40"><b>Title</b></font></td>
      <td width="85%" height="40"><font color="#990033" size="4">Developing Guidelines and Ensuring Consistency for Chinese Text Annotation</font></td>
    </tr>
    <tr>
      <td height="40"><b>Authors</b></td>
      <td height="40"><font color="#006600">Xia Fei</font> (Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA, fxia@linc.cis.upenn.edu)<br><font color="#006600">Palmer Martha</font> (Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA, mpalmer@linc.cis.upenn.edu)<br><font color="#006600">Xue Nianwen</font> (Linguistics Department, University of Delaware, Newark, DE 19716, USA, xueniwen@UDel.Edu)<br><font color="#006600">Okurowski Mary Ellen</font> (US Department of Defense, Ft. Meade, MD 20755, USA, meokuro@super.org)<br><font color="#006600">Kovarik John</font> (US Department of Defense, Ft. Meade, MD 20755, USA, kovariks@worldnet.att.net)<br><font color="#006600">Chiou Fu-Dong</font> (Linguistics Department, University of Pennsylvania, Philadelphia, PA 19104, USA, chioufd@linc.cis.upenn.edu)<br><font color="#006600">Huang Shizhe</font> (East Asian Studies Program, Haverford College, Haverford, PA 19041, USA, shuang@haverford.edu)<br><font color="#006600">Kroch Tony</font> (Linguistics Department, University of Pennsylvania, Philadelphia, PA 19104, USA, kroch@linc.cis.upenn.edu)<br><font color="#006600">Marcus Mitch</font> (Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA, mitch@linc.cis.upenn.edu)</td>
    </tr>
    <tr>
      <td height="40"><b>Keywords</b></td>
      <td height="40">Annotation Guidelines, Bracketed Corpus (Treebank), Chinese Language Processing, Quality Control</td>
    </tr>
      <tr>
      <td height="40"><b>Session</b></td>
      <td height="40">Session WO1 - Corpus Tagging</td>
    </tr>
     <tr>
      <td height="40"><b>Full Paper</b></td>
            <td height="40"><a href="../../ps/287.ps" target="newps" type="application/postscript">287.ps</a>, <a href="../../pdf/287.pdf" target="newpdf" type="application/pdf">287.pdf</a></td>
    </tr>
      <tr>
      <td height="40"><b>Abstract</b></td>
             <td height="40">With growing interest in Chinese Language Processing, numerous NLP tools (e.g. word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the world. However, since no large-scale bracketed corpora are available to the public, these tools are trained on the corpora with different segmentation criteria, part-of-speech tagsets and bracketing guidelines, and therefore, comparisons are difficult. As a first step towards addressing this issue, we have been preparing a 100-thousand-word bracketed corpus since late 1998 and plan to release it to the public summer 2000. In this paper, we will address several challenges in building the corpus, namely, creating annotation guidelines, ensuring annotation accuracy and maintaining a high level of community involvement.</td>
    </tr>
  </table><br>
  </center>
</BODY>
</html>