| ¡@About
IASL | Research | Publications
| Demos | People |
| ¡@ Home>>Research>>Biological
Literature Mining |
 |
| ¡@ |
| NER |
Information Extraction
(IE) is the task of extracting information of
interest from unconstrained text. IE involves
two main tasks: the recognition of named
entities, and the recognition of the
relationships among these named entities. Named
Entity Recognition (NER) involves the
identification of proper names in text and their
classification into different types of named
entities (e.g., persons, organizations,
locations). NER is not only important in IE, but
also in lexical acquisition for the development
of robust NLP systems [4]. Moreover, NER has
proven fruitful for tasks such as documents
indexing, and maintenance of databases
containing identified named entities.
We concentrate on Chinese NER problems. We
proposed a hybrid method combining the
advantages of rule-based and machine learning
(ML) based NER systems. Rule-based NER systems
can explicitly encode human comprehension and
can be tuned conveniently, while ML-based
systems are robust, portable and inexpensive to
develop. Our hybrid system incorporates a
rule-based knowledge representation and
template-matching tool, InfoMap, into a maximum
entropy (ME) framework. Named entities are
represented in InfoMap as templates, which serve
as ME features in Mencius. These features are
edited manually and their weights are estimated
by the ME framework according to the training
data. To avoid the errors caused by word
segmentation, we model the NER problem as a
character-based tagging problem. In our
experiments, Mencius outperforms both pure
rule-based NER systems. The F-Measures of person
names (PER), location names (LOC) and
organization names (ORG) in the experiment are
respectively 94.3%, 77.8% and 75.3%. We also
compared the NER results with/without word
segmentation and found slight differences.
¡@ Demo site URL:
|
|
¡@ |
|
|
 |
| ¡@ |
Wen-Lian Hsu
Professor, IEEE Fellow
Research Fellow
Institute of Information Science ,
Academia Sinica, Taipei,
Taiwan, R. O. C. Phone:
886-2-27883799 ext.1804 Fax:
886-2-27824814 E-mail: hsu@iis.sinica.edu.tw
¡@ |
¡@ |
|
 |
| ¡@ |
Ting-Yi Sung
Research Fellow
Institute of Information Science ,
Academia Sinica, Taipei,
Taiwan, R. O. C. Phone:
886-2-27883799 ext.1711 Fax:
886-2-27824814 E-mail:
tsung iis.sinica.edu.tw¡@ |
¡@ |
|
|