Research Interests

Our main research topic is natural language understanding. Nearly all of the following systems require certain understanding capability to achieve high precision rates: semantic search on the web, Chinese voice input and output, spelling checker and machine translation. Our Chinese input system–GOING, which automatically translates a chu-in sequence into characters with a hit ratio close to 96%, is widely used in Taiwan. It received the Distinguished Chinese Information Product Award(中文傑出資訊產品獎)in 1993. In PC Home software download area, GOING has been downloaded 600,000times. Within the top 10 download software, it is the only one developed domestically.

Our model for concept understanding can utilize heterogeneous knowledge representation systems. We have extended our model to that on Internet intelligent agents, especially on the database agents. These software agents will become indispensable in the semantic search engine and the electronic commerce on the Internet. Another direction we are moving into is the development of educational tutoring systems. We have successfully implemented a system that can understand and solve (and explain how so solve) the mathematics word problems of primary school (grade 3).

Our major achievement is the development of a knowledge representation kernel, InfoMap, for the semantic analysis of natural language, which can be applied to a wide variety of application systems. We are currently utilizing this kernel to develop an Intelligent Knowledge Management System over the World Wide Web. There are several technology transfer programs currently going on with private companies.

In DNA sequence analysis, we have been studying the physical mapping and the clone assembly problem. When the experimental error is within 15%, we have developed an error-tolerant algorithm for the clone assembly problem (as well as the physical mapping problem) that can determine the relative positions of each clone (respectively, each probe) given the clone overlapping relationships. By combining our knowledge management tools, InfoMap, and natural language agent, we are currently constructing a Question Answering system for genomic and proteomic knowledge. We shall further extend this system to help biologist to execute certain natural language scripts automatically in their dry labs. Finally, we shall utilize InfoMap to facilitate the accurate search of various relationships in biological literature.