Sirindhorn International Institute of Technology, Thammasat University, Thailand
The Royal Society of Thailand
An introduction to characteristics of Thai language in computational point of view and a historical remark on Thai language processing research are provided. Thai language is one of the languages that are difficult to be analyzed since it has optional vowel expression, no word boundary, no sentence boundary, flexible word composition and flexible grammar structure. So far advances in Thai language processing are still limited due to lack of language resources and standardization in language structure by natural. Successful processing Thai tends to strongly depend on semantic-level processing, the hardest part in both theoretical and practical points of view. In these four decades, Thai researchers have conducted studies on Thai-to-English translation as a pioneer project in 1981, the Multilingual Machine Translation in 1986 and later information retrieval and information extraction, speech recognition and synthesis, optical character recognition, text categorization, text summarization, and text sentimental analysis. Among Thai language processing applications, text summarization is one of the most challenging research topics since it requires combination of multiple processing levels from character-level to semantic-level components under environments of no explicit word/phrase/sentence boundary and flexible word/grammar structure. As a series of research works on automatic text summarization of multiple Thai news documents, a number of studies on word segmentation, named entity (NE) extraction and recognition, part-of-speech tagging, predicate-oriented relation among words, document relation discovery and text summarization paradigm have been discussed.
Thanaruk Theeramunkong is currently a professor at School of Information, Computer and Communication Technology at Sirindhorn International Institute of Technology (SIIT) at Thammasat University, Bangkok, Thailand. He is also the Program Director of Information and Communication Technology for Embedded Systems (ICTES) at TAIST Tokyo Tech, National Science and Technology Development Agency (NSTDA). He received his bachelor degree in Electric and Electronics Engineering, master and doctoral degrees in Computer Science from Tokyo Institute of Technology. He was a research associate at Japan Advanced Institute of Science and Technology in Japan and a MIS manager at C.P. Seven Eleven Public Co., Ltd. in Thailand. He got several awards, including the Very Good Research Award in engineering field from Thammasat University in 2008, 2009 and 2010. Recently, in 2014, he has received the National Outstanding Researcher Award in the field of Information Technology and Communication Arts. He also got several best paper awards from conferences and societies, including the Japanese Society for Artificial Intelligence, PAKDD workshops, and KICSS. In 2015, he also got a Gold Medal with the Congratulations of the Jury from the 43rd International Exhibition of Inventions of Geneva for the inventions of automatic semantic-based multi-document summarization and application to public hearing. His research interests are natural language processing, data mining, text mining, machine learning and applications to service science. He is also a member of the Steering Committee of the Pacific-Asia Conferences on Knowledge Discovery and Data Mining (PAKDD). He is an associate editor of the Institute of Electronics, Information and Communication Engineers (IEICE). He is the author of more than 40 papers in a number of journals with impact factors and more than 100 conference papers.