Biography

I am currently working as research assistant at speech and text understanding research team, National Electronics and Computer Technology Center (NECTEC). My responsibilities are to research and develop on technologies related to human intelligence, including natural language understanding, information retrieval, and technologies for social text analysis. I am especially interested in analyzing semantic sentences posed by humans from enormous amounts of text to improve question answering system. Moreover, I am an open-minded software engineer and interested in many things related to artificial intelligence. I have been studying in the field of information retrieval, natural language processing, machine learning, and deep learning.

Experiences

FernUniversität in Hagen
Research Assistant and PhD Student
National Electronics and Computer Technology Center
Research Assistant
Conducting research and development on technologies related to human-computer interaction, including natural language, information retrieval and technologies for social text analysis
National Software Contest
Committee
Reviewing project proposals in the field of artificial intelligence fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
Ratchaphruek University
Lecturer
Teaching in subjects of object-oriented programming, introduction to programming, web programming, system analysis and design, and database management system

Projects

  2019Thai Word Similarity using Deep Learning: Creating a tool for finding related words based on text meaning from Thai Wikipedia articles

  2019Reviewers and Journals Selection System for Thai Editors: The System consists of 2 modules: (1) Reviewer Finder automatically suggests potential reviewers within the field of research for a manuscript (2) Journal Finder automatically finds suitable journals that match with a submitted manuscript details to an author.

  2019Thai Question Answering Framework: The key objective of Thai question answering framework is to automatically generate a correct answer of a question by finding relevant documents containing answer strings on Thai Wikipedia articles. The framework consists of three main components: (1) question processing (2) passage retrieval, and (3) answer processing

  2019ThaiQCor 2.0: Thai Query Correction via Soundex and Word Approximation: ThaiQCor is a tool for correcting spelling word errors from inaccurate typing due to typographical and cognitive errors.

  2018Journal Extraction and Plagiarism Checking System for TCI: It is a tool for metadata and bibliographic data extraction and palgiarism checking.

  2015Thesis and Academic Work Plagiarism Checking and Management System: This system is divided into two main parts. (1) CopyCatch is a system for detecting plagiarized contents in both Thai and English languages. CopyCatch can handle different types of plagiarized texts including copy and paste to a more difficult case in which contents are partially modified. (2) MyCatch is a thesis submission management system which assists students, instructors and department administrators to work seamlessly throughout the thesis submission workflow. It also can help draft a thesis, and store the completed version.

  2013Thai PDF Converter Tool: It is a tool which performs text cleansing based on the lexical analysis.

  2012A Scalable Soundex Search System for a Very Large Lexicon: It is a tool that applies a grapheme-to-phoneme converter to both the search term and database and then perform approximate string matching on phoneme sequences where similarity is defined in terms of pronunciation edit distance. More improvement can be gained with modification of edit distance cost function to reflect sound similarity in Thai.

  2011TVIS-Flood: Location-based Information System in Flood Situation: TVIS-Flood which is a location-based information system in flooding situation consists of two main components. (1) user's location identifier, and (2) social media text processor. For the location identifier, a hybrid approach which considers the coordinates from both GPS and a telephone network is proposed to solve the problem of GPS signal loss. For the social media text processor, a text categorization technique is applied to classify information from the social media into informative groups to make it easier for a user to view. Location extraction is also applied to get the geographic coordinate automatically from a given text.

  2010Opinion Mining System for Hotel Reviews: Opinion Mining System on Hotel Reviews uses the opinion mining technique with information visualization to create summarized opinion in graphic view. First, the system receives input which is tourist’s opinion that is written in Thai language. Then, the opinion will be analyzed by Feature-based sentiment analysis and summarization method. The result of this process is the summarized opinion that opinion segments were determined feature and polarity. Finally, summarized opinion was converted to graphic view.

  2008Expert Finder System for Thai National Research Repository: The system adopts many techniques including information retrieval, text mining and information visualization. The proposed system includes several key processes: (1) collect project and publication information from several databases, (2) develop a search engine for the collected information, (3) perform content analysis to extract expertise keywords for each experts, and (4) apply social network concept with information visualization to display social and topical relationships among researchers. The system can be used for decision support and knowledge management tasks. For example, a project analyst could search for a list of experts in a certain domain to be included in a committee or as a project consultant. An organization could use the system to plan a project collaboration between other organizations.

  2007ThaiReSearch: Thailand's Research Information Search Portal: ThaiResearch integrates research information from various databases such as researchers, research projects, patent records, and publications. The framework also provides an intelligent information analysis module which incorporates the following functions: statistical analysis, natural language processing (NLP).

Talks

- Development of Reviewers and Journals Selection System [slides]

- Thai Question Answering [slides][article]

- EPS Integration Process with ThaiJO [slides]

- Tag Suggestion [slides]

- TCI Format Checker [slides]

- Scopus API [slides]

- Expertise Keywords Representation Identifying From Scopus Database [slides]

- CopyCatch: Thai Plagiarism Detection [slides][website]

- CopyAlert: Monitoring and Protecting Your Valuable Contents [slides]

Publications

  • Santipong Thaiprayoon, Herwig Unger, and Mario Kubek, "Graph and Centroid-based Word Clustering," In Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval (NLPIR 2020), 2020. [pdf][slides]

  • Santipong Thaiprayoon, Kanokorn Trakultaweekoon, and Pornpimon Palingoon, "Design and Development of a Plagiarism Corpus in Thai for Plagiarism Detection," In Proceedings of the 11th International Conference on Knowledge and Systems Engineering (KSE 2019), 2019. [pdf][slides]

  • Santipong Thaiprayoon, Pornpimon Palingoon, Kanokorn Trakultaweekoon, and Supon Klaithin, "Developing a Framework for a Thai Plagiarism Corpus," In Proceedings of the 16th International Conference of the Pacific Association for Computational Linguistics (PACLING 2019), 2019.

  • Kanokorn Trakultaweekoon, Santipong Thaiprayoon, and Anocha Rugchatjaroen, "The First Wikipedia Questions and Factoid Answers Corpus in the Thai Language," In Proceedings of the 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP 2019), 2019. [pdf][slides]

  • Supon Klaithin, Pornpimon Palingoon, Kanokorn Trakultaweekoon, and Santipong Thaiprayoon, "Annotation-tool for creating Thai plagiarism corpus," In Proceedings of the 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP 2019), 2019.

  • Santipong Thaiprayoon, Alisa Kongthon, and Choochart Haruechaiyasak, "ThaiQCor 2.0: Thai Query Correction via Soundex and Word Approximation," In Proceedings of the 5th International Conference on Advanced Informatics: Concept Theory and Applications (ICAICTA 2018), 2018. [pdf][slides]

  • Alisa Kongthon, Choochart Haruechaiyasak, and Santipong Thaiprayoon, "Automatically Constructing Areas of Expertise Based on R&D Publication Data," In Proceedings of Portland International Conference on Management of Engineering and Technology (PICMET 2017), 2017.

  • Santipong Thaiprayoon, and Siriporn Ummeepien, "Web Plagiarism Monitoring System using Informative Text Selection Method," Information Technology Journal, vol. 12(2), pp. 1-9, 2016.

  • Santipong Thaiprayoon, Choochart Haruechaiyasak, and Alisa Kongthon, "PDF Extraction Based on Lexical Analysis for Thai Texts," International Journal of Applied Computer Technology and Information Systems, vol. 5(1), pp. 7-9, 2015. [pdf][slides]

  • Ananlada Chotimongkol, Santipong Thaiprayoon, and Sumonmas Thatphithakkul, "Flexible Proper Name Search Using a Sound Approximation Approach," In Proceedings of the 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (Oriental COCOSDA 2014), 2014. [poster]

  • Thanapol Wisuttikul, Choochart Haruechaiyasak, and Santipong Thaiprayoon, "Using Multi-Linguistic Techniques for Thailand Herb and Traditional Medicine Registration Systems," International Journal of Medical, Health, Biomedical, Bioengineering and Pharmaceutical Engineering, vol. 7(10), pp. 634-638, 2013.

  • Santipong Thaiprayoon, "Automatic Plagiarism Detection Based on Information Retrieval and Natural Language Processing," In Proceedings of Applied Computer Technology and Information Systems (ACTIS 2013), 2013.

  • Santipong Thaiprayoon, Alisa Kongthon, Pornpimon Palingoon, and Choochart Haruechaiyasak, "Search Result Clustering for Thai Twitter Based on Suffix Tree Clustering," In Proceedings of the 9th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON 2012), 2012. [pdf][slides]

  • Santipong Thaiprayoon, and Choochart Haruechaiyasak, "Web Plagiarism Detection Based on Search Result Snippets," In Proceedings of the 25th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC 2010), 2010. [pdf]

  • Choochart Haruechaiyasak, Santipong Thaiprayoon, and Alisa Kongthon, "Expertise Mapping Based on a Bibliographic Keyword Annotation Model," In Proceedings of the 12th International Conference on Asia-Pacific Digital Libraries (ICADL), 2010.

  • Santipong Thaiprayoon, Choochart Haruechaiyasak, and Alisa Kongthon, "Visualizing Expert Network via User-Interface Markup Language and ActionScript," In Proceedings of the 6th International Joint Conference on Computer Science and Software Engineering (JCSSE 2009), 2009. [pdf][slides]

  • Choochart Haruechaiyasak, Santipong Thaiprayoon, and Alisa Kongthon, "Expert Identification for Multidisciplinary R&D Project Collaboration," In Proceedings of Portland International Conference on Management of Engineering and Technology (PICMET 2009), 2009.

  • Choochart Haruechaiyasak, Santipong Thaiprayoon, and Alisa Kongthon, "Building a Thailand Researcher Network Based on a Bibliographic Database," In Proceedings of the 9th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2009), 2009.

  • Alisa Kongthon, Choochart Haruechaiyasak, and Santipong Thaiprayoon, "Constructing Term Thesaurus using Text Association Rule Mining," In Proceedings of the 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON 2008), 2008.

  • Alisa Kongthon, Choochart Haruechaiyasak, Marut Buranarach, Santipong Thaiprayoon, and Niran Angkawattanawit, "A Framework for Managing R&D for Thai Research Community Using Text Information Exploitation," In Proceedings of Portland International Conference on Management of Engineering and Technology (PICMET 2008), 2008.

  • Alisa Kongthon, Choochart Haruechaiyasak, and Santipong Thaiprayoon, "Enhancing the Literature Review Using Author-Topic Profiling," In Proceedings of the 11th International Conference on Asia-Pacific Digital Libraries (ICADL 2008), 2008.