Skip to content

A Study on Text Classification for Webmining Based Spatio Temporal Analysis of the Spread of Tropical Diseases

December 3, 2010

Proc. of International Conference on Advance Computer Science & Information System (ICACSIS) 2010, pp.311-314, Bali, Indonesia, 2010

Fatimah Wulandini, Iqbal Yasin, Anto Satriyo Nugroho, Bowo Prasetyo, Mohammad Teduh, Uliniansyah, Vitria Pragesjvara, Made Gunawan, Gunarso, Ratih Irbandini, and Dwi Handoko

“The rapid growth of tropical diseases in Indonesia had led to countless number of victims. Experts had tried to overcome the problem by monitoring the spreading and collecting useful information regarding these diseases. Web mining is one technique to collect data information from the Internet. Spatio-temporal data of tropical diseases can be collected by using web mining so the useful information can be extracted for further analysis. The main objective of this study is to create a text classification system which classified the web document using several learning methods including naïve Bayes, nearest neighbor, decision tree and support vector machine (SVM) with Sequential Minimal Optimization algorithm. The classification is intended to construct a spatio temporal analysis for documents classified into health. The result showed that naïve Bayes and SVM-SMO achieve good performance (naïve Bayes: 95% and SVM-SMO: 92%). Multinomial distribution of naïve Bayes is able to normalize the length of document while SVM-SMO performed well in high-dimensional data.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 150 other followers