Please use this identifier to cite or link to this item:
Title: Development of soft computing models for data mining
Authors: Sivanandam, S N
Shanmugam, A
Sumathi, S
Usha, K
Issue Date: Dec-2001
Publisher: NISCAIR-CSIR, India
Abstract: The increasing amount and complexity of today's data available in science, business, industry and many other areas creates an urgent need to accelerate discovery of knowledge in large databases. Such data can provide a rich resource for knowledge discovery and decision support. To understand, analyze and eventually use this data, a multidisciplinary approach called data mining has been proposed. Technically, data mining is the process of finding correlation or patterns among dozens of fields in large relational databases. Pattern classification is one particular category of data mining, which enables the discovery of knowledge from very large databases (VLDB). In this paper, mining the database through pattern classification has been done by utilizing two important mining tools called K-Nearest Neighbour algorithm and Decision trees. The K-Nearest Neighbour (K-NN) is the popularly used conventional statistical approach for data mining. K-NN is a technique that classifies each record in a data set based on a combination of the classes of K-records most similar to it in a historical data set. The fuzzy version of K-NN, crisp and fuzzy versions of nearest prototype classifiers have also been proposed. Decision tree is one of the best machine learning approaches for data mining. A decision tree is a predictive model that as its name implies, can be viewed as a tree. Briefly, decision trees are tree shaped structures that represent sets of decisions. These decisions generate rules for classification of a data set. Classification and Regression Tree (CART), ID3 are the two decision tree methods used in this paper. The classification rules have been extracted in the form of IF THEN rules. The performance analysis of K-NN methods and tree-based classifiers has been done. The proposed methods have been tested on three applications such as land sat imagery, letter image recognition and optical recognition of hand written digits data. The simulation algorithms have been implemented using C++ under UNIX platform.
Page(s): 327-340
ISSN: 0975-1017 (Online); 0971-4588 (Print)
Appears in Collections:IJEMS Vol.08(6) [December 2001]

Files in This Item:
File Description SizeFormat 
IJEMS 8(6) 327-340.pdf2.98 MBAdobe PDFView/Open

Items in NOPR are protected by copyright, with all rights reserved, unless otherwise indicated.