Data Mining Case

The simplest way of understanding the concept of data mining is to think of an organizational infrastructure where data from different sources are collected. At the secondary stage, they are analyzed in order to understand which data sets are more prospective as far as the progress of that organization is concerned. Technically, it is a computerized process that converts raw data into understandable form so that they can be put to further use. Other elements of data mining include database organization and data management, pre-processing of data, creating data models as per the inferences at the secondary level of raw data analysis, finding out workable algorithms, understanding the complexities, conducting post-processing of discovered structures, visualization, and online updating.

The basic function of data mining is to compare the old data with the newly collected ones. The derived result is used as a knowledge or an infrastructural add-on which helps in understanding what can be the most appropriate solution for an existing problem or what can be the possible measures to counter a future hurdle that may arise in an organization. The data that are scrutinized during data mining come from an array of sources which are related to different sectors of an organization, such as production, quality control, marketing, etc. These data remain stored in data warehouses both in physical format as well as digitized forms. The processes involved in data mining are many and the choice of the process depends upon the management of an organization. KDD or the process of ‘Knowledge Discovery in Database’ talks about five key stages of data processing: selection, pre-processing, transformation, data mining and evaluation. The KDD process can involve significant iteration and can contain loops between any two steps (Fayyad and Piatetsky-Shapiro et al., 1997). On the other hand, CRISP-DM or Cross Industry Standard Process for Data Mining suggests that data mining involves six major


