Improved Data mining approach to find Frequent Itemset Using Support count table

Document Sample
Improved Data mining approach to find Frequent Itemset Using Support count table Powered By Docstoc
					   International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856




     Improved Data mining approach to find
    Frequent Itemset Using Support count table
                          Ramratan Ahirwal1, Neelesh Kumar Kori2 and Dr.Y.K. Jain3
                                            1
                                             Samrat Ashok Technological Institute
                                                Vidisha (M. P.) 464001 India
                                                 2
                                                 Samrat Ashok Technological
                                                 Vidisha (M. P.) 464001 India
                                            3
                                             Samrat Ashok Technological Institute
                                                Vidisha (M. P.) 464001 India

Abstract: Mining frequent item sets has been widely studied     dependent on collected, stored, and processed
over the last decade. Past research focuses on mining           information. However, the abundance of the collected
frequent itemsets from static database. In many of the new      data makes it laborious to find essential information in it
applications mining time series and data stream is an           for a specific purpose. Data mining is the analysis of
important task now. Last decade, there are mainly two kinds     (often large) observational datasets from the database,
of algorithms on frequent pattern mining. One is Apriori        data warehouse or other large repository incomplete,
based on generating and testing, the other is FP-growth based   noisy, ambiguous, the practical application of random
on dividing and conquering, which has been widely used in       data to find unsuspected relationships and summarize the
static data mining. But with the new requirements of data       data that are both understandable and useful to the data
mining, mining frequent pattern is not restricted in the same   owner. It is a means that data extraction, cleaning and
scenario. In this paper we focus on the new miming              transformation, analysis, and other treatment models, and
algorithm, where we can find frequent pattern in single scan    automatically discovers the patterns and interesting
of the database and no candidate generation is required. To
                                                                knowledge hidden in large amounts of data, this helps us
achieve this goal our algorithm employ one table which
                                                                make decisions based on a wealth of data. Information
retain the information about the support count of the itemset
                                                                communication mode of software development lies in
and the table is virtual for static database means generated
whenever required to generate frequent items and may be
                                                                how to collection, analysis, and mine out the hidden
useful for time series database. So our algorithm is suitable   useful information in the various data from information
for static as well as for dynamic data mining. Result shows     communication between developers and the staff
that the algorithm is useful in today’s data mining             interaction with manages, and then used the knowledge to
environment.                                                    make decision.
Keywords: Apriori, Association Rule, Frequent Pattern,          oustead College uses database technology to manage the
Data Mining                                                     library currently. Its main purpose is to facilitate the
                                                                procurement of books, cataloging, and circulation
                                                                management. In order to better satisfy the needs of
1. INTRODUCTION                                                 readers, we must to explore the needs of readers, to
Mining data streams is a very important research topic          provide the information which they need initiatively.
and has recently attracted a lot of attention, because in       Most current library evaluation techniques focus on
many cases data is generated by external sources so             frequencies and aggregate measures; these statistics hide
rapidly that it may become impossible to store it and           underlying patterns. Discovering these patterns is the key
analyze it offline. Moreover, in some cases streams of          that use library services [3]. Data mining is applied to
data must be analyzed in real time to provide information       library operations [4].With the fast development of the
about trends, outlier values or regularities that must be       technology and the more requirements of the users, the
signaled as soon as possible. The need for online               dynamic elements in data mining are becoming more
computation is a notable challenge with respect to              important, including dynamic databases and the
classical data mining algorithms [1], [2]. Important            knowledge bases, users' interestingness and the data
application fields for stream mining are as diverse as          varying with time and space. I order to solve the problems
financial applications, network monitoring, security            such as low effectiveness; high randomness and hard
problems,       telecommunication      networks,     Web        implementation in dynamic mining, more research on
applications, sensor networks, analysis of atmospheric          dynamic data mining have been done. In [5][6] , an
data, etc. The innovation in computer science have made         evolutionary immune mechanism was proposed based on
it possible to acquire and store enormous amounts of data       the fact that the elements involved in the domains could
digitally in databases, currently giga or terabytes in a        be modeled as the ones in immune models. It focused on
single database and even more in the future. Many fields        how to utilize the relationship between antigens and
and systems of human activity have become increasingly          antibodies in a dynamic data mining such as an

Volume 1, Issue 2 July-August 2012                                                                              Page 195
   International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856


incremental mining. However, the sole immune                 At an abstract level, the KDD field is concerned with the
mechanism and relative algorithm runs more effectively       development of methods and techniques for making sense
only on incremental situations rather than on others. Its    of data. The basic problem addressed by the KDD process
performance and function have to be improved when used       is one of mapping low-level data (which are typically too
in more complex and dynamic environments like Web.           voluminous to understand and digest easily) into other
We provide here an overview of executing data mining         forms that might be more compact (for example, a short
services and association rule. The rest of this paper is     report), more abstract approximation or model of the
arranged as follows: Section 2 introduces Data Mining        process that generated the data), or more useful (for
and KDD; Section 3 describes about Literature review         example, a predictive model for estimating the value of
Section 4 shows the description of proposed work Section     future cases). At the core of the process is the application
5 result analysis of the algorithm and proposed work.
                                                             of specific data-mining methods for pattern discovery and
Section 6 describes the Conclusion and outlook.
                                                             extraction.
                                                             The traditional method of turning data into knowledge
2. DATA MINING AND KDD                                       relies on manual analysis and interpretation. For
Generally, data mining (sometimes called data or             example, in the health-care industry, it is common for
knowledge discovery) is the process of analyzing data        specialists to periodically analyze current trends and
from different perspectives and summarizing it into useful   changes in health-care data, say, on a quarterly basis. The
information - information that can be used to increase       specialists then provide a report detailing the analysis to
revenue, cuts costs, or both. Data mining software is one    the sponsoring health-care organization; this report
of a number of analytical tools for analyzing data. It       becomes the basis for future decision making and
allows users to analyze data from many different             planning for health-care management. In a totally
dimensions or angles, categorize it, and summarize the       different type of application, planetary geologists sift
relationships identified. Technically, data mining is the    through remotely sensed images of planets and asteroids,
process of finding correlations or patterns among dozens     carefully locating and cataloging such geologic objects of
of fields in large relational databases. There are several   interest as impact craters. Be it science, marketing,
algorithm are devised for this.[5]The process is shown in    finance, health care, retail, or any other field, the classical
Figure 1.                                                    approach to data analysis relies fundamentally on one or
Although data mining is a relatively new term, the           more analysts becoming intimately familiar with the data
technology is not. Companies have used powerful              and serving as an interface between the data and the users
computers to sift through volumes of supermarket scanner     and products. For these (and many other) applications,
data and analyze market research reports for years.          this form of manual probing of a data set is slow,
However, continuous innovations in computer processing       expensive, and highly subjective. In fact, as data volumes
power, disk storage, and statistical software are            grow dramatically, this type of manual data analysis is
dramatically increasing the accuracy of analysis while       completely impractical in many domains.
driving down the cost.
                                                             Databases are increasing in size in two ways:

                                                             (1) The number N of records or objects in the database
                                                             and (2) The number d of fields or attributes to an object.

                                                             Databases containing on the order of N = 109 objects are
                                                             becoming increasingly common, for example, in the
                                                             astronomical sciences. Similarly, the number of fields d
                                                             can easily be on the order of 102 or even 103, for
                                                             example, in medical diagnostic applications. Who could
                                                             be expected to digest millions of records, each having tens
                                                             or hundreds of fields? We believe that this job is certainly
                                                             not one for humans; hence, analysis work needs to be
                                                             automated, at least partially. The need to scale up human
                                                             analysis capabilities to handling the large number of bytes
                                                             that we can collect is both economic and scientific.
                                                             Businesses use data to gain competitive advantage,
                                                             increase efficiency, and provide more valuable services.
                                                             Data we capture about our environment are the basic
                                                             evidence we use to build theories and models of the
                                                             universe we live in. Because computers have enabled
           Figure 1: Data Mining Algorithm                   humans to gather more data than we can digest, it is only
                                                             natural to turn to computational techniques to help us

Volume 1, Issue 2 July-August 2012                                                                            Page 196
    International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856


unearth meaningful pattern and structure from the                 4. PROPOSED WORK AND ALGORITHM
massive volumes of data. Hence, KDD is an attempt to
                                                                  The frequent itemset mining is introduced in [2] by
address a problem that the digital information era made a
                                                                  Agrawal and Srikant. To facilitate our discussion; we give
fact of life for all of us: data overload.
                                                                  the formal definitions as follows.
                                                                  Let I = (i1, i2, i3,………im) be a set of items. An itemset X
3. LITERATURE REVIEW                                              is a subset of I. X is called k-itemset if |X| = k; where k is
In 2011, jinwei Wang et al. [12] proposed to conquer the          the size (or length) of the itemset. A transaction T is a
shortcomings and deficiencies        of      the       existing   pair (tid; X), where tid is a unique identifier of a
interpolation technique of missing data, an interpolation         transaction and X is an itemset. A transaction (tid;X) is
technique for missing context data based on Time-Space            said to contain an itemset Y iff Y⊆ X: A dataset D is a
Relationship and Association Rule Mining (TSRARM) is              set of transactions.
proposed to perform spatiality and time series analysis on        Given a dataset D, the support of an itemset X, denoted as
sensor data, which generates strong association rules to          Supp(X), is the fraction of transactions in D that contain
interpolate missing data. Finally, the simulation                 X. An itemset X is frequent if Supp (X) is no less than a
experiment verifies the rationality and efficiency of             given threshold S0. An important property of the frequent
TSRARM through the acquisition of temperature                     itemsets, called the Apriori property, is that every
sensor data.                                                      nonempty subset of a frequent itemset must also be
In 2011, M. Chaudhary et al. [13] proposed                        frequent.
new and more        optimized     algorithm      for     online   The problem of finding frequent itemsets can be specified
rule generation. The advantage of this algorithm is that          as: given a dataset D and a support threshold S0; to find
the graph generated in our algorithm has less edge as             any itemset whose support in D is no less than S0. It is
compared to the lattice used in the existing algorithm.           clear that the Apriori algorithm needs at most l + 1,scans
The      Proposed      algorithm     generates       all    the   of database D if the maximum size of frequent itemset is
essential rulesalso and no rule is missing. The use of non        l:On the context of data streams, to avoid disk access,
redundant association rules help significantly in the             previous studies focus on finding the approximation of
reduction of irrelevant noise in the data mining process.         frequent itemsets with a bound of space complexity.
This graph theoretic approach, called adjacency lattice is        Mining frequent itemsets in static databases, all the
crucial for online mining of data. The adjacency lattice          frequent itemsets and their support counts derived from
could be stored either in main memory or secondary                the original database are retained. When the transactions
memory. The idea of adjacency lattice is to pre store a           are added or expired, the support counts of the frequent
number of large item sets in special format which reduces         itemsets contained in them are recomputed. By resuing
disc I/O required in performing the query.                        the frequent itemsets and their counts retained, the
In 2011,Fu et al. [14] analyzes Real-time monitoring data         number of candidate itemsets generated during the
mining has been a necessary means of improving                    mining process can be reduced. Later to rescan the
operational efficiency, economic safety and fault detection       original database is required because non-frequent
of power plant. Based on the data mining arithmetic of            itemsets can be frequent after the database is updated.
interactive association rules and taken full advantage of         Therefore they cannot work without seeing the entire
the association characteristics of real-time test-spot data       database and cannot be applied to data stream.
during the power steam turbine run, the principle of              In our approach we introduce new method in which we
mining quantificational association rule in parameters is         required only single scan of database D to count the
put forward among the real-time monitor data of steam             support of each itemset and no candidate generation and
turbine. Through analyzing the practical run results of a         pruning is required to find the frequent itemsets. So our
certain steam turbine with the data mining method based           algorithms reduce the disk access time and directly find
on the interactive rule, it shows that it can supervise           the frequent itemset by using support count table. This
stream turbine run and condition monitoring, and afford           method is application for static database as well as for
model reference and decision-making supporting for the            dynamic database if the table is created at the initial
fault diagnose and condition-based maintenance.                   stage.
In      2011,Xin      et    al.    [15]     analyzes       that
use association rule learning to process statistical data of        4.1 Support Cont Table:
private economy and analyze the results to improve the            As state previous that every itemset X of transaction T is
quality of statistical data of private economy. Finally the       a subset of I (X ⊆ I) and a set of such transactions is the
article         provides           some            exploratory    database D. So in database D every transaction itemset X
comments and suggestions         about    the      application    will be an element of 2I-1, where 2I is a power set of I.
of association rule mining in private economy statistics.         Power set of I contain all the subsets of I that may be in
                                                                  the form of transactions itemset in the transaction
                                                                  database D except  . Hence our algorithm employ one
                                                                  table that’s name is support count table. That table

Volume 1, Issue 2 July-August 2012                                                                                  Page 197
   International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856


assumes as virtual and created when required to finding         Table 2: Initial support count table for I=(i1,i2,i3,i4).
frequent itemset. The Length of the table is (2I-1) × 2.
Two field of attributes are itemset and support count. In                 No     Itemset (A)      Support. count(Scount)
this table we make entries of frequency count of each                     .
itemset that are observed in transaction database. The                    1      {i1}             0
frequency count of each itemset is the count of the                       2      {i2}             0
occurrence of such itemset in transactional database D.                   3      {i3}             0
This table is generated and may be stored in cache                        4      {i4}             0
memory till the frequent itemset are not found. Generated
                                                                          5      {i1,i2}          0
table may be used for stationary database as well as for
time series database. Table can be given as follows given                 6      {i1,i3}          0
below.                                                                    7      {i1,i4}          0
   4.2 Entries in Support count table:                                    8      {i2,i3}          0
Support count table is a table that may be useful to find                 9      {i2,i4}          0
frequent itemset from static datasets as well as from                     10     {i3,i4}          0
stream line dataset where we used windowing concept. In                   11     {i1,i2,i3}       0
static database this table may be created when we want to                 12     {i1,i2,i4}       0
analyze the database by single scan of the database and
                                                                          13     {i1,i3,i4}       0
make entries in the table for every transaction. In support
counts table initially all the entries of support count of                14     {i2.i3,i4}       0
each itemsets are set to zero. If we are using database D                 15     {i1,i2,i3,i4}    0
that is static, fixed then we update the table by single
scanning of the database D and make entries of each              4.3 Proposed Method to find frequent itemset
itemset in the table. For each transaction itemset X in D     In our proposed work we are giving the method that may
find the corresponding itemset in table and increment the     be useful for static as well as for stream line database to
count of that itemset. In this way for each T we make         find frequent itemset. In our proposed work we employ
entries. Later may retain the table in memory till the        the support count table that required only to scaning the
observation not complete. So the added or expired             database once to make the entries in the table for each
transactions only required to update the table.               transaction the table retains the information till the
If we consider the database D as random or stream line        observation not complete or frequent itemset not found.
database then the table may be more useful because every      When the trasactions are added into dataset or expired
incoming or expired transaction only required to update       from the dataset simultaneously update the table. The
the table by incrementing or decrementing the                 updated support count table has the frequency count of
corresponding itemset and this table may be stored in         each itemset. To find the frequent itemset for any
efficient way so we can use it to find the frequent item      threshold value we scan the table not the database. As in
sets or association rules. In this approach we are not        A-priori we are required l+1 scan of the dataset and
required to save the database in the disk memory only         generate the candidates to find frequent set. Our approach
necessary to save the table and used whenever necessary       has only single scan of database and no candidate
to find frequent itemset.                                     generation is required. Table has entries of frequency
                                                              count of every itemset but not the total support count of
               Table 1: Support count table ST.               that itemset. The frequency count of each itemset is the
                                                              count of the occurrence of such itemset in transactional
                                                              database D so to find frequent itemset we are required to
        NO.         Itemset (A)     support count             find the total support count of that itemset, Total support
                                    (Scount)                  count of an itemset is the count of the occurrence of total
        1                                                     items of that itemset in the no. of transactions in D. This
        .                                                     total count in our scheme is calculated by scanning the
        .                                                     table and then found total support count compared with
        2I-1                                                  the threshold S0 if the count is greater than the threshold
                                                              then itemset is included in frequent set. This procedure is
                                                              repeated for every itemset to find frequent them.
For example Let I=(i1,i2,i3,i4) be the set of items and the     Algorithm: To find frequent itemset
different types of itemset that may be generated from the I
                                                                Input: A database D and the support threshold S0.
are {i1},{i2},{i3}…..{i1,i2,i3,i4}.Then all transaction
itemset X that may occur in database D are all will be any      Output: frequent itemsets Fitemset.
subset of I and equal to itemset. Now table created
initially as given below                                        Method


Volume 1, Issue 2 July-August 2012                                                                               Page 198
   International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856


  Step:1 Scan the transaction database D and update the         Step2: To find frequent itemset we make use of support
  Support count table ST. As given in sec.4.2,                  count table given below as follows:

  Fitemset={  }                                                      Table 3: Frequency count for above example

  Step:2 for ( i=1; i<2I ; i++)                                            No     Itemset (A)      Supportcount(Scount)
                                                                           .
  //for each itemset A in ST repeat the steps.
                                                                           1      {10}             2
  //2I gives total element in power set of I
                                                                           2      {20}             0
                                                                           3      {30}             0
  TCount =0;            //Total count                                      4      {40}             1
                                                                           5      {10,20}          1
  Step3: for (j=1; j< 2I ; j++) // Repeat step3 to find total              6      {10,30}          2
  count                                                                    7      {10,40}          0
                                                                           8      {20,30}          2
               Step:3.1 If Ai ⊆ Aj
                                                                           9      {20,40}          0
                        TCount = TCount +Scount(j)                         10     {30,40}          2
                                                                           11     {10,20,30}       2
  Step:4 If (Tcount ≥ S0)                                                  12     {10,20,40}       0
                                                                           13     {10,30,40}       0
               Then Fitemset = Fitemset U Ai
                                                                           14     {20.30,40}       2
  Step:5 Go to step 2                                                      15     {10,20,30,40     1
                                                                                  }
  Step:6 End

To better explain our algorithm, now we consider one            To check itemset {10} is frequent or not, we obtain the
example: Let I= (10, 20, 30, 40) be the set of four items &     total support count by scaning the support count table for
value assumed for the threshold is 2.Total transactions in      {10}, so from the table total support of {10} is 8.This
D are considered 15.Table of transactions of D is given         value of total support count is compared with threshold
below:                                                          value 2, since threshold value is 2 and less than the total
                                                                count, so the itemset {10} is frequent itemset and
            ti      transactions                                included in Fitemset. This process is repeated for every
            d                                                   itemset.
            1    {10}
            2    {10,20}                                        In such a way we get every frequent itemset using support
            3    {30,40}                                        count table
            4    {10,20,30,40
                 }                                              Frequent itemset for the given dataset is:
            5    {10,30}
            6    {10,30}                                        Fitemset={{10},{20},{30},{40},{10,20},{10,30},{20,30},{
            7    {30,40}                                        20,40},{30, 40}, {10,20,30},{20,30,40}}
            8    {20,30,40}
            9    {20,30,40}
            10 {10,20,30}                                       5. RESULT ANALYSIS
            11 {20,30}                                          To study the performance of our proposed algorithm, we
            12 {40}                                             have done several experiments. The experimental
            13 {20,30}                                          environment is intel core processor with operating system
            14 {10,20,30}                                       is window XP. The algorithm is implemented with java
                                                                netbeans 7.1.The meaning of used parameters are as
            15 {10}
                                                                follows D for transaction database, I for no. of items in
Step1: By scanning the database the table of support            transactions and S0 for MINsupport. Table 4 shows the
count will be as follows: Given in table3.                      results for execution time in sec when I=5 and
                                                                transactional database D scale-up from 50 to 1000 and
                                                                MINsupport S scale-up from 2 to 8.We see from the table

Volume 1, Issue 2 July-August 2012                                                                              Page 199
   International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856


that when in rows we scale-up the MINsupport time for
exection is linearly decreasing and scale-up the database
D time is increasing but not in some linear way.

Table 4: Execution time(s)-When D scale-up from 50 to
            1000 & S scale-up from 2 to 8.

  No. of               Different Minimum Support(S)
  transactions     2       3      4      5     6     8
  50               2       1.7    1.4    1.1   0.9   0.6
  100              4       3.4    2.8    2.2   1.8   1.2
  200              8       6.8    4.2    3.2   2.6   2
  250              9.5     8.5    6.7    4.6   4     3
                                                             Figure 4: Comparison of execution time (s) for MINsupport
  300              12      10.2   8.4    6.8   5.2   3.5          (S0=2) with algorithm given in reference [16].
  400              14      12     10     8     6     5
  500              16.5    14     12.5   8.5   7     6       Figure.4 shows the comparison of our proposed algorithm
                                                             execution time with S0=2 and database D scale-up from
  1000             30      25     20     18    14    10
                                                             50 to 175. Comparison result shows that our approach
                                                             gives some better performance than the method proposed
                                                             in reference [16].
                                                                 Execution Time in Sec.



                                                                                           120
                                                                                           100
                                                                                            80
                                                                                            60
                                                                                            40
                                                                                            20
                                                                                             0
                                                                                                 1000 2000 3000 4000 5000 6000

                                                                                                     No. of Transactions
      Figure 2: Execution time(s), MINsupport(S0=2);
Figure 2 shows the algorithm execution time {for
MINsupport(S0=2), I=5} is increasing almost linearly with
the increasing of dataset size. It can be concluded our                                   Figure 5: Scale-up: Number of transactions.
algorithm has a good scalable performance. Now later to
examine the scalability performance of our algorithm we
increased the dataset D from 1000 to 6000 with same          6. CONCLUSION AND OUTLOOK
parameter MINsupport(S0=2), I=5, result is given in figure
                                                             Data mining, which is the exploration of knowledge from
5.
                                                             the large set of data, generated as a result of the various
                                                             data processing activities. Frequent Pattern Mining is a
                                                             very important task in data mining. The previous
                                                             approaches applied to generate frequent set generally
                                                             adopt candidate generation and pruning techniques for
                                                             the satisfaction of the desired objective. In this paper we
                                                             present an algorithm which is useful in data mining task
                                                             and knowledge discovery without candidate generation
                                                             and our approach reduce the disk access time and directly
                                                             find the frequent itemset by using support count table.
                                                             The proposed method work well with static dataset by
                                                             using support count table as well as for mining streams
                                                             requires fast, real-time processing in order to keep up
                                                             with the high data arrival rate and mining results are
                                                             expected to be available within short response time.We
   Figure 3: Execution time(s), Transaction database
                                                             also proof the algorithm for static dataset by the
                      (D=200);
                                                             concerning graph results.
Volume 1, Issue 2 July-August 2012                                                                                               Page 200
   International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
       Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume 1, Issue 2, July – August 2012                                          ISSN 2278-6856


In this paper we improve the performance by without         Recent Trends in Information Systems (ReTIS), 2011
candidate values. The experiment indicates that the         International Conference on Dec. 2011,IEEE.
efficiency of the algorithm is faster and some efficient    [14] Fu Jun ,Yuan Wen-hua, Tang Wei-xin ,Peng
than presented algorithm of itemset mining.                 Yu,”study on Monitoring Data Mining of Steam
                                                            Turbine Based on Interactive Association Rules ”,IEEE
REFERENCES                                                  2011, Computer Distributed Control and Intelligent
 [1] M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy,        Environmental Monitoring (CDCIEM).
 “Mining data streams: A review,” ACM SIGMOD                [15] Jinguo, Xin; Tingting, Wei, “The application of
 Record, vol. Vol. 34,no. 1, 2005.                          association rules mining in data processing of private
 [2] C. C. Aggarwal, Data Streams: models and               economy                                     statistics”,
 algorithms. Springer, 2007.                                E -Business and E -Government (ICEE), 2011 IEEE.
 [3] Nicholson, S. The Bibliomining Process: Data           [16] Weimin Ouyang and Qinhua Huang, “ Discovery
 Warehousing and Data Mining for Library Decision-          Algorithm for mining both Direct and Indirect weighted
 Making. Information Technology and Libraries. 2003,        Association Rules”, Internatinal conference on
 22(4):146-151.                                             Artificial Intelligence and Computational Intelligence,
 [4] Jiann-Cherng Shieh, Yung-Shun Lin. Bibliomining        pages 322-325,IEEE 2009
 User Behaviors in the Library. Journal of Educational
 Media & Library Sciences.2006, 44(1):36-60.               AUTHORS
 [5] Yiqing Qin, Bingru Yang, Guangmei Xu, et al.                            Mr. Ram Ratan Ahirwal has received
 Research on Evolutionary Immune Mechanism in KDD                            his B.E.(First) degree in Computer
 [A]. In: Proceedings of Intelligent Systems and                             Science & Engineering from GEC
 Knowledge Engineering 2007 (ISKE2007) [C], Cheng                            Bhopal University RGPV Bhopal in
 Du, China, October, 2007, 94-99.                                            2002. During 2003, August he joined
 [6] Yang B R. Knowledge discovery based on inner                            Samrat Ashok Technological Institute
 mechanism: construction, realization and application                        Vidisha (M. P.) as a lecturer in
 [M]. USA: Elliott & Fitzpatrick Inc. 2004.                                  computer Science & engg. Dept. and
 [7] Binesh Nair, Amiya Kumar Tripathy, “                   complete his M.Tech Degree (with hons.) as sponsored
 Accelerating Closed Frequent Itemset Mining by             candidate in CSE from SATI (Engg. College), Vidisha
 Elimination of Null Transactions”,         Journal of      University RGPV Bhopal, (M.P) India in 2009.Currently
 Emerging Trends in Computing and Information               he is working as assistant professor in CSE dept., SATI
 Sciences, Volume 2 No.7, JULY 2011, pp 317-324.            Vidisha. He has more than 12 publications in various
 [8] E.Ramaraj and N.Venkatesan, “Bit Stream Mask-          referred international jouranal and in international
 Search Algorithm in Frequent Itemset Mining”,              conferences to his credit. His areas of interests are data
 European Journal of Scientific Research ISSN 1450-         mining, image processing, computer network, network
 216X Vol.27 No.2 (2009), pp.286-297.                       security and natural language processing.
 [9] Shilpa and Sunita Parashar, “ Performance
 Analysis of Apriori Algorithm with Progressive                             Neelesh Kumar Kori received his B.E
 Approach for Mining Data”, International Journal of                        (First)   degrees    in   Information
 Computer Applications (0975 – 8887) Volume 31–                             Technology from UIT, BU Bhopal
 No.1, October 2011, pp 13-18.                                              (M.P) India in 2008 and currently he is
 [10] G. Cormode and M. Hadiieleftheriou, “ Finding                         pursuing M. Tech from SATI Vidisha
 frequent items in data streams”, In Proceedings of the                     (M.P), India in Computer Science &
 34th International Conference on Very Large Data                           Engineering.
 Bases (VLDB), pages 1530–1541, Auckland, New
 Zealand, 2008.                                                            Dr.Y.K.Jain, Head CSE Deptt, SATI
 [11] D.Y. Chiu, Y.H. Wu, and A.L. Chen, “Efficient                        (Degree) Engg. College Vidisha, (M.P.),
 frequent sequence mining by a dynamic strategy                            India. He has more than 30-40
 switching algorithm”, The International Journal on                        publications    in    various   referred
 Very Large Data Bases (VLDB Journal), 18(1):303–                          international     jouranal    and     in
 327, 2009.                                                                international conferences to his credit.
 [12] Jinwei Wang and Haitao Li ,” An Interpolation                        His areas of interests are image
 Approach for Missing Context Data Based on the Time-      processing, computer network.
 Space Relationship and Association Rule Mining ”
 ,Multimedia Information Networking and Security
 (MINES), 2011,IEEE.
 [13] Chaudhary, M. ,Rana, A. , Dubey, G,” Online
 Mining of data to generate association rule mining in
 large                    databases                  ”,
Volume 1, Issue 2 July-August 2012                                                                         Page 201

				
DOCUMENT INFO
Description: International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) is an online Journal in English published bimonthly for scientists, Engineers and Research Scholars involved in computer science, Information Technology and its applications to publish high quality and refereed papers. Papers reporting original research and innovative applications from all parts of the world are welcome. Papers for publication in the IJETTCS are selected through rigid peer review to ensure originality, timeliness, relevance and readability. The aim of IJETTCS is to publish peer reviewed research and review articles in rapidly developing field of computer science engineering and technology. This journal is an online journal having full access to the research and review paper. The journal also seeks clearly written survey and review articles from experts in the field, to promote intuitive understanding of the state-of-the-art and application trends. The journal aims to cover the latest outstanding developments in the field of Computer Science and engineering Technology.