Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Scalable Fuzzy Clustering Algorithms by she20208


									                           Scalable Fuzzy Clustering Algorithms

                                              Lawrence O. Hall

                          Department of Computer Science and Engineering, ENB118
                                         University of South Florida
                                           4202 E. Fowler Ave.
                                          Tampa, Fl 33620-9951

                   Abstract                              streams. One could also choose to cluster
                                                         summarizations. Experimental data sets include
Clustering is the most typical way to group              several which contain tens of millions of
unlabeled data. Today, there are very large              examples, as well as streaming data sets.
unlabeled data sets available. Many of these             Results from real-world data sets show excellent
data sets are too large to fit in the memory of a        partitions are obtained. For tractable size data
typical computer. Some of these data sets are so         sets it is shown that the partitions are
large that they can only be treated as data              comparable to those from fuzzy c-means when
streams because not all of the data can be stored        it clusters all the data.
in a cost-effective manner. Fuzzy clustering
algorithms are known to be very useful on small                           Biography
to medium-size data sets. This talk focuses on
how to make some well understood classic                 Lawrence O. Hall is a Professor of Computer
fuzzy clustering algorithms scale to very large          Science and Engineering at University of South
data sets and streaming data sets. The goal is to        Florida. He received his Ph.D. in Computer
be able to create a data partition that reflects the     Science from the Florida State University in
whole data set, but requires practical                   1986 and a B.S. in Applied Mathematics from
computation times. In particular, we show that           the Florida Institute of Technology in 1980. He
the fuzzy c-means families of algorithms can be          is a fellow of the IEEE. His research interests lie
scaled to provide data partitions that are very          in distributed machine learning, data mining,
close and potentially identical to what you              pattern recognition and integrating AI into
would get if you were able to cluster all the            image processing. The exploitation of
data. The general idea is to cluster subsets of          imprecision with the use of fuzzy logic in
the data and create weighted examples from the           pattern recognition, AI and learning is a
subsets.    The weighted examples from a                 research theme. He has authored over 190
previous partition(s) are used with new data to          publications in journals, conferences and books.
create a new partition which reflects the                Recent publications appear in Artificial
examples currently loaded in memory and those            Intelligence in Medicine, Neural Computation,
partitioned previously. This process can be              Pattern Recognition Letters, JAIR, Journal of
repeated until all the data has been clustered.          Machine Learning research, IEEE Transactions
Several variations on the theme of summarizing           on Systems, Man, and Cybernetics, the
previous partitions with a set of weighted               International Conference on Data Mining, the
examples are given. Some history can be                  Multiple Classifier Systems Workshop, and the
ignored, for example, in time changing data
FUZZ-IEEE                               conference:

He co-edited the 2001 joint North American
Fuzzy     Information      Processing   Society
(NAFIPS), IFSA conference proceedings. He
was the co-Program Chair of NAFIPS 2004. He
received the IEEE SMC Society Outstanding
contribution award in 2000. He received an
Outstanding Research achievement award from
the Univ. of South Florida in 2004. A past
president of NAFIPS.          The former vice
president for membership of the SMC society.
He is the President-elect of the SMC society for
2005. He is currently the Editor-In-Chief of the
IEEE Transactions on Systems, Man and
Cybernetics, Part B. Also, associate editor for
IEEE Transactions on Fuzzy Systems,
International Journal of Intelligent Data
Analysis, and International Journal of
Approximate Reasoning.

To top