Density Based Clustering Algorithm using Sparse Memory Mapped File
Description
IJCSIS invites authors to submit their original and unpublished work that communicates current research on information assurance and security regarding both the theoretical and methodological aspects, as well as various applications in solving real world information security problems.
Document Sample


(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
Density Based Clustering Algorithm using Sparse
Memory Mapped File
J. Hencil Peter A. Antonysamy
Department of Computer Science Department of Mathematics
St. Xavier’s College, Palayamkottai , India. St. Xavier’s College, Kathmandu, Nepal.
hencilpeter@hotmail.com fr_antonysamy@hotmail.com
Abstract: and section 4 explains the proposed solution. After the new
algorithm’s explanation, section 5 shows the Experimental
The DBSCAN [1] algorithm is a popular algorithm in Data Results and final section 6 presents the conclusion and future
Mining field as it has the ability to mine the noiseless arbitrary work associated with this algorithm.
shape Clusters in an elegant way. As the original DBSCAN
algorithm uses the distance measures to compute the distance
between objects, it consumes so much processing time and it’s
computation complexity comes as O(N2). In this paper we have II RELATED WORK
proposed a new algorithm for mining the density based clusters
using Sparse Memory Mapped File (Spares MMF) [3]. All the
The DBSCAN (Density Based Spatial Clustering of
given objects are initially loaded into their corresponding
Sparse Memory Mapped File’s locations and during the Application with Noise) [1] is the basic clustering algorithm
SparseMemoryRegionQuery operation each objects’ to mine the clusters based on objects density. In this
surrounding cells will be visited for the neighbour objects algorithm, first the number of objects present within the
instead of computing the distance between each of the objects neighbour region (Eps) is computed. If the neighbour objects
in the data set. Using the Sparse MMF approach, it is proved count is below the given threshold value, the object will be
that the DBSCAN algorithm can process huge amount of marked as NOISE. Otherwise the new cluster will be formed
objects without having any runtime issues and the new from the core object by finding the group of density
algorithm’s performance analysis shows that proposed solution connected objects that are maximal w.r.t density-reachability.
is super fast than the existing algorithm.
The OPTICS [4] algorithm adopts the original
Keywords: Sparse Memory Mapped File; Sparse MMF; DBSCAN algorithm to deal with variance density clusters.
Sparse Memory; Neighbour Cells; Sparse Memory DBSCAN.
This algorithm computes an ordering of the objects based on
the reachability distance for representing the intrinsic
hierarchical clustering structure. The Valleys in the plot
I. INTRODUCTION indicate the clusters. But the input parameters ξ is critical
for identifying the valleys as ξ clusters.
Data mining is a fast growing field in which
clustering plays a very important role. Clustering is the The DENCLUE [5] algorithm uses kernel density
process of grouping a set of physical or abstract objects into estimation. The result of density function gives the local
classes of similar objects [2]. Among the many algorithms density maxima value and this local density value is used to
proposed in the clustering field, DBSCAN is one of the most form the clusters. If the local density value is very small, the
popular algorithms due to its high quality of noiseless output objects of clusters will be discarded as NOISE.
clusters.
A Fast DBSCAN (FDBSCAN) Algorithm[6] has
The most of the Density Based Clustering been invented to improve the speed of the original DBSCAN
algorithms requires O (N2) computation time and requires algorithm and the performance improvement has been
huge amount of main memory to process in the real time achieved through considering only few selected
scenario. Since the seed object list grows during run time, it representative objects belongs inside a core object’s
is very difficult to predict the required memory to process the neighbour region as seed objects for the further expansion.
entire objects present in the data set. If the memory is Hence this algorithm is faster than the basic version of
insufficient to process the growing seed objects, the DBSCAN algorithm and suffers with the loss of result
DBSCAN algorithm will crash in the run time. So to get rid accuracy.
of the instability problem and improve the performance, a
new solution has been proposed in this paper. The MEDBSCAN [7] algorithm has been proposed
recently to improve the performance of DBSCAN algorithm,
Rest of the paper is organised as follows. Section 2 at the same time without loosing the result accuracy. In this
gives the brief history about the related works in the same algorithm totally three queues have been used, the first queue
area. Section 3 gives the introduction of original DBSCAN will store the neighbours of the core object which belong
122 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
inside Eps distance, the second queue is used to store the An Object p is density connected to another object q if there
neighbours of the core object which belong inside 2*Eps is an object o such that both, p and q are density reachable
distance and the third queue is the seeds queue which store from o w.r.t Eps and MinObjs.
the unhandled objects for further expansion. This algorithm
guarantees some notable performance improvement if Eps Definition 6: Cluster
value is not very sensitive.
A Cluster C is a non-empty subset of a Database D w.r.t Eps
and MinObjs which satisfying the following conditions.
Though the DBSCAN algorithm’s complexity can
be reduced to O(N * logN) using some spatial trees, it is an For every p and q, if p ∈ cluster C and q is density reachable
extra effort to construct, organize the tree and the tree from p w.r.t Eps and MinObjs then q ∈ C.
requires an additional memory to hold the objects. In this For every p and q, q ∈ C; p is density connected to q w.r.t
new algorithm different new complexity O (N * 2Eps) has Eps and MinObjs.
been achieved and it is proved that the new complexity better
than the previous version of DBSCAN algorithms when the
Eps value is minimal. Definition 7: Noise
II. INTRODUCTION TO DBSCAN ALGORITHM An object which doesn’t belong to any cluster is called noise.
The working principles of the DBSCAN algorithm The DBSCAN algorithm finds the Eps
are based on the following definitions: Neighbourhood of each object in a Database during the
clustering process. Before the cluster expansion, if the
Definition 1: Eps Neighbourhood of an object p algorithm finds any non core object, it will be marked as
NOISE. With a core object, algorithm initiate a cluster and
The Eps Neighbourhood of an object p is referred as surrounding objects will be added into the queue for the
NEps(p), defined as further expansion. Each queue objects will be popped out
NEps(p) = {q ∈ D | dist(p,q) <=Eps}. and find the Eps neighbour objects for the popped out object.
When the new object is a core object, all its neighbour
Definition 2: Core Object Condition objects will be assigned with the current cluster id and its
unprocessed neighbour objects will be pushed into queue for
An Object p is referred as core object, if the neighbour further processing. This process will be repeated until there
objects count >= given threshold value (MinObjs). i.e. is no object in the queue for the further processing.
|NEps(p)|>=MinObjs IV. PROPOSED SOLUTION
Where MinObjs refers the minimum number of neighbour A new algorithm has been proposed in this paper to
objects to satisfy the core object condition. In the above improve the performance as well as to process huge amount
case, if p has neighbours which are exist within the Eps of data. This algorithm is totally relying on Sparse MMF and
radius count is >= MinObjs, p can be referred as core object. the Sparse MMF concept has been explained below briefly:
Definition 3: Directly Density Reachable Object A. Sparse Memory Mapped File (Sparse MMF)
An Object p is referred as directly density reachable from
The Sparse MMF [3] is the derived mechanism of
another object q w.r.t Eps and MinObjs if
Memory Mapped File. The Memory Mapped File [3] is like
p ∈ NEps(q) and
virtual memory and it allows reserving a region of address
space and committing physical storage to the region. The
difference is that the physical storage comes from a file that
|NEps(q)|>= MinObjs (Core Object condition)
is already on the disk instead of the system’s paging file. The
memory mapped file can be used to access the data file on
Definition 4: Density Reachable Object
disk (even very huge files), load and execute executable files
and libraries and allowing multiple processes running on the
An object p is referred as density reachable from another
same machine to share data with each other. The Sparse
object q w.r.t Eps and MinObjs if there is a chain of objects
MMF is similar to Memory Mapped File but it occupies only
p1,…,pn, p1=q, pn=p such that pi+1 is directly density
the required storage space in the physical file. If we use
reachable from pi.
Memory Mapped File to reserve the region of memory, while
committing the changes to the file on disk, the file size will
Definition 5: Density connected object
be equivalent of the created Memory Mapped File size.
Instead if we replace the same with Sparse MMF, final file’s
size will be equivalent to the e non-zero element which is
123 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
stored in the Sparse MMF. So Sparse MMF gives better Address(CenterObject)) and next time when the new object
storage result and hence it has been used in our research. found, current object’s offset will be stored in the previous
object’s NextSeedObject field and so on. Eventually last
B. Object’s Structure object’s NextSeedObject field will be assigned with NULL.
Thus the extra memory as well as buffer/queue requirement
As this algorithm’s core is Spare MMF, the objects to store the seed objects has been removed in this solution.
that needs to be processed by this algorithm are organized bit This function has been customized to update the neighbour
differently and each objects’ structure will have three objects offset in the either field NextSeedObectOffset or
additional fields NextObjectOffset, NextSeedObjectOffset NextTempObjectOffset. If this function receives an update
and NextTempObjectOffset. flag UpdateMasterSeedOffset, neighbour objects offset will
be stored in NextSeedObectOffset field and input update flag
is UpdateTempSeedOffset then the NextTempObjectOffset
will be updated with the neighbour object(s) offset.
The DBSCAN algorithm’s computation complexity
varies based on the RegionQuery function and it uses
distance function to compute the neighbours present with in
the certain radius (Eps). In this new approach, distance
computation during the SparseMemoryRegionQuery function
call has been removed and it visit’s the required number of
neighbour cells from the center cell.
Figure 1. Sparse Memory Mapped File Object’s Structure
While loading all the objects in Sparse MMF, all the
objects are chained in a sequence like linked list (but not
exactly linked list). The first additional field
NextObjectOffset will hold the Offset value of the next
object, second object will hold the offset of its immediate
successor object, etc and the final object’s NextObjectOffset
Figure 2. Neighbour Cells Diagram
will set to NULL to indicate that there are no more objects
further to visit during the clustering process. So the first
object’s address should be retained always to visit the entire In this proposed solution, we have selected two
objects loaded in the Sparse MMF. The other two fields dimensional dataset for the experiment and the above
NextSeedObjectOffset and NextTempObjectOffset fields are diagram shows the neighbour cells with different distance.
used by SparseMemoryRegionQuery function call and it is The center cell has been painted in red colour and it’s
explained in the below section. distance of object stored in the cell will be zero, next
immediate neighbours whose distance is 1 from the center
C. SparseMemoryRegionQuery function cell have been painted in blue colour, the yellow colour cells
distance are greater than 1 and <=2 and so on. These
neighbour cells offsets are pre-computed and stored in M X 2
The proposed algorithm doesn’t uses any extra buffer or dimensional array and it will be passed to the
queue to store the seed objects as well as neighbour objects SparseMemoryRegionQuery function to visit only the
during the run time, instead each object has the required number of neighbour cells to process. Thus the
corresponding Offset field and in which the exact offset of distance computation between objects is not required.
the next seed object will be stored. In the original DBSCAN
algorithm, RegionQuery function has been used to retrieve
the neighbour objects and in this new algorithm
SpareMemoryRegionQuery function has been introduced
instead of RegionQuery. This function visits all the required
surrounding cells in memory and the non empty cell objects
will be chained and return back as seed objects. i. e The
function start from the center cell and visit the neighbour
cells one by one. When the non empty object found in the
first time, center object’s NextSeedOffset field will be
assigned the Offset of new object (Address(NewObject) –
124 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
based on the maximum possible Eps value supported by the
algorithm and based on this K value also determined. So
these two array values are populated with the required values
before the actual clustering process.
E. Algorithm
1) Input D, Eps, MinObjs.
2) Create SparseMemoryMapped File.
3) Load the pre-computed Neighbour Cells Offset
Array “NCOArray” and Offset Index Array
“OIArray” Values.
D. Neighbour Cells and Index Offset Array 4) Initialize the SparseMemoryMapped file with the dataset
D, assign ClusterID field of all objects with
UNCLASSIFIED and preserve the First Object’s
Address.
5) ClusterID = NOISE, CurrentObject = FirstObject.
6) WHILE CurrentObject <> NULL
7) If (CurrentObject.ClusterID == UNCLASSIFIED)
Then
8) Call SparseMemoryRegionQuery function with
CurrentObject, Eps, UpdateMasterSeedOffset,
NCOArray and OIArray parameter and the function
returns FirstSeedObject, LastSeedObject and
SeedObjectsCount.
9) If (SeedObjectsCount >= MinObjs) Then// Core
Object condition
10) ClusterID = GetNextID(ClusterID).
11) Assign the ClusterID to all the seed objects.
12) Move CurrentSeedObject to point its next seed
object using the OffsetValue and assign NULL
value to previous CurrentSeedObject’s
NextSeedObjectOffset field.
13) WHILE CurrentSeedObject <> NULL
14) Call SparseMemoryRegionQuery function with
CurrentSeedObject, Eps, UpdateTempSeedOffset,
NCOArray and OIArray parameter and the function
returns TempFirstSeedObject, TempLastSeedObject
and TempSeedObjectsCount.
15) If (TempSeedObjectsCount >= MinObjs) Then
16) TempCurrentSeedObject = TempFirstSeedObject.
17) For I = 1 to TempSeedObjectsCount
18) If TempCurrentSeedObject .ClusterID IN
{UNCLASSIFIED, NOISE} Then
19) If TempCurrentSeedObject.ClusterID ==
UNCLASSIFIED Then
20) Append the TempCurrentSeedObject to the
Figure 3. NCOArray and IOArray LastSeedObject.
21) End If
Two additional arrays are been used in this algorithm 22) TempCurrentSeedObject .ClusterID =
to avoid the distance computation and improve the ClulsterID.
performance. The first array Neighbour Cells Offset Array 23) End If
(NCOArray) is an M X 2 array and it stores the offset values 24) Move TempCurrentSeedObject to point its next
of neighbour cells from the center object. The Second Index seed object using the OffsetValue and assign
Offset Array (IOArray) is K X 1 dimensional array and it NULL
stores the NCOArray’s last index value for the corresponding value to previous TempCurrentSeedObject’s
Eps value sequence starting from 0. For example if the Eps NextTempSeedObjectOffset field.
value is 1 then IOArray[1] tells that NCOArray array 25) End For
elements starting from 0 to 4 have the cells offset that need to 26) End If
be visited by SparseMemoryRegionQuery during the
neighbour objects computation. The value M will be decided 27) If (CurrentSeedObject. NextObjectOffset == 0)
125 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
Then neighbour objects will be visited using
28) CurrentSeedObject =NULL. TempObjectNextSeedOffset instead of
29) Else ObjectNextSeedOffset and the UNCLASSIFIED cluster id
30) Move CurrentSeedObject to point its next seed object type objects present in the temporary seed chain will be
using the OffsetValue and assign NULL value to appended to the LastSeedObject (main seed chain) for the
previous CurrentSeedObject’s NextSeedObjectOffset further processing and all the UNCLASSIFIED and NOISE
field. type objects present in the temporary seed list will be
31) End If assigned with the current Cluster ID. The LastSeedObject
32) END WHILE member will always point the last object in the seed chain.
33) Else //Non Core Object The entire object present in the main seed chain will be
34) CurrentObject.ClusterID = NOISE. processed one by one and cluster expansion will stop when
35) Assign NULL value to all the SeedObjects’ the traverse reaches the LastSeedObject and no more seed
NextSeedOffset member. objects to process further. The complete clustering process
36) End If will stop once the initial loop process the entire objects
37) End If present in the data set.
38) If (CurrentObject. NextObjectOffset == 0) Then
39) CurrentObject=NULL.
40) Else
41) Move CurrentObject to point its next object using the
OffsetValue.
42) End If
43) END WHILE
This algorithm starts with creating the Sparse MMF
with the required size and loads the Neighbour Cell Offset
and Index Offset array values. The dataset D will be read
one by one and each object will be placed in the
corresponding memory locations. As mentioned in the
section 4(B), while initializing the Sparse MMF with objects,
each successive object’s memory offset will be stored in the Figure 4. Result of Dataset 1
previous objects NextObjectOffset field and last object’s
NextObjectOffset field will be assigned with NULL value.
F. Advantages
Thus it is very essential to preserve the FirstObject’s address
to visit all the remaining objects.
The proposed algorithm is very stable. The main
The algorithm starts the traverse from the first drawback of original DBSCAN algorithm is instability.
object and visits the next objects one by one using the next Though all the objects present in the data set can be loaded
object’s offset stored in the current object itself. When it by the DBSCAN algorithm, if we don’t have sufficient main
finds the object and its cluster ID is UNCLASSIFIED, memory to hold the growing seeds objects, DBSCAN
SparseMemoryRegionQuery function will be called with algorithm will crash during run time. But the new algorithm
required parameter. As the new cluster is not yet formed, doesn’t rely on the growing seeds and it will give guarantee
SparseMemoryRegionQuery function needs to be called with to process all the objects as long as it is able to load. The
UpdateMasterSeedField flag to update the seed objects’ second advantage of the new algorithm is capable of
NextObjectSeedOffset field. The output of processing huge amount of objects. Since this algorithm is
SparseMemoryRegionQuery will give FirstSeedObject, based on the Sparse MMF, it can support few GBs of data in
LastSeedObject and SeedObjectsCount. If the current object a 32 bit Operating System where traditional approach
is a non core object, the current object will be market as supports only few MBs of data in the real time scenario.
NOISE and all its seed objects NextObjectSeedOffset field Also this algorithm can be customised to process very huge
will be market with NULL value. Otherwise the cluster data set (e.g > 10 GB) using the Sparse MMF. Then the
expansion will start with creating a new cluster ID as the beauty of Sparse MMF is, though we pre-allocate more
current object is a core object. The new Cluster ID will be memory in the beginning, the real memory occupying is
assigned to all the seed objects that are chained starting from based on the consumption. Eventually the performance is
FirstSeedObject. Now the remaining objects (except really fast as the algorithm directly works on the memory.
FirstSeedObject) present in the seed chain will be processed
one by one and for all the remaining seed objects G. Limitations
SparseMemoryRegionQuery will be called with
UpdateTempSeedOffset flag to update the As this algorithm uses Sparse MMF and only very
TempObjectNextSeedOffset field. This will avoid the few languages support this feature, scope for implementing
overwriting of seed objects which are already exist in the this algorithm is limited. Second limitation is memory
main seed list chain. So if the object is a core object, the
126 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
customization. If we are planning to apply this algorithm to different sizes of 2 dimensional synthetic datasets were used
support multidimensional dataset, memory needs to be and running time results are given below:
customized accordingly and the computation complexity may
vary. Also if the minimum distance between one object and TABLE 2. RUNNING TIME OF DBSCAN AND DBSCANSMMF IN SECONDS
the immediate nearest object is greater than one unit or less
than one unit, offset array values will change and it should be
recomputed. Moreover creating and populating values in
1.DBSCANSMMF
2.DBSCANSMMF
Offset arrays are an extra task. Last drawback of this
algorithm is this doesn’t support duplicate objects. As the
1.DBSCAN
2.DBSCAN
Number of
object loaded in the corresponding memory location, it is not
Objects
possible to overwrite another object in the same location.
These are the notable limitations of this algorithm.
1500 0.0007 0.3892 0.0005 0.2176
3000 0.0043 0.5395 0.0051 0.5684
H. Computation Complexity 6000 0.0081 1.8030 0.0094 1.8920
10000 0.0137 4.9124 0.0166 5.1122
The DBSCAN algorithm’s complexity has been 20000 0.0261 20.4426 0.0255 18.2351
calculated based on the number of RegionQuery function 30000 0.0377 43.3875 0.0269 41.1765
call. In which each RegionQuery function call need N
40000 0.0545 77.6204 0.0587 79.6543
distance computation and hence the computation complexity
becomes O (N2) for processing all the N objects present in 60000 0.0799 195.8284 0.0676 181.8745
the dataset. As the new algorithm’s SparseRegionQuery
process the neighbour cells, the complexity varies based on
the Eps value and each SparseRegionQuery requires not
more than 2(Eps+1) cells traversal. Eventually for processing
all the N objects, our algorithm requires O (N * 2(Eps+1) )
time. The constant 1 can be removed as it is very small and
the final complexity comes as O (N * 2Eps). This complexity
is really a reduction when the Eps value is reasonable (e.g
1~10) and N value is very large. At the same time, if we have
very less number of objects and the Eps value is too big, this
new complexity won’t be an attractive one. However the real
processing time will be very faster than the traditional
RegionQuery function call as the SparseRegionQuery
traverse the memory directly.
TABLE 1. COMPARISON OF ALGORITHMS
Fig 5. Scalability of Algorithm with different size of dataset
Better Performance
The above table and graph figures show that new
Supports Duplicate
Doesn’t depend on
distance function.
Ability to process
growing Seed )?
algorithm gives better performance when the algorithm’s
Doesn’t Require
huge dataset?
input data set size grows. This is the expected obvious result
extra Buffer
(because of
Algorithm
as the new algorithm visits only the required neighbour cells
Objects?
Stability
during the SparseMemoryRegionQuery function call instead
of the computing distance between center and the entire
objects in the data set. Another reason is directly accessing
DBSCAN No No No Yes No No the memory is much faster than using the buffers to process
the data that are usually used to implement the algorithms.
DBSCANSMMF Yes Yes Yes No Yes Yes
VI. CONCLUSION AND FUTURE ENHANCEMENT
Above table show the comparison of some key features and In this paper we have proposed DBSCANSMMF
DBSCANSMMF is superior in most of the features. algorithm to improve the performance as well as to process
the huge amount of data using Sparse MMF. This new
V. EXPERIMENTAL RESULTS algorithm doesn’t uses any growing seed list which causes
the crash during the run time when there is no sufficient
memory to store the seed objects. Instead the new algorithm
The newly proposed algorithm and the original just maintains the seed list using the offset values and these
DBSCAN algorithm have been implemented in Visual C++ values are stored in each objects corresponding offset field
(2008) on Windows Vista OS and ran on PC with a 2.0 GHZ internally. So there is no need of creating duplicate objects
processor and 4 GB RAM to observe the performance. The
127 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 5, 2010
for processing the objects. Also this new algorithm takes O Tirunelveli. His interested research area is algorithms inventions in
(N * 2Eps) computation complexity and this is better data mining.
complexity as long as Eps value is reasonable.
Email: hencilpeter@hotmail.com
Future work will be to customize this algorithm to
support duplicate objects. This can be achieved using the
internal counter which will give the number of similar
objects and the SparseMemoryRegionQuery also needs to be
customized accordingly to support correct output. The next
expansion will be customizing this algorithm to process super
big data set (e.g. 50 GB). One of the real uses of Memory
Mapped File is mapping the required portion of the file into
memory to process and, un map the current mapped region
and remap the next consecutive file region to process later.
Like this we can process any big file and this algorithm needs
to be customized to support this feature.
REFERENCES
[1] Ester M., Kriegel H.-P., Sander J., and Xu X. (1996) “A Density-Based Dr.A. Antonysamy is Principal of St. Xavier’s College,
Algorithm for Discovering Clusters in Large Spatial Databases with Noise” Kathmandu, Nepal. He completed his Ph.D in Mathematics for the
In Proceedings of the 2nd International Conference on Knowledge
Discovery and Data Mining (KDD’96), Portland: Oregon, pp. 226-231
research on “An algorithmic study of some classes of intersection
graphs”. He has guided and guiding many research students in
[2] J. Han and M. Kamber, Data Mining Concepts and Techniques. Morgan Computer Science and Mathematics. He has published many
Kaufman, 2006. research papers in national and international journals. He has
organized Seminars and Conferences in state and national level.
[3] Jeffrey Richter and Christophe Nasarre, WINDOWS VIA C/C++,
Microsoft Press, 2008. Email: fr_antonysamy@hotmail.com.
[4]M. Ankerst, M. Breunig, H. P. Kriegel, and J. Sander, “OPTICS:
Ordering Objects to Identify the Clustering Structure, Proc. ACM
SIGMOD,” in International Conference on Management of Data, 1999, pp.
49–60.
[5] A. Hinneburg and D. Keim, “An efficient approach to clustering in large
multimedia data sets with noise,” in 4th International Conference on
Knowledge Discovery and Data Mining, 1998, pp. 58–65.
[6]SHOU Shui-geng, ZHOU Ao-ying JIN Wen, FAN Ye and QIAN Wei-
ning.(2000)
"A Fast DBSCAN Algorithm" Journal of Software: 735-744
[7] Li Jian; Yu Wei; Yan Bao-Ping; , "Memory effect in DBSCAN
algorithm," Computer Science & Education, 2009. ICCSE '09. 4th
International Conference on , vol., no., pp.31-36, 25-28 July 2009.
AUTHOR PROFILES
J. Hencil Peter is Research Scholar, St. Xavier’s College
(Autonomous), Palayamkottai, Tirunelveli, India. He earned his
MCA (Master of Computer Applications) degree from
Manonmaniam Sundaranar University, Tirunelveli. Now he is
doing Ph.D in Computer Applications and Mathematics
(Interdisciplinary) at Manonmaniam Sundranar University,
128 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
Related docs
Other docs by ijcsis
Comparative Analysis between Split and HierarchyMap Treemap Algorithms for Visualizing Hierarchical Data
Views: 39 | Downloads: 0
Non-Preemptive Multi-Constrain Scheduling for Multiprocessor with Hopfield Neural Network
Views: 5 | Downloads: 0
Reliable Multipath Routing Protocol (RMRP) For Mobile Ad Hoc Networks Using Adaptive Video Compression
Views: 22 | Downloads: 1
Single CCTA-Based Four Input Single Output Voltage-Mode Universal Biquad Filter
Views: 71 | Downloads: 0
Get documents about "