An Application of Fuzzy Clustering on the Analysis of by sparkunder16

VIEWS: 0 PAGES: 22

									Tamsui Oxford Journal of Management Sciences March 2007, Vol. 23, No. 2 (01-22)
Aletheia University

         An Application of Fuzzy Clustering on the Analysis of Customer Needs

                                           Ming-chang Ke1
                (Received March 23, 2007; Revised May 14, 2007; Accepted May 30, 2007)

                                         Abstract
     The market has entered the era of consumers. It is now essential that the businesses
know who their customers are, what they want and how to sell products to them. With
this knowledge the businesses will be able to develop various products that sell to the
targeted consumers. The paper is about the application of fuzzy clustering on the analysis
of customer needs. Fuzzy clustering was used to cluster customers into a number of
groups based on their needs and hence recommends businesses more effective strategies
of customer grouping for unknown customers. Two different similarity measures are used
in our experiments. In the first similarity measure the clustering algorithm is based on
fuzzy clustering by the product rule. The other similarity measure is calculated by the
single linkage and Euclidean distance using SPSS and then the hierarchical clustering is
used for clustering the data.

     Our results suggest that there are eight types of customers in our study: (1)
economical; (2) basic; (3) fashion-based; (4) function-based; (5) standard; (6) practical;
(7) advanced and (8) luxurious. The clustering shows that about 72.38% of the customers
are economical. We could adjust the similarity threshold to effectively obtain more
appropriate clusters, i.e. customer groups, which is very useful to the businesses in terms
of adaptability to the fast-changing market and customer behaviors and therefore gives
the businesses more competitive advantages.

Keyword: Fuzzy Clustering; Customer Needs; Customer Satisfaction; Marketing
         Segmentation Theory.

1. Introduction

1.1 Research Motivation

     General speaking, consumer behavior can be shaped by one’s living style, the
culture, the community and psychological factors etc, which then determine one’s buying
behaviour and decision making. Long term observations and surveys are often needed
before we can fully understand consumer behaviour. Partitioning different consumers into
suitable markets helps us investigate the similarity in consumer behavior between the
consumers in the same partition. However, consumers are often held back from making a
purchase by the lack of information, especially with a huge variety of products to choose
from. In this paper we make use of questionnaire and fuzzy clustering theory to extract
important needs of consumers, which will then be taken into consideration by

1
    Dept.of Information Management, Aletheia University
2                                        Ming-chang Ke


manufacturers of MP3 player when adopting appropriate marketing strategies. Only when
the demand of consumers is met by manufacturers’ supply both can gain the best value
for money and profits out of the marketing process.

1.2 Research Aims

      In this study we used fuzzy clustering to group consumers with different needs into
dissimilar clusters to provide effective clustering strategy for unknown population. MP3
manufacturers can enhance their market competetiveness and achievement by introducing
suitable strategies to the market that reflect on consumers’ needs based on effective
consumer clustering; retailers can meet consumer’s satisfaction with productive
promotion and feedback to the manufacturers who then will come up with more realistic
marketing strategies with consideration of the demand and the supply. In sum, the aims of
this study are:
      1. Enhancing consumer’s value and satisfaction to attain their loyalty.
      2. Futher improving a company’s strategies so that they meet consumers’ demand.

2. Research Methodologies

2.1 Cluster Analysis

     Clustering is often needed in everyday life, especially on the problems with
multivariables and multi-pointers, and cluster analysis based on multivariable statistics is
commonly adopted.

        Cluster analysis is a logic analysing process which essentially groups existing data
    into a number of clusters according to their similarity and dissimilarity and hence
    simplify clustering problems. When similar data are grouped in the same cluster, the
    inter-cluster dissimilarity will become more apparent and this will help us understand
    the difference in the data(Liu et al,2003).

       Generally there are two types of cluster analysis:
       1. Hierachical Analysis Method:linkage method, Ward’s minimum variance
          method.
       2. Nonhierachical Cluster Analysis Methods:K-means.

         Provided that the optimal number of clusters is known, Objective Function Method
    (OFM) offers the much better clustering effectiveness in accuracy and computational
    efficiency over the hierarchical method. The advantage of the heirarchical method is
    that it does not a pre-defined number of clusters which is automatically determined but
    its accuracy is inferior to OFM and it is more computationally inefficient.

2.1.1 Hierachical Analysis Methods

                                               2
              An Application of Fuzzy Clustering on the Analysis of Customer Needs        3



    This type of methods is characterized by tree structure or hierarchical structure.
  Certain methods first treat an instance as a single dot cluster, cluster two similar dots
  together and add the most characterically similar instance to the cluster one by one or
  form a new cluster until all of the instances belong to one of the clusters. Another
  approach adopts top-down process where all of the instances are initially divided into
  two clusters based on the mean distance between the instances and then the instances
  with largest mean distances in each of the existing cluster are divided into two clusters
  until each of the instances is the cluster itself(Huang,1998).

     The most commonly used method for determine the distance of instances are the
  linkage method and the minimum variance method. The linkage method has a few
  variances:
     1. Single linkage: the minimum distance that can be found between two instances
        (dots) in two different clusters represents the distance between the clusters, i.e.
        d24.
     2. Complete linkage: the maximum distance that can be found between two
        instances (dots) in two different clusters represents the distance between the
        clusters, i.e. d35.
     3. Average linkage: the average distance between all the instances in two different
        clusters represents the the distance between the clusters,
        i.e. d 14 + d 15 + d 24 + d 25 + d 34 + d 35
                            6




                           Figure 1. Linkage methods representation

     The minimum variance method is also known as the Ward’s method, which takes the
approach of merging clusters in order. This method first treats each instance as a cluster
and the order by which the clusters are merged is determined by the total within-group
variance. The instance that produces that minimum total within-group variance is given a
higher priority in merging over the others. In other words, the earlier the instances are
merged together the more similar they are to each other.
     The distance between clusters A and B is defined as the sum of the products of the
square of the distances from the centroid of cluster A ( x) and the centroid of cluster B
                                                           A

( xB ) to the centroid of the merged cluster of A and B and the instances within the

                                               3
4                                                                  Ming-chang Ke


corresponding cluster.
                                                   2                                2

           d   A ,B
                      =   n   A
                                  •   x       −x       +   n   B
                                                                   •   x       −x
                                          A                                B

(1)
      The distance between clusters yielded by the minimum variance method              d is
                                                                                         A, B


called the between group sum of squares(Chang,2002).

2.2 The Application of Fuzzy Clustering

     In the theory of fuzzy clustering, the relationship between an instance and the fuzzy
cluster is represented by the Membership Degree instead of the dichotomy. That is, the
extent of an element being a member of a fuzzy cluster is represented by a continuous
function [0,1] in order to clearly describe the fuzzy phenomenon with numbers. The
fuzzy theory broadens the traditional set theory and therefore its set operations are
introduced. There are many fuzzy decision-makings in the real world problems so using
fuzzy clustering to cluster the membership of the data will generate more rational
results(Zheng,2003).

     The number of clusters was unknown prior to the study so we used the fuzzy
clustering method for analysis, which is clustering method based on fuzzy relationship.
The procedures are as follows: (1) transforming the raw data; (2) computing the matrices
of fuzzy similarity; (3) obtaining the fuzzy clustering relationship and (4) applying fuzzy
clustering on the data and evaluating the clustering results. The detailed explanation of
each procedure is given below.

1. Transforming the raw data:

     The concept and the approach of transforming raw data are the same as systematic
cluster analysis, including standarization transformation, normalization transformation,
logarithmic transformation and so on. Provided that there are n samples to be clustered
(depicted by x1, x2, x3, …, xn where n is the number of people surveyed in the
experiment) and each sample has m attributes depicted by y1, y2, … , ym where yj
represents the value of the jth attribute in that particular sample, which corresponds to
each of the questions in the questionnaire used in the survey and represents the ith index.
Hence the indices of n samples can be represented by the following table(Luo,1994) .

                 Table 1. Transformation of raw data for fuzzy clustering
           Sample                         Index (Question ID)
        (Questionnaire
             ID)           y1         y2          y3         ……           ym
              x1           x11        x12         x13        ……           x1m
              x2           x21        x22         x23        ……           x2m
              :             :          :           :           :           :
              :             :          :           :           :           :

                                                                           4
              An Application of Fuzzy Clustering on the Analysis of Customer Needs         5


              xn              xn1           xn2          xn3         ……              xnm

     Each attribute was given five different grades by a scale of 0 ~ 4 with 0 being the
most unimportant and 4 being the most important. People being surveyed will give a
score to each attribute according to the importance of that particular attribute in their
perception.

                              Table 2. Conversion of raw data
                       Index ( Question) 1      ………           Index (Question) m
                                           Scor
                   Grade Importance             ……… Grade         Importance Score
                                            e
                    A Very Important 4          ………        A Very Important 4
                     B      Important       3   ………        B       Important     3
    Sample 1         C        Neutral       2   ………        C        Neutral      2
 (Questionnaire 1)  D      Unimpotant       1   ………        D      Unimpotant     1
                               Very                                  Very
                     E                      0   ………        E                     0
                           Unimportant                           Unimportant
                    A Very Important 4          ………        A Very Important 4
                     B      Important       3   ………        B       Important     3
    Sample 2         C        Neutral       2   ………        C        Neutral      2
 (Questionnaire 2)  D      Unimpotant       1   ………        D      Unimpotant     1
                               Very                                  Very
                     E                      0   ………        E                     0
                           Unimportant                           Unimportant
         :           :           :          :     :                    :         :
                                                           :
         :           :           :          :     :                    :         :
                    A Very Important 4          ………        A Very Important 4
                     B      Important       3   ………        B       Important     3
    Sample n         C        Neutral       2   ………        C        Neutral      2
 (Questionnaire n)  D      Unimpotant       1   ………        D      Unimpotant     1
                               Very                                  Very
                     E                      0   ………        E                     0
                           Unimportant                           Unimportant

     This study applied the conversion for raw data proposed by Zhou WenZhen to the
raw data seen above(Zhou,2001), where the original scores (0 ~ 4) were binary-encoded
and represented by numbers 0 and 1. Hence a matrix with the new representation was
formed where the score of 4 (Very Important) was converted into 1111; the score of 3
(Important) was converted into 1110; the score of 2 (Neutral) was converted into 1100;
the score of 1 (Unimportnat) was converted into 1000; and the score of 0 (Very
Unimportant) was converted into 0000. After the binary-encoding, the number of
sub-indices is four times m, the number of original indices.

                               Table 3. Binary-encoded matrix
  Sample/Index              Index1                ……                  IndexM


                                                  5
6                                                                                        Ming-chang Ke


                                                            Score                                 ……                               Score
      Sample1                                     X11 X12 …X14                                    ……              X1(4M-3) X1(4M-2) …X1(4M)
      Sample2                                     X21 X22 …X24                                    ……              X2(4M-3) X2(4M-2) …X2(4M)
          …                                                         …                             ……                                 …
      SampleN                                 XN1 XN2 …XN4                                        ……              XN(4M-3) XN(4M-2) …XN(4M)
     We then apply standardization to the indices where we first work out the mean                                                                     x   j

and the standard deviation sj before the standardization transformation.

                                          N
                                 1
              x        j
                             =
                                 N
                                     ∑ X
                                      i =1
                                                           ij

(2)
          (   x    j
                           is the jth column of the mean where j = 1, 2, …, 4M.)

                                                                                     1


                                     ∑ (x                                    )
                         ⎡ 1              N                                  2   ⎤   2
           s   j
                       = ⎢
                         ⎣ N          i =1
                                                      ij
                                                                −   x    j       ⎥
                                                                                 ⎦
(3)
          (Sj is the jth column of the standard deviation.)


                                 x   ij
                                          −       x   j
              x'             =
                        ij
                                          s   j


(4)
          (   x'   ij        is the transformed standard deviation.)


After the standardization each row would have a mean of 0 and a variance of 1 and the
samples would remain relatively stable even if they were changed.

2.Computing the matrices of fuzzy similarity:

    All the indexing values x'ij in the standardized matrix was then compressed into the
range of [0,1] and a matrix of fuzzy similarity is formed. The similarity coefficients rij
were computed using the product rule which then represent the inter-sample similarity.


                                                      ∑ (x                                    )
                                                      4M
                                          1                                                   2

    (5)
              r    ij
                             =1−
                                          X           k =1
                                                                        ik
                                                                             −       x   jk




                                                                                                      ∑ (x                     )
                                                                                                      4M
                                                                                                                               2
Where X is a positive number and X ≥ max                                                                     ik
                                                                                                                  −   x   jk
                                                                                                                                   and the resulting
                                                                                                      k =1
matrix will be reflexive and symmetric.

3. Obtaining the fuzzy clustering relationship:


                                                                                                  6
                An Application of Fuzzy Clustering on the Analysis of Customer Needs         7


     The resulting matrices of fuzzy similarity in previous procedure were reflexive and
symmetric but not transitive therefore such matrices were convolutionally evaluated:
R→R2→R3→...→ R          n
                            and R‧R={rij*}=
                                                n
                                          max min
                                               δ −1
                                                      [   (r , r )]
                                                             iδ   δj
                                                                            , which determines
whether such fuzzy matrices were transitive. This process was repeated until transitive
matrices were obtained. After limited time product and accumulates, we can get the fuzzy
equivalent relation accorded with the reflexive, symmetrical and transitivity(Luo,1994).



4. Applying fuzzy clustering on the data and evaluating the clustering results:

     Here we applied α-cut to the fuzzy matrices obtained in procedure 3. In order to gain
more precise values, λ was reduced by 0.01 at each iteration from 1 and iterations stop
when 0 was reached. When λ=1 each sample itself was a cluster. As λ decreased clustes
with more instances were more likely to merge into one single cluster hence resulting in
clusters being more different. Meanwhile, . Taking what was mentioned above we
worked out the possible combinations of clustering from which the maximum number of
clusters yielded was used for the optimal results. On the other hand when λ value is
reduced, it is easier to amalgamate each other if the single similar degree of classification
greater than λ value, yet the classification is not apter to amalgamate with other
classifications alone(Wang,1997). Considering above-mentioned phenomena we choose
the most one classifications from all possible categorised states.

2.3 Reliability Analysis

     Reliability tells the validity and accuracy of a measurement and includes stability
and consistency(Huang,1998). Kerlinger, in 1999, argued that reliability can measures the
trustworthiness, consistency and stability of a tool (e.g. questionnaire). If all the elements
that were measured with the same criteria show similar or the same characteristics these
elements are thought to have factual correlation. If a particular element is not correlated
with other elements measured in the same criteria this element is said to be irrelevant to
the criteria and should be eliminated.

     Reliability Analysis largely depends on Cronbach’s α coefficient. Consider the
following: a questionnaire has n elements x1, x2, …, xn and these n number of xi scores are
positively correlated or negatively correlated with the true scores T, whose sum is
       n
H = ∑ xi ; each xi score also has an error term Ei and xi= Ti + Ei. Now we can use H as
      i =1
the reliability (that is the square of correlation of T and H, and the covariance is depicted
by Cov(T,H)):



                                                 7
8                                                 Ming-chang Ke


                                                       ⎛       n
                                                                       ⎞
                                                2      ⎜ ∑ Var( xi ) ⎟
                                  (Cov(T , H ))     n ⎜                ⎟
           RH = ρ TH
                  2
                                =                 =      1 − i =1 n
                                  Var(T )Var(H ) n − 1 ⎜               ⎟
                                                       ⎜ Var(∑ xi ) ⎟
                                                       ⎝          i =1 ⎠
(6)

                                                                                                  n
           (   x   i   is the number of questions in the questionnaire, i = 1,…, n;Var (∑ xi )                     is
                                                                                                 i =1
                                                                                                           n

                                                                                        ∑;
           the variance of the scores given to the ith question by all the people surveyed Var ( xi )
                                                                                                          i =1

           is the variance of the total score of all the questions given by the people.)
              Table 4. Relationship between reliability and Cronbach’s α coefficient
                  Reliability                    Cronbach’s α Coefficient
                  Unreliable                     Cronbach’s α Coefficient<0.3
                       Barely reliable             0.3≦Cronbach’s α Coefficient<0.4
                 Reliable                          0.4≦Cronbach’s α Coefficient<0.5
            Very reliable (most
                                                   0.5≦Cronbach’s α Coefficient<0.7
             common range)
           Very reliable (second
                                                   0.7≦Cronbach’s α Coefficient<0.9
           most common range)
              Most reliable                                 0.9≦Cronbach’s α Coefficient

3. Experiments

3.1 The Samples

     We sent 400 questionnaire by convenient sampling, all the subjects being surveyed
were university students, including on-the-job students, and we had 105 valid samples.
Some of the demographical data of the samples are as Table 5: 58.1% of the population
were male; 87.62% of the population were aged between 21 and 40; 84.76% of the
population had used an MP3 player before; 79.05% mainly use an MP3 player for leisure;
and 82.86% would purchase one if it was priced at NTD $1000 to NTD $4000.

                                  Table 5. Demographical statistics of the samples
                                                                            Distributio   Coun          Percentag
    Item       Distribution              Count   Pencentage       Item
                                                                                n          t                   e
                                                                                In
                         Male             61       58.1%                    employme       66           62.86%
Gender                                                                          nt
                                                               Occupation
                                                                              Public
                        Female            44       41.9%                                   9             8.57%
                                                                             servant
    Age                Under 20           11      10.48%                     Student       30           28.57%


                                                        8
                 An Application of Fuzzy Clustering on the Analysis of Customer Needs               9


                                                                        Under
                 21 ~ 30          68       64.76%                                       10   9.52%
                                                                         1000
                 31 ~ 40          24       22.86%                     1001~2000         32   30.48%
                 41 ~ 50           2        1.9%                      2001~3000         41   39.05%
Experienc          Yes            89       84.76%                     3001~4000         14   13.33%
                                                        Acceptable
 e of use          No             16       15.24%                     4001~5000         2    1.9%
                                                           price
            Job/Work-relate
                                   8        7.62%                     5001~6000         1    0.95%
                    d

                 Leisure          83       79.05%                     6001~7000         1    0.95%
Purpose

             Study-related         6        5.71%                     Over 7000         4    3.81%

                  Other            8        7.62%
3.2 Model Constrcution

     We had been collecting information on the website of MP3 manufacturers and
forums and found that one might consider the following when purchasing an MP3 player:
price, capacity, features and design etc. We then devised our questionnaire around these
considerations.

                               Table 6. Analysis on MP3 player features
      Category              Feature                            Description
                                           Most players now support formats such as MP3
                          MP3 playing      and WMA. Some advance models support ASF
                                           and OGG formats.
                          USB storage      File transfer via USB 2.0 (or 1.1).
        Basic                              Long-hour audio recording feature which also
                                           allows user to adjust quality and maximum
                        Audio recording recording time. Some products record audio in
                                           other formats than MP3 which often require
                                           special software to be transferred to a computer.
                           FM radio        The player is able to receive FM radio channels.
                                           Most basic models only support Chinese and
                          Multilingual     English fonts and will display incomprehensible
                                           characters if other languages were encountered.
    Middle-Class                           Users can set a starting point A and an end point
    Mainstream            A-B section      B when playing audio and the section between A
                           repeating       and B can be repeated for song practice or
                                           language study.
                            Multiple       Built in audio equalizer (EQ) settings such as
                        equalizer settings rock, bass, live and so on.
                           FM radio
                                           Record FM programs in real-time
                           recording
      Advance
                         Digital line-in   Converting audio from external devices into
                           recording       MP3 format

                                                   9
10                                      Ming-chang Ke


                      Synchronized      Using specialized plug-ins to display song lyrics
                         lyrics         when playing
                                        SD memory card can be used to further expand
                       Expandable
                                        the memory of the play as the built-in memory
                       memory slot      is often limited.
                                        The firmware on the player can be updated
                     Update-enabled
                                        when the manufacturer releases the lastest
                       firmware         functions that support more features or formats.

     Taking the features mentioned above into account, we looked at nine factors for
which consumers will likely take into consideration when purchasing an MP3 players,
namely (1) price; (2) specification; (3) data transfer; (4) exterior design; (5) power supply;
(6) supplementary features; (7) operability; (8) peripherals and accessories; and (9)
manufacturer factors.

                      Table 7. Various needs for an MP3 player surveyed
          Price     Acceptable price for purchase of an MP3 player
                    Slim and thin body for easy carrying
                    Large capacity of memory
                    Built-in memory
      Specification
                    Displaying song titles and lyrics when playing
                    Good acoustic quality
                    Long battery life for music playback
                    High data transferring rate
      Data transfer Universal Plug and Play as standard, which requires no extra drivers
                    for data transfer.
                    Body case should be stain or scratch resistent
                    Fashionable and trendy design
        Exterior    Color display or back-light display
         design     Body case should come with a fashion pattern or picture, e.g. hello
                    kitty
                    Retractable USB head
                    The player can accept various types of rechargeable battery
                    (Lithium-ion battery, Ni-MH battery, recharging through USB)
      Power supply Apart from the rechargeable battery, the player can also be powered
                    by external power supply or dry batteries.
                    Automatic switch-off
                    FM radio, recording and radio channels settings
                    Supporting multilingual fonts
                    The player can also be used as a flash drive
     supplementary Language learning features
        features    Supporting other formats of multimedia (wav, wma, video and
                    images etc)
                    Multiple audio effects
                    Shockproof

                                              10
                 An Application of Fuzzy Clustering on the Analysis of Customer Needs            11


                      Built-in speakers
                      Buttons or keys with sensitive response
                      Supporting remote control or other control
    Operability
                      Comprehensible user manual
                      Easy navigation and interface
    Peripherals       Free accessories (memory card, drivers, carrying cases and so on)
        and           Supporting external memory cards
    accessories       Supporting external speakers or amplifiers
                      Long warranty
   Manufacturer
                      Repaire services
     factors
                      Reputation

3.3 Study Procedures

     We carried out reliability analysis on the returned 105 questionnaires and a selection
of the questions is listed below. The value of Cronbach’s α was 0.927.

                                   Table 8. Purchase weight factors
 Weigt factors                 Items               Weight factors                Items
                          Dimensions                     Price             Acceptable price
                       Memory capacity                                     Built-in speaker
 Specification          Long-hour music
                                                                      Long-hour audio recoring
                            playback
                         Flash memory                                        Flash drive
 Data transfer         Data transfer mode                                     FM radio
                                                   Supplementary        Supporting multiple
                            Sleek body
                                                      features                 formats
   Exterior
                      Back-light display                                  Sectional repeat
    design
                       Retractable USB                                 Multiple equalizers and
                           connector                                            effects
    Power            Rechargeable battery                                   Shockproof
    supply           External power supply                                  Multilingual
  Peripherals
                                                                          Quick response and
      and                    Free gifts
                                                                              sensitivity
  accessories                                         Operability
                             Warranty                                            Control
 Manufacturer
                              Repair                                            Interface
   factors
                            Reputation

     We then used SPSS to carry out cluster analysis and fuzzy clustering analysis based
on the weight factors mentioned above(Qiu, 2004; Lin, 2004).

                            Table 9. Reliability analysis of weight factors
                                         Item-Total Statistics


                                                 11
12                                      Ming-chang Ke




                                                        Corrected      Cronbach's
                       Scale Mean if    Scale Variance
                                                        Item-Total     Alpha if Item
          Item         Item Deleted     if Item Deleted
                                                        Correlation    Deleted
                                                                       (Cronbach’s α)


         Price             73.95            379.563             .671         .922
        Volume             73.13            401.367             .436         .925
          Storage          73.11            397.816             .600         .924
        Long play          73.04            402.342             .479         .925
         Memory            73.36            394.233             .525         .924
          Fashion          74.24            390.103             .445         .926
        Back light         74.29            384.006             .539         .924
        USB port           73.57            390.692             .519         .924
      Transmission
                           73.57            383.844             .544         .924
            type
     Charge battery        73.69            387.792             .528         .924
       Outer power
                           73.77            386.098             .589         .923
          source
        Sensitivity        73.27            397.734             .535         .924
     Operation type        73.43            387.096             .666         .922
          Operate
                           73.96            377.978             .646         .922
         interface
      Inner speaker        74.42            382.691             .548         .924
     Long
                           73.90            383.707             .576         .923
     recording
       Mobile disk         73.51            389.727             .529         .924
         FM radio          74.16            380.499             .595         .923
       External file
                           73.74            386.477             .570         .923
            play
        Part repeat        74.29            379.501             .563         .924
           Stereo          74.03            379.221             .595         .923
      Against shake        73.55            389.705             .549         .924
     Multi-language        74.29            384.693             .503         .925
            Gift           73.94            392.703             .402         .926
          Service          73.34            391.762             .551         .924
         Maintain          73.28            392.668             .557         .924
           Brand           73.47            388.050             .635         .923
                                       Reliability Statistics
                                   Cronbach's Alpha Based on
      Cronbach's Alpha                                                  N of Items
                                      Standardized Items
             .927                             .930                          27

3.3.1 SPSS Cluster Analysis

     The following need to be defined before the cluster analysis:
1. Similarity measure: we used the Squared Euclidean Distance to determine the similariy

                                              12
                An Application of Fuzzy Clustering on the Analysis of Customer Needs        13


     between two observed instances by measuring the distance between them in the vector
     space. The longer the distance is the less similar these two instances are to each other
     and vice versa. Hence the distance between X and Y can be written as:


            Distence( X , Y ) = ∑ ( X i − Yi ) ………………………………………………….(
                                           2

                               i

7)
             (where X and Y are two different observed instances and, i is a variable 1,
2, …)

2. Clustering method: Closely distanced instances will be clustered together. SPSS
  provides seven different clustering methods, which are given below:
    (1) Between-group linkage
    (2) Within-group linkage
    (3) Nearest neighbor, also known as single linkage.
    (4) Furthest neighbor also known as complete linkage.
    (5) Centroid neighbor
    (6) Median cluster
    (7) Ward’s method

     We employed the single linkage in our cluster analysis, which yielded seven clusters,
 the optimal number of clusters for our samples.

             Table 10. Clusters found by SPSS (the numbers are sample IDs)
                           Cluster Cluster Cluster Cluster Cluster                     Cluster
        Cluster 1
                               2         3         4        5          6                  7
 1、3、4、5、6、7、8、 11、                  15、64 28、68 44、74 59、70                           63、
 10、12、13、14、16、17、 53、72                                                              73、77
 19、20、21、22、23、24、
 25、26、27、29、30、31、
 34、35、37、39、41、42、
 46、49、51、55、58、60、
 61、62、65、66、67、69、
 75、76、78、79、80、81、
 82、83、84、85、86、87、
 88、89、90、91、92、93、
 94、95、96、97、98、99、
 100、101、102、103、104、
 105

3.3.2 Fuzzy Clustering

     In section we show how we applied the fuzzy clustering to the 105 samples with the
considerations of the weight factors shown in Table 8:

                                                13
14                                       Ming-chang Ke


1. Conversion of the raw data
     (1) The importance of each factor valued by the respondents is converted into scores
         as mentioned earlier in Table 2.

                        Table 11. Conversion of the raw data into scores
 Sample      Q1        Q2       Q3        Q4        Q5        Q6       Q7        Q8        Q9
   ID
    1         4         3        3         3         4         4        3         3         0
    2         3         4        4         3         3         4        3         2         0
    3         4         4        4         4         4         4        4         4         3
    4         4         4        4         3         3         4        3         4         4
    5         3         4        4         3         4         4        3         2         0
    6         4         4        4         4         4         4        4         4         4
    7         3         3        4         3         3         3        3         3         3
    8         3         3        4         3         3         3        4         3         3
    9         4         3        3         3         3         3        3         3         0
   10         4         4        4         0         3         4        4         3         0

     (2) The scores were then transformed into binary-encoded representation.

                                Table 12 Binary-encoded scores
 Sample
             Q1        Q2       Q3        Q4        Q5        Q6       Q7        Q8        Q9
   ID
              4         3        3         3         4         4        3         3         0
     1
            1111      1110     1110      1110      1111      1111     1110      1110      0000
              3         4        4         3         3         4        3         2         0
     2
            1110      1111     1111      1110      1110      1111     1110      1100      0000
              4         4        4         4         4         4        4         4         3
     3
            1111      1111     1111      1111      1111      1111     1111      1111      1110
              4         4        4         3         3         4        3         4         4
     4
            1111      1111     1111      1110      1110      1111     1110      1111      1111
              3         4        4         3         4         4        3         2         0
     5
            1110      1111     1111      1110      1111      1111     1110      1100      0000
              4         4        4         4         4         4        4         4         4
     6
            1111      1111     1111      1111      1111      1111     1111      1111      1111

     (3) After enconding, the standardization transformation was applied to the encoded
         scores.

                            Table 13. Standardization transformation
     1. 計算模糊相似矩陣:
     0.2225,0.2858,-0.8126,0.1387,0.2450,0.2858,0.8126,0.0976,0.2225,0.2450,-1.1718,0.0976,0.1
     981,0.2225,-1.2947,0.1707,0.3405,0.4374,0.9489,0.4976,0.6001,0.8126,-0.5855,-1.8283,-1.57
     36,-1.2434,-0.5709,0.3229,0.3575,0.5124,-0.8959,0.4063,0.4828,0.5417,0.8288,0.3741,0.437
                                       表 13 模糊相似矩陣
     4,0.5417,-0.8619,0.3405,0.4678,0.5417,-0.7341,0.1981,0.1981,0.3047,-0.9489,0.2660,0.3047,
     0.4374,1.0438,0.4976,0.5271,0.6148,1.3492,0.5563,0.7189,0.9133,-0.5855,0.4527,0.4828,0.6
     001,-0.7189,0.3405,0.3904,0.4063,1.0639,-1.9905,-1.6916,-1.1950,-0.7037,0.3575,0.4374,0.5
     709,-0.7966,0.6001,0.6887,0.7966,-0.6887,0.5417,0.5563,0.6589,-0.7495,0.2858,0.3904,0.49
     76,1.0845,-1.7803,-1.5736,-1.1056,-0.6738,0.4220,0.5124,0.7189,1.2687,0.2660,0.3229,0.374
     1,-1.0639,0.2450,0.2858,0.3741,0.8788,0.2225,0.3229,0.5417,-0.9858,
                                               14
              An Application of Fuzzy Clustering on the Analysis of Customer Needs               15


     2. Obtaining fuzzy clustering relationship



     (4) The standardized matrix in Table 13 was then compressed into the range of [0,1]
         and a matrix of fuzzy similarity is formed.

                              Table 14. Matrix of fuzzy similarity
     1.0000,0.2747,0.5133,0.5583,0.5811,0.5580,0.6497,0.5622,0.4075,0.5220,0.1831,0.5851,0.55
     62,0.5793,0.4468,0.5871,0.5557,0.3964,0.5698,0.5865,0.5326,0.5700,0.4591,0.5655,0.4871,0
     .6033,0.5597,0.4182,0.5246,0.4884,0.4870,0.3515,0.2602,0.6440,0.5133,0.3205,0.6192,0.403
     8,0.6332,0.3519,0.5382,0.4792,0.4210,0.4237,0.2984,0.4830,0.3822,0.3271,0.5493,0.3075,0.
     6551,0.5034,0.1294,0.1712,0.5614,0.4073,0.4972,0.6611,0.0833,0.5414,0.5365,0.5812,0.4442
     ,0.4378,0.4522,0.4399,0.5631,0.4174,0.5191,0.2899,0.4797,0.2774,0.4404,0.4101,0.4174,0.5
     593,0.4559,0.6195,0.5529,0.5210,0.5580,0.5580,0.5580,0.6296,0.5750,0.5580,0.5703,0.5879,


     (5) Accumulate and calculate the fuzzy similar relation to accord with transitivity,
         received the fuzzy categorised relation.

                          Table 15. Fuzzy clustering relationship
     1.0000,0.4869,0.6379,0.6611,0.6379,0.6611,0.6611,0.6611,0.4926,0.6060,0.4092,0.6611,0.66
     0.4869,1.0000,0.4869,0.4869,0.4869,0.4869,0.4869,0.4869,0.4869,0.4869,0.4092,0.4869,0.48
     0.6379,0.4869,1.0000,0.6379,0.6457,0.6379,0.6379,0.6379,0.4926,0.6060,0.4092,0.6379,0.63
     0.6611,0.4869,0.6379,1.0000,0.6379,0.7026,0.7026,0.7026,0.4926,0.6060,0.4092,0.7026,0.70
     0.6379,0.4869,0.6457,0.6379,1.0000,0.6379,0.6379,0.6379,0.4926,0.6060,0.4092,0.6379,0.63
     0.6611,0.4869,0.6379,0.7026,0.6379,1.0000,0.7184,0.7141,0.4926,0.6060,0.4092,0.9082,0.82.
     6611,0.4869,0.6379,0.7026,0.6379,0.7184,1.0000,0.7141,0.4926,0.6060,0.4092,0.7184,0.7184
     0.6611,0.4869,0.6379,0.7026,0.6379,0.7141,0.7141,1.0000,0.4926,0.6060,0.4092,0.7141,0.71

     (6)Applying fuzzy clustering on the data and evaluating the clustering results

                          Table 16. Fuzzy clustering with different λs
λ=          0.7             0.6             0.55              0.53                   0.5




                                              15
16                                                 Ming-chang Ke


                     4、6、7、8、      1、3、4、5、        1、3、4、5、        1、3、4、5、6、      1、3、4、5、6、
                     12、13、16、     6、7、8、10、       6、7、8、10、       7、8、10、12、      7、8、10、12、
                     24、26、27、     12、13、14、       12、13、14、       13、14、16、       13、14、15、16、
                     37、39、49、     16、17、19、       16、17、19、       17、19、20、       17、19、20、21、
                     58、78、79、     20、22、23、       20、21、22、       21、22、23、       22、23、24、25、
                     81、82、83、     24、25、26、       23、24、25、       24、25、26、       26、27、28、29、
                     84、85、86、     27、29、30、       26、27、29、       27、29、30、       30、31、34、35、
                     87、89、90、     34、35、37、       30、34、35、       31、34、35、       37、39、41、42、
                     91、92、93、     39、41、42、       37、39、41、       37、39、41、       43、44、46、47、
                     94、96、97、     46、49、51、       42、46、49、       42、43、46、       49、51、52、55、
                     98、99、        55、58、60、       51、55、58、       49、51、52、       57、58、60、61、
                     100、101、      61、62、65、       60、61、62、       55、57、58、       62、63、64、65、
                     104、          67、69、76、       65、66、67、       60、61、62、       66、67、68、69、
                     105           78、79、80、       69、76、78、       65、66、67、       71、73、74、75、
 lustering results




                                   81、82、83、       79、80、81、       69、75、76、       76、77、78、79、
                                   84、85、86、       82、83、84、       78、79、80、       80、81、82、83、
                                   87、88、89、       85、86、87、       81、82、83、       84、85、86、87、
                                   90、91、92、       88、89、90、       84、85、86、       88、89、90、91、
                                   93、94、95、       91、92、93、       87、88、89、       92、93、94、95、
                                   96、97、98、       94、95、96、       90、91、92、       96、97、98、99、
                                   99、100、         97、98、99、       93、94、95、       100、101、102、
                                   101、102、        100、101、        96、97、98、       103、104、105
                                   103、104、105     102、103、        99、100、101、
                                                   104、105         102、103、
                                                                   104、105
                     35、76         33              34              9、36            9、36
                     66            non-clustered   non-clustered
                                                                   15、64           11、53
                     non-clustered instances       instances
                                                                                   14 non-clustered
                                                                   28、68
                     instances                                                     instances
                                                                   44、74
                                                                   21
                                                                   non-clustered
                                                                   instances

     Fuzzy clustering was carried out on the transitive fuzzy matrix R and which cluster a
sample should belong to was determined by λ. For example, when λ was equal to 0.9,
instances with λ > 0.9 in the matrix R would be assigned a value of 1 and the rest would
be assigned a value of 0. The instances with assigned value of 1 were then pushed into
the same cluster. The same process was repeated with a reduced λ until the most number
of clusters containing more than one instance was obtained. In our experiment, such
condition was reached when λ = 0.53.

    We further fuzzy-clustered the remaining 21 non-clustered instances to obtain the
most number of clusters of samples.

                                    Table 17. Fuzzy clustering when λ = 0.53
                                                    λ= 0.53


                                                        16
              An Application of Fuzzy Clustering on the Analysis of Customer Needs       17


                           Cluster Cluster Cluster Cluster Cluster Cluster Cluster
        Cluster 1
                              2       3       4       5       6       7       8
 1、3、4、5、6、7、8、 9、36                 15、64 28、68 44、74 11、53 59、70 40、63
 10、12、13、14、16、                                       、72         、73
 17、19、20、21、22、
 23、24、25、26、27、
 29、30、31、34、35、
 37、39、41、42、43、
 46、49、51、52、55、
 57、58、60、61、62、
 65、66、67、69、75、
 76、78、79、80、81、
 82、83、84、85、86、
 87、88、89、90、91、
 92、93、94、95、96、
 97、98、99、100、101、
 102、103、104、105



3.3.3. Comparison between Cluster Analysis and Fuzzy Clustering in SPSS

     In the experiment we applied cluster analysis and fuzzy clustering to the samples.
Here we compared the results obtained in both approaches and hoped to find an effective
clustering strategy for analyzing customer needs. The comparison of the results of the
approaches is shown in Table 18 where the results show the most number of clusters
found in each approach.

     As seen in Table 18, we could tell that the results yielded by both results were very
similar, which indicated that the clustering methods were reliable. One interesting finding
was that samples ID 9 and ID 36 were only found in fuzzy clustering. When we looked at
these two samples we found that their difference in the importance scores valued by the
respondents could not be easily distinguished. This might suggest that the cluster analysis
in SPSS was unable to cluster instances with minor differences when fuzziness in the
instances was present. Whereas in the fuzzy clustering the similarity between instances
were represented by [0,1] and therefore instances or clusters with low similarity could be
more easily distinguished from one and another, which led to more effective clustering. In
our experiment the non-clustered instances were dynamically disregarded in the clustering
process because of their low similarity to other instances or clusters.

       Table 18. Comparison between cluster analysis and fuzzy clustering in SPSS
                                    SPSS Cluster Analysis



                                              17
18                                       Ming-chang Ke


 1、3、4、5、6、7、8、10、12、13、 11、 15、 28、 44、 59、 63、
 14、16、17、19、20、21、22、23、 53、 64 68  74  70  73、
 24、25、26、27、29、30、31、34、 72                 77
 35、37、39、41、42、46、49、51、
 55、58、60、61、62、65、66、67、
 69、75、76、78、79、80、81、82、
 83、84、85、86、87、88、89、90、
 91、92、93、94、95、96、97、98、
 99、100、101、102、103、104、105
                                  Fuzzy Clustering (λ= 0.53)
 1、3、4、5、6、7、8、10、12、13、 9、                         15、 28、 44、 11、 59、 40、
 14、16、17、19、20、21、22、23、 36                        64  68  74  53、 70  63、
 24、25、26、27、29、30、31、34、                                       72      73
 35、37、39、41、42、43、46、49、
 51、52、55、57、58、60、61、62、
 65、66、67、69、75、76、78、79、
 80、81、82、83、84、85、86、87、
 88、89、90、91、92、93、94、95、
 96、97、98、99、100、101、102、
 103、104、105

3.4 Results

    The fuzzy clustering clustered the 105 samples into eight types based on their needs
 specified in the questionnaire. The numbers below are the sample IDs corresponding to
 105 samples.

     Type 1 customer: 1, 3, 4, 5, 6, 7, 8, 10, 12, 13, 14, 16, 17, 19, 20, 21, 22, 23, 24, 25,
26, 27, 29, 30, 31, 34, 35, 37, 39, 41, 42, 43, 46, 49, 51, 52, 55, 57, 58, 60, 61, 62, 65, 66,
67, 69, 75, 76, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99, 100, 101, 102, 103, 104, 105. This type of customers placed more importance on
specification, exteriot design, power supply, data transfer, operability, supplementary
features, peripherals and accessories, manufacturer factors and price.
     Type 2 customer: 9, 36. They placed more importance on specification, peripherals
and accessories and price.

    Type 3 customer: 15, 64. They placed more importance on exterior design, power
supply, operability and peripherals and accessories.

    Type 4 customer: 28, 68. They placed more importance on specification, power
supply, operability and supplementary features.

     Type 5 customer: 44, 74. They placed more importance on specification, operability,
                                              18
              An Application of Fuzzy Clustering on the Analysis of Customer Needs       19


manufacturer factors and price.

    Type 6 customer: 11, 53, 72. They placed more importance on specification, power
supply, operability, peripherals and accessories, manufacturer factors and price.

    Type 7 customer: 59, 70. They placed more importance on specification, power
supply, data transfer, operability, supplementary features, peripherals and accessories and
manufacturer factors.

     Type 8 customer: 40, 63, 73. They placed more importance on specification, data
transfer, power supply, operability and peripherals and accessories.

    Therefore we can specify customers in eight different categories based on their
needs:

                     Table 19 Categories of customers and their needs
                         Specification, exteriot design, power supply, data
          Category 1     transfer,   operability,     supplementary      features,
         Economical      peripherals and accessories, manufacturer factors and
                         price
          Category 2
                         Specification, peripherals and accessories and price
            Basic
          Category 3     Exterior design, power supply, operability and
         Fasion-based peripherals and accessories
          Category 4     Specification, power supply, operability and
        Function-based supplementary features
          Category 5     Specification, operability, manufacturer factors and
           Standard      price
          Category 6     Specification, power supply, operability, peripherals
           Practical     and accessories, manufacturer factors and price
                         Specification, power supply, data transfer, operability,
          Category 7
                         supplementary features, peripherals and accessories
          Advanced
                         and manufacturer factors
          Category 8     Specification, data transfer, power supply, operability
          Luxurious      and peripherals and accessories

1. Economical customers hope to purchase an MP3 with most of the functionality at a
   cheap price. Manufacturers or retailers can establish a more effective communication
   with this type of customers so that they could understand the product as well as their
   needs and can be well advised by the manufacturers or the retailers.

2. Basic customers normally look for the simplest features in an MP3 player and do not
   need may supplementary ones. This type of customers usually pay attention to
   convenience in a product and fashion and do not need an MP3 play for a continuous

                                              19
20                                     Ming-chang Ke


     period. Manufacturers can consider introducing slim models at a low price for the
     customers.

3. Fashion-base customers need an MP3 player that does not only play music but also
   have special acoustic effects, sleek body design and a wide range of accessories.
   Manufacturers should look at introducing a variety of peripherals and accessories that
   can be personalized, such as changeable cases or memory card packages.

4. Function-based customers emphasize their needs in operability, such as the
  convenience of a remote control. This type of users are willing to spend more on an
  MP3 player that has more complete functionality and is easy to operate.

5. Standard customers usually look for some basic and standard features and easy
  operation in an MP3 player without fancy supplementary functions becaused of their
  limited budget. They also have limited knowledge in the product so a manufacturer’s
  reputation becomes an importance guideline when they are purchasing the product.
  The retailers can merchandise the products according customers’ preferred brands.

6. Practical customers also have limited budget like standard customers do but they often
   know what specification and features they are after in an MP3 player. For example,
   students usually want a player that has long battery life for recording and music
   playback for their academic purposes.

7. Advanced customers usually know and compare specification and features of several
   models. They are also particular about acoustic and video quality, number of formats
   supported, operability and expansibility and so on. The manufacturers should pay more
   attention to the features and functions of their products and release refined models to
   the market.

8. Luxurious customers are willing spend mony on products on the high-end market and
   often pay attention to the latest products. The manufacturers should pay extra attention
   to their marketing strategies and emphasize the latest features on their products to draw
   the customers’ interest in buying.

      The results of the clustering analysis show that 72.38% of the respondents were
economical customers and we found that the optimal number of clusters can be obtained
by tuning the similarity. From manufacturers’ prospects, being able to control the
fast-changin market and understand the dynamics of customer behaviour enhances the
their advantages and competitiveness in the market.

4. Conclusions and Future Work

4.1. Conclusions

                                             20
              An Application of Fuzzy Clustering on the Analysis of Customer Needs      21



     In traditional analysis, binary clustering assigns instances to either “belong to” or
“not belong to” a particular cluster. However, this approach is not always applicable in
the real-world scenarios where many uncertainties are present. This study has found that
when the observed instances are not significantly different SPSS would not be able to
cluster them with such little difference.

      Today’s market is orientated by consumer’s needs. If the manufacturer cannot meet
diverse needs of the consumers they will turn to its competing rivals. We applied fuzzy
clustering to analyzing consumers’ needs and provided the manufacturer with useful
information that can be used as guidelines when it is planning to launch new marketing
strategies to the fast-changing market

4.2. Future Work

    There are three directions in the future work:
    1. Factors other than the product functionality and features were not considered in
       this study therefore our future work can focus on these other factors that could
       potentially influence customer behavior.
    2. MP3 player now has become a built-in feature for a number of electronic devices
       such as cell phone and portable gaming console. Our future work can start to
       look at what impact these devices have on customer behavior.
    3. Only the students at Alethia University were surveyed in this study. Surveying a
       broader population and also considering the views of the manufacturers should
       help us future understad and improve the relationship between demand and
       supply.

                                         References

[1] Wang, Wen Jun, 1997. Knows Fuzzy, Chuan Hwa Book CO., LTD., Taipei,
    6.17-6.21.
[2] Lin, Xin Cheng, and Sheng Wen Xiao., 2003. Annotate the materials fuzzily and
    classify the application that is catalogued in the library. Educational materials and
    library science, 41, the first issue, pp.61-76.
[3] Zhou, Wen Zhen, 2001. Application of Fuzzy-Clustering Methodology to the
    Development of a Demand-Responsive Distribution System. M.S. Dissertation.
    Department of Transportation, Warehousing and Logistics, National Kaohsiung first
    University of Science and Technology.
[4] Chang, Dun Cheng, 2002. Apply Fuzzy Cluster Method for Identifying the Spatial
    Distribution of Pollutants around Kaohsiung Coastal Water. M.S. Dissertation.
    Department of Marine Environment and Engineering of National Sun Yat-Sen


                                              21
22                                    Ming-chang Ke


      University.
[5]   Chen Che operating room. 2003. SPSS statistical analysis - basic page. gotop
      information inc., Taipei. 5.2-5.19.
[6]   Qiu ZhenKun. 2004. SPSS counts teaching instance application, Kingsinfo
      information, Taipei.
[7]   Lin JieBin , Lin ChuanXiong , Liu MingDe. 2004. SPSS12 statistic modeling and
      the practice. DrSmart Press Co., Ltd., Taipei.
[8]   Huang, JunYing. 1998. Many variable analysis, 6th Edition. The economic
      enterprise research institute of China, pp.239-259.
[9]   Liu, Jonen, Kwoting Fang, and Shih Ya Yueh, 2003. Using Fuzzy C-means to
      Analyze the Competitive Market for Mobile Phone. The International Journal of
     Management,Chung Hua university, 4, the third issue. pp.49-62.
[10] Zheng, Cai Fang, 2003. One group of laws of fuzzy set produces the research
     wrapped up outside in the industry that designs IC. M.S. Dissertation. Department of
     administration and institute of Chung Hua university.
[11] Luo JiYu. 1994. Plural statistics analytical method and application. Scientific and
     technological books Limited Company, Taipei. pp.225-233.




                                           22

								
To top