Document Sample

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 ENCHANCING AND DERIVING ACTIONABLE KNOWLEDGE FROM DECISION TREES 1. P.Senthil Vadivu 2.Dr.(Mrs) Vasantha Kalyani David. HEAD, Department Of Computer Applications, Associate Professor, Hindusthan College Of Arts and Science, Department of Computer Science, Coimbatore-641028. Tamil Nadu, India. Avinashilingam Deemed University Email: sowju_sashi@rediffmail.com Coimbatore, Tamil Nadu, India Email: vasanthadavid@yahoo.com Abstract marketing actions and compare the rankings using a lift chart or the area under curve measure from the Data mining algorithms are used to discover customer ROC curve. Ensemble based methods are examined models for distribution information, Using customer under the cost sensitive learning frameworks. For profiles in customer relationship management (CRM), it Example, integrated boosting algorithms with cost has been used in pointing out the customers who are loyal considerations. and who are attritors but they require human experts for A class of reinforcement learning problems and discovering knowledge manually. associated techniques are used to learn about how to Many post processing technique have been introduced that do not suggest action to increase the objective function make sequential decisions based on delayed such as profit. In this paper, a novel algorithm is proposed reinforcement so as to maximize cumulative rewards. that suggest actions to change the customer from the A common problem in current application of data undesired status to the desired one. These algorithms can mining in intelligent CRM is that people tend to focus discover cost effective actions to transform customer from on, and be satisfied with building up the models and undesirable classes to desirable ones. Many tests have been interpreting them, but not to use them to get profit conducted and experimental results have been analyzed in explicitly. More specifically, most data mining this paper algorithms only aim at constructing customer profiles; Key words : CRM,BSP,ACO, decision trees, attrition predict the characteristics of customers of certain classes. Example of this class is: what kind of 1. Introduction customers are likely attritors and kind are loyal Researchers are done in data mining. Various customers? models likes Bayesian models, decision trees, support This can be done in the telecommunications vector machines and association rules have been industry. For example, by reducing the monthly rates applied to various industrial applications such as or increasing the service level for valuable customers. customer relationship management,(CRM)[1][2] which Unlike distributional knowledge, to consider maximizes the profit and reduces the costs, relying on actionable knowledge one must take into account post processing techniques such as visualization and resource constraint such as direct mailing and sales interestingness ranking. promotion [14]. To make a decision one must take into Because of massive industry deregulation account the cost as well as the benefit of actions to the across the world each customer is facing an ever enterprise. growing number of choices in telecommunication This paper is presented with many algorithms for industry and financial services [3] [10] .The result is the Creation of decision tree, BSP (Bounded that an increasing number of customers are switching Segmentation Problem), Greedy-BSP and ACO (Ant from one service provider to another. This Colony Optimization) which helps us to obtain actions Phenomenon is called customer “churning “or for maximizing the profit and finding out the number “attrition”. of customer who are likely to be loyal. A main approach in the data mining area is to rank the customers according to the estimated likelihood and they way they respond to direct 230 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 2. Extracting Actions in Decision Tree The domain specific cost matrix for the net For CRM applications, using a set of examples a profit of an action can be defined as follows: decision tree can be built, which is described by set of PNet=PE*Pgain-∑i COSTij (1) attributes such as name, sex, birthday, etc., and Where PNet denotes the net profit, PE denotes the total financial information such as yearly income and family profit the customer in the desired status, Pgain denotes information such as lifestyles and number of children. the probability gain, and COSTij denotes the cost of Decision tree is used in vast area of data mining each action involved. because one can easily convert methods into rules and The leaf node search algorithm for searching also obtain characteristics of customers those who the best actions can be described as follows: belong to a certain class. The algorithm that is used in this paper do not only relay on prediction but also it Algorithm: leaf-node search can classify the customers who are loyal and such rules 1. For each customer x, do can be easily derived from decision trees. 2. Let S be the source leaf node in which x falls into; The first step is to extract rules when there is no 3. Let D be a destination leaf node for x the maximum restrictions in the number of rules that can be net profit PNet. produced. This is called as unlimited resource case [3]. 4. Output (S, D, Pnet); The overall process of the algorithm is described as follows: An example of customer profile:- Algorithm 1: Step 1: Import customer data with data collection, Service data cleaning, data preprocessing and so on. Step 2: A Decision tree can be build using decision tree learning algorithm[11] to predict, if a Low Med HIGH customer is in desire status or not. One improvement for the decision building is to use the area under the C RATE SEX curve of the ROC curve [7]. Step 3: Search for optimal actions for each L H customer using the key component proactive solution 0.1 [3]. F M Step 4: Produce reports, for domain experts to review the actions that deploy the actions. A B D E 2.1 A search for a leaf node in the unlimited 0.9 0.2 0.8 0.5 resources This algorithm search for optimal actions and transforms each leaf node to another node in the more Consider the above decision tree, the tree has five desirable fashion. Once the customer profile is built, nodes. A, B, C, D, E each with the probability of the customers who are there in the training examples customers being a loyal. The probability of attritors falls into a particular leaf node in a more desirable simply “1” minus this probability. status thus the probability gain can then be converted Consider a customer Alexander who’s record states into expected gross profit. that the service=Low (service level is low), sex=M When a customer is moved from one leaf to (male), and Rate =L (mortgage rate is low). The another node there are some attribute values of the customer is classified by the decision tree. It can be customer that must be changed. When an attribute seen that Alexander falls into the leaf node B, which value is transformed from V1 to V2, it corresponds to predicts that Alexander will have only a 20 percent an action that incurs cost which is defined in a cost chance of being loyal. The algorithm will now search matrix. through all other leafs (A, C, D & E) in the decision The leaf node search algorithm searches all tree to see if Alexander can be “replaced” into a best leafs in the tree so that for every leaf node ,a best leaf with the highest net profit. destination leaf node is found to move the customer to 1. Consider the leaf node A. which do not have a high the collection of moves are required to maximize the probability of being loyal(90%), because the cost of net profit. action would be very high if Alexander should be 231 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 changed to female). So the net profit is a negative Step 2: Consider a constant, k.(K<m) ,where m is total infinity. number of source leaf nodes. 2. Consider leaf node C, it has a lower probability of Step 3: Build a cost matrix with attributes U and V. being loyal, so we can easily skip it. Step 4: Build a unit benefit vector, when a customer 3. Consider leaf node D the probability gain is 60 belongs to positive class percent (80 percent - 20 percent) if Alexander falls into Step 5: Build a set of test cases. D, the action needed is to change service from L (low) The goal is to is to find a solution with maximum net to H (high). profit. by transforming customers that belongs to a 4. Consider leaf E, the probability gain is 30 percent source node S to the destination node D via, a number (50 percent – 20 percent), which transfers to $300 of of attribute value changing actions. the expected gross profit. Assume that the cost of the GOALS: actions (change service from L to H and change rate The goal is to transform a set of leaf node S to a from L to H). Is $250, then the net profit of the jack destination leaf node D, S->D. from B to E is $ 50 (300-250). ACTIONS: Clearly, the node with the maximum net profit for In order to change one has to apply one attribute Alexander is D, that suggest actions of changing the value changing action. This is denoted by service from L to H. {Attr, u->v}. Thus the BSP problem is to find the best K groups 3. COST MATRIX:- of source leaf nodes {Group i=1, 2…, k} and their Each attribute value changes incur cost and the cost corresponding goals and associated action sets to for each attribute is determined by the domain experts. maximize the total net profit for a given data set Ctest. The values of many attributes, such as sex, address, Service number of children cannot be changed with any reasonable amount of money. These attributes are called “hard attributes”. The users must assign large Low HIGH number to every entry in the cost matrix. Some values can be easily changed with STATUS reasonable costs, these attributes such as the service RATE level, interest rate and promotion packages are called “soft attributes”. The hard attributes should be included the tree A B C D building process in the first place to prevent customers from being moved to other leafs is because that many hard attributes are important accurate probability estimation of the leaves. L4 L1 L2 L3 For continuous value attributes, such as interest rate which is varied within a certain range. the numerical 0.9 0.2 0.8 0.5 ranges be discretized first for feature transformation. Example: To illustrate the limited resources problem, consider again our decision tree in above figure. 4. THE LIMITED RESOURCE CASE: Suppose that we wish to find a single customer POSTPROCESSING DECISION TREES:- segment {k=1}. A candidate group is {L2, L4}, with a 4.1 BSP (Bounded Segmentation Problem) selected action set {service <-H, Rate <--C} which can In the previous example that is considered above transform the group to node L3. assume that group to each leaf node of the decision tree is a separate leaf node L3,L2 changes the service level only and customer group. For each customer group we have to thus, has a profit gain of (0.8-0.2)*1-0.1=0.5 and L4 design actions to increase the net profit. But in practice has a profit gain of (0.8-0.5)*1-0.1=0.2.Thus, the net the company may be limited in its resources. But when benefit for this group is 0.2+0.5=0.7. such limitations occur it is difficult to merge all the As an example of the profit matrix computation, a nodes into K segments. So to each segment a part of the profit matrix corresponding to the source responsible manager can apply several actions to leaf node. L2 is as shown in table, where increase the overall profit. Aset1={status=A}. Aset2={service=H, Rate =C } and Step 1: Here, a decision tree is build with collection Aset3={service=H , Rate=D}. here, for convenience S(m) source leaf nodes and collection D(m) destination we ignore the source value of the attributes which is leaf nodes. dependent on the actual test cases. 232 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 TABLE1 An example of the profit matrix input parameters. Greedy BSP algorithm processes this computation matrix in a sequenced manner for k iterations. In each Aset1(L2)(Goal Aset2(L2)(Goal Aset3(L2)(Goal= iteration, it considered adding one additional column = ->L1) = ->L3) ->L4) of the M matrix, until it considered all k columns. 0.6 0.4 0.1 Greedy BSP algorithm considers how to expand the … … … customer group by one. To do this, it considers which addition column will increase the total net profit to a highest value we can include one more column. TABLE 2 : lustrating the Greedy-BSP algorithm 5. IMPROVING THE ROBUSTNESS USING MULTIPLE TREES: Source Aset1 Aset2 Aset3 Aset4 The advantage of the Greedy BSP algorithm is that nodes (goal= (goal= (goal= (goal= - it can significantly reduce the computational cost while ->D1) ->D2) ->D3) >D4) guaranteeing the high quality of the solution at the S1 2 0 1 1 same time. In Greedy BSP algorithm, built decision S2 0 1 0 0 tree always choose the most informative attribute as S3 0 1 0 0 the root node. S4 0 1 0 0 Therefore, we have also proposed an algorithm Column 2 3 1 1 referred to as Greedy BSP multiple which is based on sum integrating an ensemble of decision trees in this paper Selected X [16],[5],and [15]. actions The basic idea is to construct multiple decision trees using different top ranked attributes as their root nodes. For each set of test cases, the ensemble decision Then the BSP problem becomes one of picking the trees return the median net profit and the best k columns of matrix M such that the sum of the corresponding leaf nodes and action sets as the final maximum net profit value for each source leaf node solution. among the K columns is maximized. When all Pij Thus, we expect that when the training data are elements are of unit cost, this is essentially a maximum unstable, the ensemble based decision tree methods coverage problem, which aims at finding K sets such can perform much more stable as compared to results that the total weight of elements covered is maximized, from the single decision tree. where the weight of element is the same for all the Algorithm Greedy BSP Multiple: sets. A special case of the BSP problem is equivalent Step 1: Given a training data set described by P to the maximum coverage problem with unit costs. Our attributes aim will then to find approximation solutions to the 1.1 Calculate gain ratios to rank all the attributes in BSP problem. a descending order. 1.2 For i=1 to p Algorithm for BSP: Use the ith attribute as the root node to construct the ith decision tree Step 1: Choose any combination of k action sets End for Step 2: Group the leaf nodes into k groups Step 2: take a set of testing examples as input Step 3: Evaluate the net benefit of the action sets on 2.1 For i= 1 to p the group Use the ith decision tree to calculate the net Step 4: Return the k action set with associated leaf profit by calling algorithms Greedy BSP node End for Since the BSP needs to examine every combination 2.2 return k actions sets corresponding to the of k action sets, the computation complexity is more. median net profit. To avoid this we have develop the Greedy algorithm Since Greedy BSP multiple relies on building, which can reduce the computational cost and guarantee multiple decision trees to calculate the median the quality of solution. net profit different sampling can only affect the We consider the intuition of the Greedy BSP construction of a small portion of decision trees. algorithm using an example profit matrix M as shown Therefore. Greedy BSP Multiple can produce in table. Where we assume a k=2 limit. In this table net profit less variance. each number a profit Pij value computed from the 233 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 6. AACO (ADAPTIVE ANT COLONY Collecting customer data [9] and using the data for OPTIMISATION): direct marketing operations have increasingly become possible. One approach is known as database The searching process of ACO is based on the marketing which is creating a bank of operation about positive feedback reinforcement [4][12]. Thus, the individual customers from their orders, queries and escape from the local optima is more difficult than the other activities using it to analyze customer behavior other Meta heuristics, therefore, the recognition of and develop intelligent strategies [10],[13],[6]. searching status and the escape technique from the Another important computational aspect is to local optima are important to improve the search segment a customer group into sub-groups. AACO is a performance of ACO. Regarding the recognition of new technique that is used to find the accuracy in this searching status, the proposed algorithm utilizes a paper. transition of the distance of the best tour. A period of All the above research works have aimed at finding which each ant builds a tool represents one generation. a segmentation of the customer’s database taking a Regarding the recognition of searching status, the predefined action for every customer based on that proposed utilizes a transition of the distance of the best customer’s current status. None of them have two (shortest 2). A period of which each ant builds a addressed about discovering actions that might be tool represents one generation. taken from a customer database, in this paper we have There is a cranky ant which selects a path let us not addressed about how to extract actions and find out the being selected and which is shortest. best accuracy for the customer to be loyal. ADAPTIVE ANT COLONY ALGORITHM EXPERIMENTAL EVALUATION 1. Initialized the parameters ∞, t max, t, sx, sy 2. for each agent do 3. Place agent at randomly selected site an grid 4. End for 5. While (not termination) // such at t≤tmax 6. for each agent do 7. Compute agent’s Fitness t(agent) And activate probability Pa (agent) According to (4) and (7) 8. r<-random([0,1]} 9. If r≤ Pa then 10. Activate agent and move to random Selected neighbor’s site not Occupied by other agents 11. Else This experimental evaluation shows the entropy value 12 .Stay at current site and sleep that has been calculated for each parameter. 13 .end if 14 .end for 15 .adapting update parameters ∞, t<-t+1 16. End while 17.Output location of agents 7. DIFFERENCE FROM A PREVIOUS WORK Machine learning and data mining research has been distributed to business practices by addressing some issues in marketing. One issue is that the difficult market is cost sensitive. So to solve this problem an associative rule based approach is used to differentiate between the positive class and negative class members and to use this rules for segmentation. 234 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 The best tree is selected from the various experimental result analyzed. In our example each action set contains a number of actions and each action set contains four different action set with attribute changes. The experiment with Greedy BSP found action sets with maximum net profit and is more efficient than optimal BSP. We also conducted this experiment with AACO and found that AACO is more accurate than the Greedy BSP algorithms REFERENCES [1] R.Agarwal and R.Srikanth “Fast algorithms for large data bases (VLDB’94), pp.487-499, sept.1994. 235 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010 [2]Bank Marketing Association, Building a [15] X.Zang and C.E. Brodley, “Boosting Lazy Financial Service Plan: Working plans for product and Decision Trees”, Proc Conf .Machine Learning (ICM), segment marketing, Financial Source Book, 1989. pp178-185, 2003. [3]A.Berson, K.Therling, and S.J.Smith, Building [16] Z.H.Zhou, J.Wu and W.Tang “Ensembling Data Mining Applications for CRM.McGraw-Hill, Neural Network IEEE Conf. Data Mining, pp 585-588, 1999. 2003. [4] Bezdek,J.C., (1981) Pattern Recognition with fuzzy Objective Function Algorithm” P.Senthil Vadivu received MSc (Computer Science) [5] M.S.Chen, J.Han and P.S.Yu,”Data Mining: An from Avinashilingam Deemed University in 1999, Overview from a Database Perspective,”IEEE Trans completed M.phil from Bharathiar university in on Knowledge and Data Engineering”, 1996. 2006.Currently working as the Head, Department of [6]R.G.Drozdenko and P.D.Drake.”Optimal Computer Applications, Hindusthan College of Arts Database Marketing “2002. and Science, Coimbatore-28. Her Research area is [7] J.Hung and C.X.Ling,”Using Auc and decision trees using neural networks. Accuracy in Evaluvating Learning Algorithms”, IEEE Trans Knowledge and Data Engineering. Pp 299- 310, 2005. [8] Lyche “A Guide to Customer Relationship Management” 2001. [9]H.MannilaH.Toivonen,and .I.Verkamo,”Efficient Algorithms for Discovering Association Rules”Proc workshop Knowledge discovery in databases. [10]E.L.Nash,”Database Marketing” McGraw Hill, 1993. [11]J.R.Quinlan, C4.5 Programs for Machine Learning”, 1993 [12] S.Rajasekaran, Vasantha Kalyani David , Pattern Recognition using Neural And Functional networks by 2008. [13]Rashes attrition and M.Stone DataBase Marketing, Joattritionhn Wiley, 1998. [14] Q.Yang, J.Yin,C.X Ling, and T.Chen, “Post processing Decision Trees to Extract Actionable Knowledge” IEEE conf.Data Mining ,pp 685-688, 2003. 236 http://sites.google.com/site/ijcsis/ ISSN 1947-5500

DOCUMENT INFO

Shared By:

Categories:

Tags:
IJCSIS, call for paper, journal computer science, research, google scholar, IEEE, Scirus, download, ArXiV, library, information security, internet, peer review, scribd, docstoc, cornell university, archive, Journal of Computing, DOAJ, Open Access, December 2010, Volume 8, No. 9, Impact Factor, engineering, international, proQuest, computing, computer, technology, decision trees, attrition

Stats:

views: | 189 |

posted: | 1/20/2011 |

language: | English |

pages: | 7 |

Description:
The International Journal of Computer Science and Information Security (IJCSIS) is a well-established publication venue on novel research in computer science and information security. The year 2010 has been very eventful and encouraging for all IJCSIS authors/researchers and IJCSIS technical committee, as we see more and more interest in IJCSIS research publications. IJCSIS is now empowered by over thousands of academics, researchers, authors/reviewers/students and research organizations. Reaching this milestone would not have been possible without the support, feedback, and continuous engagement of our authors and reviewers.
Field coverage includes: security infrastructures, network security: Internet security, content protection, cryptography, steganography and formal methods in information security; multimedia systems, software, information systems, intelligent systems, web services, data mining, wireless communication, networking and technologies, innovation technology and management. ( See monthly Call for Papers)
We are grateful to our reviewers for providing valuable comments. IJCSIS December 2010 issue (Vol. 8, No. 9) has paper acceptance rate of nearly 35%.
We wish everyone a successful scientific research year on 2011.
Available at http://sites.google.com/site/ijcsis/
IJCSIS Vol. 8, No. 9, December 2010 Edition
ISSN 1947-5500 � IJCSIS, USA.

OTHER DOCS BY ijcsiseditor

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.