Data Mining A Tutorial-Based Primer

2008. 05. 15 권오진 서울시립대학교 HPC 연구실 I Simple Analysis of NN in DM 8.1 Feed-Forward Neural Networks II III 8.2 Neural Network Training: A Conceptual View IV IV IV 8.3 Neural Network Explanation 8.4 General Considerations 8.5 Neural Network Training: A Detailed View Simple Analysis of NN in DM • TS="data Mining" and TS="Neural Network" KAIST 산업공학과 박상찬 교수 A taxonomy of feed-forward and recurrent/feedback network architectures. S S S U U S U S: Supervised U: Unsupervised Jain, A.K., Mao,J., “Artificial Neural Networks: A Tutorial” Computer, March,31-44 Neural network learning can be supervised or unsupervised. Learning is accomplished by modifying network connection weights while a set of input instances is repeatedly passed through the network. Once trained, an unknown instance passing through the network is classified according to the value (s) seen at the output layer Jain, A.K., Mao,J., “Artificial Neural Networks: A Tutorial” Computer, March,31-44 History of Neural Networks • 1943: McCullough and Pitts - Modeling the Neuron for Parallel Distributed Processing • 1958: Rosenblatt - Perceptron • 1969: Minsky and Papert publish limits on the ability of a perceptron to generalize • 1970’s and 1980’s: ANN renaissance • 1986: Rumelhart, Hinton + Williams present backpropagation • 1989: Tsividis: Neural Network on a chip 8.1 Feed-Forward Neural Networks Input Layer 1.0 Node 1 W1j W1i W2j Hidden Layer Output Layer Node j Wjk 0.4 Node 2 W2i Node k Node i Wik W3j 0.7 Node 3 W3i Figure8.1 A fully connected feed-forward neural network Table 8.1 • Initial Weight Values for the Neural Network Shown in Figure 8.1 W lj W li W 2j 0.20 0.10 0.30 –0.10 W 2i –0.10 W 3j W 3i W jk W ik 0.20 0.10 0.50 The user specifies the number of hidden layers as well as the number of nodes within a specific hidden layers ** Neural Network Input Format(1/2) Categorical data Color = { red, green, blue, yellow} Ex 1) Straightforward technique red =0.00, green=0.33, blue=0.67, yellow=1.00  Pitfall Ex 2) Additional input nodes red = [0,0], green=[0,1] blue=[1,0], yellow=[1,1] Neural Network Input Format(2/2) Conversion of numerical data 1. Divide all attribute values by the largest attribute value( 단점 : 0에 근접한 값이 없을 경우) 2. 최대값과 최소값이용 newValue  where newValue is the computed value falling in the[0,1] interval range originalVa lue is the value to be converted minimumVal ue is the smallest possible value for theattribute maximumVal ue is the largest possible attribute value originalVa lue  minimumVal ue maximumVal ue  minimumVal ue 3. 편중된 데이터 base가 2나 10인 로그변환 Neural Network Output Format Neural Network Output Format Ambiguous output value % 신용카드 프로모션 프로모션 Yes Node1 = 1, Node2 = 0 프로모션 NO Node1= 0, Node2 =1 NN의 출력 값이 Node1 = 0.9 Node2 = 0.2 프로모션 Yes NN의 출력 값이 Node1 = 0.2 Node2 = 0.3 ? % 출력 값 0.8이상 프로모션이 Yes일 가능성이 큰 것으로 고려 그렇다면 0.45는 어떻게 처리 해야 하나 ? KNN방식 사용. Prediction % 주식가격 예측 시 출력 값이 0.35라면 출력 값은 주식의 최소값 $10.00 최대값 $100.00 (90.00(현재주식값))(0.35)+$10.00  $41.50 The Sigmoid Function 평가함수: [0,1]사의의 출력 최대값 1출력 1 f ( x)  1  e x where e is the base of natural logarithmsapproximated by 2.718282. 1.200 1.000 0.800 f(x) 0.600 0.400 0.200 0.000 -6 -5 -4 -3 Equation08.21 -2 -1 2 3 4 5 6 x Input Layer 1.0 Node 1 W1j W1i W2j Hidden Layer Output Layer Sigmoid 함수 사용 Node j Wjk W2i 0.4 Node 2 Node k Node i Wik W3j 0.7 Node 3 W3i Table 8.1 • Initial Weight Values for the Neural Network Shown in Figure 8.1 W lj W li W 2j 0.20 0.10 0.30 –0.10 W 2i –0.10 W 3j W 3i W jk W ik 0.20 0.10 0.50 Node j InputV=(1.0)(0.2) +(0.4)(0.3)+(0.7)(-1.0)=0.25 F(0.25) = 0.562 8.2 Neural Network Training: A Conceptual View Supervised Learning with FeedForward Networks • Backpropagation Learning Input Layer 1.0 Node 1 W1j W1i W2j Hidden Layer Output Layer Node j Wjk 0.4 Node 2 W2i Node k Node i Wik W3j 0.7 Node 3 W3i Weight 값의 조정 방향 Unsupervised Clustering with Self-Organizing Maps Output Layer Input Layer Node 1 Node 2 Figure 8.3 A 3x3 Kohonen network with two input layer nodes 8.3 Neural Network Explanation • Sensitivity Analysis • Average Member Technique Sensitivity analysis (Supervised) To insight into the effect individual attributes have on neural network 1. 2. 3. Divide the data into a training set and a test dataset. Train the network with the training data. Use the test data to create a new instance I. Each attribute value for I is the average of all attribute values within the test data. 4. For each attribute: a. b. Vary the attribute value within instance I and present the modification of I to the network for classification. Determine the effect the variations have on the output of the neural network. c. The relative importance of each attribute is measured by the effect of attribute variations on network output. Average member technique The average or most typical member of each class is computed by finding the average value for each class attribute AMT는 Unsupervised에 이용 Supervised 학습을 사용하여 unsupervised 학습에 이용 Unsupervised clustering을 위한 데이터변환NN을 사용하여 Clustering각 Clsuter를 Class로 명명규칙생성기를 가진 supervised 분류모델을 위한 Training 데이터로 사용생성된 규칙을 검토하여 클래스 내용 파악 8.4 General Considerations • • • • • What input attributes will be used to build the network? How will the network output be represented? How many hidden layers should the network contain? How many nodes should there be in each hidden layer? What condition will terminate network training? – Minimum Total Error, Specific Time Criterion, Maximum number of iterations The process of building a neural network is both an art and a science Neural Network Strengths • Work well with noisy data. • Can process numeric and categorical data. • Appropriate for applications requiring a time element. • Have performed well in several domains. • Appropriate for supervised learning and unsupervised clustering. Weaknesses • Lack explanation capabilities. • May not provide optimal solutions to problems. • Overtraining can be a problem. 8.5 Neural Network Training: A Detailed View The Backpropagation Algorithm: An Example Backpropagation works by making modifications in weight values starting at the output layer and then moving backward through the hidden layers. Input to node j=(0.2)(1.0)+(0.3)(0.4)+(-0.1)(0.7)=0.250 Output from node j =0.562 Input to node i=(0.1)(1.0)+(-0.1)(0.4)+(0.2)(0.7)=0.200 Output from node i =0.550 Input to node k=(0.1)(0.562)+(0.5)(0.550) =0.331 Output from node k =0.582 Backpropagation Error Output Layer Error ( k )  (T  Ok )[ f ' ( xk )] where T  The target output Ok  The computed output at node k (T  Ok )  The actual output error f ' ( xk )  The first - order derivative of the sigmoid function xk  the input to the sigmoid function at node k Error ( k )  (T  Ok )Ok (1  Ok ) Error ( k )  (T  Ok )Ok (1  Ok ) T=0.65 Error (k) = (0.65-0.582)(0.582)(1-0.582)=0.017 Error (j) = (0.017)(0.1)(0.562)(1-0.562) = 0.00042 △Wjk =(0.5)(0.017)(0.562)=0.0048 The update value for Wjk=0.1+0.0048=0.1048 △W1j =(0.5)(0.00042)(1.0)=0.0002 The update value for W1j=0.2+0.0002=0.2002 △W2j =(0.5)(0.00042)(0.4)=0.000084 The update value for W2j=0.3+0.0048=0.300084 △W3j =(0.5)(0.00042)(0.7)=0.000147 The update value for W3j=-0.1+0.000147=-0.099853 Backpropagation learning algorithm 1.Initialize the network a. Create the network topology by choosing the number of nodes for the input, hidden, and output layers b. Initialize weight for all node connections to arbitrary values between -1.0 and 1.0 c. Choose a value between 0 and 1.0 for the learning parameter 2. For all training set instances: a. Feed the training instance through the network b. Determine the output error. c. Update the network weights using the previously described method. 3. If the terminating condition has not been met repeat step 2. 4. Test the accuracy of the network on a test dataset. If the accuracy is less than optimal, change one or more parameters of the network topology and start over. Root Mean Squared Error   (Tin  Oin ) n i ni where n  the totalnumber of training set instances i  the totalnumber of output nodes Tin  the target output for thenth instance and the ith output node Oin  the computed output for thenth instance and ith output node A common criterion is to terminate backpropagation learning when RMS <0.10 Equation 8.8 Kohonen Self-Organizing Maps: An Example Input Layer 0.4 Node 1 W1j = .3 W1i = .2 Output Layer Node i W2i = .1 0.7 Node 2 W2j = .6 Node j Figure 8.4 Connections for two output layer nodes (0.4  0.2) 2  (0.7  0.1) 2  0.632( nodei) (0.4  0.3) 2  (0.7  0.162  0.141( nodej) r = 0.5 △w1j = (0.5)(0.4-0.3)=.05 △w2j = (0.5)(0.7-0.6)=.05 △w1j(new) = 0.3+.05=.35 △w2j(new) = 0.6+.05=.65

Related docs
Data Mining A Tutorial-Based Primer
Views: 121  |  Downloads: 25
A tutorial-based users� manual for Poly3D
Views: 36  |  Downloads: 1
Data Mining
Views: 18  |  Downloads: 6
an information systems and technology primer
Views: 6  |  Downloads: 0
an excel primer
Views: 10  |  Downloads: 0
data mining
Views: 635  |  Downloads: 59
A Globus Primer
Views: 0  |  Downloads: 0
data mining with r
Views: 6  |  Downloads: 0
KM_Primer
Views: 13  |  Downloads: 3
OA_primer
Views: 64  |  Downloads: 1
Mongolia Primer
Views: 4  |  Downloads: 0
Data Mining
Views: 6  |  Downloads: 0
premium docs
Other docs by techmaster