Document Sample

Generalized Algorithm To Perform Short Term Load Forecasting 1 Introduction Electric power demand has been growing exponentially in the recent past. This is credited to the development of industries and growth of population in the urban and the rural areas. This growth has led tomore random variations in the daily load consumption pattern thus making the operation of the power system a more cumbersome process. Short-term load forecasting also becomes difficult and hence an accurate and robust methodology is the need of the hour.Short term load forecasting helps the utility in operational planning, unit commitment, maintenance scheduling and also in the allotment of spinning reserves. The utility, apart from all the operational benefits, will also be in a position to bring down their revenue losses. This has gained more importance with the advent of open access in the deregulated electricity market. 1.1 Traditional Approach Traditionally, short term load forecasting has been carried out using time series analysis such as Box and Jenkins’ method, ARIMA model and regression analysis. In this type of analysis, predicted load demand is modeled as a function of previous historical loads. Also artificial intelligence techniques like genetic algorithm, neural network, expert systems, fuzzy logic have been employed for short term load forecasting. The above mentioned methods focused on developing different strategies and aimed at solving the problem by presenting the data differently. But they did not emphasize on the neural network design for arriving at the optimal solution. The neural network architecture was selected on a random basis and hence it cannot be concluded that the solution obtained is optimal. 1.2 Proposed Method The algorithm proposed in ourproject concentrates more on network optimization by predicting the number of neurons in the hidden layer and also the best suited training algorithm. The proposed algorithm adopts a two-step approach wherein the output of the first step gives the optimal training algorithm and the corresponding network size for optimal results. The next step involves the usage of the best training algorithm for the give dataset and also incorporating the number of neurons for optimal results. The above mentioned attributes of the artificial neural networkobtained from the first step of the algorithm will be used for the implementation of Short-term Load Forecasting. The neural network, after assigning to it random weights and biases, is trained with sufficient amount of data and is simulated with the best training algorithm. The Short-term load forecasting results that are obtained from the proposed algorithm are validated by comparing the results with the actual load curves. This is done by forecasting for Tamil Nadu state for the year 2011. The algorithm proposed paves way for obtaining optimal results for short- term load forecasting using artificial neural network. The following chapters will deal with the different aspects involved in our project. 2 Load Forecasting 2.1 Introduction Load forecasting is defined as the prediction of electrical load demand on the power station at a particular time in future. Load forecasting helps an electric utility to make important decisions including decisions on purchasing and generating electric power, load switching, and infrastructure development. Load forecasts are of extreme importance for energy suppliers,financial institutions and other participants in electric energy generation, transmission, distribution and market.Load forecasting has always been critical for taking planning and operational decision conducted by utility companies. However, with the deregulation of the energy industries, Load forecasting has become even more important. With supply and demand fluctuating and the changes of weather conditions and energy prices increasing by a factor of ten or more during peak situations, load forecasting is vitally important for utilities. 2.2 Types of Load Forecasting Load forecasting can be divided into three categories Short-Term Load Forecasting Medium-Term Load Forecasting Long-Term Load Forecasting Short-term forecasts are usually done for a period ranging from one hour to one week. Short-term load forecasting can help to estimate load flows and to make decisions that can prevent overloading of the power system. Timely implementations of such decisions lead to the improvement of network reliability and to the reduced occurrences of equipment failures and blackouts.Various functions of the power plant engineer like unit commitment, operational planning, power purchase, etc., are made based on the Short-term load forecasting results. Medium-term forecasts are carried out for a period usually ranging from a week to a year. Medium-term forecasting serves the purposes of power system planning and operation. Decisions that have to be made in planning and operation are solely dependent on the results that are arrived from Medium-term load forecasting. Long-term forecasts are done for a period longer than a year. Long-term load forecasting results help in the future commissioning of generating stations and also in determining the capacity of each generating unit. 2.3 Factors affecting Load Forecasting For short-term load forecasting several factors should be considered, such as time factors, weather data, and possible customers’ classes. The medium- and long-term forecasts take into account the historical load and weather data, the number of customers in different categories, the appliances in the area and their characteristics including age, the economic and demographic data and their forecasts, the appliance sales data, and other factors. Weather conditions also influence the load. In fact, forecasted weather parameters are the most important factors in short-term load forecasts. Various weather variables could be considered for load forecasting. Temperature and humidity are the most commonly used load predictors. Historical load data such as load at the previous hour, load at the same hour in the previous week and also seasonal components etc. influence the forecasting results. Hence load forecasting algorithm should consider the aforesaid factors so that it could perform accurately and reliably. 2.4 Methods for load forecasting Over the last few decades a number of forecasting methods have been developed. Two of the methods namely end-use and econometric approaches are broadly used for medium and long-term forecasting. A variety of methodsare used for short-term forecasting. The development, improvements, and investigation of the appropriate mathematical tools will lead to the development of more accurate load forecasting techniques. Some of the methods used are Regression analysis Stochastic time series(AR ,ARIMA models) Neural Network Fuzzy Logic Knowledge-based Expert Systems 2.5 Conclusion In this project, Short-term load forecasting is performed using Artificial Neural Network out of the different methods mentioned above. The attributes of the Artificial Neural Network which is used for performing Short-term Load Forecasting is explained in detail in the next section. 3 Artificial Neural Networks 3.1 Introduction Artificial neural network is a massively distributed processor made up of simple processing units, which has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two aspects 1. Knowledge is acquired by the network from its environment through a learning process 2. Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge The procedure used to perform the learning process is called a learning algorithm, the function of which is to modify the synaptic weights of the network in an orderly fashion to attain a desired design objective. 3.2 Basic Neuron Model A neuron is an information processing unit that is fundamental to the operation of a neural network. The neural network model has been shown in Fig.1 which forms the basis of an artificial neural network.Here we identify three basic elements of a neuronal model i. A set of synapses, each of which is characterized by a weight of its own. Specifically, a signal xj at the input of the synapse j connected to neuron k is multiplied by the synaptic weight wkj. The first subscript refers to the neuron in question and the second subscript refers to the input end of the synapse to which the weight refers. ii. An adder for summing up the input signals, weighted by the respective synapses of the neuron. iii. An activation function for limiting the amplitude of the output of a neuron. The activation function is also referred to as the squashing function in that it squashes the permissible amplitude range of the output signal to some finite value. Typically, the normalized amplitude range of the output of a neuron is written as the closed unit interval [0, 1] or alternatively [-1, 1].In mathematical terms, we may describe a neuron k by writing the following pair of equations. m uk = wik xik i=1 And yk = φ(uk +bk ) where x1,x2,…..,xm are the input signals; WPD,WI1,…..,WIk are the synaptic weights of neuron k; uk is the linear combiner output due to the input signals; bk is the bias; k is the activation function and yk is the output signal of the neuronk. Fig. 1 Basic neuron model 3.3 Network Architecture The manner in which the neurons of neural network are structured is intimately linked with the learning algorithm used to train the network. We may therefore speak of learning algorithms (rules) used in the design of neural networks as being structured. In general we may identify three fundamentally different classes of network architectures: 1. Single layer feedforward networks In a layered neural network the neurons are organized in the form of layers. In a simple layered network, an input layer of source nodes that projects onto an output layer neurons. It is also known as acyclic type. Here, “single layer” refers to output layer of computation nodes (neurons). 2. Multilayer feedforward networks This distinguishes itself by the presence of one or more hidden layers, whose computation nodes are correspondingly called hidden neurons or hidden units whose function is to intervene between the external input and the network output in some useful manner. The source nodes in the input layer of the network supply respective elements of the activation pattern (input vector), which constitute the input signals applied to the neurons (computation nodes) in the second layer (i.e. the first hidden layer). The output signals of the second layer are used as inputs to the third layer, and so on for the rest of the network. Typically the neurons in each layer of the network have as their inputs the output signals of the preceding layer only. The set of output signals of the neurons in the output (final) layer of the network constitutes the overall response of the network to the activation pattern, supplied by the source nodes in the input (first) layer. 3. Recurrent networks A recurrent network distinguishes itself from a feedforward neural network in that it has at least one feedback loop. A recurrent network may consist of a single layer of neurons with each neuron feeding its output signal back to the inputs of all the other neurons. The feedback loops involve the use of particular branches composed of unit-delay elements which result in a nonlinear dynamical behavior, assuming that the neural network contains nonlinear units. 3.4 Training Learning is a process by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes takes place. A prescribed set of well-defined rules for solution of a learning problem is called a training (learning) algorithm. Once the network weights and biases are initialized, the network is ready for training (learning). The multilayer feedforward network can be trained for function approximation (nonlinear regression) or pattern recognition. The training process requires a set of examples of proper network behavior—network inputs p and target outputs t. The process of training a neural network involves tuning the values of the weights and biases of the network to optimize network performance, as defined by the network performance function. The default performance function for feedforward networks is mean square error (mse)—the average squared error between the networks outputs a and the target outputs t. Backpropagation is a gradient algorithm in which the network weights are moved along the negative of the gradient of the performance function. Backpropagation Algorithm Step 1: Pick the synaptic weights and biases from a uniform distribution whose mean is zero and whose variance is chosen to make the standard deviation of the induced local fields of the neurons lie at the transition between the linear and saturated parts of the sigmoid activation function. Step 2: Present the network with an epoch of training examples. For each example in the set, perform the sequence of forward and backward computation described in step 3 and step 4. Step 3: Compute the induced local fields and function signals of the network by proceeding forward through the network, layer by layer. Compute the error signal �������� ���� = �������� ���� − �������� (����) where�������� ���� is the desired response of the neuron j and�������� (����) is the output of the neuron j after the forward computation �������� Step 4: Compute the local gradient of the network i.e. for all the neurons �������� present in the network. Adjust the synaptic weights of the network in the layer L according to the generalized delta rule �������� ���� ������������ ���� + 1 = ������������ ���� + ���� (����)������������ (����) �������� Step 5: Repeat step 3 and step 4 i.e. forward and backward computations by presenting new epochs of training examples until the stopping criteria is met. The following training algorithms which are variants of the backpropagation algorithms are being used for the purpose of training. The variations in these algorithms are contributed by the calculation of the gradient which differs in its own way. The calculation of this gradient for each algorithm is explained as follows 3.4.1 Gradient Descent Backpropagation Gradient Descent Backpropagation is a network training function that updates weight and bias values according to gradient descent. It can train any network as long as its weight, net input, and transfer functions have derivative functions. Backpropagation is used to calculate derivatives of performance function with respect to the weight and bias variables X. Each variable is adjusted according to gradient descent �������� �������� = �������� ∗ ������������������������ 3.4.2 Gradient descent with momentum backpropagation This training algorithm provides faster convergence than the previous algorithm. Momentum allows a network to respond not only to the local gradient, but also to recent trends in the error surface. Backpropagation is used to calculate derivatives of performance function with respect to the weight and bias variables. Each variable is adjusted according to gradient descent with momentum dX = mc*dXprev + lr*(1-mc)*dperf/dX 3.4.3 Gradient Descent with Adaptive learning rate The performance of the gradient descent algorithm can be improved the learning rate is allowed to change during the training process. An adaptive learning rate attempts to keep the learning step size as large as possible keeping the learning stable.This can train any network as long as its weight, net input, and transfer functions have derivative functions. Backpropagation is used to calculate derivatives of performance function with respect to the weight and bias variables. Each variable is adjusted according to gradient descent dX = lr*dperf/dX 3.4.4 Gradient descent with momentum and adaptive learning rate backpropagation This algorithm combines the adaptive learning rate with momentum training. Backpropagation is used to calculate derivatives of performance function with respect to the weight and bias variables. Each variable is adjusted according to gradient descent with momentum dX = mc*dXprev + lr*mc*dperf/dX 3.4.5 Resilient Backpropagation The purpose of Resilient Backpropagation is to eliminate the harmful effects of the magnitudes of the partial derivatives. Only the sign of the derivative is used to determine the direction of the weight update. Backpropagation is used to calculate derivatives of performance function with respect to the weight and bias variables. Each variable is adjusted according to the following dX = deltaX.*sign (gX); where the elements of deltaX are all initialized to del0, and gX is the gradient. In each iteration the elements of deltaX are modified in the following manner. If an element of gX changes signsin successive iterations, then the corresponding element of deltaX is decreased by deltadec. If an element of gX maintains the same sign in successive iterations, then the corresponding element of deltaX is increased by deltainc. 3.4.6 Conjugate Gradient with Fletcher-Reeves updates In most of the conjugate gradient algorithms, the step size is adjusted at each step size. A search is made along the conjugate gradient direction to determine the step size that minimizes the performance function along that line. Each variable is adjusted according to the following: X = X + a*dX where dX is the search direction. The parameter a is selected to minimize the performance along the search direction. The line search function is used to locate the minimum point. The first search direction is the negative of the gradient of performance. In succeeding iterations the search direction is computed from the new gradient and the previous search direction, according to the formula dX = -gX + dXold*Z where gX is the gradient. The parameter Z can be computed in several different ways. For the Fletcher-Reeves variation of conjugate gradient it is computed according to Z = normnew2/norm2 where norm2 is the norm square of the previous gradient and normnew2 is the norm square of the current gradient. 3.4.7 Conjugate Gradient Algorithm with Polak-Ribiére update The storage requirements for Polak-Ribiére are slightly higher than for Fletcher-Reeves gradient. Each variable is adjusted according to the following: X = X + a*dX where dX is the search direction. The parameter a is selected to minimize the performance along the search direction. The line search function is used to locate the minimum point. The first search direction is the negative of the gradient of performance. In succeeding iterations the search direction is computed from the new gradient and the previous search direction according to the formula dX = -gX + dXold*Z The parameter Z can be computed in several different ways. For the Polak- Ribiére variation of conjugate gradient, it is computed according to the formula Z = ((gX - gXold)'*gX)/norm2; where norm2 is the norm square of the previous gradient, and gXold is the gradient on the previous iteration. 3.4.8 Conjugate Gradient Algorithm with Powell-Beale Restarts In this algorithm, each variable is adjusted according to the following: X = X + a*dX; where dX is the search direction. The parameter a is selected to minimize the performance along the search direction. The line search function is used to locate the minimum point. The first search direction is the negative of the gradient of performance. In succeeding iterations the search direction is computed from the new gradient and the previous search direction according to the formula dX = -gX + dXold*Z; where gX is the gradient. The parameter Z can be computed in several different ways. For the Polak-Ribiére variation of conjugate gradient, it is computed according to Z = ((gX - gXold)'*gX)/norm2; where norm2 is the norm square of the previous gradient, and gXold is the gradient on the previous iteration. 3.4.9 Scaled Conjugate Gradient The line search used in other conjugate gradient algorithms is computationally expensive. The scaled conjugate gradient algorithm is aimed at remove this problem. This algorithm can train any network as long as its weight, net input, and transfer functions have derivative functions. Backpropagation is used to calculate derivatives of performance with respect to the weight and bias variables X. 3.4.10 BFGS quasi-Newton backpropagation Newton’s method converges faster than conjugate gradient methods.Each variable is adjusted according to the following: X = X + a*dX; where dX is the search direction. The parameter a is selected to minimize the performance along the search direction. The line search function is used to locate the minimum point. The first search direction is the negative of the gradient of performance. In succeeding iterations the search direction is computed according to the following formula: dX = -H\gX; where gX is the gradient and H is an approximate Hessian matrix. 3.4.11 One Step Secant Algorithm This algorithm does not compute the Hessian matrix as in BFGS algorithm. It assumes that in each iteration, the previous Hessian was an identity matrix. Each variable is adjusted according to the following X = X + a*dX; where dX is the search direction. The parameter a is selected to minimize the performance along the search direction. The line search function is used to locate the minimum point. The first search direction is the negative of the gradient of performance. In succeeding iterations the search direction is computed from the new gradient and the previous steps and gradients, according to the following formula: dX = -gX + Ac*Xstep + Bc*dgX; where gX is the gradient, Xstep is the change in the weights on the previous iteration, and dgX is the change in the gradient from the last iteration. 3.4.12 Levenberg-Marquardt Algorithm Backpropagation is used to calculate the Jacobian jX of performance function with respect to the weight and bias variables X. Each variable is adjusted according to Levenberg-Marquardt, jj = jX * jX je = jX * E dX = - (jj+I*mu) \ je where E is all errors and I is the identity matrix. The adaptive value mu is increased by muinc until the change above results in a reduced performance value. The change is then made to the network and mu is decreased by mudec. The parameter mem_reduc indicates how to use memory and speed to calculate the Jacobian jX. Higher states continue to decrease the amount of memory needed and increase training times. 3.4.13 Bayesian Regulation Bayesian regularization minimizes a linear combination of squared errors and weights. It also modifies the linear combination so that at the end of training the resulting network has good generalization qualities This Bayesian regularization takes place within the Levenberg-Marquardt algorithm. Backpropagation is used to calculate the Jacobian jX of performance function with respect to the weight and bias variables X. Each variable is adjusted according to Levenberg-Marquardt, jj = jX * jX je = jX * E dX = - (jj+I*mu) \ je where E is all errors and I is the identity matrix. The adaptive value mu is increased by muinc until the change shown above results in a reduced performance value. The change is then made to the network, and mu is decreased by mudec. The parameter memreduc indicates how to use memory and speed to calculate the Jacobian jX. If memreduc is 1, then the Levenberg-Marquardt algorithm runs the fastest, but can require a lot of memory. Increasing memreduc to 2 cuts some of the memory required by a factor of two, but slows Levenberg-Marquardt algorithm somewhat. Higher values continue to decrease the amount of memory needed and increase the training times. 3.5 Conclusion A multilayer neural network is used for the implementation of short-term load forecasting. The above mentioned training algorithms have been used for the short-term load forecasting. The implementation of Short-term load forecasting using Artificial Neural Network is elaborated in the next section. 4 Implementation of Artificial Neural Network for Short-term Load Forecasting 4.1 Introduction Load Forecasting can be performed using many methods like Regression Analysis, stochastic time series models, Neural Network, Fuzzy Logic and Knowledge-based Expert Systems. Of all these methods Neural Networks stands out in the implementation of Load Forecasting. Its ability to recreate non-linear relationships and its practice of parallel computing makes it far more favorable than any other method. Because of parallel computing, the process is computed at a faster pace than any other method. Also, any failure at one part of the process is overlooked and the rest of the process continues towards the end result. The implementation of neural network for the purpose of Short-term Load Forecasting is discussed in the following sections. 4.2 Collection of Data Before beginning the network initialization process, you first collect and preparesample data. It is generally difficult to incorporate prior knowledge into aneural network and therefore the network can only be as accurate as the datathat is used to train the network.It is important that the data cover the range of inputs for which the networkwill be used. Multilayer networks can be trained to generalize well within therange of inputs for which they have been trained. However, they do not havethe ability to accurately extrapolate beyond this range, so it is important thatthe training data span the full range of the input space.After the data is collected, there are two steps that need to beperformed before the data are used to train the network: the data need to bepreprocessed, and they need to be divided into subsets. The next two sectionsdescribe these two steps. 4.3 Pre-processing and Post-processing of Data Neural network training can be made more efficient if certainpreprocessing steps are performed on the network inputs and targets. The most common of thesepreprocessing techniques are provided automatically when you create a network, and they become partof the network object, so that whenever the network is used, the data cominginto the network is preprocessed in the same way. For example, in multilayer networks, sigmoid transfer functions are generallyused in the hidden layers. These functions become essentially saturated whenthe net input is greater than three. If this happens at thebeginning of the training process, the gradients will be very small, and thenetwork training will be very slow. In the first layer of the network, the netinput is a product of the input times the weight plus the bias. If the input is very large, then the weight must be very small in order to prevent the transferfunction from becoming saturated. It is for this reason; it is standard practice to normalize the inputs before applying them to the network. Generally, the normalization step is applied to both the input vectors and thetarget vectors in the data set. In this way, the network output always falls into anormalized range. The network output can then be reverse transformed back into the units of the original target data when the network is put touse in the field.It is easiest to think of the neural network as having a preprocessing blockthat appears between the input and the first layer of the network and apost-processing block that appears between the last layer of the network and the output, as shown in the Figure 4.1 Normalization is done by using functions such as mapminmax,mapstd, processpca, fixunknowns and removeconstantrows. Usually mapminmax function Pre- Neural Post- Input Output processing Network processing Fig. 4.1 Processing of data ispreferred for both in the input and the output side, though other functions may also be used. Normalization is done between the limits 0.2 and 0.8 using the mapminmax function which is given by the formula �������� = (���������������� − ���������������� ) ∗ (�������� − ���������������� )/(���������������� − ���������������� ) + ���������������� 4.4 Division of Data When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases. The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error. The test set error is not used during training, but it is used to compare different models. It is also useful to plot the test set error during the training process. If the error on the test set reaches a minimum at a significantly different iteration number than the validation set error, this might indicate a poor division of the data set. There are four functions provided for dividing data into training, validation and test sets. They are dividerand (divides the data randomly), divideblock (divides the data into contiguous blocks), divideint (divides data into an interleaved selection), and divideind (divides the data by index). The data division is normally performed automatically when you train the network. In our project, dividerand is used for the division of data. When net.divideFcn is set to 'dividerand' (the default) the data is randomly divided into the three subsets using the division parameters net.divideParam.trainRatio, net.divideParam.valRatio and net.divideParam.testRatio. The fraction of data that is placed inthe training set is trainRatio/(trainRatio+valRatio+testRatio), with asimilar formula for the other two sets. The default ratios for training, testingand validationare 0.7, 0.15 and 0.15 respectively. 4.5 Creation and Initialization of the Network 4.5.1 Creation of Network After the collection and preparation of data, the next step is to create the network object. To create a custom network, start with an empty network and set its properties as desired net = network; The above statement creates an empty network and its properties are modified as follows. The first two properties that have to be set are the number of inputs and the number of layers the network needs. net.numInputs and net.numLayers allows the user to select the same parameters respectively. net.numInputs = 1; net.numLayers = 2; Now the network will have one input layer and 2 layers other than the input layer. We should designate the number of neurons in each layer. The input layer will have only eight neuronswhich is equal to the number of input variables. Initially the number of neurons in the hidden layer is assigned to be one. This is done by the following statements net.inputs{1}.size = 8; net.layers{1}.size = 1; net.layers{2}.size = 1; The hidden layer is the first of the two layers mentioned and the output layer will form the second and the last layer. Next step involves the connection of layers. The inputs should be connected to the input layer. This is done by the command net.inputConnect(1) = 1; Similarly, the connection to the output layer is established by net.outputConnect(i) where i refers to the ith layer. The connection between the hidden layer and the output layer is given by net.layerConnect(j,i) where the outputs of the ith layer is connected to the jth layer. These two connections are expressed as follows. net.outputConnect(2) = 1; net.layerConnect(2,1) = 1; This brings to an end the discussion on the creation of the network and the modification of the network parameters. 4.5.2 Initialization of Network Before training a feedforward network, we must initialize the weights andbiases. The configure command (init) automatically initializes the weights, butyou might want to reinitialize them.This function takes a network object as input and returns a network objectwith all weights and biases initialized. Here is how a network is initialized (or reinitialized) net.biasConnect = [1;1]; net = init(net); The first statement indicates the attachment of biases to both the layers of the network. The network is now initialized with random weights and biases and is ready to be trained. 4.6 Training the Network Once the network weights and biases are initialized, the network is ready for training. The multilayer feedforward network can be trained for function approximation (nonlinear regression) or pattern recognition. The training process requires a set of examples of proper network behavior—network inputs p and target outputs t. The process of training a neural network involves tuning the values of the weights and biases of the network to optimize network performance, as defined by the network performance function net.performFcn. The default performance function for feedforward networks is mean square error (mse) but here we use the performance function mean absolute error (mae). There are two different ways in which training can be implemented: incremental mode and batch mode. In incremental mode, the gradient is computed and the weights are updated after each input is applied to the network. In batch mode, all the inputs in the training set are applied to the network before the weights are updated. For training multilayer feedforward networks, any standard numericaloptimizationalgorithm can be used to optimize the performance function, butthere are a few key ones that have shown excellent performance for neuralnetwork training. These optimization methods use either the gradient of thenetwork performance with respect to the network weights, or the Jacobianof the network errors with respect to the weights. The gradient and the Jacobian are calculated using a technique calledthe backpropagation algorithm, which involves performing computationsbackward through the network. For the training purpose, Backpropagation training algorithm is used. Its variants like Levenberg-Marquardt, Quasi Newton algorithms, Conjugate Gradient, Scaled Conjugate Gradient methods are being used for training. These algorithms have already been discussed in detail in the previous chapter. 4.7 Conclusion This chapter describes the need of Artificial Neural Network for the implementation of Short-term Load Forecasting and also puts forth the key aspects that are to be considered during the implementation of Artificial Neural Network. The Algorithm that is being proposed for Short-term Load Forecasting using Artificial Neural Network is discussed in detail in the next section. 6 Development of Algorithm for Artificial Neural Network based Short-term Load Forecasting 6.1 Introduction An algorithm is developed to overcome the problem of random selection of training algorithm and the neurons in the hidden layer, which is a problem that is being faced by the traditional methods mentioned in the previous sections. The next two paragraphs discuss about the need for concentrating on the training algorithm and the hidden layer size. There are numerous training algorithms which are variants of the backpropagation algorithm. Each algorithm performs best for different functionalities. The use of a random training algorithm is very much questionable as some other algorithm can provide better results with the same dataset that is being used. Hence the selection of training algorithm is very much necessary to drive out the ambiguity involved in the process. The selection of number of neurons in the hidden layer can also affect the performance of the network. A random number will just provide a result not sure of whether it is optimal or not. The number of neurons in the hidden layer should not be very less as it would it not contribute to the generalization of non-linear relationship. Also large number of neurons in the hidden layer will increase the amount of computational time, hence making the process ineffective. The proposed algorithm is aimed at achieving optimal results for short-term load forecasting, keeping in mind the computational efficiency of the entire process. These two attributes of the network are considered in the algorithm for providing improvement in the performance of the neural network. The development of algorithm is a two-step process. The first step involves the training and simulation of the network for various training algorithms and for various neuron sizes. This helps in obtaining the performance of various training algorithms at different hidden layer sizes for the same dataset. The second step involves the use the selected training algorithm and the optimal number of neurons for obtaining the result of short-term load forecasting. 6.2 Algorithm Step 1: Start Step 2: Enter the number of input variables “a” Step 3: Initialize i = 1 Step 4: Read the data of the input variable Step 5: If i = a, go to step 6 else increment i by one and go to step 4 Step 6: Initialize “s” as the number of values in each variables Step 7: Initialize i = 1 Step 8: Initialize j = 1 Step 9: Calculate the change in demand using the formula ���� ����−1 ����−2 ����−2 ∆�������� = (�������� − �������� )/�������� ���� Step 10: Normalize ∆�������� between 0.2 and 0.8 using the formula �������� = (���������������� − ���������������� ) ∗ (�������� − ���������������� )/(���������������� − ���������������� ) + ���������������� Step 11: If j = s, go to step 12 else increment j by 1 and go to step 9 Step 12: Increment i by 1 Step 13: Initialize j = 1 Step 14: Calculate the change in the input variable using the formula ∆�������� = ������������ − ������������ −1 Step 15: Normalize ∆�������� between 0.2 and 0.8 using the formula �������� = (���������������� − ���������������� ) ∗ (�������� − ���������������� )/(���������������� − ���������������� ) + ���������������� Step 16: If j = s, go to next step else increment j by 1 and go to step 14 Step 17: If I = a, go to next step else increment i by 1 and go to step 13 Step 18: Set layer {1}.size = a Step 19: Set TrainFcn = 1 Step 20: Create the network and initialize weights and biases randomly Step 21: Initialize n = 1 Step 22: Train and simulate the network Step 23: Calculate Mean Absolute Percentage Error (MAPE) using the formula 1 ������������������������ −������������������������������������ ���������������� = ∗∑ ∗ 100 % ���� ������������������������ Step 24: If n = 50, go to step 26 else go to next step. Step 25: If n = 5*a, go to next step, else increment n by 1 and go to step 22 Step 26: Store the MAPE calculated Step 27: If TrainFcn = 13 go to next step, else increment TrainFcn by 1 and go to step 20 Step 28: Display BTF = the training algorithm corresponding to minimum MAPE and N = neurons corresponding to the minimum MAPE Step 30: Set TrainFcn = BTF Step 31: Set Neurons = N Step 32: Train and Simulate the network Step 33: Print MAPE and Predicted value Step 34: Stop 6.3 Flowchart Start Enter the number of inputs “a” i=1 Read Data No Is i =a i = i+1 Yes Enter s = Number of input values in each variables i=1 j=1 ���� ����−1 ����−2 ����−2 ∆�������� = (�������� − �������� )/�������� �������� = (���������������� − ���������������� ) ∗ (�������� − ���������������� )/(���������������� − ���������������� ) + ���������������� No Is j =s j = j +1 Yes A A i = i+1 j=1 ∆�������� = ������������ − ������������−1 �������� = (���������������� − ���������������� ) ∗ (�������� − ���������������� )/(���������������� − ���������������� ) + ���������������� No Is j =s j = j +1 Yes No Is i =a i = i +1 Yes Set layer {1}.size = a Set training algorithm = 1 C Create network and initialize random weights and biases n=1 B B Train and Simulate the network 1 ������������������������ −������������������������������������ Calculate ���������������� = ∗∑ ∗ 100 % ���� ������������������������ Yes Is n =50 No No Is n =5*a i = i +1 Yes Store MAPE No Is TrainFcn = 13 TrainFcn++ B Yes Display BTF and N Set TrainFcn = BTF and Neurons = N Train and Simulate the network Display MAPE and Predicted value Stop 5 Test Case and Analysis of Results 5.1 Introduction The implementation of the proposed algorithm is based on the data obtained from the Southern Regional Load Dispatch Centre (SRLDC) and also the Weather data that was obtained from Wunderground. The data that is being used and also the analysis of results obtained from the implementation of Neural Network is discussed in detail in this chapter. 5.2 Test Data For implementing Load Forecasting using Artificial Neural Network sufficient amount of data that has some relationship with the final output, load demand in this case, was needed for the purpose of training the Neural Network. The peak load demand data for the state of Tamil Nadu is obtained from the Southern Regional Load Dispatch Centre (SRLDC) for the year 2010 and 2011. Since more inputs can improve the performance of the neural network, weather data that included information like Temperature, Humidity, Dew Point and Wind speed was obtained from www.wunderground.com. The peak demand data obtained consisted of one value of the peak demand for one day. The weather data included maximum temperature, minimum temperature, maximum humidity, minimum humidity, dew point, wind speed. 5.3 Treatment of Data The data to be sent into the neural network is preprocessed. The peak load demand is preprocessed using the formula ���� ����−1 ����−2 ����−2 ∆�������� = (�������� − �������� )/�������� Other data that includes previous week peak demand for the same day (calculated from peak demand itself) and also the weather information for the day such as maximum temperature, minimum temperature, maximum humidity, minimum humidity, dew point and wind speed are preprocessed using the formula ∆�������� = ������������ − ������������−1 Once the preprocessing of data is done, it should be normalized between certain limits before they are fed into the neural network. For normalization mapminmax function is used which involves the formula �������� = (���������������� − ���������������� ) ∗ (�������� − ���������������� )/(���������������� − ���������������� ) + ���������������� Saturation of input variables can be avoided by using appropriate limits for each activation function used in the normalization process. The activation function used on the input side is logsig transfer function and the one that is used on the output side is purelin transfer function. These activation functions have been explained in the previous chapters. 5.4 Training and Simulation of the Network After the treatment of data, it is fed into the neural network. The network is trained with the data with the assigned training algorithm and neurons in the hidden layer and simulation is also done simultaneously. The process is repeated for all training algorithm as the size of the hidden layer varies from i to 5*i where i is the number of input variables. The values of Mean Absolute Percentage Error (MAPE) are calculated at each simulation and the results are stored. The training algorithm that corresponds to minimum MAPE is selected as the best training algorithm (BTF) and the neurons corresponding to the same are selected as the optimal number of neurons. The new training and the optimal number neurons are selected. Training and simulation are performed with the new parameters. Peak Demand Artificial Neural Network Post-Processing Stage Pre-Processing Stage Previous Week Demand Max. Temperature Load Demand Min. Temperature Max. Humidity Max. Humidity Dew point Wind Speed Fig 7.1 Implementation of Artificial Neural Network 5.5 Results Mean Absolute Percentage Error is taken as the measurement of accuracy and hence a tabulation of MAPE for all the training algorithms for neurons ranging from 8 to 40 is shown in table 5.1(a) and (b). The network performs differently for each training algorithm and arrives at different results. The variation of the MAPE for each is shown graphically in the figure 7.2. From the 13 algorithms implemented in the neural network Resilient Backpropagation (Trainrp) produced the best results for this particular dataset. The results obtained are shown below Training algorithm: Resilient Backpropagation Number of neurons: 22 MAPE: 3.1889 Neurons traingd traingda traingdm traingdx traincgb traincgp 8 3.5566 3.6142 3.6446 3.6542 4.2308 3.5538 9 3.6765 3.6773 4.9320 3.6950 3.7074 3.6926 10 3.3942 3.4636 3.5834 3.4933 3.4077 3.3898 11 3.7281 3.9953 4.4414 3.7557 3.7298 3.7357 12 3.6878 3.7062 4.3873 3.7035 3.7120 3.6050 13 3.6011 3.6626 3.6806 3.6246 3.4646 3.6141 14 3.7652 3.8461 4.1154 3.7971 3.8231 3.7647 15 4.1631 4.6565 4.5875 4.4307 4.6440 3.6785 16 3.8235 3.8845 4.1297 3.8503 3.8665 3.8031 17 4.3082 4.7334 5.7101 4.3180 4.3517 3.8279 18 4.0229 4.1954 4.8219 4.0568 3.7689 3.5565 19 4.3741 4.7705 4.7745 4.4445 3.6634 4.1703 20 3.7945 3.9384 5.7914 3.8190 3.6428 3.7795 21 4.1720 3.9922 4.0701 3.9848 4.2744 3.8421 22 4.1954 4.1806 5.2781 4.1830 3.7481 3.5589 23 4.8444 4.6714 5.2312 4.6697 3.6607 4.7133 24 4.4193 4.0606 4.2972 3.9868 3.9832 3.5944 25 4.9920 4.2516 4.6786 4.2094 4.5154 3.8038 26 3.9280 4.2385 4.4956 3.9376 3.8658 3.8714 27 4.5996 4.0278 5.9912 3.9121 3.8988 3.8919 28 4.8294 4.1418 4.6231 3.8996 4.4652 3.9469 29 5.1908 4.7105 5.1845 4.4469 4.3724 4.3734 30 4.9180 4.8041 5.3129 4.3819 3.7792 4.0353 31 5.2324 4.6641 4.6601 4.2932 3.9074 3.7326 32 4.8248 4.6545 4.5299 4.4363 3.9348 5.1736 33 4.6442 3.9915 5.5154 3.9617 4.6660 3.9471 34 4.4819 5.3709 5.8031 4.4482 4.4374 4.0862 35 4.8206 4.3382 4.4090 4.0333 3.7811 3.8015 36 5.9694 5.0785 4.9501 4.8601 5.0868 4.9670 37 5.1436 4.5973 5.2020 4.3688 3.6085 4.3352 38 4.6616 4.4122 4.6985 4.2381 3.8670 5.1917 39 4.9155 4.3986 4.9893 4.4027 4.6731 3.7218 40 5.5223 4.5349 4.6636 4.4342 3.7643 4.3519 Table 7.1(a) Variation of MAPE with the number of neurons Neurons traincgf trainscg trainbfg trainoss trainrp trainbr trainlm 8 4.7305 3.5883 3.4412 3.5584 3.5222 3.3565 3.4780 9 3.9929 3.7028 3.7031 3.6962 3.6777 3.4012 3.5798 10 3.3918 3.3974 3.6649 3.3967 3.6146 3.3516 3.5709 11 3.8604 3.7508 3.7455 3.7815 3.5758 3.3224 3.5955 12 4.0028 3.6876 3.6912 3.6802 3.5802 3.2661 3.5821 13 3.8865 3.6096 3.5603 3.6519 3.4322 3.3417 3.5565 14 3.7632 3.8586 3.9183 3.7674 3.5679 3.2754 3.4304 15 5.5560 4.4004 3.6911 3.8419 3.6346 3.3654 3.7166 16 3.7545 3.8280 3.7086 3.8071 4.0016 3.3028 3.3985 17 3.8708 4.3831 3.9054 4.1713 3.9973 3.3168 3.6861 18 4.0418 3.8278 3.7321 3.6377 3.5565 3.2964 3.4083 19 4.1743 4.4441 3.5869 3.6464 3.5954 3.3501 3.5942 20 3.7802 3.7764 3.7031 3.7956 4.0947 3.3865 3.4476 21 3.8287 3.8554 3.7748 3.9916 4.2762 3.3504 3.6972 22 3.6790 3.9917 3.6842 3.6223 3.7509 3.1889 3.6890 23 4.6951 4.6795 3.8679 4.4104 3.6749 3.3988 3.7893 24 3.9901 3.9931 3.6208 3.9717 3.4682 3.3668 3.4117 25 3.4732 4.1719 3.6072 4.2780 4.2328 3.3750 4.1287 26 3.8884 3.8687 3.9019 3.8548 3.9721 3.2984 3.7376 27 3.8959 3.8949 3.7717 3.8754 3.4286 3.5958 3.3967 28 4.0499 3.8996 3.7976 3.8311 3.6527 3.3411 3.9200 29 4.3707 4.3106 3.9719 4.3626 3.7276 3.3721 3.6024 30 4.5084 4.1593 3.8026 3.8609 4.0198 3.3363 3.9529 31 5.3186 3.8768 3.7899 3.8667 3.5337 3.3318 3.4641 32 4.4387 4.4367 4.3636 4.4299 3.5577 3.3407 3.5315 33 3.9484 4.0093 3.6347 4.1600 3.8811 3.4154 3.8207 34 3.4397 4.3936 3.6877 4.5409 4.2813 3.3603 4.0805 35 4.3266 4.0263 3.7373 3.8333 3.8174 3.2435 3.7092 36 5.5198 4.9339 3.7556 4.1364 4.1922 3.3147 4.0773 37 4.3309 4.3346 3.7799 3.5802 3.6243 3.2823 3.6581 38 5.6628 3.6954 3.6069 4.1825 4.5596 3.4827 3.5465 39 3.7005 4.3383 3.8234 4.3441 3.9268 3.3728 4.4555 40 4.4426 4.3850 4.0083 3.9271 3.8938 3.4401 3.9163 Table 7.1(b) Variation of MAPE with the number of neurons 6 traingd traingda traingdm 5 traingdx % MAPE traincgb traincgp traincgf 4 trainscg trainbfg trainoss trainrp 3 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 trainbr trainlm Number of Neurons Fig. 7.2 Variation of MAPE with the number of neurons Fig. 7.3 Actual Demand vs. Predicted Demand 5.6 Conclusion The implementation of neural network for short-term load forecasting is done using 13 training algorithms and the results depict that, for this particular dataset trainrp proves to be the best training algorithm providing the most optimal result and better computational efficiency.

DOCUMENT INFO

Shared By:

Tags:

Stats:

views: | 43 |

posted: | 8/12/2012 |

language: | English |

pages: | 39 |

OTHER DOCS BY affilomark

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.