VIEWS: 51 PAGES: 33 CATEGORY: Farm Service Agency POSTED ON: 9/3/2008
RESEARCH REPORT Statistical Reporting Service U.S. Department of Agriculture A COMPARISON OF SEVERAL REGRESSION MODELS FOR FORECASTING PECAN YIELDS by Chapman P. Gleason Research Division Research and Development Branch NOVEMBER 1974 ------.------------------ CONTENTS SUMMARY .••.••.••••••••••• I •••••••••••••••••••••••••••••••••••••••••• i ii 1 1 1 1 2 2 2 I NTRO DUCT I ON •••.••.••••.••••••••••.•••..•••••.••.••.••.••.••.•..••••• DATA COLLECT I ON PROCEDURES •••••••••••••••••••••••••••••••••••••••••• B 1ock Se 1ec t ion •.•••••.••••••••.•••••.••••.••••••.•.•••••.••••• Se 1ect ion ••••••••••••••••.••••••••••••••••••••••••• Se 1ect ion ••••..••••••••••••••••••••••••••••.••••••• Counts •••••••••••••••••••••••••••..•••••••••••••.•• Procedures Nuts from •••••••••••••••••.•••••.•••••••••••.••••• Photographs to Harvest ••.••••••••••••••••.•••••••••••• •••.•.•••.•••••...••••••.•••••••.• Samp 1e Tree Samp 1e Limb Samp 1e Limb Photography Counts Nut of Droppage Prior 3 3 Ha rves t Data ••••••..•••••.••••••••••••••••••••••••••••••••••••• ' •••••••••••••••••••.••.••••••••.•••••••••••••. DATA EXPANS IONS ••••••• Limb 5 5 5 7 8 8 Exp a n s i on s •••.•••••••••••••••••••••••••••••••••••••••••••• Expans ions ••••••••••••••••••••••••••••••••••••••••• Photog raphy Drop Expans ions •••••••••••••••••••••.•.•••.••••••••••••••••.•.• RE SU L T5 ••••.••.••.••.••••••••••••••••••••••••••••.••.•.•••••••••••.• General Co r re 1at i on Coeff Ana 1ys i s of i c i ents ••••••••••••••••••••••••••••••••••••••• 8 12 23 26 26 29 Reg res s i on Mode 1s •••••••••••••••••••••••••••••••••• DISCUSS ION OF RESULTS ••••••••••••••••••••••••••••••••••••••••••••••• CON US I ON5 •••••••••••••••••••.••••••••••••••••••••••••••••••••••••• CL RECOMMENDAT IONS ••••••.•••••••••••••••••••.•••••••••••••••••••••••••• REFERENCES•••••••••••••••••••••••••••••••••••••••••••••••••••••••••• SUMMARY Several different regression models are compared to determine which average yield per tree. Criteria are proposed are best for forecasting to determine which variables and ultimately which regression models are better than others. Using the proposed criteriat a simple linear regreswas found to A new method of sion model using the number of nuts counted on photographs be "best". expanding Reconmendations are made for further research. the number of nuts counted on photographs to the tree level is in 1972 in central also presented. The study was based upon data collected and southern Mississippi. A COMPARISON OF SEVERAL REGRESSION MODELS FOR FORECASTING PECAN YIELDS BY CHAPMAN P. GLEASON INTRODUCTION Research data collection which studies have shown that limb sampling are promising methods and photographic data with procedures of providing to forecast Two different the average yield approaches (number or weight of nuts) per tree. the average yield per tree using two simple linear regres- to forecasting were proposed sions---yield in 1971. versus The first involved the number of nuts on sample The second limbs; and yield versus regression the number of nuts on photographs. approach to the forecasting involves a multiple problem---yield versus the number of nuts on two questions: limbs and photographs. 1. 2. The research was aimed at answering is better? is as good as multiple Which of the above approaches If simple which tree? linear regression regression, regression gives the best estimate of average yield per From a cost standpoint, collect and provide one variable forecasts. may be easier and cheaper Tests of statistical Standard errors, to more precise to answer hypothesis R 2 , and will be formulated the first question. C.V.'s will be compared to answer the second. ------------------------ ---------------~----------- DAT~ COLLECTION Block Selection: Five block of Stuart variety two separate in central additional (Wilkinson geographic PROCEDURES pecans were subjectively selected in areas of Mississippi. (Hinds County), Three blocks were located by one operator. Two Mississippi all managed blocks were located County), each managed in the southwestern by different corner of the State individuals. Sample Tree Selection: For each of the three blocks 1971 research in Hinds County, (Wood (8». the trees used in a In Wilkinson County project were used again it was necessary blocks. 1. to select four trees in each of the newly selected procedure was used to select the trees. selected with equal probability of selection A two-stage Two rows were randomly for each row. 2. Within each selected equal probability. row, two trees were randomly In this approach, selected with lengths than if rows are varying of selection trees in short rows have a greater those in long rows. Sample Limb Selection: For each selected a six-foot probability tree, the total number of accessible limbs was enumerated (reachable simple Sample by ladder) sample and a 50 percent was taken. random sample with equal probabilities were defined inches. of selection limbs as those with cross-sectional area between 1.8 and 5.5 square and For each tree, the total number of sample was estimated using either limbs (both accessible inaccessible) bare tree mappings of limbs or bare number of sample tree stero photographs. limbs for the i-th tree. Nj will denote the total estimated The trees in Hinds County had stero photographs 2 taken in early April graphy to estimate 1971. (Huddleston (4) describes the uses of photolimbs.) Bare tree mappings limbs the total number of sample of limbs were made and used to estimate for the trees selected Sample limb Counts: For each tree, once the sample in Wilkinson the total number of sample County. limbs were selected all nuts on the the number limb were counted by tagging each cluster of fruit and counting of nuts In each tagged cluster. an indication Photography of monthly This prevented counting errors and gave fruit droppage from the clusters. Procedures: photography that plagued the research To avoid the poor quality efforts 1. in 1970 and 1971, the following techniques were used: The tripod which held the camera was located 50 feet from the base of the tree with the sun at the back of the photographer. 2. A florist stake was placed directly below the tripod. frame was placed two feet in front of the 3. The metal photography camera lens. 4. The angle of the camera from the tree was recorded. The photographs a metal frame. were taken up a vertical column of the tree through A Miranda Sensorex camera with an in-lens light meter over those with a camera which improved the photograph significantly has no in-lens light meter. Counts of Nuts from Photographs: Each slide was projected on a grid. The number of nuts in each cell subset of the slides was factors. (See Wood (7,p.19) was counted by a photo interpreter. recounted for computation A certain of photo adjustments for a discussion of methods used to compute photo adjustment factors.) ----------------------~--~--_._-_._--------~-~_ .. _------~----------- 3 Nut Droppage Prior to Harvest: On the first photography by two feet, were randomly each tree. subsequent visit, two square plots, each two feet located on the ground beneath the canopy of On each The identified area was then gleaned for nuts. field visit, the amount of droppage (number of nuts) in the plot was counted and removed. Harvest Data: At harvest, collected. the each tree was shaken and all "goodll nuts were The nuts that remained on the ground were deemed "bad". Three of nuts Each tree was visited three times to collect harvest data. one-pound samples of nuts were selected However, it was apparent from each collection for a tree. that for several of the trees that errors by these collection of nuts were mixed due to classification the trees. the laborers who harvested For this reason and the fact that the a good nut cannot be distinguished biological from a bad nut on a photograph variable yield was used as the dependent in the analysis of all nuts. that follows. The term LBNUTS, will denote the collection The total harvest data for each tree are given in Table 1. 4 Table) - Harvest Data, Mississippi Pecans, 1972 5 DATA EXPANSIONS limb Expansions: The expanded tree was computed number of nuts from sample as fo 11 ows: N. (1) NNSl= I limbs (NNSl) for each n. L;I X .. IJ n. I j=l where for the i-th tree, N. is the estimated I total number of sample X •• IJ limbs, n. is the number of sample I limbs selected, sample limb. is the total number of fruit counted on the j-th selected are sampling expansion It is noted that we whereas (1) is an that each only those limbs which are accessible, to the total tree based on the fallacious limbs had a non-zero assumption of the Ni sample contend chance of selection. Horticulturists limbs. that the lower limbs produce fewer nuts than the higher Hence, an unde~-estimate from (1). Photography Expansions: of the total number of fruit will be realized The counts of nuts using ground photographs level by two methods. a tree is a sphere. The independent counted The first expansion were expanded to a tree assumed that the sh~pe of of the method,) (Wood (7,p.20) gives a discussion using this assumption sphere assumption.) variable is NNPS (number of nuts The second expansion to from photographs, a tree level assumes expansion that the shape of the tree is a parabolid. for every tree must be estimated, (See Strout For this (h) bearing is: two parameters (r). the height and the radius It can be proved (5» that estimated surface area of the tree assuming the tree is shaped as a parabolid 6 llr SAP = ---- 6h2 Thus, the number of nuts counted on photographs (NNPP) Is, SAP using the parabolid assumption n· ( ~I (2) NNPP = ------TAMF j=l x .. ) IJ IJ Where n. is the number of photograph I taken on the i-th tree, X .. is TAMF is the total area of the the number of nuts on the j-th photograph, middle frame. (See Wood (7, p.22)). The number of nuts counted on photographs the fact that each photo for any given slide. such deviation in counting to estimate interpreter were adjusted to reflect counts a different differences number of nuts and to measure To minimize interpreter f:rom the IInorm'la balanced incomplete block design was used of methods used the slides. interpreter (Wood (7,p.19)) gives a discussion adjustment factors.) differences The count of fruit on each by multiplying the interWhen two slide was adjusted preter adjustment interpreters for interpreter factor times the number of fruit counted. the same slide these adjusted by averaging counted counts were averaged. The radius distance (r) was estimated the longest and the shortest The height (h) from the trunk to the edge of the tree canopy. by using the number of photographs from the trunk to the camera. was roughly estimated knowing the distance n. taken and I limb counts, the true of As with these methods of expansion to the tree level will under-estimate number of nuts on the tree since all nuts do not grow on the periphery the tree. However, since flower buds develop on new growth that tends to of the tree, most of the fruit is produced near the occur on the periphery surface. ----------------------------------~----------~------------ 7 Drop Expansion: The nut droppage from the i-th tree was estimated 2 (3) DROP = --( z: as follows: X •• ) IJ 8 where (r) is the estimated j=l radius, and Xij is the number of nuts in the j-th drop count unit for the i-th tree. Observe that nr2 is the area of a circle and 8 is the total area sampled using both 21 x 21 drop units, so the ratio nr2/8 is an area expansion factor. ~~-----~------~----------- ~-~-~------------------ 8 RESULTS Genera 1 : Previous investigations (by Wood correlated (7,8)) found that both NNSL with the estimated the biological number of yield, variable. and NNPS to be significantly good nuts at harvest. or total weight In addition is a variable ditions In this investigation nuts --- LBNUTS, of harvested was the dependent to the reasons mentioned that is influenced previously, the number of good nuts conproject. by marketing or immeasurable and other economic in the research which were uncontrolled Two data sets were used in the analysis. counts from color transparancies, differences. Coefficients: 2 through The first were unadjusted for the second were counts adjusted interpreter Correlation Tables 7 gives the product moment correlation of variables, its significance probability coefficient probability of a correvalue) for each pairwise combination and the number of observations. lation coefficient correlation parameter assumption The significance is the probabi lity that a larger (in absolute coefficient,should p=O. The pairwise arise by chance of the true population correlation was computed based on the normal distributions. that the random variables have bivariate ~--------~~----~_._-~~-- ----~-~-------------- 9 Table 2: Correlation 1972 Correlation LBNUTS Matrix, unadjusted Coefficientsl NNPS 0.589406 0.0063 20 photography > data, Mississippi Pecans, July Prob IRI under Ho:p=OI number of observations II LIMBNNPP 0.692073 0.0010 20 0.973957 0.0001 20 1.000000 0.0000 20 0.449205 0.0512 19 0.853912 0.0001 19 0.814804 0.0001 19 1.000000 0.0000 19 LBNUTS I.000000 0.0000 20 NNPS I.000000 0.0000 20 NNPP LIMB Table 3: Correlation Matrix, unadjusted photography data, Mississippi Pecans, August 1972 Correlation Coefficientsl Prob > IRI under Ho:p=OI number of observations II LBNUTS NNPS NNPP LIM~ DROP 1.000000 0.0000 20 LBNUTS 0.835472 0.0001 20 0.909419 0.0000 20 0.970281 0.0001 20 1.000000 0.0000 20 0.439499 0.0571 19 0.565588 0.0112 19 0.473863 0.0384 19 I.000000 0.0000 19 -0.022093 0.9235 20 -0.038837 0.8651 20 -0.051213 0.8245 20 0.392730 0.0931 19 1.000000 0.0000 20 NNPS 1.000000 0.0000 20 NNPP LIMB DROP II There were no accessible sample limbs on tree F4 10 Table 4: Correlation Matrix, unadjusted photography data, Mississippi Pecans, Septembe r, 1972 Correlation Coefficients/ Prob > I RI under H 0 :p=O/ number of observat ions LBNUTS NNPS 0.750568 0.0003 20 1.000000 0.0000 20 NNPP 0.821828 0.0001 20 0.977678 0.0001 20 1.000000 0.0000 20 LIMB _1/ 0.427667 0.0650 19 0.638995 0.0035 19 0.568388 0.0108 19 1.000000 0.0000 19 DROP -0.066414 0.7771 20 -0.093471 0.6967 20 -0. 142778 0.5545 20 0.335533 o. 1573 19 1.000000 0.0000 20 LBNUTS 1.000000 0.0000 20 NNPS NNPP LIMB DROP Table 5: Correlation Matrix, adjusted photography data Mississippi Pecans, July 1972 Correlation Coefficients/ Prob > IRJ under Ho:p=O/ number of observations LBNUTS NNPS 0.71448 0.0001 20 I.000000 0.0000 20 NNPP 0.805644 0.0001 20 0.972430 0.0001 20 1.000000 0.0000 20 LIMB _1/ 0.449205 0.0512 19 0.745579 0.0004 19 0.664007 0.0022 19 1.000000 0.0000 19 LBNUTS 1.000000 0.0000 20 NNPS NNPP LIMB 1/ There are no accessible sample limbs on tree F4 II Table 6: Correlation 1972 Correlation LBNUTS LBNUTS 1.000000 0.0000 20 Matrix, adjusted photography > data Mississippi Pecans, August Coefficientsl NNPS 0.899645 0.0001 20 I.000000 0.0000 20 Prob I RI under He :p=OI NNPP 0.914822 0.0001 20 0.971275 0.0001 20 I.000000 0.0000 20 number of observations I I LI M B -DROP -0.022093 0.9235 20 -0.057109 0.8058 20 -0.069774 0.7669 20 0.392730 0.0931 19 1 .000000 0.0000 20 0.439499 0.0571 19 0.462843 0.0438 19 0.332710 0.1611 19 1.000000 0.0000 19 NNPS NNPP LIMB DROP Table 7: Correlation Matrix, adjusted photography data Mississippi Pecans, Sep tembe r 1972 Correlation Coefficientsl Prob > IRI under He:p=OI number of observations LBNUTS NNPS 0.,846902 0.0001 20 I.000000 0.0000 20 NNPP LIMB __ II 0.427667 0.0650 19 0.538335 0.0166 19 0.418659 0.0715 19 1 .000000 0.0000 19 DROP -0.066414 0.7777 20 -0.108465 0.6530 20 -0.163419 0.5024 20 0.335533 o. 1573 19 1.000000 0.0000 LBNUTS 1.000000 0.0000 20 o 874279 0.0001 20 0.973825 0.0001 20 1.000000 0.0000 20 NNPS NNPP LIMB DROP __ II There were no accessible sample limbs on tree F4 12 Analysis of Regression Models: model: Consider the linear regression In the classical linear regression, Y is an observable random variable, the X.'s are fixed observable I served random disturbance. are observable independent procedures (3) variables, However, and the error term is an unobthe regressor to be tests and in our situation variables variables distributed estimation (Goldberger stochastic wbich are assumed All the classical of the disturbance. are valid when this assumption stochastic regression.) can be justified. discusses In the analysis were considered: (M 1 ) (M2) (M3) presented, the following models of the above form Y= Y= Y= + (NNPS) (NNPP) (L 1MB) (NNPS) (NNPP) (NNPS) (NNPP) LBNUTS. + + + + (L 1MB) (L1 MB) + + + (M4) Y= (MS) Y= (M6) Y= (M7) y= + + + variable (L 1MB) (LIMB) + + (DROP) (DROP) Y is the dependent Note that M6 and M7 have more terms of two independent variables. than MI and M2 because The difference different between of the inclusion MI and M2 (or between M4 and MS) is just the variable. Are all the if any, of average methods of expansion of the photography Several independent questions variables arise about the models Ml through M7. necessary in models M6 and M77 regression Which, the seven models is the "best" model for forecasting 13 weight of nuts per tree? to answer them. These questions were considered and criteria formulated Seven criteria list of criteria certain disirab1e will be proposed to answer the above questions. The is certainly properties. not exhaustive The criteria but was chosen are as follows: to evaluate (CI) The square of the multiple increase correlation coefficient, pr possibly R2• The R2 value should independent by the inclusion of another several the better in LIMB variables into the model. The larger the R2 value, A substantial the model expl~ins the R 2 the variation in the data. increase value for any model over Ml (or M2) by including would indicate that the LIMB variable the variable into the regression some additional (C2) is explaining variation in the data. error of estimate, s=1 s2 cr2y·X. The standard , the residual The smaller mean square estimates the variance about regression the value of s the more precise will be the predictions. The coefficent if increased (e4) of another precision of variation, CV ~ The CV = slY should decrease of another variable. is obtained by the inclusion This criterion The sequential F -test. variable accesses the contribution ,bk) added to an equation in stages. In (1) let SS(bo"" be the sums of square due to regression. Now for j=1,2, ... ,k let SS(bjlbo,b1 ... ,bj_1) be the sequential squares between for the j-th beta parameter. the sums of squares sums of SS(bjlbO,bl, ••• ,bj_l) is the difference of Y on X ""'X 1 j .,Xj_ . l and the due to the regression of Y on Xl'" sums of square due to the regression by SS(bo,b, F -test j=l ,2, ,bj) and SS(bo,b" k is: This is denoted The j-th sequential ... ,bj_l), respectively. 14 55 (b j I bo' , hj _ I) F(b·lb J 0 ,... ,b. I) = J- ESS (bo,b, ,bk)/N- (k+1) (I), and N is ESS(bo,b , ... ,bk) is the residual 55 of general model l the number of units in the sample. grees of freedom. Note that, k L The above F has I and N-(k+l) de- 55(bjlbo,.·.,bj_l) j=l = 55(bo, ... ,bk). in the full model 55's. considers the order the last. (I) Thus, the total sum of squares due to regression is just partioned (CS) into single degrees of freedom F-test criteria. The partial This criteria in which the variables value of a variable enter into the model. This criteria accesses equation as if it were to enter the regression The effect of Xj may be larger when the regression Xj. However, when the same variable entered equation includes only after into the equation other variables, as follows. it may affect the response very little. The F-test is For j=l, ... ,k 55 (bj I bo' b 1'... ,bj -1 'bj+ 1 '... ,bk) =--------------ES5(bo,bl,··· ,bk)/N- (K+I) where, 55(bo,···,bj_l,bj+l,···,bk) Y on Xl,X2,···,Xj_l,Xj+I, the j-th. that is the sum of squares due to the regression ... ,Xk' i.e. the regression on all variables of except This F has 1 and N-(k+l) degrees of freedom. It is not noted has the T distribution -------------------~~--- with N-(k+l) d.f., and this statistic .. is used to _-~_._-~---~_._-----~------~------------ 15 test if 13.=0 in (1). J Thus, the j-th partial F-test is equivalent to a T-test of 13·=0. J (C6) whether gression in (1). The extra sums of square criteria. This criteria accesses rek it was worth whi Ie to include certain model (1). terms in the general It is a joint test of the parameters 13j+l'... ,13 Consider the reduced model (2) Y = 130+S1Xl+ ...+8qXqwhere q<k. (2). And let SS(bo, ... bq ) denotes the SS due to the regression Then SS(bq+l, ..bklbo, ... ,bq)=SS(bo, ... ,bk)-SS(bo,bl, ...bq) is the extra SS due to the inclusion of the terms Sq+1Xq+l+ ...+13kXk into the model (1). Now, the sum of squares q+ SS(bo, .•. ,bk) has k d.f. and SS(bo"" has k-q d.f. So if ,bq) ...=8 =0 k has q d.f., thus SS(b l, ..• ,bklbo, ... ,b) q eq+l =8q+2= then SS(bq+l, ... ,bklbo, ... ,bq) ~cr2X2 k-q, and is independnet Hence, F ( bq+ 1 ',... bk I bo"'" b • +-'-l_' • _ ' _ .._,_b • )_/_k b ) - _S S_(__q•..• _• _. _' b _k__ ....;:bo"-,_· q••. -_q_ q ESS(bo,bl····,bk)/N-(k+1) of ESS(bo.b •... b ). k has the F distribution (C7) Significance with k-q and N-(k+l) of regression. d.f. determines whether This criteria the regression of Y on Xl •... ,Xk is significant. The test is F = SS (b 0' •.. ,b k) /k ESS (bo ,bl ,... bk) /N- (K+1) This is a test of the hypothesis testing that the true multiple H:Sl=S2=",=Sk=O, correlation 'which is equivalent R is O. areas. First. the model fitted parato coefficient The seven criteria can be broken into two general 2 R , s, and CV are measures the data. meters of how well the linear regression are statistical The other four criteria model. tests on certain 13 present in the regression Tables 8 through the seven - .... ----- ..•.. --.--- . ----------- ._-------------------- 16 criteria to determine the "best" regression model. The analysis was done System (1). using the STEPWISE procedure of the Statistical Analysis This program deleted has no accessible records with missing observations. Since tree F-4 presented. sample limbs it was deleted in the analysis Table 8: Criteria to determine the IIbestll regression model, adjusted photography data, Mississippi Pecans, July 1972 Criteria HODEL R2 s for IIbestll regression model C.V.% Sequential F-test for parameters -2/ B1 HI H2 H3 H4 M5 0.630 0.741 0.202 0.676 0.767 38.780 32.486 56.977 37.414 31.716 63.8 53.5 93.8 61.6 52.2 28. 97~': 48.52* 4. 30~1:* 31.13* 50. 90~': S2 S3-' II. : Partial F-test for oarameters ..Y Sl 133 _1/ 132 28.97* 48. 53~': 4 .30~:* F for slgnlfTcance regression 28.97* 48.52~': 4. 30~:* of 2.26 1.84 23.43~': 38. 87~" 2.26 1.84 16. 70~': 26.37* Indicates the F is significant a = .10 Indicates the F is significant a = .01 1/ Drop was not observed the first month :2/ A blank indicates that the F test is not app 1icafij-}e with th is mode 1 . ** * Table 9: Criteria 1972 to determine the "best" regression model, adjusted photography data, Missisippi Pecans, August Criteria MODEL R2 Sequential parameter 81 Ml M2 M3 M4 M5 M6 M7 0.901 0.869 20.089 23.065 57.284 20.707 21.999 20.997 22. 720 33. 1 38.0 94.3 34. 1 36.2 34.6 37.4 154.32'~ 112.96* 4.07:~* 145 .25:~ 124.17* 141.27* 116.42:" 0.00 2.68 0.44 2.52 for "best" reg ress ion model Partial F-test for parameter 1/ 81 154.32* 112.96'~ 4.07* 114. 1O'~ 99. 26:~ 0.12 105 .85:~ 0.00 ].68 0.56 2.09 0.12 0.00 0.281 1.26 82 83 :F-test for extra F for : SS Criteria :significance of :models M6 and M7 regression Ho :82"83 =0 154.32'~ 112.96,1, 4 .0 7:~* 72. 62'~ 68.43'~ 47.28:~ 39.65'" s C.V.% F-test for 1/ 82 83 o. 193 0.901 0.888 0.904 0.888 0.00 88.21* ** Indicates the F is significant a •. 10 * Indicates the F is significant a = .01 1/ A blank indicates that the F-test is not applicable wfth this model. 00 Table 10: Criteria to determine the "best" regression model, adjusted photography, Mississippi Pecans, September 1972 Criteria MODEL R2 s C.V.% for "best" regress ion model :F-test for extra: F for SS Cri teri a : significance :mode 15 M6 and M7: regression H 0 :132=133=0 76.45''< 73.45''< 3.81*": 0.44 0.27 2.19 1.25 1.73 0.00 1.33 0.76 37.20": 35.26''< 27.371": 24.28''< of Sequential F-test for 1/ parameter 131 132 133 81 Partial F-test parameter _1/ 82 83 Ml M2 M3 M4 M5 M6 M7 0.818 0.812 0.183 0.823 0.815 0.846 0.829 27.200 27.655 57.647 27.653 28.271 26.683 28.050 44.8 45.5 94.9 45.5 46.5 43.9 46.2 76.45* 73.4P 8.81 *,,< 73.96* 70.24* 79.45''< 71.35''< 0.45 0.27 0.94 1.52 1.73 0.00 76.45* 73.41* 3.81*''< 57.87* 54.68''< 61.75''< 54.45''< ** "Indicates the F is significant~ a = .10 * Indicates the F is significant. a = .01 1/ A blank indicates that the F-test is not applicable with this model. Table II: Criteria to determine the "best" regression model, unadjusted photography Mississippi Pecans, July 1972 MODEL R2 s C.V.% Criteria for "best" regression model Partial F-test for Sequential F-teSj for oarameter pa ramete r ...l -21 81 82 83 II: -. 81 12.36* 18.30* 4.30** 1.20 2.11 7.72** 14.02* 1.21 2. II 82 83 _II F for significance of regression 12.36* 18.30* 4.30** 6.86* 10.80* Ml M2 M3 M4 M5 0.421 0.518 0.202 0.462 0.575 48.527 44.258 56.977 48.235 42.878 79.9 72.9 93.8 79.4 70.6 12.36* 18.30* 4.30** 12.. 51* 19.49* Indicates the F is significant a = .10 Indicates the F is significant a = .01 II Drop was not observed the first visit 21 A blank indicates that the F-test is not applicable with this model. ** * o N Table 12: Criteria to determine 1972 the "best" regression model, unadjusted photography data, Mississippi Pecans,August MODEL R2 s C.V.% Cri teria for "best" regression model Partial F-tes for :F-test for extra: Sequential F-f7st for: parameter __ : parameter _1 55 Criteria : :models M6 and M7: 1 81 Ml M2 M3 M4 M5 M6 M7 0.793 0.874 0.193 0.799 0.874 0.806 0.876 29.041 22.663 57.284 29.496 23.358 29.906 23.894 47.8 37.3 94.3 48.6 38.5 49.2 39.3 64.98* 117.62* 4.07H 82 8s 81 64.98* 117.62* 4.07*'~ 48. 12'~ 86 ..4* 2 82 8s F for significance of regression H 0 :e2=8 3=0 64. 98,~ 117.62* 4.07* 62. 99'~ 0.48 110.71* 61.27* 105.81* 0.00 0.47 0.20 0.56 0.09 0.48 0.00 0.94 0.29 0.56 0.09 0.546 0.147 31 .73'~ 55.36* 20. 77* 35.37* 44.56* 78.32* ** Indicates the F is significant a = .10 * Indicates the F is significant a = .01 l/A blank indicates that the F-test is not applicable with this model. N Table 13: Criteria to determine the "best" regression model, unadjusted photography, Mississippi Pecans, September 1972 MODEL R2 s C.V.% Criteria for Ilbest" regression model Sequential F-test for Partial F-tesf for parameter / parameter -1/ 81 82 83 81 82 83 :F-test for extra: F for SS C rite ria significance :models M6 and M7: of regression Ho :82= 8 3"'0 MI M2 M3 M4 M5 M6 M7 0.668 0.749 36.724 31.915 57.647 36.978 32.492 36.328 31.114 60.5 52.6 94.9 60.9 53.5 59.8 51.2 34. 27'~ 50.88* 3 .81,~,~ 33.80* 49.09* 35 .02'~ 53.53* 0.77 0.40 0.79 1.00 1.58 1.88 34.27'~ 50.88* 3.81** 25. 32'~ 37.51"< 26.40* 41 .44'~ 0.77 0.40 2.02 2.45 1.58 1.88 1. 19 1.41, 34. 27'~ 50 .88,~ •• _~'l o. 183 0.688 0.756 0.714 0.790 3.81* 17 .28'~ 24.741< I2 .46'~ 18.81)~ ** * Indicates the F is sifnificant a = .10 Indicates the F is significant a = .01 1/ A blank indicates that the F-test is not applicable under this particular model. N N 23 DISCUSSION Inspecting of the variables significantly the correlation are correlated matrices OF RESULTS (Tables 2-7) indicate that most (LBNUTS). However, DROP is not with yield correlated with yield for any month, nor was DROP signifiindependent variables. This con- cantly correlated firms earlier However, with any of the other of WooB findings (7,P·,15;6,p.12). in all the regression two variables analysis presented DROP was included since even if the correlation fluence variables the multiple between is small it may in-. correlation coefficient (r) a great deal when several In this particular effect on the coefficient correlation coefficient. are in a regression model simultaneously. case, it was not true that DROP had a substantial of determination, which is the square of the multiple This can be seen by comparing Models M4 and M6 and Models M5 and M7 in Tables and previous results (by Wood (6,7)) 8 through 13. Based on these findings drop counts should not be included a pecan forecast model. in any further work in developing There also appears harvested weight to be a stronger relationship between final than with adjustment of nuts with adjusted photography that variates, interpreter the unadjusted photography. This indicates factors are necessary. variable Also, with LBNUTS The sample correlation is always greater coefficient of the photo LIMB. than the limb count variable in general, the photo count variable NNPP has larger sample correlation NNPS. This could possibly surface of a tree coefficient be attributed by assuming Tables than does the photo count variable to a more precise estimate of the bearing the tree is a parabolid rather than a sphere. 8 through 13 show that: 24 1. 2. Each regression The F-test significant. M6 and M7. is significant at the .01 level. (where applicable) is in- is the extra SS criteria Thus, 82 and 8 3 are simultaneously zero in Models 2. The partial contributed stage. F-test indicates that the LIMB and DROP variables included in the last very little when they were the contribution However, of the photo count variables are in- is important troduced nificant even when the LIMB and/or DROP variables first. 1 This is indicated • in the equation Partia1-F by the sig- of the 8 parameter. of by the 4. Once the photo variable was in the model the contribution additona1 sequential 5. Comparing variables F-test. Models M1 through M3 indicates were significant. This is indicated that in each case M1 errors than Model and M2 have larger R2·s and smaller standard M3. What the seven criteria to be considered indicate is that only one variable needs Furyield (and collected); it is the photographic variable. ther, M1 or M2 is the "best" per tree. Table regression model to use to forecast regression 14 and 15 give the estimated When fitting parameters for Models M1 and M2. were examined these models, plots of residuals assumptions. None for any departure (2) describe from any of the underlying methods for examining (Draper and Smith was found. residuals.) 25 Table 14: Estimate of regression parameters, model, Mississippi Pecans, 1972. MONIH adjusted data, by month and JULY MODEL AUGUST 81 0.023 0.026 80 13.541 18.779 81 0.025 0.022 80 SEPTEMBER 81 0.016 0.148 eo Ml 15.457 11.567 15.154 18.972 M2 Table 15: Estimate of regression parameters, model, Mississippi Pecans, 1972. unadjusted data, by month and MODEL JULY MONTH AUGUST 81 0.019 0.024 80 18.516 16.687 81 0.021 0.023 80 SEPTEMBER 81 0.013 0.015 ao Ml M2 24.200 18.435 21.317 18.521 --~"~_._~-----------------~ ------------- --------- ---------------- 26 CONCLUSIONS Based upon the analysis photographs performed on data collected in 1972, only for fore- need to be collected in any further pecan research improvements casting yield per tree (LBNUTS) until procedure can be achieved in the limb sampling limbs representathe use of a for this study. with yield nor in any which will make the accessible This will require tive of a larger portion of the tree. type of mechanical The variable was it useful forecasting lift equipment which was not available correlated DROP failed to be significantly in model building. The variable LIMB is not needed variable model once any type of photographic RECOMMENDATIONS is in the model. Future research collection Different studies should focus attention on photographic data and improving expansions results. this technique for this particular nut crop. to a tree level using photography A more refined estimate on a per tree basis. environment may produce even better of the height may also However, other characteristics For Possibly improve the expansions of the tree and its immediate example, how do differing must not be overlooked. influence yield? management techniques this answer practices is Ilgreatly", indicating that stratification based on management might be necessary. of blocks of different forecast varieties and ages is needed to a of A random selection determine complete if different sampling models are needed. This will necessitate Accurate frame of operations for the population. estimates tree numbers Future by individual investigation blocks must be secured should also consider for each operation. monthly models whether .~------------------~-_._------~----------~----~------------- 27 are necessary parameters for forecasting yield per tree. during Possibly the regression indicating equation would be stable over months the growing season that just the development would be necessary. be reflected change and maintenance change of only one forecasting in average Thus, monthly yield per tree would in the change in average photo counts per tree and not in the in beta parameters. of total number of nuts counted and investigation. year-to-year, on photographs of under-estiforecast The under-estimation needs further mation analysis If the magnitude is consistent the use of a relative change of production limb counts. example, could be utilized based on either photo counts or accessible is used in Florida on citrus. form for a particular variety For and This method of estimation of the following an estimation age class might be x x P t-l , where P t is the forecast of productton in year t, year, P - is the actual production t l Nt is the forecasted for the previous average weight (or number) of nuts per tree using the photo expansion Nt-l is the average weight photo expansion for year t, (or number) of nuts per tree using the for year t-l, trees of a particular age and variety Tt is the number of bearing for year t, T t-l is the number of bearing for year t-l. trees of a particular age and variety Another proportion ratio (Ht/Ht_l) of nuts intended could be included for commercial in (1) to indicate the harvest. This ratio would 28 probably cyclic harvest be very volatile since price and the tendency whether of the trees to be operator will in yield usually determine his pecan crop. a noncommercial It should be noted that for this forecast needed for a particular estimates Observe the actual production is region by variety and age of trees. change in number of. bearing change Also, accurate of the relative also that Nt/N _ t 1 trees must be secured. weight of nuts is the relative in estimated per tree, so that if the method of expansion under-estimates the true weight and estimation consistently of nuts per tree, this effect will cancel discussion (6). is tedius procedures difficult, and very of this forecast method out in the ratio. A more detailed can be found in Stout Finally, counting (5) and Williams the nuts on slides fruit counting time consuming. desirable Automated would be extremely for any operational level study. m __ ._~~ ~. . _ 29 REFERENCES 1. Barr, Anthony Statistical J., and Goodnight, System", James H., "A User's Guide to the Student Store, North Carolina Analysis Raleigh: State University, 2. Draper, York: 3. 1971. Analysis", New Norman and Smith, Harry, IIApplied Regression John Wiley and Sons, 1966. Arthur S., "Econometric Theory", Goldberger, New York: John Wiley and Sons, 1964. 4. Huddleston, Harold F., liThe Use of Photography Economic in Sampling for Number of Fruit Per Tree", Agriculture No.3. Research, July 1971, Vol. 23, 5. Stout, Roy G., "Estimating Surveyl', Journal pp.1037-1049. 6. William, S. R., "Forecastihg January Citrus Production November by Use of Frame Count 1962, Vol. XLIV, No.4. of Farm Economics, Florida Citrus Production Crop and Livestock Methodology Reporting & Development", Service", 7. 1971, "Florida Orlando, Florida. of the Pecan Tree for Branch, U.S. Wood, Ronald A., "A study of the Characteristics Use in Objective Standards Department Yield Forecasting", Division. Research and Development Reporting and Research Statistical D.C. Service, of Agriculture, Washington, 8. Wood, Ronald A., liThe Development Yield for Pecan Treesl', Research sion, Statistical Washington, D.C. Reporting of Objective and Development Procedures Branch, to Estimate Research Divi- Service, U. S. Department of Agriculture,