A Comparison of Several Regression Models for Forecasting Pecan Yields

Document Sample
A Comparison of Several Regression Models for Forecasting Pecan Yields Powered By Docstoc
					RESEARCH REPORT
Statistical Reporting Service U.S. Department of Agriculture

A COMPARISON OF SEVERAL REGRESSION MODELS FOR FORECASTING PECAN YIELDS

by Chapman P. Gleason Research Division Research and Development Branch

NOVEMBER 1974

------.------------------

CONTENTS

SUMMARY .••.••.•••••••••••

I

••••••••••••••••••••••••••••••••••••••••••

i ii
1 1 1 1 2 2 2

I NTRO DUCT I ON •••.••.••••.••••••••••.•••..•••••.••.••.••.••.••.•..••••• DATA COLLECT I ON PROCEDURES •••••••••••••••••••••••••••••••••••••••••• B 1ock Se 1ec t ion •.•••••.••••••••.•••••.••••.••••••.•.•••••.••••• Se 1ect ion ••••••••••••••••.••••••••••••••••••••••••• Se 1ect ion ••••..••••••••••••••••••••••••••••.••••••• Counts •••••••••••••••••••••••••••..•••••••••••••.•• Procedures Nuts from •••••••••••••••••.•••••.•••••••••••.••••• Photographs to Harvest ••.••••••••••••••••.•••••••••••• •••.•.•••.•••••...••••••.•••••••.•

Samp 1e Tree Samp 1e Limb Samp 1e Limb Photography Counts Nut of

Droppage

Prior

3
3

Ha rves t

Data ••••••..•••••.••••••••••••••••••••••••••••••••••••• ' •••••••••••••••••••.••.••••••••.•••••••••••••.

DATA EXPANS IONS ••••••• Limb

5 5 5 7
8 8

Exp a n s i on s •••.•••••••••••••••••••••••••••••••••••••••••••• Expans ions •••••••••••••••••••••••••••••••••••••••••

Photog raphy Drop

Expans ions •••••••••••••••••••••.•.•••.••••••••••••••••.•.•

RE SU L T5 ••••.••.••.••.••••••••••••••••••••••••••••.••.•.•••••••••••.• General Co r re 1at i on Coeff Ana 1ys i s of i c i ents •••••••••••••••••••••••••••••••••••••••

8
12 23 26 26 29

Reg res s i on Mode 1s ••••••••••••••••••••••••••••••••••

DISCUSS ION OF RESULTS ••••••••••••••••••••••••••••••••••••••••••••••• CON US I ON5 •••••••••••••••••••.••••••••••••••••••••••••••••••••••••• CL RECOMMENDAT IONS ••••••.•••••••••••••••••••.•••••••••••••••••••••••••• REFERENCES••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

SUMMARY Several different regression models are compared to determine which average yield per tree. Criteria are proposed

are best for forecasting

to determine which variables and ultimately which regression models are better than others. Using the proposed criteriat a simple linear regreswas found to A new method of

sion model using the number of nuts counted on photographs be "best". expanding Reconmendations are made for further research.

the number of nuts counted on photographs

to the tree level is in 1972 in central

also presented.

The study was based upon data collected

and southern Mississippi.

A COMPARISON OF SEVERAL REGRESSION MODELS FOR FORECASTING PECAN YIELDS
BY

CHAPMAN P. GLEASON

INTRODUCTION
Research data collection which studies have shown that limb sampling are promising methods and photographic data with

procedures

of providing

to forecast Two different

the average yield approaches

(number or weight of nuts) per tree. the average yield per tree using two simple linear regres-

to forecasting

were proposed sions---yield

in 1971. versus

The first involved

the number of nuts on sample The second

limbs; and yield versus regression

the number of nuts on photographs. approach to the forecasting

involves a multiple

problem---yield

versus

the number of nuts on two questions:

limbs and photographs. 1. 2.

The research was aimed at answering is better? is as good as multiple

Which of the above approaches If simple which tree? linear regression

regression,

regression

gives the best estimate

of average yield per

From a cost standpoint, collect and provide

one variable forecasts.

may be easier and cheaper Tests of statistical Standard errors,

to

more precise to answer

hypothesis R 2 , and

will be formulated

the first question.

C.V.'s will be compared

to answer the second.

------------------------

---------------~-----------

DAT~ COLLECTION Block Selection: Five block of Stuart variety two separate in central additional (Wilkinson geographic

PROCEDURES

pecans were subjectively

selected

in

areas of Mississippi. (Hinds County),

Three blocks were located by one operator. Two

Mississippi

all managed

blocks were located County), each managed

in the southwestern by different

corner of the State

individuals.

Sample Tree Selection: For each of the three blocks 1971 research in Hinds County, (Wood (8». the trees used in a In Wilkinson County

project were used again

it was necessary blocks. 1.

to select four trees in each of the newly selected procedure was used to select the trees. selected with equal probability of selection

A two-stage

Two rows were randomly for each row.

2.

Within each selected equal probability.

row, two trees were randomly In this approach,

selected with lengths than

if rows are varying of selection

trees in short rows have a greater those in long rows. Sample Limb Selection: For each selected a six-foot

probability

tree, the total number of accessible limbs was enumerated

(reachable simple Sample

by

ladder) sample

and a 50 percent was taken.

random sample with equal probabilities were defined inches.

of selection

limbs

as those with cross-sectional

area between

1.8 and 5.5 square and

For each tree, the total number of sample was estimated using either

limbs (both accessible

inaccessible)

bare tree mappings

of limbs or bare number of sample

tree stero photographs. limbs for the i-th tree.

Nj will denote the total estimated The trees in Hinds County

had stero photographs

2

taken in early April graphy to estimate

1971.

(Huddleston

(4) describes the uses of photolimbs.) Bare tree mappings limbs

the total number of sample

of limbs were made and used to estimate for the trees selected Sample limb Counts: For each tree, once the sample in Wilkinson

the total number of sample

County.

limbs were selected all nuts on the the number

limb were counted by tagging each cluster of fruit and counting of nuts In each tagged cluster. an indication Photography of monthly This prevented

counting errors and gave

fruit droppage

from the clusters.

Procedures: photography that plagued the research

To avoid the poor quality efforts 1.

in 1970 and 1971, the following

techniques

were used:

The tripod which held the camera was located 50 feet from the base of the tree with the sun at the back of the photographer.

2.

A florist stake was placed directly below the tripod. frame was placed two feet in front of the

3. The metal photography
camera lens.

4. The angle of the camera from the tree was recorded.
The photographs a metal frame. were taken up a vertical column of the tree through

A Miranda

Sensorex camera with an in-lens light meter over those with a camera which

improved the photograph

significantly

has no in-lens light meter. Counts of Nuts from Photographs: Each slide was projected on a grid. The number of nuts in each cell subset of the slides was factors. (See Wood (7,p.19)

was counted by a photo interpreter. recounted for computation

A certain

of photo adjustments

for a discussion

of methods

used to compute photo adjustment

factors.)

----------------------~--~--_._-_._--------~-~_

..

_------~-----------

3

Nut Droppage Prior to Harvest: On the first photography by two feet, were randomly each tree. subsequent visit, two square plots, each two feet

located on the ground beneath the canopy of On each

The identified area was then gleaned for nuts. field visit, the amount of droppage

(number of nuts) in the

plot was counted and removed. Harvest Data: At harvest, collected. the each tree was shaken and all "goodll nuts were

The nuts that remained on the ground were deemed "bad". Three of nuts

Each tree was visited three times to collect harvest data. one-pound samples of nuts were selected However, it was apparent from each collection

for a tree.

that for several of the trees that errors by

these collection

of nuts were mixed due to classification the trees.

the laborers who harvested

For this reason and the fact that the

a good nut cannot be distinguished biological

from a bad nut on a photograph variable

yield was used as the dependent

in the analysis of all nuts.

that follows.

The term LBNUTS, will denote the collection

The total harvest data for each tree are given in Table 1.

4 Table) - Harvest Data, Mississippi Pecans, 1972

5

DATA EXPANSIONS limb Expansions: The expanded tree was computed number of nuts from sample as fo 11 ows: N. (1) NNSl=
I

limbs (NNSl) for each n.
L;I

X ..
IJ

n.
I

j=l

where for the i-th tree, N. is the estimated
I

total number of sample
X ••
IJ

limbs, n. is the number of sample
I

limbs selected, sample limb.

is the total number

of fruit counted on the j-th selected are sampling expansion

It is noted that we whereas (1) is an that each

only those limbs which are accessible,

to the total tree based on the fallacious limbs had a non-zero

assumption

of the Ni sample contend

chance of selection.

Horticulturists limbs.

that the lower limbs produce

fewer nuts than the higher

Hence, an unde~-estimate from (1). Photography Expansions:

of the total number of fruit will be realized

The counts of nuts using ground photographs level by two methods. a tree is a sphere. The independent counted The first expansion

were expanded

to a tree

assumed

that the sh~pe of of the method,)

(Wood (7,p.20) gives a discussion using this assumption sphere assumption.)

variable

is NNPS (number of nuts The second expansion to

from photographs,

a tree level assumes expansion

that the shape of the tree is a parabolid. for every tree must be estimated, (See Strout

For this (h) bearing is:

two parameters (r).

the height

and the radius

It can be proved

(5»

that estimated

surface area of the tree assuming

the tree is shaped as a parabolid

6
llr
SAP

= ----

6h2

Thus,

the number of nuts counted on photographs (NNPP) Is,
SAP

using the parabolid

assumption

n·
( ~I

(2)

NNPP

= ------TAMF

j=l

x .. ) IJ
IJ

Where n. is the number of photograph
I

taken on the i-th tree, X .. is TAMF is the total area of the

the number of nuts on the j-th photograph, middle frame. (See Wood (7, p.22)).

The number of nuts counted on photographs the fact that each photo for any given slide. such deviation in counting to estimate interpreter

were adjusted

to reflect

counts a different differences

number of nuts and to measure

To minimize

interpreter

f:rom the IInorm'la balanced

incomplete

block design was used of methods used

the slides. interpreter

(Wood (7,p.19)) gives a discussion adjustment factors.) differences

The count of fruit on each by multiplying the interWhen two

slide was adjusted preter adjustment interpreters

for interpreter

factor times the number of fruit counted. the same slide these adjusted by averaging

counted

counts were averaged.

The radius distance

(r) was estimated

the longest and the shortest The height (h)

from the trunk to the edge of the tree canopy. by using the number of photographs from the trunk to the camera.

was roughly estimated knowing the distance

n. taken and I limb counts, the true of

As with

these methods

of expansion

to the tree level will under-estimate

number of nuts on the tree since all nuts do not grow on the periphery the tree. However,

since flower buds develop on new growth that tends to of the tree, most of the fruit is produced near the

occur on the periphery surface.

----------------------------------~----------~------------

7

Drop Expansion: The nut droppage from the i-th tree was estimated
2 (3) DROP = --( z:

as follows:

X •• )
IJ

8
where (r) is the estimated

j=l

radius, and Xij is the number of nuts in the j-th drop count unit for the i-th tree. Observe that nr2 is the area

of a circle and 8 is the total area sampled using both 21 x 21 drop units, so the ratio nr2/8 is an area expansion factor.

~~-----~------~-----------

~-~-~------------------

8

RESULTS Genera 1 : Previous investigations (by Wood correlated

(7,8)) found that both NNSL
with the estimated the biological number of yield, variable.

and NNPS to be significantly good nuts at harvest. or total weight In addition is a variable ditions

In this investigation nuts --- LBNUTS,

of harvested

was the dependent

to the reasons mentioned that is influenced

previously,

the number of good nuts conproject.

by marketing or immeasurable

and other economic in the research

which were uncontrolled

Two data sets were used in the analysis. counts from color transparancies, differences. Coefficients: 2 through

The first were unadjusted for

the second were counts adjusted

interpreter Correlation Tables

7 gives the product moment correlation
of variables, its significance probability

coefficient probability of a correvalue)

for each pairwise

combination

and the number of observations. lation coefficient correlation parameter assumption

The significance

is the probabi lity that a larger

(in absolute

coefficient,should p=O. The pairwise

arise by chance of the true population correlation was computed based on the normal distributions.

that the random variables

have bivariate

~--------~~----~_._-~~--

----~-~--------------

9

Table 2:

Correlation 1972 Correlation LBNUTS

Matrix, unadjusted Coefficientsl NNPS 0.589406 0.0063
20

photography
>

data, Mississippi

Pecans, July

Prob

IRI

under Ho:p=OI number of observations II LIMBNNPP 0.692073 0.0010 20 0.973957 0.0001 20 1.000000 0.0000 20 0.449205 0.0512 19 0.853912 0.0001 19 0.814804 0.0001 19 1.000000 0.0000 19

LBNUTS

I.000000 0.0000 20

NNPS

I.000000 0.0000
20

NNPP

LIMB

Table 3:

Correlation Matrix, unadjusted photography data, Mississippi Pecans, August 1972 Correlation Coefficientsl Prob > IRI under Ho:p=OI number of observations II LBNUTS NNPS NNPP LIM~ DROP 1.000000 0.0000
20

LBNUTS

0.835472 0.0001
20

0.909419 0.0000 20 0.970281 0.0001 20 1.000000 0.0000 20

0.439499 0.0571 19 0.565588 0.0112 19 0.473863 0.0384 19 I.000000 0.0000 19

-0.022093 0.9235 20 -0.038837 0.8651 20 -0.051213 0.8245 20 0.392730 0.0931 19 1.000000 0.0000 20

NNPS

1.000000 0.0000
20

NNPP

LIMB

DROP

II

There were no accessible

sample limbs on tree F4

10

Table 4:

Correlation Matrix, unadjusted photography data, Mississippi Pecans, Septembe r, 1972 Correlation Coefficients/ Prob > I RI under H 0 :p=O/ number of observat ions LBNUTS NNPS 0.750568 0.0003 20 1.000000 0.0000 20 NNPP 0.821828 0.0001 20 0.977678 0.0001 20 1.000000 0.0000 20 LIMB _1/ 0.427667 0.0650 19 0.638995 0.0035 19 0.568388 0.0108 19 1.000000 0.0000 19 DROP -0.066414 0.7771 20 -0.093471 0.6967 20 -0. 142778 0.5545 20 0.335533 o. 1573 19 1.000000 0.0000 20

LBNUTS

1.000000 0.0000 20

NNPS

NNPP

LIMB

DROP

Table 5:

Correlation Matrix, adjusted photography data Mississippi Pecans, July 1972 Correlation Coefficients/ Prob > IRJ under Ho:p=O/ number of observations LBNUTS NNPS 0.71448 0.0001 20 I.000000 0.0000 20 NNPP 0.805644 0.0001 20 0.972430 0.0001 20 1.000000 0.0000 20 LIMB _1/ 0.449205 0.0512 19 0.745579 0.0004 19 0.664007 0.0022 19 1.000000 0.0000 19

LBNUTS

1.000000
0.0000 20

NNPS

NNPP

LIMB

1/ There are no accessible

sample limbs on tree F4

II Table 6: Correlation 1972 Correlation LBNUTS LBNUTS 1.000000 0.0000 20 Matrix, adjusted photography
>

data Mississippi

Pecans, August

Coefficientsl NNPS 0.899645 0.0001 20 I.000000 0.0000 20

Prob

I RI

under He :p=OI

NNPP 0.914822 0.0001 20 0.971275 0.0001 20 I.000000 0.0000 20

number of observations I I LI M B -DROP -0.022093 0.9235 20 -0.057109 0.8058 20 -0.069774 0.7669 20 0.392730 0.0931 19 1 .000000 0.0000 20

0.439499 0.0571 19 0.462843 0.0438 19 0.332710 0.1611 19 1.000000 0.0000 19

NNPS

NNPP

LIMB

DROP

Table 7:

Correlation Matrix, adjusted photography data Mississippi Pecans, Sep tembe r 1972 Correlation Coefficientsl Prob > IRI under He:p=OI number of observations LBNUTS NNPS 0.,846902 0.0001 20 I.000000 0.0000
20

NNPP

LIMB __ II 0.427667 0.0650 19 0.538335 0.0166 19 0.418659 0.0715 19 1 .000000 0.0000 19

DROP -0.066414 0.7777 20 -0.108465 0.6530 20 -0.163419 0.5024 20 0.335533 o. 1573 19 1.000000 0.0000

LBNUTS

1.000000 0.0000
20

o

874279
0.0001

20 0.973825 0.0001 20 1.000000 0.0000 20

NNPS

NNPP

LIMB

DROP

__ II

There were no accessible

sample

limbs on tree F4

12

Analysis

of Regression

Models: model:

Consider

the linear regression

In the classical

linear regression,

Y is an observable

random variable,

the X.'s are fixed observable I served random disturbance. are observable independent procedures
(3)

variables, However,

and the error term is an unobthe regressor to be tests and

in our situation variables

variables distributed estimation (Goldberger

stochastic

wbich are assumed All the classical

of the disturbance.

are valid when this assumption stochastic regression.)

can be justified.

discusses

In the analysis were considered:
(M 1 ) (M2) (M3)

presented,

the following

models of the above form

Y= Y= Y=

+

(NNPS) (NNPP) (L 1MB) (NNPS) (NNPP) (NNPS) (NNPP) LBNUTS. + + + +
(L 1MB) (L1 MB)

+ +
+

(M4) Y= (MS) Y= (M6) Y=
(M7) y=

+ + + variable

(L 1MB) (LIMB)

+ +

(DROP) (DROP)

Y is the dependent

Note that M6 and M7 have more terms of two independent variables.

than MI and M2 because The difference different between

of the inclusion

MI and M2 (or between

M4 and MS) is just the variable. Are all the if any, of average

methods

of expansion

of the photography

Several independent

questions variables

arise about the models Ml through M7. necessary in models M6 and M77 regression Which,

the seven models

is the "best"

model for forecasting

13

weight

of nuts per tree? to answer them.

These questions

were considered

and criteria

formulated

Seven criteria list of criteria certain disirab1e

will be proposed

to answer

the above questions.

The

is certainly properties.

not exhaustive The criteria

but was chosen are as follows:

to evaluate

(CI)

The square of the multiple increase

correlation

coefficient, pr possibly

R2•

The

R2 value should independent

by the inclusion

of another

several the better in LIMB

variables

into the model.

The larger the R2 value, A substantial

the model expl~ins the R 2

the variation

in the data.

increase

value for any model over Ml (or M2) by including would indicate that the LIMB variable

the variable

into the regression some additional (C2)

is explaining

variation

in the data. error of estimate, s=1 s2
cr2y·X.

The standard

, the residual The smaller

mean square estimates

the variance

about

regression

the value of s the more precise will be the predictions. The coefficent if increased (e4) of another precision of variation,
CV ~

The CV = slY should decrease of another variable.

is obtained

by the inclusion This criterion

The sequential F -test. variable

accesses

the contribution ,bk)

added to an equation

in stages.

In (1) let SS(bo""

be the sums of square due to regression. Now for j=1,2, ... ,k let SS(bjlbo,b1 ... ,bj_1) be the sequential squares between for the j-th beta parameter. the sums of squares sums of

SS(bjlbO,bl, ••• ,bj_l) is the difference of Y on X ""'X 1 j .,Xj_ . l and the

due to the regression of Y on Xl'"

sums of square due to the regression by SS(bo,b, F -test j=l ,2, ,bj) and SS(bo,b" k is:

This is denoted The j-th sequential

... ,bj_l), respectively.

14
55 (b j I bo' , hj _ I)

F(b·lb
J

0

,... ,b. I) =
J-

ESS (bo,b,

,bk)/N- (k+1) (I), and N is

ESS(bo,b , ... ,bk) is the residual 55 of general model l the number of units in the sample. grees of freedom. Note that,
k
L

The above F has I and N-(k+l) de-

55(bjlbo,.·.,bj_l)

j=l

= 55(bo, ... ,bk). in the full model 55's. considers the order the last. (I)

Thus, the total sum of squares due to regression is just partioned (CS) into single degrees of freedom F-test criteria.

The partial

This criteria

in which the variables value of a variable

enter into the model.

This criteria accesses equation

as if it were to enter the regression

The effect of Xj may be larger when the regression Xj. However, when the same variable entered

equation

includes only after

into the equation

other variables, as follows.

it may affect the response very little.

The F-test is

For j=l, ... ,k 55 (bj I bo' b 1'... ,bj -1 'bj+ 1 '... ,bk)

=--------------ES5(bo,bl,··· ,bk)/N- (K+I)
where,

55(bo,···,bj_l,bj+l,···,bk) Y on Xl,X2,···,Xj_l,Xj+I, the j-th. that

is the sum of squares due to the regression ... ,Xk' i.e. the regression on all variables

of

except

This F has 1 and N-(k+l)

degrees of freedom.

It is not noted

has the T distribution
-------------------~~---

with N-(k+l) d.f., and this statistic
..

is used to

_-~_._-~---~_._-----~------~------------

15

test if 13.=0 in (1).
J

Thus, the j-th partial

F-test

is equivalent

to a

T-test of 13·=0. J (C6) whether gression in (1). The extra sums of square criteria. This criteria accesses rek

it was worth whi Ie to include certain model (1).

terms in the general

It is a joint test of the parameters

13j+l'... ,13

Consider

the reduced model (2) Y = 130+S1Xl+ ...+8qXqwhere q<k. (2).

And let SS(bo, ... bq ) denotes

the SS due to the regression

Then SS(bq+l, ..bklbo, ... ,bq)=SS(bo, ... ,bk)-SS(bo,bl, ...bq) is the extra SS due to the inclusion of the terms Sq+1Xq+l+ ...+13kXk into the model (1). Now, the sum of squares q+ SS(bo, .•. ,bk) has k d.f. and SS(bo"" has k-q d.f. So if ,bq) ...=8 =0 k

has q d.f., thus SS(b

l, ..• ,bklbo, ... ,b) q

eq+l =8q+2=

then SS(bq+l, ... ,bklbo, ... ,bq) ~cr2X2 k-q, and is independnet Hence, F ( bq+ 1 ',... bk I bo"'"

b • +-'-l_' • _ ' _ .._,_b • )_/_k b ) - _S S_(__q•..• _• _. _' b _k__ ....;:bo"-,_· q••. -_q_ q ESS(bo,bl····,bk)/N-(k+1)

of ESS(bo.b •... b ). k

has the F distribution (C7) Significance

with k-q and N-(k+l) of regression.

d.f. determines whether

This criteria

the regression

of Y on Xl •... ,Xk is significant.

The test is

F =

SS (b 0'

•..

,b

k) /k

ESS (bo ,bl ,... bk) /N- (K+1) This is a test of the hypothesis testing that the true multiple H:Sl=S2=",=Sk=O, correlation 'which is equivalent R is O. areas. First. the model fitted parato

coefficient

The seven criteria

can be broken

into two general

2 R , s, and CV are measures the data. meters

of how well the linear regression are statistical

The other four criteria model.

tests on certain 13 present

in the regression

Tables

8 through

the seven

-

.... ----- ..•.. --.---

.

-----------

._--------------------

16

criteria

to determine

the "best" regression

model.

The analysis was done System (1).

using the STEPWISE

procedure

of the Statistical

Analysis

This program deleted has no accessible

records with missing observations.

Since tree F-4 presented.

sample limbs it was deleted

in the analysis

Table 8:

Criteria

to determine

the IIbestll regression

model, adjusted

photography

data, Mississippi

Pecans, July 1972

Criteria HODEL R2 s

for IIbestll regression

model

C.V.%

Sequential F-test for parameters -2/

B1
HI H2 H3 H4 M5 0.630 0.741 0.202 0.676 0.767 38.780 32.486 56.977 37.414 31.716 63.8 53.5 93.8 61.6 52.2 28. 97~': 48.52* 4. 30~1:* 31.13* 50. 90~':

S2

S3-'

II.

:

Partial F-test for oarameters ..Y Sl 133 _1/ 132 28.97* 48. 53~': 4 .30~:*

F for
slgnlfTcance
regression 28.97* 48.52~': 4. 30~:* of

2.26 1.84

23.43~': 38. 87~"

2.26 1.84

16. 70~': 26.37*

Indicates the F is significant a = .10 Indicates the F is significant a = .01 1/ Drop was not observed the first month :2/ A blank indicates that the F test is not app 1icafij-}e with th is mode 1 .

**

*

Table 9:

Criteria 1972

to determine

the "best" regression

model, adjusted

photography

data, Missisippi

Pecans, August

Criteria MODEL R2 Sequential parameter 81 Ml M2 M3 M4 M5 M6 M7 0.901 0.869 20.089 23.065 57.284 20.707 21.999 20.997 22. 720 33. 1 38.0 94.3 34. 1 36.2 34.6 37.4 154.32'~ 112.96* 4.07:~* 145 .25:~ 124.17* 141.27* 116.42:" 0.00 2.68 0.44 2.52

for "best" reg ress ion model Partial F-test for parameter 1/ 81 154.32* 112.96'~ 4.07* 114. 1O'~ 99. 26:~ 0.12 105 .85:~ 0.00 ].68 0.56 2.09 0.12 0.00 0.281 1.26 82 83 :F-test for extra F for : SS Criteria :significance of :models M6 and M7 regression Ho :82"83 =0 154.32'~ 112.96,1, 4 .0 7:~* 72. 62'~ 68.43'~ 47.28:~ 39.65'"

s

C.V.%

F-test for 1/ 82 83

o. 193
0.901 0.888 0.904 0.888

0.00

88.21*

** Indicates the F is significant a •. 10 * Indicates the F is significant a = .01 1/ A blank indicates that the F-test is not applicable

wfth this model.
00

Table

10:

Criteria

to determine

the "best"

regression

model, adjusted

photography,

Mississippi

Pecans,

September

1972

Criteria MODEL R2 s C.V.%

for "best"

regress ion model :F-test for extra: F for SS Cri teri a : significance :mode 15 M6 and M7: regression H 0 :132=133=0 76.45''< 73.45''< 3.81*": 0.44 0.27 2.19 1.25 1.73 0.00 1.33 0.76 37.20": 35.26''< 27.371": 24.28''< of

Sequential F-test for 1/ parameter 131 132 133 81

Partial F-test parameter _1/ 82 83

Ml M2 M3 M4 M5 M6 M7

0.818 0.812 0.183 0.823 0.815 0.846 0.829

27.200 27.655 57.647 27.653 28.271 26.683 28.050

44.8 45.5 94.9 45.5 46.5 43.9 46.2

76.45* 73.4P 8.81 *,,< 73.96* 70.24* 79.45''< 71.35''< 0.45 0.27 0.94 1.52 1.73 0.00

76.45* 73.41* 3.81*''< 57.87* 54.68''< 61.75''< 54.45''<

** "Indicates the F is significant~ a = .10 * Indicates the F is significant. a = .01 1/ A blank indicates that the F-test is not applicable

with this model.

Table II:

Criteria to determine

the "best" regression model, unadjusted

photography

Mississippi

Pecans, July 1972

MODEL

R2

s

C.V.%

Criteria for "best" regression model Partial F-test for Sequential F-teSj for oarameter pa ramete r ...l -21 81 82 83 II: -. 81 12.36* 18.30* 4.30** 1.20 2.11 7.72** 14.02* 1.21 2. II 82 83 _II

F for significance of regression 12.36* 18.30* 4.30** 6.86* 10.80*

Ml M2 M3 M4 M5

0.421 0.518 0.202 0.462 0.575

48.527 44.258 56.977 48.235 42.878

79.9 72.9 93.8 79.4 70.6

12.36* 18.30* 4.30** 12.. 51* 19.49*

Indicates the F is significant a = .10 Indicates the F is significant a = .01 II Drop was not observed the first visit 21 A blank indicates that the F-test is not applicable with this model. **

*

o

N

Table 12:

Criteria to determine 1972

the "best" regression model, unadjusted

photography

data, Mississippi

Pecans,August

MODEL

R2

s

C.V.%

Cri teria for "best" regression model Partial F-tes for :F-test for extra: Sequential F-f7st for: parameter __ : parameter _1 55 Criteria : :models M6 and M7:

1

81 Ml M2 M3 M4 M5 M6 M7 0.793 0.874 0.193 0.799 0.874 0.806 0.876 29.041 22.663 57.284 29.496 23.358 29.906 23.894 47.8 37.3 94.3 48.6 38.5 49.2 39.3 64.98* 117.62* 4.07H

82

8s

81 64.98* 117.62* 4.07*'~ 48. 12'~ 86 ..4* 2

82

8s

F for significance of regression

H

0

:e2=8 3=0 64. 98,~ 117.62* 4.07*

62. 99'~ 0.48 110.71* 61.27* 105.81* 0.00 0.47 0.20 0.56 0.09

0.48 0.00 0.94 0.29 0.56 0.09 0.546 0.147

31 .73'~ 55.36* 20. 77* 35.37*

44.56* 78.32*

** Indicates the F is significant a = .10 * Indicates the F is significant a = .01 l/A blank indicates that the F-test is not applicable with this model.

N

Table 13:

Criteria to determine

the "best" regression model, unadjusted photography,

Mississippi

Pecans, September

1972

MODEL

R2

s

C.V.%

Criteria for Ilbest" regression model Sequential F-test for Partial F-tesf for parameter / parameter -1/
81 82 83 81

82

83

:F-test for extra: F for SS C rite ria significance :models M6 and M7: of regression Ho :82= 8 3"'0

MI M2 M3 M4 M5 M6 M7

0.668 0.749

36.724 31.915 57.647 36.978 32.492 36.328 31.114

60.5 52.6 94.9 60.9 53.5 59.8 51.2

34. 27'~ 50.88* 3 .81,~,~ 33.80* 49.09* 35 .02'~ 53.53* 0.77 0.40 0.79 1.00 1.58 1.88

34.27'~ 50.88* 3.81** 25. 32'~ 37.51"< 26.40* 41 .44'~ 0.77 0.40 2.02 2.45 1.58 1.88 1. 19 1.41,

34. 27'~ 50 .88,~
•• _~'l

o. 183
0.688 0.756 0.714 0.790

3.81* 17 .28'~ 24.741< I2 .46'~ 18.81)~

** *

Indicates the F is sifnificant a = .10 Indicates the F is significant a = .01 1/ A blank indicates that the F-test is not applicable

under this particular

model.

N
N

23

DISCUSSION Inspecting of the variables significantly the correlation are correlated matrices

OF RESULTS (Tables 2-7) indicate that most (LBNUTS). However, DROP is not

with yield

correlated

with yield for any month, nor was DROP signifiindependent variables. This con-

cantly correlated firms earlier However,

with any of the other of WooB

findings

(7,P·,15;6,p.12). in all the regression two variables analysis presented

DROP was included

since even if the correlation fluence variables the multiple

between

is small it may in-.

correlation

coefficient

(r) a great deal when several In this particular effect on the coefficient correlation coefficient.

are in a regression

model simultaneously.

case, it was not true that DROP had a substantial of determination, which

is the square of the multiple

This can be seen by comparing

Models M4 and M6 and Models M5 and M7 in Tables and previous results (by Wood (6,7))

8 through 13.

Based on these findings

drop counts should not be included a pecan forecast model.

in any further work

in developing

There also appears harvested weight

to be a stronger

relationship

between

final than with adjustment

of nuts with adjusted

photography that

variates, interpreter

the unadjusted

photography.

This indicates

factors are necessary. variable Also, with LBNUTS

The sample correlation is always greater

coefficient

of the photo LIMB.

than the limb count variable

in general,

the photo count variable

NNPP has larger sample correlation NNPS. This could possibly surface of a tree

coefficient be attributed by assuming Tables

than does the photo count variable to a more precise estimate

of the bearing

the tree is a parabolid

rather than a sphere.

8 through 13 show that:

24

1. 2.

Each regression The F-test significant. M6 and M7.

is significant

at the .01 level. (where applicable) is in-

is the extra SS criteria Thus, 82 and 8 3

are simultaneously

zero in Models

2.

The partial contributed stage.

F-test

indicates

that the LIMB and DROP variables included in the last

very little when they were the contribution

However,

of the photo count variables are in-

is important troduced nificant

even when the LIMB and/or DROP variables first. 1 This is indicated

•

in the equation Partia1-F

by the sig-

of the 8

parameter. of by the

4. Once the photo variable was in the model the contribution
additona1 sequential 5. Comparing variables F-test. Models M1 through M3 indicates were significant. This is indicated

that in each case M1 errors than Model

and M2 have larger R2·s and smaller

standard

M3.
What the seven criteria to be considered indicate is that only one variable needs Furyield

(and collected);

it is the photographic

variable.

ther, M1 or M2 is the "best" per tree. Table

regression

model to use to forecast regression

14 and 15 give the estimated When fitting

parameters

for Models M1 and M2. were examined

these models,

plots of residuals assumptions. None

for any departure (2) describe

from any of the underlying methods for examining

(Draper and Smith was found.

residuals.)

25
Table 14: Estimate of regression parameters, model, Mississippi Pecans, 1972.
MONIH

adjusted data, by month and

JULY MODEL

AUGUST 81 0.023 0.026 80 13.541 18.779 81 0.025 0.022 80

SEPTEMBER 81 0.016 0.148

eo
Ml

15.457 11.567

15.154 18.972

M2

Table 15:

Estimate of regression parameters, model, Mississippi Pecans, 1972.

unadjusted

data, by month and

MODEL

JULY

MONTH AUGUST 81 0.019 0.024 80 18.516 16.687 81 0.021 0.023 80

SEPTEMBER 81 0.013 0.015

ao
Ml M2 24.200 18.435

21.317 18.521

--~"~_._~-----------------~

-------------

---------

----------------

26

CONCLUSIONS Based upon the analysis photographs performed on data collected in 1972, only for fore-

need to be collected

in any further pecan research improvements

casting yield per tree (LBNUTS) until procedure can be achieved

in the limb sampling limbs representathe use of a for this study. with yield nor in any

which will make the accessible This will require

tive of a larger portion of the tree. type of mechanical The variable was it useful forecasting lift equipment

which was not available correlated

DROP failed to be significantly in model building.

The variable

LIMB is not needed variable

model once any type of photographic RECOMMENDATIONS

is in the model.

Future research collection Different

studies

should focus attention

on photographic

data

and improving expansions results.

this technique

for this particular

nut crop.

to a tree level using photography A more refined estimate on a per tree basis. environment

may produce

even better

of the height may also However, other characteristics For Possibly

improve the expansions

of the tree and its immediate example, how do differing

must not be overlooked. influence yield?

management

techniques

this answer practices

is Ilgreatly", indicating

that stratification

based on management

might be necessary. of blocks of different forecast varieties and ages is needed to a of

A random selection determine complete if different sampling

models are needed.

This will necessitate Accurate

frame of operations

for the population.

estimates

tree numbers Future

by individual investigation

blocks must be secured should also consider

for each operation. monthly models

whether

.~------------------~-_._------~----------~----~-------------

27

are necessary parameters

for forecasting

yield per tree. during

Possibly

the regression indicating equation

would be stable over months

the growing season

that just the development would be necessary. be reflected change

and maintenance change

of only one forecasting in average

Thus, monthly

yield per tree would

in the change

in average

photo counts per tree and not in the

in beta parameters. of total number of nuts counted and investigation. year-to-year, on photographs of under-estiforecast

The under-estimation needs further mation analysis

If the magnitude

is consistent

the use of a relative

change

of production limb counts. example,

could be utilized

based on either photo counts or accessible is used in Florida on citrus. form for a particular variety For and

This method of estimation of the following

an estimation

age class might be
x

x P

t-l

, where

P

t

is the forecast

of productton

in year t, year,

P - is the actual production t l Nt is the forecasted

for the previous

average weight

(or number) of nuts per tree

using the photo expansion Nt-l is the average weight photo expansion

for year t, (or number) of nuts per tree using the

for year t-l, trees of a particular age and variety

Tt is the number of bearing for year t, T
t-l

is the number of bearing for year t-l.

trees of a particular

age and variety

Another proportion

ratio (Ht/Ht_l) of nuts intended

could be included for commercial

in (1) to indicate

the

harvest.

This ratio would

28

probably cyclic harvest

be very volatile

since price and the tendency whether

of the trees to be operator will

in yield usually determine his pecan crop.

a noncommercial

It should be noted that for this forecast needed for a particular estimates Observe

the actual production

is

region by variety and age of trees. change in number of. bearing change

Also, accurate

of the relative also that Nt/N _ t 1

trees must be secured. weight of nuts

is the relative

in estimated

per tree, so that if the method of expansion under-estimates the true weight

and estimation

consistently

of nuts per tree, this effect will cancel discussion (6). is tedius procedures difficult, and very of this forecast method

out in the ratio.

A more detailed

can be found in Stout Finally, counting

(5) and Williams

the nuts on slides fruit counting

time consuming. desirable

Automated

would be extremely

for any operational

level study.

m

__

._~~

~.

.

_

29

REFERENCES

1.

Barr, Anthony Statistical

J., and Goodnight, System",

James H., "A User's Guide to the Student Store, North Carolina

Analysis

Raleigh:

State University, 2. Draper, York: 3.

1971. Analysis", New

Norman and Smith, Harry, IIApplied Regression John Wiley and Sons, 1966. Arthur S., "Econometric Theory",

Goldberger,

New York: John Wiley

and Sons, 1964. 4. Huddleston, Harold F., liThe Use of Photography Economic in Sampling for Number

of Fruit Per Tree", Agriculture No.3.

Research,

July 1971, Vol. 23,

5. Stout, Roy G., "Estimating
Surveyl', Journal pp.1037-1049. 6. William, S. R., "Forecastihg January

Citrus Production November

by Use of Frame Count 1962, Vol. XLIV, No.4.

of Farm Economics,

Florida

Citrus Production Crop and Livestock

Methodology Reporting

&

Development", Service", 7.

1971, "Florida

Orlando,

Florida. of the Pecan Tree for Branch, U.S.

Wood, Ronald A., "A study of the Characteristics Use in Objective Standards Department Yield Forecasting", Division. Research

and Development Reporting

and Research

Statistical D.C.

Service,

of Agriculture,

Washington,

8. Wood, Ronald A., liThe Development
Yield for Pecan Treesl', Research sion, Statistical Washington, D.C. Reporting

of Objective and Development

Procedures Branch,

to Estimate Research Divi-

Service,

U. S. Department

of Agriculture,


				
DOCUMENT INFO