# In this assignment we are going to look at using arrays and by dfhercbml

VIEWS: 4 PAGES: 5

• pg 1
```									                                                 ICS3M – Assignment #3
“FINDING THE LINE OF BEST FIT”
In this assignment we are going to look at using arrays and applets to do some
introductory statistics. The data set we are using compares the hours a student studies
with the grade achieved in a particular class. The data will be used to create a “scatter
plot”, similar to the one shown below.

ICS3M Marks Scatterplot

120

100
80
60
40
20
0
0         5          10        15   20
Hours Studied

The next step in the process will be to add the “Line of Best Fit”. The line of best
fit is a linear approximation of the data. (Simply put, it’s the straight line that best fits the
data.)

ICS3M Marks Scatterplot

120

100
80
60
40                                                 Line of
Best Fit
20
0
0         5          10        15   20
Hours Studied
ICS3M – Assignment #3
“FINDING THE LINE OF BEST FIT”
Calculating the Line of Best Fit

Recall that our equation of a line is as follows: Y             = mX + b
We can calculate the values of m and b, the formulas are described using „sigma‟
notation. These formulas might look difficult at first, but if you split them up into pieces
and store the result of each piece then the formulas are not too hard to handle.

 N          N

N             ( X i )( Yi ) 
 ( X iYi )   i 1 N i 1 
i 1
                  
                  
m                                 
 N         2

  X  
i
              
 ( X i )   i 1 N  
N
2

i 1                     
              
              

N
 N    
 Yi     Xi 
b  i 1  m i 1 
N     N 
      
      

Unfortunately the formula for calculating the line of best fit is not straightforward.
It involves many summations of the data. In this assignment we will be using sigma
notation (∑ „sigma‟). Here’s an example of how to convert a sum of ‘N’ numbers into a
loop in Java.

Sigma                                                   Java
N                                 int sumX = 0;
sumX   X i                             for ( int i = 0 ; i < N ; i ++ )
i 1
sumX + = x [ i ]

In sigma                               Remember that
notation, we start                     array indices start
at 1 and go to N.                      at 0. So we loop
from 0 to N – 1.
ICS3M – Assignment #3
“FINDING THE LINE OF BEST FIT”

What is to be submitted:

   Hard copy of the program (print out)
   All program files on a floppy disk (Attach to hard copy).
   Saved copy of the program on the shared I: drive.
   External documentation containing:
 Pseudo-code
 A write-up summarizing the program design (1 page minimum).
 A spreadsheet including data and scatter plot to verify solution to program.
 Hand calculated results (or spreadsheet showing totals of the summations
above).

Try to follow these steps to successfully completing the assignment:

Step 1             Complete pseudo-code (Expand on steps 2-6).
   Test the data using spreadsheet software.
   Do the calculations by hand or using a spreadsheet to verify possible solutions.
(This way you can check your program for errors as you progress.)

Step 2             Start coding your program:
 Set-up the applet.
 Try to use methods wherever applicable, they will make the programming
easier.

Step 3             Draw the scatter plot using the data in your array(s).
 Use a dot, an ‘x’ or a small circle for each data point.
 Use a nice colour scheme: try different colours to show pass / fail, or
different colours for each Level of Achievement (0-49, 50-59, 60-69, 70-79,
80-100).

Step 4             Calculate the line of best fit.
   Use hand calculations from step 1 to check program for errors.

Step 5             Draw the line of best fit.
   Label the line.
   Show equation of the line.

Step 6             Attempt any of the higher expectations you can do.
ICS3M – Assignment #3
“FINDING THE LINE OF BEST FIT”                                                              /100
Marking Scheme                                NAME: __________________________
SP2.01 – use constants, variables, expressions, and assignment statements to store and manipulate numeric, character and
logical data in programs.
SP2.02 – incorporate one-dimensional and two-dimensional arrays into computer programs.
SP2.03 – write programs that use related arrays to store and extract data.
K/U & App.               Level 1                Level 2                Level 3                Level 4            Marks
Scatterplot           Scatter plot and       Scatter plot and       Scatter plot and        Presentation of
points are poorly
presented
points are
points are well
presented
Scatter plot and
points is
/20
presented                                    exemplary
Line of Best Fit      Line of best fit is    Line of best fit is    Line of best fit is    Line of best fit is
incorrect and
poorly displayed
correct but poorly
displayed
correct and well
displayed
correct, labeled
and well displayed
/20
Higher                *see assignment        *see assignment        *see assignment        *see assignment
Expectations         for possible extras    for possible extras    for possible extras    for possible extras       /30
/70
SPV.03 – produce appropriate internal and external documentation
SP2.09 – adhere to defined programming style, including naming conventions for variables and subroutines, indentation and
spacing.
SP2.10 – incorporate and maintain internal documentation to a specific set of standards, including author, date, file name,
SP2.11 – develop external documentation to summarize the design.
Comm.                    Level 1                Level 2                Level 3                Level 4            Marks
SP2.09                   Has barely            Has partially          Has mostly           Has adhered to
Naming and
Indenting
programming
programming
programming
defined
programming
/5
style.                 style.                 style.                style.
SP2.10                Has incorporated       Has incorporated       Has incorporated          Has many
Internal
Documentation
meaningful
/5
SP2.11                   Has poorly          Has adequately           Has mostly           Has effectively
External
Documentation
summarized the
programs design
summarized the
program design
summarized the
program design
summarized the
program design
/5
/15
SP1.07 – solve the same problem using various tools (spreadsheet software).
SP1.08 – verify solutions to problems.
Think. / Inq.            Level 1                Level 2                Level 3                Level 4            Marks
SP1.07               Has poorly solved       Has adequately        Has mostly solved       Has effectively
Solve using
the problem using
solved the
problem using
the problem using
solved the
problem using
/7
software                                      software
SP1.08                  Has poorly             Has poorly             Has mostly           Has excellently
Verify Solution        described and
verified the
described or
verified the
described and
verified the
described and
verified the
/8
solution.              solution.              solution.              solution.

/15
ICS3M – Assignment #3
“FINDING THE LINE OF BEST FIT”
Additions to the project must be included in the program summary
documentation. (5 marks each)

 Calculating and displaying the mean (average) for X & Y.
N

X        i
mean  X         i 1

N

   Calculating and displaying the median (middle value) for X & Y.
   Calculating and displaying the mode (most occurring value) for X & Y.
   Displaying all the data points in the applet (Labels).
   Calculating and displaying the variance and standard deviation.
N

(X      i     X )2
var  S 2    i 1

N 1
S .Dev  S  var

 Instead of using Labels to list the coordinates, make your applet interactive by
using TextFields and allowing the user to change data points. Update all of the
information on screen after every change. (15 marks !!!)

X    Y
ICS3M Marks Scatterplot                                         15   60
19   98
1   23
120                                                                        9   50

13   76
100                                                                       16   85
80                                                                       17   82
5   46
Mean = 66.4                      60                                                                        4   32
Median = 65
Mode = 85                        40                                                                       18   95
18   85
20                                                                       12   65
0                                                                       13   61
14   79
0        5                  10              15             20        8   59

Hours Studied

Mean = 12.13
Variance = 31.12                     Median = 13                      Line of Best Fit
Standard Deviation = 5.58            Mode = 13 or 18                  Y = mX + b

```
To top