In this assignment we are going to look at using arrays and by dfhercbml

VIEWS: 4 PAGES: 5

									                                                 ICS3M – Assignment #3
                  “FINDING THE LINE OF BEST FIT”
       In this assignment we are going to look at using arrays and applets to do some
introductory statistics. The data set we are using compares the hours a student studies
with the grade achieved in a particular class. The data will be used to create a “scatter
plot”, similar to the one shown below.


                                                ICS3M Marks Scatterplot

                                      120
                     Grade Achieved




                                      100
                                       80
                                       60
                                       40
                                       20
                                        0
                                            0         5          10        15   20
                                                          Hours Studied


        The next step in the process will be to add the “Line of Best Fit”. The line of best
fit is a linear approximation of the data. (Simply put, it’s the straight line that best fits the
data.)

                                                ICS3M Marks Scatterplot

                                      120
                     Grade Achieved




                                      100
                                       80
                                       60
                                       40                                                 Line of
                                                                                          Best Fit
                                       20
                                        0
                                            0         5          10        15   20
                                                           Hours Studied
                                       ICS3M – Assignment #3
                         “FINDING THE LINE OF BEST FIT”
Calculating the Line of Best Fit

Recall that our equation of a line is as follows: Y             = mX + b
       We can calculate the values of m and b, the formulas are described using „sigma‟
notation. These formulas might look difficult at first, but if you split them up into pieces
and store the result of each piece then the formulas are not too hard to handle.


                                                         N          N
                                                                           
                                           N             ( X i )( Yi ) 
                                           ( X iYi )   i 1 N i 1 
                                          i 1
                                                                          
                                                                          
                                       m                                 
                                                           N         2
                                                                         
                                                            X  
                                                                    i
                                                                        
                                                ( X i )   i 1 N  
                                                N
                                                      2

                                               i 1                     
                                                                        
                                                                        


                                                   N
                                                          N    
                                                   Yi     Xi 
                                             b  i 1  m i 1 
                                                    N     N 
                                                               
                                                               

       Unfortunately the formula for calculating the line of best fit is not straightforward.
It involves many summations of the data. In this assignment we will be using sigma
notation (∑ „sigma‟). Here’s an example of how to convert a sum of ‘N’ numbers into a
loop in Java.

                          Sigma                                                   Java
                               N                                 int sumX = 0;
                        sumX   X i                             for ( int i = 0 ; i < N ; i ++ )
                               i 1
                                                                          sumX + = x [ i ]


   In sigma                               Remember that
   notation, we start                     array indices start
   at 1 and go to N.                      at 0. So we loop
                                          from 0 to N – 1.
                                    ICS3M – Assignment #3
                “FINDING THE LINE OF BEST FIT”

What is to be submitted:

        Hard copy of the program (print out)
        All program files on a floppy disk (Attach to hard copy).
        Saved copy of the program on the shared I: drive.
        External documentation containing:
          Pseudo-code
          A write-up summarizing the program design (1 page minimum).
          A spreadsheet including data and scatter plot to verify solution to program.
          Hand calculated results (or spreadsheet showing totals of the summations
            above).


Try to follow these steps to successfully completing the assignment:

Step 1             Complete pseudo-code (Expand on steps 2-6).
                   Test the data using spreadsheet software.
                   Do the calculations by hand or using a spreadsheet to verify possible solutions.
                    (This way you can check your program for errors as you progress.)

Step 2             Start coding your program:
                     Set-up the applet.
                     Try to use methods wherever applicable, they will make the programming
                        easier.

Step 3             Draw the scatter plot using the data in your array(s).
                     Use a dot, an ‘x’ or a small circle for each data point.
                     Use a nice colour scheme: try different colours to show pass / fail, or
                       different colours for each Level of Achievement (0-49, 50-59, 60-69, 70-79,
                       80-100).
                     Label your axes.

Step 4             Calculate the line of best fit.
                   Use hand calculations from step 1 to check program for errors.

Step 5             Draw the line of best fit.
                   Label the line.
                   Show equation of the line.

Step 6             Attempt any of the higher expectations you can do.
                                                 ICS3M – Assignment #3
                       “FINDING THE LINE OF BEST FIT”                                                              /100
Marking Scheme                                NAME: __________________________
SP2.01 – use constants, variables, expressions, and assignment statements to store and manipulate numeric, character and
logical data in programs.
SP2.02 – incorporate one-dimensional and two-dimensional arrays into computer programs.
SP2.03 – write programs that use related arrays to store and extract data.
K/U & App.               Level 1                Level 2                Level 3                Level 4            Marks
Scatterplot           Scatter plot and       Scatter plot and       Scatter plot and        Presentation of
                     points are poorly
                         presented
                                                 points are
                                                adequately
                                                                    points are well
                                                                       presented
                                                                                           Scatter plot and
                                                                                               points is
                                                                                                                    /20
                                                 presented                                    exemplary
Line of Best Fit      Line of best fit is    Line of best fit is    Line of best fit is    Line of best fit is
                        incorrect and
                      poorly displayed
                                            correct but poorly
                                                 displayed
                                                                    correct and well
                                                                        displayed
                                                                                            correct, labeled
                                                                                          and well displayed
                                                                                                                    /20
Higher                *see assignment        *see assignment        *see assignment        *see assignment
Expectations         for possible extras    for possible extras    for possible extras    for possible extras       /30
                                                                                                                    /70
SPV.03 – produce appropriate internal and external documentation
SP2.09 – adhere to defined programming style, including naming conventions for variables and subroutines, indentation and
spacing.
SP2.10 – incorporate and maintain internal documentation to a specific set of standards, including author, date, file name,
purpose, and explanatory comments.
SP2.11 – develop external documentation to summarize the design.
Comm.                    Level 1                Level 2                Level 3                Level 4            Marks
SP2.09                   Has barely            Has partially          Has mostly           Has adhered to
Naming and
Indenting
                     adhered to defined
                        programming
                                            adhered to defined
                                              programming
                                                                   adhered to defined
                                                                     programming
                                                                                               defined
                                                                                            programming
                                                                                                                      /5
                            style.                 style.                 style.                style.
SP2.10                Has incorporated       Has incorporated       Has incorporated          Has many
Internal
Documentation
                       few comments
                     and header blocks
                                             some comments
                                            and header blocks
                                                                    many comments
                                                                   and header blocks
                                                                                             meaningful
                                                                                            comments and
                                                                                                                      /5
                                                                                            header blocks.
SP2.11                   Has poorly          Has adequately           Has mostly           Has effectively
External
Documentation
                      summarized the
                      programs design
                                             summarized the
                                             program design
                                                                    summarized the
                                                                    program design
                                                                                           summarized the
                                                                                           program design
                                                                                                                     /5
                                                                                                                    /15
SP1.07 – solve the same problem using various tools (spreadsheet software).
SP1.08 – verify solutions to problems.
Think. / Inq.            Level 1                Level 2                Level 3                Level 4            Marks
SP1.07               Has poorly solved       Has adequately        Has mostly solved       Has effectively
Solve using
spreadsheet
                     the problem using
                        spreadsheet
                                               solved the
                                             problem using
                                                                   the problem using
                                                                      spreadsheet
                                                                                             solved the
                                                                                           problem using
                                                                                                                      /7
software.                 software            spreadsheet               software            spreadsheet
                                                software                                      software
SP1.08                  Has poorly             Has poorly             Has mostly           Has excellently
Verify Solution        described and
                        verified the
                                              described or
                                              verified the
                                                                     described and
                                                                      verified the
                                                                                           described and
                                                                                            verified the
                                                                                                                      /8
                         solution.              solution.              solution.              solution.

                                                                                                                    /15
                                              ICS3M – Assignment #3
                         “FINDING THE LINE OF BEST FIT”
Additions to the project must be included in the program summary
documentation. (5 marks each)

       Calculating and displaying the mean (average) for X & Y.
                                                                   N

                                                                  X        i
                                                mean  X         i 1

                                                                        N

         Calculating and displaying the median (middle value) for X & Y.
         Calculating and displaying the mode (most occurring value) for X & Y.
         Displaying all the data points in the applet (Labels).
         Calculating and displaying the variance and standard deviation.
                                                           N

                                                           (X      i     X )2
                                             var  S 2    i 1

                                                           N 1
                                             S .Dev  S  var

       Instead of using Labels to list the coordinates, make your applet interactive by
        using TextFields and allowing the user to change data points. Update all of the
        information on screen after every change. (15 marks !!!)


                                                                                                          X    Y
                                          ICS3M Marks Scatterplot                                         15   60
                                                                                                          19   98
                                                                                                           1   23
                                120                                                                        9   50
               Grade Achieved




                                                                                                          13   76
                                100                                                                       16   85
                                 80                                                                       17   82
                                                                                                           5   46
Mean = 66.4                      60                                                                        4   32
Median = 65
Mode = 85                        40                                                                       18   95
                                                                                                          18   85
                                 20                                                                       12   65
                                  0                                                                       13   61
                                                                                                          14   79
                                      0        5                  10              15             20        8   59

                                                    Hours Studied

                                                      Mean = 12.13
                 Variance = 31.12                     Median = 13                      Line of Best Fit
                 Standard Deviation = 5.58            Mode = 13 or 18                  Y = mX + b

								
To top