Docstoc

Weekly Logs

Document Sample
Weekly Logs Powered By Docstoc
					Assignment 2
Weekly Logs – Back Up


Steven Graham
B00444855
Table of Contents
Learning Log ............................................................................................................................................ 3
Week 1 – 20th September – 24th September........................................................................................... 3
   Lecture ................................................................................................................................................ 3
   Tutorial ................................................................................................................................................ 3
   Practical............................................................................................................................................... 3
       Exercise 2 ........................................................................................................................................ 3
       Exercise 3 ........................................................................................................................................ 8
Week 2 – 27th September – 1st October............................................................................................... 12
   Lecture .............................................................................................................................................. 12
   Tutorial .............................................................................................................................................. 12
   Practical............................................................................................................................................. 13
Learning Log .......................................................................................................................................... 17
Week 3 –4th October - 8th October ...................................................................................................... 17
   Lecture .............................................................................................................................................. 17
   Tutorial .............................................................................................................................................. 17
   Practical............................................................................................................................................. 18
       Telecare ......................................................................................................................................... 18
       Telehealth ..................................................................................................................................... 20
Learning Log .......................................................................................................................................... 22
Week 4 – 11th October - 15th October ................................................................................................. 22
   Lecture .............................................................................................................................................. 22
   Tutorial .............................................................................................................................................. 23
   Practical............................................................................................................................................. 23
Learning Log .......................................................................................................................................... 25
Week 5 – 18th October – 22nd October ................................................................................................ 25
   Lecture .............................................................................................................................................. 25
   Tutorial .............................................................................................................................................. 26
Learning Log .......................................................................................................................................... 27
Week 6 – 25th October - 29th October ................................................................................................. 27
   Lecture .............................................................................................................................................. 27
   Tutorial .............................................................................................................................................. 29
   Practical............................................................................................................................................. 29
Week 7 – 1st November – 5th November ............................................................................................. 30
   Practical............................................................................................................................................. 30
Week 8 – 8th November – 12th November ............................................................................................ 31
   Practical............................................................................................................................................. 31
   Task 1 ................................................................................................................................................ 31
   Task 2 ................................................................................................................................................ 32
       Resubmit ....................................................................................................................................... 33
Learning Log .......................................................................................................................................... 34
Week 9 – 15th November – 19th November........................................................................................ 34
   Practical............................................................................................................................................. 34
       Task 3.1 ......................................................................................................................................... 36
       Task 3.2 ......................................................................................................................................... 36
       Task 3.3 ......................................................................................................................................... 40
       Task 3.4 ......................................................................................................................................... 40
       Task 3.5 ......................................................................................................................................... 40
       Task 3.6 ......................................................................................................................................... 41
Learning Log .......................................................................................................................................... 42
Week 10 – 22nd November – 26th November ..................................................................................... 42
   Practical............................................................................................................................................. 42
       Task 2.1 ......................................................................................................................................... 42
       Task 2.3 ......................................................................................................................................... 42
       Task 2.5 ......................................................................................................................................... 42
       Task 2.6 ......................................................................................................................................... 42
       Tasks 3.1 ........................................................................................................................................ 43
       Task 3.2 ......................................................................................................................................... 43
Learning Log


Week 1 – 20th September – 24th September
Lecture


This week’s lecture was an introduction to the module. The lecture covered the different areas in
which we would be covering during the module and I found that I was interested in the topics that
where being listed. I was also happy that I had chosen the Emerging healthcare module for the
second semester. I also got a better understanding of what health informatics means and how it has
evolved within the medical profession.

I believe that this module is going to be very interesting and enjoyable



Tutorial


This week’s tutorial was an introduction into the Matlab software. We were taken through some of
the basic commands which are used in Matlab in preparation for the practical.



Practical


This week’s practical was an introduction to Matlab, below are the answers to the practical
questions

Exercise 2


Q1. >> a = [1 2 3 4 5]

a=

   1     2    3    4     5



Q2. >> b = [6, 7 , 8 , 9]

b=

   6     7    8    9

Q3. >> c = [1; 2; 3; 4; 5]
c=

     1

     2

     3

     4

     5



Q4. >> d = [

123

456

789]



d=

  123

  456

  789



Q5. >> e = d'

e=

  123 456 789



Q6. >> f = [1 2 3 4 5; 6 7 8 9 10];



Q7. A ‘ on a matrix turns the numbers from going down in a number of rows to just
one.

Q8. A ; on the end stops the results being shown in screen

Q9. >> [12345]
ans =

         12345

Q10. The above row vector was stored inside a variable called ans



Q11. >> u = [0:8]

u=

     0     1     2        3        4        5    6     7



Q12. >> s = [0:2:100]

s=

 Columns 1 through 13

     0     2     4        6        8    10       12    14    16    18    20    22    24

 Columns 14 through 26

     26    28        30       32       34       36    38    40    42    44    46    48    50

 Columns 27 through 39

     52    54        56       58       60       62    64    66    68    70    72    74    76

 Columns 40 through 51

     78    80        82       84       86       88    90    92    94    96    98 100



Q13. >> t = [2:2:6; 7 2 9]

t=

     2     4     6

     7     2     9
Q14. >> t = t'

t=

     2   7

     4   2

     6   9



Q15. >> v = t(1:3)

v=

     2   4    6



Q16. >> v = t(1,2)

v=

     7



Q17. >> v = t (2,1)

v=

     4



Q18. >> a = [1 2 3 4 5; 6:10; 11:2:19]

a=

     1   2    3    4        5

     6   7    8    9    10

  11     13   15       17       19

Q19. >> a(:,2)=[]

a=

     1   3    4    5

     6   8    9    10

  11     15   17       19
Q20. >> a = [1 2 3 4 5; 6:10; 11:2:19]

a=

   1    2    3    4        5

   6    7    8    9    10

  11    13   15       17       19



Q21. >> a(4,3)

??? Attempted to access a(4,3); index out of bounds because size(a)=[3,5].



Q22. >> a

a=

   1    2    3    4        5

   6    7    8    9    10

  11    13   15       17       19



Q23. >> a - 1

ans =

   0    1    2    3        4

   5    6    7    8        9

  10    12   14       16       18



Q24. >> a = a - 1

a=

   0    1    2    3        4

   5    6    7    8        9

  10    12   14       16       18
Exercise 3


Q1. >> a = a'

a=

     0   5   10

     1   6   12

     2   7   14

     3   8   16

     4   9   18



Q3. >> a = [1 2 3 4];

>> a

a=

     1   2   3    4

a now contains the numbers 1 2 3 4

Q4. >> b = [ 5 6 7 8];

>> b

b=

     5   6   7    8

b now contains the numbers 5 6 7 8



Q5. >> c = a + b

c=

     6   8   10   12

C now contains variable a plus variable b. It is taking each number in sequence and adding
them together for example a 1 + b 5 = c6
Q6. >> m = [1:2:9; 10:2:19];

>> m

m=

     1   3    5     7        9

  10     12    14       16       18

This is creating a variable called m and creating a matrix of two rows by 5 columns and
placing the numbers 1 to 9 in steps of two in the first row and numbers 10 to 19 in the
second row in steps of two.

Q7. >> b = [2:2:10; 11:2:20]

b=

     2   4    6     8    10

  11     13    15       17       19

The above statement is doing the same as the previous question

Q8. >> c = m-b

c=

  -1     -1   -1    -1       -1

  -1     -1   -1    -1       -1

This statement is creating a variable called c and performing a mathematical calculation by
subtracting the numbers in variable m away from variable b.

Q9. >> a =[1 2 3; 4 5 6]

a=

     1   2    3

     4   5    6
Q10. >> b = a'

b=

     1   4

     2   5

     3   6

This is changing variable a by placing the data into two columns and then inserting it into a
new variable called b.



Q11. >> c = a*b

c=

  14     32

  32     77

This is multiplying the two matrices. To work out 14 we do the following

Multiply a 1 by b 1 which equals 1

We then

Multiply a 2 by b 2 which equals 4

We then

Multiply a 3 by b 3 which equals 9

We then add the three answers together

1 + 4 + 9 this equals 14

Below is the website link which I used to learn how matrices are multiplied

http://www.intmath.com/Matrices-determinants/4_Multiplying-matrices.php

Access on the 28/09/2010 at 21:03

Q12. >> a(3,:) = []

??? Index of element to remove exceeds matrix dimensions.
Q13. >> a = [a(1,:) ; a(2,:); [7 8 0]]

a=

       1   2   3

       4   5   6

       7   8   0

Q14. >> z = []

z=

  []

This creates a variable called z but places no information inside it.



Q15. >> for i = 1:10;

z(i) = i*i;

end



Q16. z =

       1   4   9   16   25   36   49     64   81 100
Week 2 – 27th September – 1st October


Lecture


As this week lecture was not available I used the time to read through one of the articles in which we
have access to namely the Medical Informatics Past, Present and Future 2010 article. The article was
a great read and gave a good understanding on how far health informatics has come over the last
decade. As it says in the paper “We can hardly imagine diagnostic procedures without, for instance,
diagnostic imaging tools such as computer tomography, or therapeutic actions without the software
that checks for medication interactions or uses computer-assisted tools for surgery” I personal
would find it very strange to walk into my own doctors surgery and not check myself in automatically
using the touch screen arrival, or see the doctor writing inside one of the old patients files rather
than filling in boxes on their computer screens.


From reading the article you can see that Health informatics is still a relatively new discipline and it is
only after the last couple of years that it has really grown and more and more medical task are
turning to using some sort of computing.

Health care is constantly changing and so Is health informatics, people and groups are constantly
trying to improve and created some new way of achieving something, be this new equipment to help
monitor our elderly relatives in their homes or news ways of manipulating and creating images from
scans to achieve better diagnoses.



Tutorial


This week’s tutorial was a further extension onto the Matlab software, in the tutorial we were
shown how Matlab could be used to –

    1.   Plotting 2d Graphs
    2.   Manipulate the lines on the graphs e.g colour, style and marker style
    3.   Produce figure windows, which could then be saved a jpeg file
    4.   Create executable file
    5.   Plotting medical data
    6.   Subploting
    7.   Contour plots
    8.   Surface plots

This tutorial has shown how powerful this piece of software is and how useful it could be within
health informatics
Practical


This week’s practical is to help me have a better understanding of how to manipulate
medical data in to visual medical data i.e. graphs. In exercise one I was asked to create an
executable file which would plot a graph, below is the commands from the file and also the
figure that they produced
In exercise two I was asked to plot a graph of emg data, on this graph I also had to change
the line colours for each finger. Below are a few screen shots taken from exercise two




Above image show the commands used to plot the emg data graph, also shown in image.
The above image shows the emg commands and then a subplot of averages which is now
included inside the same figure as seen above.
The above image is produced using the commands as seen in the screen shot. The graphs
shows the plotting of ecg data.
Learning Log


Week 3 –4th October - 8th October
Lecture
The first part of this week lecture was showing us where we uploaded are assignments for the
module.

The main part of this week’s lecture was based on patient informatics. This part of medical
informatics is new and the aim of it is to empower the patients. Two of the main faculties that would
be desired are;

    1. Email reminder of appointments
    2. Schedule your own appointments online

More and more people are now using the internet to do more research into their medical conditions
and are also using this information to tell their doctors what they think is wrong (self-diagnoses).
Although there is a number of good website out there, there are also quite a few bad ones.

One patient education website which we were shown during the lecture was the WebMD website.
The main features of this website are;

       Massive health library
       New treatment information
       Symptom checker

Another is Revolution health which would appear to be more personalised with it

       Personal health record
       Health checker



Tutorial


This week’s tutorial was an overview of the second assignment.
Practical


This week’s practical was a research based task. I had to research for companies that claimed to
provide telecare and telehealth, below are definition of what telecare and telehealth are;

Telecare
Telecare is a term given to offering remote care of elderly and vulnerable people, providing
the care and reassurance needed to allow them to remain living in their own homes.

Wikipedia - http://en.wikipedia.org/wiki/Telecare

Below are a few companies within Northern Ireland who claim to provide telecare




                              Figure 1- http://www.aidcall.co.uk/healthcare/
Figure 2 - http://www.mcelwainegroup.com/index.php?page=mcelwaine-smart




         Figure 3 - http://www.foldgroup.co.uk/pages/27/telecare
Telehealth


Telehealth is the delivery of health-related services and information via telecommunications
technologies.

Wikipedia - http://en.wikipedia.org/wiki/Telehealth

Below are a few companies within Northern Ireland who claim to provide telecare




                            Figure 4 - http://www.hometelehealthltd.co.uk/
                              Figure 5 - http://www.telehealthsolutions.co.uk/




The result from the research will be the base of our discussion topic in the week 4 tutorial.
Learning Log


Week 4 – 11th October - 15th October
Lecture


This week’s lecture was based on technologies that are used to do measurements of our bodies. The
topics discusses where;

        X-RAY
        MRI
        CT
        ULTRASOUNDS
        ECG
        EEG
        EMG
        EOG

The above technologies have allowed health care professionals to examine patients

    1.   Nervous System
    2.   Cardiovascular System
    3.   Respiratory System
    4.   Skeletal System

Although x-rays are still safer than surgery, they still have their problems, such as radiation sickness
and can lead to mutations such as cancers. A major drawback of the x-rays is it can only do 1
dimensional image; this was overcome by in the invention of the CAT or CT scan which allowed for
2D X-rays to be processed into 3D images.

The MRI scan then took this imaging to a new level of better quality and it also reduced any risk and
is said not to be as harmful.

The next types of measurements are Bio signals and these include;

        Electrocardiogram (ECG)

                The ECG is used to measure the patient’s heart rate by usual using a 12 lead ECG
                machine.

        Electroencephalogram (EEG)

                The EEG is used to measure the brains electrical activity. This is done by using
                between 16 to 25 electrodes on the patients scalp.
       Electromyogram (EMG)

                The EMG is used to measure muscle function and activity. This is achieved by either
                placing electrodes into the muscles or by placing gel electrodes on the skin. By
                placing the electrode needles into the muscle will give a more specific
                measurement.

       Electrooculogram (EOG)

                The EOG is used to measure the resting potentials of the retina. This is achieved by
                placing the electrodes either, above, below or to the side of the eye.



Another type of measurement is the acoustic measurement, which picks up vibrations from the
heart and lungs and turns them into sounds. These measurements are taken by using stethoscopes
and in more recent years electronic stethoscopes have been invented, where there is now a sensor
inside the chest piece of the stethoscope.

Tutorial


This week’s tutorial is used to do some research on my assignment, as I was unable to attend the
tutorial this week.

Practical


This week’s practical was an introduction into audio processing using a piece of software called
Goldwave. From using this product for the practical I found that the quality of the sound that it
produced after placing the filters on was excellent. From completing the last task where I had
removed the person talking from the breathing patterns, I thought that the software was very
powerful and it didn’t seem to lose any quality or any of the breathing patterns. Below are a couple
of screen shots taken from the practical.
Learning Log


Week 5 – 18th October – 22nd October


Lecture


This week’s lecture was an introduction into medical data, how it’s processed, PAC system and
eprescribing and the associated security issues.

Medical data is crucial to information processing and decision making; computers are used to
process this information in three ways

    1. Observation
    2. Diagnosis
    3. Therapy

This medical data can be anything from ECG results to family history, it is usual things that can be
observed. There are four different types of data;

    1.   Narrative data
    2.   Discrete Numerical Values
    3.   Analog Data
    4.   Visual Data

Picture Archiving and Communication Systems or PACS are computer, commonly servers which allow
medical professional to

        View images – for example X-Rays
        Archive images
        Communicate these images between different areas

PACS uses its own independent standard for image storage, this is the Digital Imaging and
Communications in Medicine or DICOM.

ePrescribing this is the introduction of paperless prescriptions. The doctor will simply fill in the
prescription on screen and send it directly to your pharmacy. The aim of ePrescribing is to reduce
the amount of errors that currently occur, for example 1 in 20 hospital admissions are thought to be
related to medication errors within the UK.

ePrescribing may be a good idea and may save lives but on thinking about it for my own local area, I
have two pharmacies next to my doctors surgery I could uses, also I know people that will travel to a
pharmacy nearer there home for example lisburn where there must be a least 20 pharmacies. So
when the system is being implemented all these pharmacies are going to have to be listed and an
error could occur where the doctor accidently selects the wrong pharmacy, the patient won’t find
out that there prescription has went to the wrong pharmacy until they go to the usual one, then how
do you find out which pharmacy it has went to.

That’s one problem that I envisage could happen, but I am sure some sort of preventative measure
could be put in place to prevent this.



Tutorial


This week’s tutorial is a reading week. While writing the log I search around for some information
on PACS and found couple of website which talk about PACS, pleas find links below

NHS Connect - http://www.connectingforhealth.nhs.uk/systemsandservices/pacs

eHow - http://www.ehow.co.uk/about_6771301_job-description-pacs-administrator.html
Learning Log


Week 6 – 25th October - 29th October
Lecture


This week’s lecture was based on patient records; these are historical records of patient care.
Previously patient records had been paper based and this lead to a number of problems which
included

    1.   Illegible handwriting
    2.   Lost due fire
    3.   Lost due to flood etc.
    4.   Lost due to human error

Also paper records take up a lot of room, if every person on earth had a patient record there
wouldn’t be enough room to store them all. Below is an example of a patient records warehouse
Now the paper free patient records era has begun with the introduction of EHR or Electronic Health
Records and this is a repository of electronically maintained information about an individual’s
health.

The electronic health records system has five functional components

   1.   Integrated view of data
   2.   Clinical decision support
   3.   Clinical order entry
   4.   Access to knowledge resources
   5.   Integrated communications support

EHR systems have the potentials to bring huge benefits to both patients and health professional and
this is the reason why they are being implemented across the developed world. The EHR aims to
provide easy navigation through the entire medical history of a patient

There are a number of different uses for the EHR system, these include

   1.   Inpatient
   2.   Outpatient
   3.   Primary care
   4.   Disease specific
   5.   Intensive care
   6.   Emergency department
   7.   Hospitals
   8.   Nursing homes
   9.   Research departments

The main disadvantages for this system are the

       Initial costs
       Maintenance costs
       Treatment of old paper based records
       Security

Below is an example of an EHR
system
Tutorial


This week tutorial was an introduction into the English health services, called NPfIT or National
Programme for IT. It was announced in 2002 and was due to be completed within 7 to 8 years at a
cost of £6 billion. The project has still not been completed and is well over budget. The main
components of the system where to be a

       National record system
             o Electronic transfer of prescriptions
             o Choose and book
             o PACS
             o NHS care records service
       IT infrastructure

The aim of the system was to provide

    o   Improve share of patient records
    o   Allow patients and GPs to book hospital appointments
    o   ePrescribing
    o   a national network (N3)
    o   NHS email services
    o   PACS
    o   Online personal health organiser
    o   NHS care website for both patients and care providers
    o   Common user interface in partnership with Microsoft – In researching the user interface I
        found the Microsoft website - http://www.mscui.net/



Practical


This week’s practical I will include in weeks 7 and 8 log.
Week 7 – 1st November – 5th November
Practical
      Attributes are the variables

      Total number of instances – 150

      Percentage of correctly classified – 96%

      Percentage of incorrectly classified – 4%

      Cross validation is the method of estimating the performance of a predictive model

      Confused matrix is a visualisation tool used in supervised learning. Each row represents an
       instance from the class

      Every instance would contain the correct number

      Is a confused matrix – class a has 49 and plus one has been incorrectly placed in class b

               Class b has 47 correct and 3 have been placed incorrectly in c

               Class c has 48 correct and 2 have been put in class b

               Overall 6 have been wrongly classified
Week 8 – 8th November – 12th November
Practical


       University of Massachusetts Amherst

Citation Policy:

If you publish material based on databases obtained from this repository, then, in your
acknowledgements, please note the assistance you received by using this repository. This will help
others to obtain the same data sets and replicate your experiments. We suggest the following
pseudo-APA reference format for referring to this repository:

Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml].
Irvine, CA: University of California, School of Information and Computer Science.

Here is a BiBTeX citation as well:

@misc{Frank+Asuncion:2010 ,
author = "A. Frank and A. Asuncion",
year = "2010",
title = "{UCI} Machine Learning Repository",
url = "http://archive.ics.uci.edu/ml",
institution = "University of California, Irvine, School of Information and Computer Sciences" }

A few data sets have additional citation requests. These requests can be found on the bottom of
each data set's web page.

       To donate a data set to the repository you simple fill out the online form and attached the
        dataset

http://archive.ics.uci.edu/ml/about.html

http://archive.ics.uci.edu/ml/citation_policy.html

http://archive.ics.uci.edu/ml/donation_policy.html

http://archive.ics.uci.edu/ml/donation_form.html

Task 1
Citation - B. Kaluza, V. Mirchevska, E. Dovgan, M. Lustrek, M. Gams, An Agent-based Approach to
Care in Independent Living, International Joint Conference on Ambient Intelligence (AmI-10),
Malaga, Spain, In press

Abstract: Data contains recordings of five people performing different activities. Each person wore
four sensors (tags) while performing the same scenario five times.

http://archive.ics.uci.edu/ml/datasets/Localization+Data+for+Person+Activity
Task 2


Data Set Characteristics:             Multivariate             Number of Instances:         336


Attribute Characteristics:            Real                     Number of Attributes:        8


Associated Tasks:                     Classification           Missing Values?              No

Instances –data points/records

Attribute –features / variables

Dataset –Collection of data points / records

Associated tasks are associated with a specific location in a resource

Are missing values allowed within the dataset, this can lead to incorrect results

Attribute list -

1. Sequence Name: Accession number for the SWISS-PROT database

2. mcg: McGeoch's method for signal sequence recognition.

3. gvh: von Heijne's method for signal sequence recognition.

4. lip: von Heijne's Signal Peptidase II consensus sequence score. Binary attribute.

5. chg: Presence of charge on N-terminus of predicted lipoproteins. Binary attribute.

6. aac: score of discriminant analysis of the amino acid content of outer membrane and periplasmic
proteins.

7. alm1: score of the ALOM membrane spanning region prediction program.

8. alm2: score of ALOM program after excluding putative cleavable signal regions from the
sequence.

http://archive.ics.uci.edu/ml/datasets/Ecoli
Resubmit
Cross-validation is a technique for assessing how the results of a statistical analysis will generalize to
an independent data set. It is mainly used in settings where the goal is prediction, and one wants to
estimate how accurately a predictive model will perform in practice. One round of cross-validation
involves partitioning asample of data into complementary subsets, performing the analysis on one
subset (called the training set), and validating the analysis on the other subset (called the validation
set or testing set). To reduce variability, multiple rounds of cross-validation are performed using
different partitions, and the validation results are averaged over the rounds.

Cross validation is used within data mining to fine tune or improve on the results.

Confusion Matrix is a visualized tool used for data sets. The rows of matrix show the instances in a
predicted class and the column in the matrix shows the instances in the actual class. It can be used
to make sure that systems are not confusing 2 classes

A perfect matrix would have the numbers diagionaly
Learning Log


Week 9 – 15th November – 19th November


Practical


Task 1
Task 2
Task 3.1


Supervised learning is where the machine concludes a function from supervised training data. The
training data will consist of training examples. Each example will be a pair consisting of input objesct
and output values. The supervised algorithm will analysis the training data and will produce an
inferred function or classifier.

Task 3.2
=== Run information ===



Scheme:     weka.classifiers.trees.J48 -C 0.25 -M 2

Relation: WDBC-weka.filters.unsupervised.attribute.Reorder-
R2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,1-
weka.filters.supervised.attribute.AddClassification-Wweka.classifiers.rules.ZeroR-
weka.filters.supervised.attribute.AddClassification-Wweka.classifiers.rules.ZeroR-
weka.filters.supervised.attribute.AddClassification-Wweka.classifiers.rules.ZeroR-
weka.filters.supervised.attribute.AddClassification-Wweka.classifiers.rules.ZeroR-
weka.filters.AllFilter

Instances: 569

Attributes: 31

        radius1

        texture1

        perimeter1

        area1

        smoothness1

        compactness1

        concavity1

        concave1

        symmetry1

        fractal_dimension1

        radius2

        texture2

        perimeter2
          area2

          smoothness2

          compactness2

          concavity2

          concave2

          symmetry2

          fractal_dimension2

          radius3

          texture3

          perimeter3

          area3

          smoothness3

          compactness3

          concavity3

          concave3

          symmetry3

          fractal_dimension3

          class

Test mode: 10-fold cross-validation



=== Classifier model (full training set) ===



J48 pruned tree

------------------



area3 <= 880.8

| concave3 <= 0.1357
| | area2 <= 36.46: B (319.0/3.0)

| | area2 > 36.46

| | | radius1 <= 14.97

| | | | texture2 <= 1.978: B (11.0)

| | | | texture2 > 1.978

| | | | | texture2 <= 2.239: M (2.0)

| | | | | texture2 > 2.239: B (3.0)

| | | radius1 > 14.97: M (2.0)

| concave3 > 0.1357

| | texture3 <= 27.37

| | | concave3 <= 0.1789

| | | | area2 <= 21.91: B (12.0)

| | | | area2 > 21.91

| | | | | perimeter2 <= 2.615: M (6.0/1.0)

| | | | | perimeter2 > 2.615: B (6.0)

| | | concave3 > 0.1789: M (4.0)

| | texture3 > 27.37: M (21.0)

area3 > 880.8

| concavity1 <= 0.0716

| | texture1 <= 19.54: B (9.0/1.0)

| | texture1 > 19.54: M (10.0)

| concavity1 > 0.0716: M (164.0)



Number of Leaves :       13



Size of the tree :       25
Time taken to build model: 0.06 seconds



=== Stratified cross-validation ===

=== Summary ===



Correctly Classified Instances       530          93.1459 %

Incorrectly Classified Instances      39          6.8541 %

Kappa statistic                  0.8544

Mean absolute error                 0.0741

Root mean squared error               0.2579

Relative absolute error            15.8366 %

Root relative squared error           53.331 %

Total Number of Instances             569



=== Detailed Accuracy By Class ===



        TP Rate FP Rate Precision Recall F-Measure ROC Area Class

         0.925    0.064     0.895     0.925    0.91    0.927 M

         0.936    0.075     0.954     0.936    0.945    0.927 B

Weighted Avg. 0.931       0.071       0.932    0.931   0.932   0.927



=== Confusion Matrix ===



 a b <-- classified as

196 16 | a = M

 23 334 | b = B
Task 3.3
Sensitivity – 0.931

Specificity – 0.071

Task 3.4
Random Forest Classification –

        Correct Classification – 95.7821%
        Incorrect Classification – 4.2179%
        Tp – 0.958
        Kappa – 0.91

Decision Table Classification -

        Correct Classification – 94.0246%
        Incorrect Classification – 5.9754%
        Tp – 0.94
        Kappa – 0.871

JRIP Classification –

        Correct Classification – 92.7944%
        Incorrect Classification – 7.2056%
        Tp – 0.928
        Kappa – 0.846



Task 3.5
In terms of correct classification random tree classification is the best with 95.7821%

In terms of TP random forest classification had the highest of 0.958

In terms of kappa random forest had the highest with 0.91.

From the tree classifications above the method in which provided the best results was Radom Forest
Classification.
Task 3.6
Results from increasing the FOLD

                         Correct              Incorrect                TP                  Kappa
                      Classification        Classification
     10 fold            95.7821%               4.2179%                0.958                 0.91
     20 fold            93.3216%               6.6784%                0.933                0.8577
     30 fold            94.9033%               5.0967%                0.949                0.8905
     40 fold            94.0246%               5.9754%                0.94                 0.8719
     50 fold            95.2548%               4.7452%                0.953                0.8982


From the changing the fold from 10 up to 40 the results where worse, 10 fold provided the best
classification. When I entered 50 fold the results appeared to start improving, to see if the higher the
fold was the better the result is I decided to enter a fold of 100, below are the results

                         Correct              Incorrect
                                                                       TP                  Kappa
                      Classification        Classification
    100 fold            95.4306%               4.5694%                0.954                0.9011

As you can see the results slightly improved.
Learning Log


Week 10 – 22nd November – 26th November


Practical


Task 2.1


Unsupervised learning is a class of problems where you seek to determine how the data is organised.
There are many methods employed here which are based on data mining methods used to pre-
process data. It is different from supervised learning as the learner is only given unlabelled
examples.

Task 2.3
I expect to see two clusters from the dataset

Task 2.5
Sensitivity = 0.08421
Specificity = 0.04761


Task 2.6
EM -1

Using EM-1 did not cluster the data correctly.
EM -2

        Sensitivity = 0.5507
        Specificity = 0.1383


Tasks 3.1


Data cleansing is where the detection and correction or removal of corrupt or inaccurate records
from the record set takes place.



Task 3.2




Data cleansing algorithms can be found under the pre-process tab, and selecting filter.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:15
posted:2/11/2012
language:
pages:44