Learning Center
Plans & pricing Sign in
Sign Out

Online Foreign Currency Comparison Tool


									Online Foreign Currency Comparison Tool

                Student: Jamie Law
               Student ID: 7146174
            Supervisor: Sean Bechhofer

             Final Year Project Report
              Computer Science BSc
       University of Manchester, May 5, 2010

1 Abstract
The goal of this project is to produce an online comparison tool similar to those
available on MoneySupermarket.coma . The comparison tool will be capable of
allowing a user to quickly determine which company offers the best rate on a
particular foreign currencyb .
    This project is successful in extracting the foreign currency rates from several
online providers and producing a sortable table of results for comparison by the
    Title: Online Foreign Currency Comparison Tool
Student: Jamie Law
Student ID: 7146174
Supervisor: Sean Bechhofer

   a is a website specialising in comparing the prices of lots of companies

for a wide range of products such as insurance (Car, Travel, Home etc) to mobile phone contracts
to find the best deal.
    b Thus saving them the trouble of having to manually find and search 10+ different providers to

find the best rate.

2 Acknowledgements
Upon entering third year I had the task of choosing what to do for my third year
project. It was my supervisors early involvement that prompted me in the right
direction and lead me on to a suitable project to undertake. Over the course of
the year he has been of great help in making sure I stook to the task at hand and
prevented me from straying off in the wrong direction. I’m grateful for his help
and support in my final year and his insight on how best to tackle the challenging
aspects I’ve faced with my project.

1 Abstract                                                                         2

2 Acknowledgements                                                                 3

3 Introduction                                                                     5
   3.1   Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      5
   3.2   Project Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . .    5
   3.3   Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      6
   3.4   Existing Work . . . . . . . . . . . . . . . . . . . . . . . . . . . .     7
   3.5   Tackling the Problem . . . . . . . . . . . . . . . . . . . . . . . .      7
   3.6   Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . .      8

4 Design                                                                           8
   4.1   System Architecture . . . . . . . . . . . . . . . . . . . . . . . . .     8

5 Implementation                                                                  10
   5.1   Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   10
   5.2   Update Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . .   10
   5.3   Index Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   10
   5.4   Sorting Routines . . . . . . . . . . . . . . . . . . . . . . . . . .     11

6 Results                                                                         11
   6.1   Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . .    11
   6.2   Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . .     11

7 Testing and Evaluation                                                          11
   7.1   Testing Approach . . . . . . . . . . . . . . . . . . . . . . . . . .     11
   7.2   Updating data with fresh data . . . . . . . . . . . . . . . . . . . .    12

   7.3   Checking the correct values are being extracted . . . . . . . . . .      12
   7.4   Estimating the accuracy of the data . . . . . . . . . . . . . . . . .    12
   7.5   Summary of Testing . . . . . . . . . . . . . . . . . . . . . . . . .     13

8 Conclusions                                                                     13
   8.1   Achievement of Objectives . . . . . . . . . . . . . . . . . . . . .      13
   8.2   Changes to Original Plan and Expectations . . . . . . . . . . . . .      14
   8.3   Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    15
   8.4   Conclusion and Personal Comments . . . . . . . . . . . . . . . .         16

3 Introduction

3.1 Background

Online comparison tools are cropping up on an almost daily basis on the Inter-
net. More and more people are using them as a means to quickly check several
providers for the best rates available for a particular product in a bid to save both
time and money.
   The idea behind these websites is that the owners profit from the commis-
sions earned through the referral of visitors to the respective websites. But, in
the foreign currency market because of the very nature of it there is no affiliate
incentive scheme and so there is only one online comparison tool able to compare
the foreign currency providers[4].

3.2 Project Proposal

The proposal was to create an online comparison tool that would produce a sortable
table to allow a user to determine which company offers the best exchange rate
for a particular foreign currency. The tool would be updated automatically at set

time intervals with fresh data to reflect the latest exchange rates available from the
respective companies. The user would then be able to click through and visit the
site from the table in order to purchase the currency at the rate displayed.
    A user should be able to see if a particular company offers the foreign cur-
rency requested. The user should also be able to sort the table alphabetically and
numerically in order to check their preferred company exchange rates if desiredc .

3.3 Use Case

As an example of the type of person who would use this online comparison tool,
consider a person who is travelling abroad next week and hasn’t got the time to
pop in to town to purchase some foreign currency from the travel agents.
    Either at work or home, the user surfs the internet and types in ’foreign cur-
rency exchange rates’. The first resultd is my site which the user surfs to and sorts
the table by AUDe which then filters the data numerically according to the best
rates available. The user then proceeds to click through to the online company to
purchase their foreign currency.
    The alternative being for the user to have to take time out of their schedule to
pay a visit in to the town centre to purchase currency. With this comes the choice
of having to visit several shops in order to find out who has the best rate and by
how much. Or, surfing to the known online retailers to see what their exchange
rates are manually.
   c It   is well known that users tend to stick to established, well-known trust brands to make
purchases over the Internet.
   d This is just a suggested use case as there are thousands of possible keywords the user could

type. Naturally for a keyword such as ’foreign currency exchange rates’ with lots of big competi-
tion the chances of my newly established site appearing first are slim to none but this is merely a
theroertical example.
    e Australian Dollars.

    This comparison tool will alleviate the additional time and effort taken by
presenting the user with the necesary data to make a decision all in one place.
The need for a one-stop-shop is clearly there as can be attributed to the success of
online comparison sites such as who in the UK alone are
seeing millions of users a year visit their website to use their comparison tools[1].

3.4 Existing Work

This project is based on the use of the Simple HTML DOM Parserf which is a
PHP library[2]. The art of scraping data from sites isn’t new but the ability to
extract dynamic data is something that more and more sites are trying to tackle.
With the evolution of algorithms such as Genetifyg we need to come up with more
robust solutions to manipulating data from websites[3].
    The Simple HTML DOM Parser hasn’t been updated in several years meaning
that once websites make use of better dynamic algorithms to boost sales then the
parser will become ineffective for tackling data of this nature. This is because
the algorithm used in Genetify learns over time from the behaviour of visitors
to the website so identifying characteristics of data on external websites could be
manipulated by the Genetify algorithm to better server the visitors to their website
leaving you with no way of directly accessing that particular peice of data if it
changes. That is of course if we only take into account the semantics of the data
as appose to the date values.
   f PHP   Simple HTML DOM parser is a php library which allows you to extract and manipulate
data from websites
   g Genetify is similar in nature to Google Website Optimizer except that you can apply weighting

to the split A/B and multi-variate tests

3.5 Tackling the Problem

There are a handful of different methods used to scrape data from websites. As
I’ve previously used the Simple HTML DOM Parser I felt at ease in using that li-
brary to fulfill the requirements I had set for the project. Alternative solutions
could have included converting the websites to xml in order to provide some
meaning and format to the data in order to use xquery on it. Using a lexical
analyser is possible which would have allowed me to convert the websites to a
series of tokens which would then be processed and analysed.

3.6 Chapter Overview

Design: This chapter will explain the design process behind the comparison tool
aswell as demonstrating how it automatically updates the data to keep it fresh.
Implementation: This chapter will explain how the tool was implemented with
special attention being paid to the problems encountered at the different stages.
Testing: This chapter will explain the testing methodology used and some exam-
ples         of        the          data        being        verified     manually.
Evaluation: This chapter will critically evalutate the project and reflect upon both
the        successes          and          failures     of         the     project.
Conclusion: This chapter will give a personal overview of the project and how
it went along with a look into what the future holds for the tool.

4 Design
This chapter explains the high-level design aspects of the project and how the rea-
sons for why they were chosen. While the exact implementation of these features
is left for later chapters.

4.1 System Architecture

The system is split up into:

   • Database.

   • Update Scripts.

   • Index Page.

   • Sorting Routines.

   The database is what holds the data from the various providers and stores for
easy and efficient manipulation. It contains three tables: currencies, currency
values and sites. Each one is updated by the update scripts to overwrite the data
when and if it changes. I chose the MySQL Database Management system as it is
both free and extensively documented online. I’m also at ease with it as I’ve used
it for several years now to store all kinds of different data.
   The update scripts contain the code to extract the data from the external web-
sites to be stored in the database for later manipulation. These update scripts are
triggered by cronjobs running on the local server to force them to run every 6
hours. An update script will be run once the cronjob executes it and then will
check the respective company website to see if the data has changed and if neces-
sary update the database with the fresh data.
   The index page is what holds the front-end code to display the table and pull
it all together. This page makes several calls to the database to pull the data and
put it into a table for viewing.
   The sorting routines are what allow the table to be organised alphabetically
and numerically.

    The development process used during my project was an incremental one,
using the Unified Process (UP)h . The most important aspect of following this
process was identifying critical tasks early on. In the case of this project that was
making sure that sites were scrapable and would allow it.
    I implemented the Model-View-Controller (MVC) design pattern, which iso-
lates the application logic which in this case is the extraction of data into the
database from the UI thus permitting independant development and testing of each
because one isn’t directly reliant on another per-se. The table can be thought of
as the View. The controller can be thought of as the sorting mechanism on the
table. The database and data extraction (i.e. the back-end) can be thought of as
the Model.

5 Implementation

5.1 Database

The database contains three separate tables for the different types of currency,
currency values and the sites which are to be scraped. These tables are queried
by the update script in order to both update them and use the data to scrape the
relevant sites.

5.2 Update Scripts

There is a specific script for each site which contains both the site url and the type
of currencies available from that site. The scripts contain the code necessary to
access the elements on the respective websites in order to extract the database and
   h The   Unified Software Development Process is a popular iterative and incremental software
development process framework with a code first emphasis.

update it if necessary.

5.3 Index Page

This is a php file which creates a html table and imports data from the database
into the correct rows and columns to populate it.

5.4 Sorting Routines

The sorting routines allow the user to order the columns both numerically and

6 Results

6.1 Discussion of Results

This project was successful in creating a comparison tool that the user can use
to establish which company offers the best exchange rate on a particular foreign

6.2 Summary of Results

In this chapter we have looked at the tool and provided an overview of the func-
tionality available in the table to the user.
    In the next chapter I will talk about how I went about testing the tool.

7 Testing and Evaluation

7.1 Testing Approach

The testing approach I used involved a combination of both white and black box
testing of the system. The white box testing involved running tests with the knowl-
edge of the internal structure of the system e.g. database. This allowed me to
check for example if the data was being stored correctly in the database and if it
was getting updated with fresh data. The black box testing involved external test-
ing of the system without any knowledge of how the internal structure worked.
These tests involved ensuring the correct values were in each cell in the table and
that this matched up with what was being displayed on the company website.

7.2 Updating data with fresh data

In order to test if fresh data was replacing old data as it should do I first had to find
some old data. Once I did I then proceeded to manually run the update script for
that specific site to see if the old was replaced and the table updated accordingly.
I’m pleased to say it was and so everything on this end is working fine.

7.3 Checking the correct values are being extracted

This was a problem I had noticed during the demo and in testing that the identifiers
I was using to extract data were sometimes changing meaning that effectively I
was pulling the wrong values. In order to combat this I need to tackle this problem
I need to introduce some more robust integrity checks to ensure that the data being
extracted is reasonable and in-line with that of the other data for the same currency
from other sites.
   For the majority of data the correct values were being extracted and had not

changed over a 3-month period. But, a more robust solution to tackle the problem
if something does change is something I will be implementing over Summer.

7.4 Estimating the accuracy of the data

In order to estimate the accuracy of the data I need to manually check each com-
pany website to see if the value is accurate at the time of checking. As this is likely
to take around an hour and the values could be updated in this time I opened the
company websites in tabs to ensure I had the values all the same at the time of
   There are 48 different values in the table which needed checking and 41 of
those values was accurate. By accurate I mean it was the same value that was
displayed on the company website and therefore up-to-date and fresh. This means
that approximately 85 percent of values are accurate which I’m reasonably happy
with. Ideally I’d like that to be in the 95+ range so I will monitor the results of
this test on a daily basis for a week to see if there are any regular occurences such
as a particular site which updates more often than others. If possible then I will
increase the frequency in which I scrape the data from that specific site to make
sure the data is as fresh as possible.

7.5 Summary of Testing

In this chapter we have looked at how I tested the tool in order to determine that
the data was accurate.
   These tests included:

   • Updating data with fresh data.

   • Checking the correct values are being extracted.

   • Estimating the accuracy of the data by determining what percentage is up-

8 Conclusions

8.1 Achievement of Objectives

When I set out to tackle this project my objectives were to:

   • Create a table of foreign currencies from several different providers.

   • Allow the table to be sorted both alphabetically and numerically.

   • Maintain fresh data so the exchange rates are accurate or very close.

   • Simplify the task of searching for the best exchange rate online.

   I sucessfully managed to import all the necessary data from several different
online providers into a database in order to allow users to manipulate it according
to the foreign currency they desired. This data was both sortable alphabetically
and numerically by the type of currency and also by the provider. The data is
replenished in the database on a 6-hour rotation to ensure that the external servers
of the providers are not hit to frequently to trigger a ban and also to ensure data
is updated regularly throughout the day. I feel confident in saying that the task
of searching for the best exchange rate is simplified by the use of the comparison
tool. A user can now visit the site and search up to 10 different providers auto-
matically by sorting the table which is a great saving on time in comparison to the
manual method of searching each individual provider for the correct currency.

8.2 Changes to Original Plan and Expectations

One important lesson this project has taught me is that of time management. It is
so easy to slip in to the trap of underestimating the time taken to complete parts
of the project which has lead to me continually have to update my gantt chart to
reflect the latest project timeframe. A software engineering priniciple is that what
ever timeframe you estimate for something you should always triple it. It is only
in hindsight and with the experience gained on this project that I have realised
    In an ideal world I would have liked to have had the tool looking more polished
on the UI front. Whilst the tool performs as expected and fulfills the objectives it
would have been nice to have it looking visually appealing. This is something that
I plan to do over the Summer and is discussed in the ’Future Work’ subsection.

8.3 Future Work

The tool as it stands is functional but the UI and general aesthetics need improve-
ment. This is something that will be done as the number of visitors increases on
the site and the Genetify algorithm has been running for long enough to satisfy
me that there are clear changes to make based on the results of the weighted el-
ements. I plan to run a series of experients to determine things such as which
colour is most effective and what naming conventions users prefer for currencies
e.g. AUD or Australian Dollars.
    Google is continually improving in the application field which paves the way
for automated graphs to be created to allow a historical view of foreign currency
rates. This would allow a user to see where a particular currency has been in terms
of value over a set period of time. Ideally, I’d like to get to the stage where a user
can select a currency and then choose a period of time to display on the graph.

Having searched for this myself before purchasing currency online I know this to
be a useful feature.
    One of the more important aspects is the ability to integrate more robust val-
idation of data into the back-end in order to spot anomalies in the extracted data.
Online stores are continually adapting and becoming more advanced and algorith-
mic in their approach to displaying data on their site. In this day and age it is no
longer a given that data fields are likely to remain the same for long. Technology
and advances in online algorithms such as the Genetify algorithm means that more
sophiscated methods have to be used.
    It would be nice to track when and if particular sites update at a regular time.
If they do this would allow me to time the update scripts to ensure maximum
freshness of data and possibly reduce the number of times I scrape the sites on a
daily basis.

8.4 Conclusion and Personal Comments

This project has resulted in the creation of an online comparison tool capable of
comparing the exchange rates from several different online providers for a number
of different currencies.
    Having always wanted to understand how comparison sites such as MoneySu- work and make their money I feel a certain sense of accomplish-
ment in knowing that I could replicate many of their tools with the knowledge
gained from this project. It will also allow me to contribute my knowledge to the
existing MoneySavingExpert.comi comparison tools and hopefully allow me to
design and create some new ones for them.
    i   is a website dedicated on providing advice to consumers on how to
find the best deals on things and has the consumer interest at heart in their motto of ”standing up
for the little guy”.

   I expect the big push by Google into the use of microformats will change how
data scraping is done and make it more simplified as the Internet evolves into a
more standardised markup. I hope in the coming years that future students are
able to look at this report as a basis on how best to tackle the problems of data
scraping and the various ways it can be achieved when there is no standardised
markup in use.
   I hope to continue this project over the Summer and develop it into a very
clean and polished UI. I then plan to push it out as a widget through the WordPress
repositories and develop more comparison tools for use by the average consumer.
Whilst this project has been lengthy it has been of great benefit in exploring the
issues of the semantic web and sparked an ambition for me to convert my own
personal sites into microformat where possible to allow better indexation.

 [1]                       Alexa           Site     Informa-
     tion,       Alexa      provides         traffic      stats      for   websites.

 [2] SimpleHTMLDOM PHP library:                  A HTML DOM parser writ-
     ten in PHP5+ let you manipulate HTML in a very easy way.

 [3] Genetify: An algorithm which applies weighting to A/B and multi-variate

 [4] Travel Money:       Find the best online deal for your holiday cash.


To top