Bioinformatics analysis with Perl
basic level, 7.5 ECTS
Welcome to this course in Perl programming with applications in
bioinformatics. This course is an introduction to programming with the aim
of teaching you how to program specifically in the language Perl. The
exercises and assignments attached to the course will have a bioinformatical
touch, i.e., they will show you how to analyze and process biological data.
This introductory lecture will give some practical information regarding the
content of the course.
The course responsible and examiner on the course is Angelica Lindlöf,
which can be contacted by email: firstname.lastname@example.org or by phone:
Lecturers and supervisors on the course will be Angelica Lindlöf and
Benjamin Ulfenborg. Benjamin can be contacted by email:
Course lecture overview
Course assingment/laboration overview
Introduction to bioinformatics
Install Perl on your computer
Short version of course syllabus Bild 2
In this introductory lecture the above topics will be covered. In order to get
started with Perl programming it is important that you read through the last
Describe the basic principle for procedural programming
In a structure way analyze and break down simpler
problems into smaller sub-problems
Based on a general description of a bionformatical
problem create a procedural program the solves the
Interpretate and create pseudo code for simpler
Interpretate and create procedural programs in the
programming language Perl
Short version of course syllabus Bild 2
Here is a short version of the syllabus, just giving you an overview of the
aims in the course.
The course is focused on the procedural programming paradigm, which you
can read more about here:
and therefore you need to learn how to think procedural, i.e., how to break
down problems into sub-problems and create solution for each sub-problem.
Using procedural programming with Perl you will take a look at some
bioinformatics problem and learn how to analyze biological data with a
small Perl program.
You will also be introduced to pseudo code and what that is, but here is an
introduction to this topic:
Course overview – 8 modules
1. Introduction to procedural programming
2. Data types
3. Program control
4. Input and output (I/O)
5. Subroutines and references
6. Problem solving and pseudo code
7. Regular expressions
Course lecture content Bild 3
This course will cover 8 modules or topics, which approximately cover 12
lectures for a campus version. Consequently, some topics will require some
more study time than others and these are the modules 3, 5 and 7. Spend
some more time on these in order to fully understand them.
For each lecture you will be given reading instructions in form of a power-
point presentation (similar to this one), which will high-light important
things to focus on. There will also be references to the course literature,
which you are expected to read, and exercises coupled to each topic. It is
strongly recommended that you do all the exercises, since learning how to
program can only be done in one way – by actually do programming.
Programming is not a theoretical topical, but very much practical and
therefore you cannot learn it solely by heart.
This course, consequently, require that you do a lot of self-studying and
practice by programming.
Course overview – 5 laborations
1. Data types
2. Program control + I/O
3. Subroutines and references
4. Problem solving and pseudo code
5. Regular expressions
Course laboration content Bild 4
The course will have five optional assignments/laborations which will in
total give 60 points. These are optional, but in order to pass the course it is
strongly recommended that you do the assignments.
For each assignment you are required to hand in a report. More details on
this will be available on the home site.
The assignments are totally independent and no cooperation is allowed! Any
suspicion of cheating will be acted upon. This also includes the report and it
is not allowed to copy text from someone else’s work without proper
Written home exam – 60 points
Assignment points will be added to exam points, i.e.
exam and assignment points equals final points
Final point sets the grade
Assignments – 60 points
5 optional assignments
Total 60 points
Examination Bild 5
The examination will consist of two parts, a written home examination that
at maximum gives 60 points and the assignments that can at maximum give
60 points. That is, in total you can get at maximum 120 points. In order to
pass you need to have 60 points, after the written exam and assignment
points have been added together.
And, once again, the assignments as well as the home examination are
individual! This requires referencing to any external material used.
Tisdall, J. (2001) Beginning Perl for Bioinformatics.
Schwartz et al. (1997) Learning Perl
Online book: http://docstore.mik.ua/orelly/perl/learn/index.htm
Lindlöf, A. (2005) Programming in Perl – with
Applications in Bioinformatics
which will be used as a reference material and good for you
who already are familiar with programming in general
Literature Bild 7
There will be two course books, Tisdall’s and Schwartz’s books, which will
be referred to frequently. The first you need to buy, but the second is
available on the internet.
There is also a reference compendium available, which is good for you who
is already familiar to programming and want to get a jump-start into Perl
Bioinformatics – what is?
Application of information technology on molecular
The use of computer science to solve biological
problems or analyze biological data within the field of
For example, analyze gene sequences
Introduction to bioinformatics Bild 8
Bioinformatics has developed into a large research field, involving
thousands of researchers all over the world. In order to understand the
exercises and assignments in this course it is good if you have a basic
knowledge and understanding about what bioinformatics is.
A bioinformatician is a person who either uses existing tools/software for
analyzing molecular biological data or develops new computer programs to
do the same. Commonly one starts with an existing tool to solve a problem
or analyze some experimental data, but most often you run into the problem
that the program cannot do what you specifically want it to do.
Consequently, you have to develop your own small analysis tool (commonly
referred to a script) and that is what you can use Perl for.
One common problem for biologists is to analyze thousands of gene
sequences, e.g., finding out each gene’s function and relatedness to other
genes. This you cannot do by hand, since it would take hours and hours of
working time. Instead, you develop a script to do the work for you, which
can minimize the task to just one or a couple of hours.
For this topic read the following pages:
Tisdall: pp. vii-xii, Preface
Tisdall: pp. 1-5, Chapter 1
Thereafter, try to answer the following questions (you may need to do some
look-up in lexicons or search the internet for more information):
- What is DNA?
- What is a DNA sequence?
- What is a protein sequence?
- What does the concept “in silico”mean?
- What is an NP-complete problem?
Installing Perl on your computer
Operative system and platform?
Windows, Unix, Linux …
Application depending on operative system
Run Perl programs
Installing Perl Bild 9
How to install Perl on your computer depends on what operative system
you have. Perl is a little application, like a small program, that needs to
be downloaded and installed on your computer. Read the following
pages in order to install Perl:
Tisdall: pp. 8-17 in Chapter 2
Schwartz: chapters 1.1-1.3
Make sure that you install the correct Perl for your operative system.
For running Perl programs under Windows you need to start an MS-
DOS command window, which can be found under:
Start -> All programs -> Accessories -> Command prompt
Try to print perl –v in the prompt, to see whether Perl has been installed
or not on your computer.
Perl programs are written in some text editor, e.g., Notepad if you are
using Windows. But I can recommend Crimson Editor for Windows-
users, which is a kind of neat text editor. It is a freely available editor
and can be downloaded from the following page:
If you decide to use Crimson you can find the MS-DOS window under
the Tools menu.
If you run into problems, do not hesitate to contact me!