Manual
Shared by: g6K452
-
Stats
- views:
- 9
- posted:
- 12/4/2011
- language:
- French
- pages:
- 30
Document Sample


Technical
CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE
Sébastien Moretti
Fabrice Armougom
Olivier Poirot
Cédric Notredame
www.tcoffee.org
www.tcoffee.org
T-Coffee Web Server Installation
(September 2006)
www.tcoffee.org
Centre National de la Recherche Scientifique, France
T - C O F F E E W E B S E R V E R I N S T A L L A T I O N
License and Terms of Use ...................................................... 3
T-Coffee is distributed under the Gnu Public License ................................................................................................ 3
T-Coffee code can be re-used freely .............................................................................................................................. 3
T-Coffee can be incorporated in any pipeline: Plug-in/Plug-out…............................................................................ 3
A Web of Web-Servers................................................................................................................................................... 3
Addresses and Contacts ......................................................... 5
Contributors ................................................................................................................................................................... 5
Addresses......................................................................................................................................................................... 5
Pre-requisites: Setting up Your Environment ......................... 6
Getting the t-coffee server files ? ................................................................................................................... 6
Installing an Apache http (web) server: ....................................................................................................................... 6
Downloading Apache rpm ......................................................................................................................................... 6
Finding Apache documentation ................................................................................................................................. 6
Chosing the right Port for Apache ............................................................................................................................. 7
Checking Apache Status ............................................................................................................................................ 7
Installing Perl.................................................................................................................................................................. 8
Installing Perl Modules .................................................................................................................................................. 8
Network clients: .............................................................................................................................................................. 9
Which Account ............................................................................................................................................................... 9
Program repository ............................................................... 10
PATHs, Rights and BIN_DIR ..................................................................................................................................... 10
BIN_DIR .................................................................................................................................................................. 10
Execution Permission .............................................................................................................................................. 10
Archives ................................................................................................................................................................... 10
Intel Compilers ............................................................................................................................................................. 11
T-coffee .......................................................................................................................................................................... 11
Blast ............................................................................................................................................................................... 12
The PDB 3D Structure Database ................................................................................................................................ 12
webblast ......................................................................................................................................................................... 13
Description .............................................................................................................................................................. 13
Installation ............................................................................................................................................................... 13
Using a BLAST Cluster............................................................................................................................................ 14
SAP ................................................................................................................................................................................ 14
Fugue ............................................................................................................................................................................. 14
Joy ........................................................................................................................................................................... 14
Fugue ....................................................................................................................................................................... 15
M-coffee ......................................................................................................................................................................... 15
Exonerate ...................................................................................................................................................................... 15
Protogene....................................................................................................................................................................... 15
Installation ............................................................................................................................................................... 16
Perl modules ............................................................................................................................................................ 17
Wrapping it up ............................................................................................................................................................. 17
HTML repository.................................................................... 18
Documentation repository ........................................................................................................................................... 18
Image repository ........................................................................................................................................................... 18
Temporary and result repository (say Tmp): ............................................................................................................ 18
T-Coffee cache .............................................................................................................................................................. 19
Web security: ................................................................................................................................................................ 19
Access Rights ........................................................................................................................................................... 19
Redirection .............................................................................................................................................................. 20
1
T - C O F F E E W E B S E R V E R I N S T A L L A T I O N
Robots ...................................................................................................................................................................... 20
CGI repository ....................................................................... 21
Perl modules required: ................................................................................................................................. 21
The CGI_DIR ............................................................................................................................................................... 21
Web security: ................................................................................................................................................................ 22
Environmental Variables ....................................................... 23
Defining your block ...................................................................................................................................................... 23
Logos and colors ........................................................................................................................................................... 23
Paths for the different repositories: ............................................................................................................................ 24
E-mail and cache: ......................................................................................................................................................... 24
2
License and Terms of
Use
T-Coffee is distributed under the Gnu Public
License
Please make sure you have agreed with the terms of the license attached to the
package before using the T-Coffee package or its documentation. T-Coffee is a
freeware open source distributed under a GPL license. This means that there is no
restriction to its use, either in an academic or a non academic environment.
T-Coffee code can be re-used freely
Our philosophy is that code is meant to be re-used, including ours. No permission is
needed, although we are always happy to receive pieces of improved code.
T-Coffee can be incorporated in any pipeline:
Plug-in/Plug-out…
Our philosophy is to insure that as many methods as possible can be used as plug-
ins within T-Coffee. Likewise, we will give as much support as possible to anyone
wishing to turn T-Coffee into a plug-in for another method. For more details on how
to do this, see the plug-in and the plug-out sections of the Tutorial Manual.
Again, you do not need our permission to either use T-Coffee (or your method as a
plug-in/out) but if you let us know, we will insure the stability of T-Coffee within
your system through future releases.
A Web of Web-Servers
Please let us know when you set a mirror, or your own web server as we try to
maintain a list of active mirrors, around the world.
3
4
Addresses and
Contacts
Contributors
The T-Coffee Web Server is developed by the following people:
Olivier Poirot
Sebastien Moretti moretti.sebastien@gmail.com
Fabrice Armougom
Cédric Notredame cedric.notredame@europe.com
Addresses
We are always very eager to get some user feedback. Please do not hesitate to drop
us a line at: cedric.notredame@europe.com The latest updates of T-Coffee are
always available on: www.tcoffee.org .
5
Pre-requisites:
Setting up Your
Environment
Getting the t-coffee server files ?
You can get this documentation and the t-coffee server archive upon request from
cedric.notredame@europe.com.
Installing an Apache http (web) server:
Downloading Apache rpm
Apache manages the applications running over the web server. We recommend
Apache 2 (release 2.2.3). Check your distribution with:
ROOT: apache2ctl -v
WARNING ! Depending on your distribution, ' apache2 ' can be replaced by apache, http2,
http…
Appache is often installed by default on UNIX and Linux systems. Use the
following command to install Apache from an rpm.
ROOT: rpm -Uvh apache2....rpm
Finding Apache documentation
Apache can also be downloaded from: httpd.apache.org/, where you will find all
6
the information you need to make a complete installation:
httpd.apache.org/docs/
www.linux-france.org/prj/edu/p-mcurie/apache.html
www.apachefrance.com/
Choosing the right Port for Apache
By default, apache uses the port 80. If this port is already used by other web servers
you will need to change apache default:
1- open the file httpd.conf
2- change the line Listen 80 to Listen 8080 or 8090
Checking Apache Status
When Apache is installed you must check if it is running. As root user, try this
command
ROOT: /etc/init.d/apache2 status
If Apache is running, you will see that its status is running, otherwise you should
start it with:
ROOT: /etc/init.d/apache2 start
You can also get the system to start apache automatically:
chkconfig apache2 on
Finally, open the following URL with your favorite web browser:
http://localhost/
If you have modified the default port, use
http://localhist:###
7
Where ### is the value given to the Listen Port.
You should now see a web page with informations from your distribution and an
apache logo, or these simple words: It works !
Installing Perl
The second main program you need is perl. You need it to be able to deal with
CGI scripts which are the skeleton of the t-coffee servers and with some other
programs. Unlike Apache, Perl should come with your distribution. Nevertheless,
you can get it from source at the Perl homepage: www.perl.org/. Choose Perl 5.6
or higher (latest stable release 5.8.8). Use the following command to check your
current version:
perl -v
Install every module contained in your distribution, as some of these can be difficult
to install directly from the source (e.g.: GD module).
Installing Perl Modules
Several of the webserver scripts described here will require the installation of some
extra perl modules. Installing these modules can be done with the CPAN archive
network. Let us imagine we want to install the following module: URI::Escape.
perl -MURI::Escape # Checks if your module is installed
If the answer is no, then you will have to install it through the Comprehensive Perl
Archive Network (CPAN). Before you do so, make sure you either have wget, nctfp
or lynx installed, then as a roo, run:
ROOT: perl -MCPAN -e shell #validate everything if first use
Validate everything and choose a mirror in your country. To install Perl modules,
and their dependencies automatically, simply use this command from the CPAN
shell:
install URI::Escape #Or any other package
WARNING !
If the download process is timeout, maybe your network is down or you are
behind a NAT or firewall, and you need to query CPAN with ftp in passive mode:
8
1-Locate the libnet.cfg file
2-Edit or add the line: ftp_int_passive setting to 1
If you don't have this file, you can get it from: bioanalyse.free.fr/libnet.cfg . Copy
it into your Perl library location which should look something like
/usr/lib/perl5/5.8.8/Net/ (where 5.8.8 is your perl version)
Network clients:
You need to have ftp and wget installed to install the server and run it. Normally
these packages are part of most standard UNIX distributions.
Warning: Our CGI scripts are not yet optimized for mod_perl or fast_CGI but they should be in a
near future.
Which Account
It is strongly advised to create a tcoffee account where all the packages and
programs will be installed.
9
Program repository
The First thing you will need to set up your server is to install all the programs needed
by T-Coffee to run properly. In this section we explain what you need and how to
install it.
PATHs, Rights and BIN_DIR
BIN_DIR
Before you start, you must decide where to install all the programs that are needed
to run the server. The most convenient thing to do is to copy ALL your executable
in some standard directory (BIN_DIR). We recommand:
/usr/local/bin/
although you may use any other convenient location.
You will need to be root in order to copy programs in the /usr/local/bin dir.
Execution Permission
Unix is a multi-user environement where everybody is not allowed to do anything
anywhere... This is why UNIX has a very precise control of permissions (X:
execution, w: writing, r: reading). It is your responsability to insure that apache has
the right to run the applications it needs on your server. To achieve this, you may
have to change permissons for the execution of your programs. For instance, the
following command allows execution of the file program (+x) by all (a).
chmod a+x program
Archives
We also recommend to create a directory containing all the program packages,
where you will store program archives, heir configuration files (specially for blast,
10
fugue, dialign-t and poa) and so on. Keep that repertory appart from BIN_DIR.
Intel Compilers
Compilers are programs that turn source code into executables (programs). A proper
Intel Compiler can increase your performances from 20% up to 200% (Okay, that's
only twice faster, but still...). Intel provides a series of compilers optimized for its
processors at:
www.intel.com/cd/software/products/asmo-na/eng/compilers/
You can use them free of charge for evaluation, for 32 and 64 bit architectures. We
will use the C/C++ and the fortran compilers which are provided as RPM packages.
We show you here how to download and install the icc compiler.
1-Get your evaluation license from Intel
2-Copy your license file on your computer
3-ROOT: gunzip l_cc_....tar.gz | tar xvf -
4-ROOT: cd l_cc_dir
5-ROOT: ./install.sh
6-add the icc programs on your PATH
7-link the icc libraries to /usr/lib/
Note: For 64 bit architectures, you have to link 64 bit libraries to /usr/lib64/
Note: You can use to your default compiler (gcc on most systems)
T-coffee
T-Coffee is a multiple sequence alignment package. It also behaves like a shell
containing several pacakages such as seq_reformat. aln_compare and iRMSD.
You will need T-Coffee to run most of the services associated with this server
(iRMSD, Expresso, Core, Combine..). The installation procedure of T-Coffee is
described in details in the technical documentation. Here is a brief outline on how to
compile T-Coffee using the icc compiler rather than gcc:
1- Get the latest T-Coffee version from: www.tcoffee.org
2- gunzip T-COFFEE_distribution_Version_X.xx.tar.gz | tar xvf -
3- cd T-COFFEE_distribution_Version_X.xx/
4- cd t_coffee_source
5- make CC='icc -o3' t_coffee
11
6- cd ..
7- cp bin/* /usr/local/bin (or any BIN_DIR)
Warning: If the compilation fails with a program and icc, try to compile this program with
standard compilers.
Blast
BLAST seraches sequence databases for sequences homologous to a given query
sequence.
BLAST is needed by several applications, including Protogene and Expresso. The
easiest way to get BLAST is to download the binary corresponding to your
architecture from:
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/
You shouléd then copy the executables in the appropriate location on your system
(in you BIN_DIR)
cp blastall /usr/local/bin (or any other BIN_DIR)
cp formatdb /usr/local/bin
BLAST also needs to know where are the substitution matrices and other data it
needs to run. This is declared through the environement variable BASTMAT:
export BLASTMAT=packages_directory/blast/data/
Add this last line to your /etc/profile (as root) if you want to use BLAST as a
command line program.
The PDB 3D Structure Database
PDB is a database containing all the sequences (and nucleic acid) with a known 3D
structure. Many of T-Coffee applications require the use of the PDB database. You
will need the sequences of these structures that you can get from:
ftp://ftp.rcsb.org/pub/pdb/derived_data/pdb_seqres.txt
The easiest way to get this DB is to use the PDB_seqres-transfert.sh script
that we provide (cf server distribution). It will automatically updateyour local DB
database every week if you copy it into your cron directory. Edit it and adaot it to
your system before.
ROOT: cp PDB_seqres-transfert.sh /etc/cron.weekly/
12
If you have spared disk space, you can also mirror the whole PDB structure
database on your computer with our PDB_transfert.sh script (about 7
GB).
Having the full PDB installed is not compulsory. Whenever there is no local mirror,
the structures are automatically fetched from RCSB. The only drawback is that it
makes things a bit slower.
Webblast
Description
Webblast.pl is a script that plays the role of an interface between T-Coffee and
BLAST, or between Protogene and BLAST. It allows these programs to efficiently
seek the sequences they need.
Webblast can interact with all sort of BLAST servers. The simplest (and slowest
mode) is the interaction with the NCBI BLAST where your queries are sent to the
NCBI. This does not require any local installation. Webblast can also interract with
your local copy of blast (blastall) and locally installed databases. Finally, webblast
can also use BLAST clusters like the gigablaster (www.igs.cnrs-
mrs.fr/adele/~database/remoteblast.cgi) as long as it is properly configured.
Installation
Install Webbalst following the described procedure and copy the executable into
your BIN_DIR (/usr/local/bin if you have been using the defaults). Normally the
required Perl Packages should be installed automatically during the default
installation procedure. If they do not, install them manually manually using the
procedure described in the first section of this manual.
strict, warnings, diagnostics, CGI, Env, LWP::UserAgent,
HTTP::Request::Common, HTML::Parser, URI::Escape, Getopt::Long
At the top of webblast.pl edit so the following fields:
my $database_expresso=<PDB Seqres.fasta>;
my $blast_dir_expresso=<blastall>;
my $BLASTMAT="export BLASTMAT= <matrix dir of BLAST>";
my $BLASTDB="export BLASTDB=<PDB Seqres File>";
my $runblast=<runblast.pl>;
Make sure you indicate the abosolute path of the files within brackets. For instance:
my $database_expresso="/usr/local/data/PDB_seqres.fasta";
my $blast_dir_expresso="/usr/local/bin/blastall";
my $BLASTMAT="export BLASTMAT=/usr/local/data/blast/";
my $BLASTDB="export BLASTDB=/usr/local/data/PDB_seqres.fasta";
my $runblast="runblast.pl";
Finaly, copy your modified webblast.pl into your BIN_DIR
cp webblast.pl /usr/local/bin
13
Using a BLAST Cluster
If you want to use the gigablaster, make sure that the script runblast.pl is in your
BIN_DIR.
SAP
SAP aligns sequences using structural information. It uses the double dynamic
programming algorithm of Orengo and Talyor. SAP is needed by EXPRESSO.
SAP is available upon request from wtaylor@nimr.mrc.ac.uk and will compute
pairwise structure based sequence alignments. T-Coffee/Expresso use Fugue to
combine sequences and structures. Download, compile and copy sap into your
BIN_DIR directory. Edit the makefile if you want to use icc rather than gcc.
Fugue
Fugue will computes sequence-structure alignments using structural information
when one structure only is available. These types of alignments are sometimes
called threading. Fugue is needed by EXPRESSO.
Fugue is available from www-cryst.bioc.cam.ac.uk/fugue/. The license is
free of charge for academic user.. Fugue can be tricky to install. You do not need
to install all the packages that come along (c.f. T-Coffee documentation for more
details).
Joy
gunzip fugue_distribution.tar.gz | tar xvf -
cd fugue_distribution/
tar xvf joy-xx.tar
tar xvf joy_related-xx.tar
cd joy-xx/
./configure -prefix=$PWD
cd ../joy_related-xx
make
make check
make install
cd ..
ROOT: cp joy_dir/bin/joy /usr/local/bin (or any other BIN_DIR)
ROOT: cp joy_related_dir/bin/hbond /usr/local/bin
ROOT: cp joy_related_dir/bin/sstruc /usr/local/bin
14
Note: You can use the Intel Fortran compiler for joy_related tools)
Fugue
cd fugue
make CC=icc
cp fugueali /usr/local/bin (or any other BIN_DIR)
ln -s /full_path/fugue_distribution/ /data
ln -s /full_path/fugue_distribution/fugue/ /data/fugue/SUBST
M-coffee
M-Coffee is a special mode of T-Coffee that combines the output of several
multiple sequence alignment methods.
M-Coffee requires the installation of several multiple sequence alignment packages
(Muscle, POA, …). Follow the procedure described in the T-Coffee documentation.
Exonerate
Exonerate is a splice alignment program. It makes pairewise alignments between
protein sequences and their genomic counterparts. Exonerate is needed by
Protogene to turn multiple protein sequence alignments into their bona-fide multiple
DNA sequence alignments. Exonerate is available from:
www.ebi.ac.uk/~guy/exonerate
T-Coffee and Protogene have only been tested with version 1.0.0. Exonerate
requires the glib-config and glib-devel packages to be installed. Compile exonerate:
./configure –prefix=$PWD; make; make check; make install
and copy its binary into your BIN_DIR (/usr/local/bin):
cp exonerate/bin/exonerate /usr/local/bin
Protogene
ProtoGene is a perl script that generate bona-fide translations or gene structures
from protein sequence alignments. Protogene uses BLAST to identify the gene
corresponding to your proteins and it then uses Exonerate to make a splice
alignment between your protein and its genomic counterpart. Protogene is meant to
interact with BLAST and T-Coffee.
15
Installation
Download protogene. All the following operations must be done in your home
directory.
Protogene comes with its own perl modules. Create a .perllib/ directory in your
home dir
mkdir ~/.perllib
and either copy or make a symbolic link with the *.pm files that are in the protogene
distribution.
ln -s protogene/*.rm ~/.perllib
Edit Protogene script to specify this location (line 17). Be sure to use absolute paths
rather than relative ones.
use lib 'Path_to_ProtoGene_cache/.perllib/';
Edit line 31 to specify the BIN_DIR
$ENV{'PATH'} .= ':BIN_DIR';
Protogene stores all its temporary results in a cache directory. You must create this
directory in your home. For instance:
mkdir ~/.protogene
You must then declare the location of this directory on line 38 of the protogene
script. Use absolute paths.
my $cachedir='My_cache_dir';
Copy the script into your BIN_DIR so that the program can be used and accessed by
apache
ROOT: cp protogene /usr/local/bin
16
Perl modules
Protogene requires severa public modules to be installed. Use the CPAN procedure
outlined in the first section.
strict, warnings, diagnostics, Env, lib, File::Basename,
File::Copy, File::Which, Getopt::Long, Mail::Send
Wrapping it up
The complete list of binaries that should be in your bin is in the BIN_DIR directory
in the Appendix A.
You can find the whole list of Perl modules required for programs and CGI scripts
in the Appendix B.
17
HTML repository
HTML documents are files which are not executable by Apache, like documentation in
HTML format, images, programs, or result files. Thus, they should be separated with CGI
scripts. If you cannot separate them we will show you some basic security usages to deal with
it. Nevertheless, we recommend asking an expert on how to secure your servers.
You can group these repositories into the same directory to help servicing (say HTML_DIR).
Depending on your configuration, apache may have created a directory named
/var/www/html/ (Red Hat, Mandrake, Fedora) or
/srv/www/htdocs/ (Suse) where you can copy your HTML_DIR. If no such
directory exists, simply create your HTML and follow the rest of the instructions.
We will assume the following structure, with HTML_DIR=/srv/www/htdocs/
/srv/www/htdocs/tcoffee_server/Doc
/srv/www/htdocs/tcoffee_server/Images
/srv/www/htdocs/tcoffee_server/Tmp
/srv/www/htdocs/tcoffee_server/tcoffee_cgi
Documentation repository
Contains documentation for sequence formats and a symbolic link for the server
configuration file: configuration_file.txt (cf. V 1):
ln -fs /full_path/CGI_DIR/configuration_file.txt HTML_DIR/Doc/
Image repository
Contains images and logos.
Temporary and result repository (say Tmp):
Contains temporary files, stored for some pre-defined period. It can grow large.
Link this directory to some scratch place on your disk.
This directory holds result files which are stored during some days (cf. IV). It's fine
18
to group these three directories (Doc, Images & Tmp) but the result directory could
require a large disk space. You should symbolically link this directory from a larger
place to HTML_DIR.
T-Coffee cache
T-Coffee produces lots of temporary data that it stores into the .t_coffee
directory.You should move the .t_coffee/ directory into the Tmp dir. You should
alos symbolically link it to its former place, so that the program runs smoothly from
the command line.
mv $HOME/.t_coffee/ HTML_DIR/Tmp/
ln -s HTML_DIR/Tmp/.t_coffee/ $HOME/
Web security:
Access Rights
Programs must NOT be located so that they can be accessed and executed from the
web pages. This is the reason why the iamges and the programs must be separated.
Apache must be the only one able to access executables.
If the location of your images, doc is not standard, you must tell appache to forbid
execution anywhere except in the CGI directory. To do so, edit the file:
/etc/apache2/mod_userdir.conf
Within this file, find the directive <Directory /home/*/public_html> and add the
following line:
<Directory /home/*/public_html>
Options -ExecCGI -Indexes SymLinksIfOwnerMatch
</Directory>
-ExecCGI means you forbid CGI and program execution into the HTML_DIR
directory and sub-directories.
-Indexes means you disallow the automatic generation of index file if Apache
cannot find one into the directory the web browser has reached. An automatic index
files could help someone to find information to enter into your system.
We provide a basic index.html file to avoid this. We recommend to copy this
file into all directories except ones where you want another kind of index file, with a
19
re-direction for instance.
(+)SymLinksIfOwnerMatch means you follow symbolic links if its owner is the
Apache user, or the same user that the one running the server. A (+) sign (or
nothing) means you allow the option while a (-) means you disallow it. See
www.cgsecurity.org/Articles/apache.html for a full article on Apache security.
Redirection
You could use a re-direction in an index file to send users directly to the homepage
of your servers. We provide such file in the HTML_DIR. It sends requests to the
main cgi script.
Remember: put an index.html file into all directories and sub-directories onto
your servers.
Robots
If you don't want Google, or other web crawlers, indexes the result files on your
server, or entirely hide your server, you can use a robots.txt file. This file is
read by web crawlers before indexing and include directives for them. It must be put
at the root of your server(s) (for you, it should for instance into
/var/www/html/).
Here is an example:
# Robots exclusion file
User-agent: * * means for all web crawlers
Disallow: HTML_DIR/Doc/ web crawlers should not index the Doc directory
Disallow: HTML_DIR/Images/
Disallow: CGI_DIR/tcoffee_cgi/
Disallow: HTML_DIR/Tmp/
It works fine for web crawlers that read the robots.txt file (most of them) but
this file also gives informations on the sub-directories structure of your server. See
the google for instance (www.google.com/robots.txt) where the tree structure of
google appears very clearly.
20
CGI repository
CGI scripts are the backbone of your t-coffee server. They must be in a directory
where they can be executed and not read as simple text (say CGI_DIR or
tcoffee_cgi). This hides your code and the variables embedded in your script.
A local apache folder may exist on your computer for that purpose
(/var/www/cgi-bin/ or /srv/www/cgi-bin/). Simply copy your
CGI_DIR directory into this folder.
Perl modules required:
The t_coffee webserver CGI requires many perl module to work. Make sure
they are all installed before you start:
strict, warnings, diagnostics, IP::Country::Fast, CGI, CGI::Carp,
Test::Pod::Coverage, Carp, File::stat, Time::Local,
Time::localtime, locale, Date::Calc, GD, GD::Graph::bars,
GD::Graph::pie, Mail::Send
WARNING ! GD perl module can be very difficult to install. Nevertheless, this
module should be provided with your Linux or Unix distribution, or available
from google or rpm repositories like http://rpmfind.net/.
The CGI_DIR
Your CGI_DIR directory should contain these files:
configuration_file.txt #The configuration file for the servers
crashlog.xml #log files
userlog.xml
evol.cgi #Statistics and survey scripts
date.cgi
ip.cgi
21
ip.pl
env_param.pm #Environmental variable configuration
feedback.cgi #Script to manage result waiting
index.cgi #The core of the server
index.html #To avoid an automatic index file and
#redirect calls onto index.cgi
Web security:
Only CGI scripts must be executed from the CGI_DIR directory. All files must not
be accessible as simple text and non-cgi file access must be forbidden. Here are
some Apache directives to do this task.
You must add two directives into the Apache configuration file. The first one tells
Apache that URLs with a path beginning with tcoffee_cgi, for us, will be
mapped to CGI scripts beginning with CGI_DIR:
ScriptAlias /tcoffee_cgi/ "full_path_to_CGI_DIR/"
<Directory "full_path_to_CGI_DIR">
AllowOverride None
Options +ExecCGI -Includes #Allow CGI execution
AddHandler cgi-script .cgi #Cgi must have a .cgi extension
Order allow,deny
Allow from all
<FilesMatch \.(txt|doc|pl|pm|xml)$>
Order allow,deny
Deny from all
</FilesMatch>
</Directory>
You must restart your apache server every time you change the apache
configuration file !
22
Environmental
Variables
Customizing your web server requires many path and values to be declared that describe your
system to the web server. We have tried to group these variables in one convenient file :
env_param.pm
Defining your block
You should look at the block that is called "dummy server" and customize it to your
needs. The first thing to to is to decide on a location name that will match the
variable $place.
Logos and colors
Of taste and colors… In this section will be defined the look and feel of your server.
Look for logos that have the same size as existing ones and place them in the the
HTML_DIR/Images directory. Choose a color scheme that suits you. Take your
time, we are still discussing ours :-) You will find color by the end of the bloc, in the
look section:
$colorTitle='#7f31e0';
$colorAppli='#b47df8';
$Rgb=180;
$rGb=125;
$rgB=248;
Logo images have to be declared into the $logo variable. Change only the image
23
names and alt values if your logos look like the ones from www.tcoffee.org/
($place =~ /IGS/) or tcoffee.vital-it.ch/ ($place =~ /Vital-IT/i).
Paths for the different repositories:
The Directories section holds names for the different repositories that we have
defined in the previous section
$dir_tcoffee = 'HTML_DIR name';
$dir_cgi = 'CGI_DIR name';
$dir_tmp = 'Temporary directory name';
$dir_images = 'Images directory name';
$dir_doc='Doc directory name';
$dir_exe='Path for programs';
$programme=$dir_exe.'/t_coffee'; Default program if none specified
The Server directories section holds names for paths where we can find
directories:
$home_web='Path for CGI repository'; # (/var/www/cgi-bin/)
$home_html='Path for HTML repositories'; # (/var/www/html)
$home_dir = "HTML_DIR";
$home_cgi = "CGI_DIR";
$scratch_area="HTML_DIR/temporary_dir";
The Web directories section holds name of URLs to reach the servers:
$web_base="http://somewhere/$dir_tcoffee";
$web_cgi_base = "$web_base/$dir_cgi";
$web_base_images="$web_base/$dir_images";
$web_base_doc="$web_base/$dir_doc";
$web_base_link_tmp="$web_base/$dir_tmp";
E-mail and cache:
The last section of the bloc, the first one from the beginning, is about configuration:
$config_infile = 'configuration_file.txt file name';
$pg_source = 'index.cgi file name of the CGI skeleton';
$tmpOldFiles=9; #Keep only files more recent than $tmpOldFiles days
old
$dir_pdb='local PDB RCSB mirror'; # Let it empty if none
$webmaster='webmaster e-mail'; # for help messages
$fromEMail='results are sent from this e-mail address';
$ppath="\$PATH";
Use only one \ if programs are run on the same computer than the
server, or three \ ( \\\ ) if programs are run on a
24
computer, different from the server computer.
Whenever something is unclear, look at the corresponding section in the IGS or the
VitalIT block. All these variables should allow you to configure your own server.
Nevertheless, you can still find some points which are defined into the index.cgi
file only. They are process management by example. To help you, we can say that
IGS configuration is for a single computer, and Vital-IT configuration is for a
frontal computer which sends program execution to a cluster.
To deal with special extensions of t_coffee result files, like score_pdf or
score_html, you must add these extensions into the /etc/mime.types file to
get automatic opening with your browser for score_html, and with a pdf client for
score_pdf :
application/pdf pdf score_pdf
text/html html htm score_html
4) Mirrors:
Mirror print is managed by the goodFlag function of the env_param.pm file.
It's another example where variables are defined into the index.cgi file and into
the env_param.pm file. It should be enhanced in a near future. See the
index.cgi file to see how mirrors are printed.
To add mirrors, see the next chapter.
WARNING ! Finally, change the location name into the cgi scripts to deal with
your variable configuration:
my $location='the_place_you_chose';
Into index.cgi, feedback.cgi, evol.cgi, date.cgi and ip.cgi
scripts.
Server configuration
1) Example from the configuration_file.txt file:
25
The configuration_file.txt file defines server applications, their
names, options, forms, ... It is split into server section:
server::TCOFFEE::Regular::Computation of a Multiple Sequence
Alignment::12824354,10964570
Lines beginning with server:: define the beginning of a server declaration.
The server flag is followed by the server name, here TCOFFEE. The server
name is followed by the form mode, Regular or Advanced. The form mode
is followed by a definition of the server function. It may be followed by
PubMed ID(s). Double ':' is the field separator, and lines beginning with '#'
are comments.
Around these server declarations you could find some special flags: mirrors,
which defined mirrors available for t_coffee servers, section which defined
program groups.
The configuration_file.txt file is read line by line, from the beginning
to the end. Thus, flags, and what they defined, are printed in the same order
on web pages.
Into a server section, the order is kept for the forms as for the whole server
list on the frontpage.
Into the server section, we can find some sub-sections which defined form
print, options and so on:
config::email::mandatory E-mail address must be provided by
users
paragraph_in Print these lines into the form
parameter::-in Input sequences
parameter::-output Output formats
parameter::xxx Option xxx and its arguments
These lines are passed to the program directly. Thus, you could define here every option you
can find in command line for the program.
sendto Create a button to send results to
another server
paragraph_out Print theses lines into the result page
2) Build your own server:
Now you can build your own server form:
######################################################################################
#############
server::TCOFFEE::Regular::Computation of a Multiple Sequence
Alignment::12824354,10964570
26
######################################################################################
#############
#config::email::mandatory
E-mail request is comment. Thus e-mail address is optional.
paragraph_in::Description:: Computes a <b>multiple sequence alignment</b>
Print the server function into the form as a description
paragraph_in::Sequence Input
Print the sequence input section
paragraph_in::Action::<b>Paste or upload</b> your set of sequences
Print informations about available actions
parameter::-in::Upload File::upload::empty::empty::doc_txt:FASTA Sequences
Define the upload box
parameter::-
in::Method::hidden::Mlalign_id_pair,Mclustalw_pair,Mslow_pair::Mlalign_id_pair::empty
Define available methods, and the ones used, as option (hidden for users)
parameter::-output::Alignment Format::hidden::msf::msf::doc_txt:Output Format
Define available output formats, and the ones used (hidden)
parameter::-maxnseq::Max number of sequences::hidden::50,100::50::empty
Define a hidden option, about max number of sequences
parameter::-maxlen::Max length::hidden::1000,2000::2000::empty
Define a hidden option, about max length of sequences
sendto::ProtoGene@fasta_aln::MyHits@fasta_aln
Create a 'send result' button to a server, with this file format, into the result page
######################################################################################
#############
Look at different applications into the configuration file to learn how to
configure your own application.
APPENDIX
A:
Programs which should be in your BIN_DIR directory ( and an index.html
file if your are in a place reachable by Apache):
blastall (+ formatdb + fastacmd)
clustalw
dialign-t
exonerate
fugueali
hbond
joy
mafft
melody
muscle
pcma
poa
probcons
ProtoGene.pl
( runblast_pdb.pl )
runblast.pl
sap
sstruc
27
t_coffee
webblast3d.pl
webblast.pl
B:
Perl modules required by programs and CGI scripts:
strict
warnings
diagnostics
libs
Env
Carp
CGI
CGI::Carp
HTTP::GHTTP (optionally, for former versions of runblast; needs
gnome-config,
into gnome-libs-devel rpm)
LWP::UserAgent
HTML::Parser
HTTP::Request::Common
URI::Escape
GD
GD::Graph::pie
GD::Graph::bars
GD::Image
GD::locale
Time::localtime
Time::Local
Time::Format (optionally, used in former versions of ProtoGene)
Date::Calc
File::stat
File::Basename
File::Which
File::Copy
Getopt::Long
Test::Pod::Coverage
Pod::Text
Mail::Send
IP::Country::Fast
Term::ANSIColor (optionally, should be used in a near future)
28
Get documents about "