Docstoc

MS Powerpoint file

Document Sample
MS Powerpoint file Powered By Docstoc
					NASC
Microarrays

UKCROP.NET

Garnet.arabidopsis.org.uk

Seeds, DNA, mut ants and natural variants

Bioinformatics

Nottingham Arabidopsis Stock Centre

Funded by the

bbsrc

NASC
Transcriptomics database
Other databases and applications

DB2

AGR

Germplasm database
MySQL

MySQL

Other databases and applications

Bioinformatics

large files / difficult mining
Arrays
2002 = 500 expts 2003 = 1000+ expts Each = 22K objects

NASC

connect
Seeds AGR
~ 25,000 genes Variable sub-objects (e.g. clones, sequence)

1991 = 200 objects 1999 = 20,000 objects
Today ~ 450,000 objects

Difficult visualisation

Complex objects / difficult curation

NASC
Germplasm database
Mutants, ecotypes, other germplasm Order seed

Image
Description of phenotype
Blast for similar sequences

DNA sequence

AGR

Other ‘similar’ mutants / genes

Protein sequence

Microarray database

Expression profiles

Spot histories Scatter plots

NASC
Germplasm
•

NASC is based at the University of Nottingham and provides seed and information resources primarily to Europe.

•

The Americas are served by ABRC and both centres serve the rest of the world together.

Stocks: • 1991: 200 lines • 2002: 200,000 lines • Now ~450,000 lines represented Distribution: • - 25,000 tubes of seed a year to mainly European countries. • Approximately 20% of our users get seeds for free, the rest enjoy an average price of about 2 euros per line

• All of these seeds can be ordered through our online catalogue at http://arabidopsis.info.

900 800 700 600 500 400 300 200 100 0

G UK er m an y Fr N an et he ce rla nd s Sp Sw a it z in er la n Sw d ed e Be n lg iu m D

• Almost 3000 users throughout Europe (and the rest of the world)
Distribution of NASC Users in Europe

NASC

It a en ly m ar k Is ra e Po l la n Fi d nl an N d or w a C Au y ze ch s R tria ep ub Po lic rtu ga H un l ga r R y us si a Ire la n G d re ec e Tu rk e Es y Lu to n xe m ia bo ur g U kr ai n R om e an Sl ia ov ak ia

NASC
Functional genomics seed tools: donors
Holland Singapore Germany Multinational France UK US

International statistics for openly distributed genetic resources.
Brazil N. Korea S. Korea Czech Republic Israel Denmark Japan Spain Hong Kong South Africa New Zealand Singapore Poland USA The Netherlands France

Seed tools: recipients (May 2002-Apr 2003)

Canada Hungary Taiwan Norway Australia Austria Italy Sweden Germany

Malaysia India Argentina Portugal Finland China Belgium Switzerland UK

NASC
What do the Seed Donors get ?
• Security – we preserve the seeds under the best possible conditions for longevity and germination frequency. • Stability – Funded from 1991-2007 (and counting). Any seeds donated can be recovered if local stocks get compromised or lost (or just lose their germination state) many years after they have been donated. • Publication – many journals now require seeds to be donated to the stockcentres before publication. We give donors a NASC code on receipt of the seeds.

• Less work – we distribute on the donor’s behalf, and their name is associated with the donated stock forever.
• Publicity – we associate stocks with documents from your lab, e.g. instructions on use, observations etc, even logos if that is what the donor wants.

http://ukcrop.net/

NASC

UKCrop.net
• A consortium of plant genome databases. • Established in 1996. • Develop, manage, and distribute information relating to comparative mapping and genome research in crop plants. • The 6 projects comprising UK CropNet are located at:
–

– –

–

Institute of Grassland & Environmental Research (IGER) in Aberystwyth John Innes Centre (JIC) in Norwich Nottingham Arabidopsis Stock Centre (NASC) at the University of Nottingham Scottish Crop Research Institute (SCRI) in Dundee

NASC
5 species-specialised UK databases:
•Arabidopsis •Barley •Brassica •Forage grasses •Millet

2 broad species UK databases:
•ComapDB •CropseqDB

~25 external world databases mirrored

http://ukcrop.net/agr

NASC

AGR
The Arabidopsis Genome Resource, based at NASC. • built as a part of the UKCrop.net project. • holds sequence and mapping data for Arabidopsis. • promotes comparative analysis between Arabidopsis and crop plants. • integrated with the Arabidopsis stock catalogue.

http://ukcrop.net/agr

NASC

OR USE SEQUENCE / BLAST

http://arabidopsis.org.uk/insertwatch

NASC

Insert Watch / Blast
InsertBlast (http://arabidopsis.org.uk/blast.html) • Since 2000: allows searching of ‘insert sequences’ matching a gene of users’ interest. • Linked to NASC stock catalogue - users can order the corresponding insertcontaining stock. InsertWatch (http://arabidopsis.org.uk/insertwatch) • Since 2000: automated searching of new sequenced insert populations. Matches are e-mailed.

Affymetrix service

NASC
Database

Specific User

Web application

Request & Receive RNA
55 50 45 40 35 30

Fluorescence

25 20 15 10 5
18S
19 24 29 34 39 44

28S
49

0

54

59

64

69

Time (seconds)

committee

approval

NASC service

DATA

General Users

Data tools

Browsing, downloading, automated delivery by CD (AffyWatch), searching, scatterplots, spot histories, mass downloads, gene swinger.. etc…

Caution !

NASC
Flower
Clustering

Array data are CLUES !!
Superman

Arabidopsis Zinc finger protein

Scatter plot

Yoda Stomata
Arabidopsis MAP kinase

Genomics / annotation

Transcriptomics Proteomics metabolomics

ArrayExpress

NASC

As part of the Affy service: Your data submitted to ArrayExpress (required for publication) under your name.

Bioinformatics

Multiple small and medium size data donors (e.g. CSHL, Pereira, Bancroft, Sundaresan)

NASC

EMBL
Genomics (AGR)
Insert data insert germplasm (JIC)

Germplasm & DNA data TAIR

NASCproteomicsDB

MIPS
ArrayExpress
NASCarrays Local generation: •Spotted arrays •Affymetrix gene chips

ABRC
CRIME TranscriptomicsDB

Proteomics (Cambridge)

Metabolomics (Long Ashton)

NASC
PLANET
• EU consortium to generate a federated set of databases for plant genomics (Arabidopsis as first focus) • Started 2002 • Goal is federation on a distributed model (inc – GRID or GRID-like approach as it becomes possible) • 2003 – the selected model became web-services / BioMOBY • Pick up the leaflet and look at the posters please !!!!

Future plans

NASC

BioMoBY
• Explores ways to represent, distribute and discover biological data. • Data providers agree on a standard defining how their data and tools are represented. • A 'registry' tracks which data sources implement which services. • Scripts consult registry to determine which data sources to query for which data type or operation. • We intend to make our data fully BioMOBY compliant.

NASC
MOBY-Services:
●

Accepts MOBY Object(s) as queries via SOAP Returns MOBY Object(s) in response via SOAP Service provider registers with MOBY Central
– – – –

●

●

"Port"

Service Type (from service hierarchy)
URL of service Human-readable description

NASC
Mock Response Sequence XML for Apetala3
<MOBY Authority="ncbi.nlm.nih.gov" log="Query/ID" id = 1334543> <Sequence namespace="GenBank/Acc" id=D21125.1> <CrossReference> <Object namespace="PubMed/ID“ id="7948893"/> <Object namespace ="SwissProt/ID" id="BAA65.1"/> <Object namespace ="TAIR/Locus" id="AP3"/> <Object namespace ="GO/Acc” id="GO:0001835"/> <Object namespace ="EMBL/ID" id="AF056541"/> </CrossReference> <Length>876</Length> Envelope <SequenceString>gatcaatcca tgttagtttc Instance taactgtggc caacttagtt …. X-Refs </SequenceString> </Sequence> Payload </MOBY>

NASC
Where does new info come from?
“SERVICE AMPLIFICATION”
"I can find functionally similar genes From GO_Ids"
X-REFS: "I can get Medline Expression data SwissProt From EMBL ID" PubMed EMBL Citation GO_Annotation Sequenc attccg e ggtcac

"I can cluster affy data in real time"

AA Locus What can I do with DNA Medline, EMBL, GO GO_Annot …??? Homologues Expression

MOBY Central

Future plans

NASC

NASC Data made available to PLANet
• Phenomics data will be made available to PLANet • XML file for each seed generated • Transcriptomics data generated in-house will be made available to PLANet

• Any other external database can also use the services as they become available for internal or public use. • e.g. – would you like to enhance YOUR database by adding germplasm data pulled by AGI code from our database without going through our pages ? • i.e. Data YOUR WAY when YOU want it in YOUR format and layout.

Near future - Ensembl

NASC

Arabidopsis

NASC
Internal NASC ENSEMBL development plan 2003/4 launch

Nottingham Arabidopsis stock centre

NASC
Taking from the data and seed rich and giving to the data and seed poor

Bioinformatics / Curation team
David Craigon – Affy DB Graeme Gill – Germplasm Curation Janet Higgins – Affy / CATMA QC curation Emma Humphreys – Germplasm curation Nick James – Germplasm DB John Okyere – Affy analysis Beatrice Shildknecht – Planet/BioMoby Guo-an Sun – AGR DB
Seeds Bioinformatics Molecular Biology

Bold = here in York

NASC

The future ?

It’s already here