VIEWS: 21 PAGES: 27 POSTED ON: 1/8/2010
NASC Microarrays UKCROP.NET Garnet.arabidopsis.org.uk Seeds, DNA, mutants and natural variants Bioinformatics Nottingham Arabidopsis Stock Centre Funded by the bbsrc NASC Transcriptomics database Other databases and applications DB2 AGR Germplasm database MySQL MySQL Other databases and applications Bioinformatics large files / difficult mining Arrays 2002 = 500 expts 2003 = 1000+ expts Each = 22K objects NASC connect Seeds AGR ~ 25,000 genes Variable sub-objects (e.g. clones, sequence) 1991 = 200 objects 1999 = 20,000 objects Today ~ 450,000 objects Difficult visualisation Complex objects / difficult curation NASC Germplasm database Mutants, ecotypes, other germplasm Order seed Image Description of phenotype Blast for similar sequences DNA sequence AGR Other ‘similar’ mutants / genes Protein sequence Microarray database Expression profiles Spot histories Scatter plots NASC Germplasm • NASC is based at the University of Nottingham and provides seed and information resources primarily to Europe. • The Americas are served by ABRC and both centres serve the rest of the world together. Stocks: • 1991: 200 lines • 2002: 200,000 lines • Now ~450,000 lines represented Distribution: • - 25,000 tubes of seed a year to mainly European countries. • Approximately 20% of our users get seeds for free, the rest enjoy an average price of about 2 euros per line • All of these seeds can be ordered through our online catalogue at http://arabidopsis.info. 900 800 700 600 500 400 300 200 100 0 G UK er m an y Fr N an et he ce rla nd s Sp Sw a it z in er la n Sw d ed e Be n lg iu m D • Almost 3000 users throughout Europe (and the rest of the world) Distribution of NASC Users in Europe NASC It a en ly m ar k Is ra e Po l la n Fi d nl an N d or w a C Au y ze ch s R tria ep ub Po lic rtu ga H un l ga r R y us si a Ire la n G d re ec e Tu rk e Es y Lu to n xe m ia bo ur g U kr ai n R om e an Sl ia ov ak ia NASC Functional genomics seed tools: donors Holland Singapore Germany Multinational France UK US International statistics for openly distributed genetic resources. Brazil N. Korea S. Korea Czech Republic Israel Denmark Japan Spain Hong Kong South Africa New Zealand Singapore Poland USA The Netherlands France Seed tools: recipients (May 2002-Apr 2003) Canada Hungary Taiwan Norway Australia Austria Italy Sweden Germany Malaysia India Argentina Portugal Finland China Belgium Switzerland UK NASC What do the Seed Donors get ? • Security – we preserve the seeds under the best possible conditions for longevity and germination frequency. • Stability – Funded from 1991-2007 (and counting). Any seeds donated can be recovered if local stocks get compromised or lost (or just lose their germination state) many years after they have been donated. • Publication – many journals now require seeds to be donated to the stockcentres before publication. We give donors a NASC code on receipt of the seeds. • Less work – we distribute on the donor’s behalf, and their name is associated with the donated stock forever. • Publicity – we associate stocks with documents from your lab, e.g. instructions on use, observations etc, even logos if that is what the donor wants. http://ukcrop.net/ NASC UKCrop.net • A consortium of plant genome databases. • Established in 1996. • Develop, manage, and distribute information relating to comparative mapping and genome research in crop plants. • The 6 projects comprising UK CropNet are located at: – – – – Institute of Grassland & Environmental Research (IGER) in Aberystwyth John Innes Centre (JIC) in Norwich Nottingham Arabidopsis Stock Centre (NASC) at the University of Nottingham Scottish Crop Research Institute (SCRI) in Dundee NASC 5 species-specialised UK databases: •Arabidopsis •Barley •Brassica •Forage grasses •Millet 2 broad species UK databases: •ComapDB •CropseqDB ~25 external world databases mirrored http://ukcrop.net/agr NASC AGR The Arabidopsis Genome Resource, based at NASC. • built as a part of the UKCrop.net project. • holds sequence and mapping data for Arabidopsis. • promotes comparative analysis between Arabidopsis and crop plants. • integrated with the Arabidopsis stock catalogue. http://ukcrop.net/agr NASC OR USE SEQUENCE / BLAST http://arabidopsis.org.uk/insertwatch NASC Insert Watch / Blast InsertBlast (http://arabidopsis.org.uk/blast.html) • Since 2000: allows searching of ‘insert sequences’ matching a gene of users’ interest. • Linked to NASC stock catalogue - users can order the corresponding insertcontaining stock. InsertWatch (http://arabidopsis.org.uk/insertwatch) • Since 2000: automated searching of new sequenced insert populations. Matches are e-mailed. Affymetrix service NASC Database Specific User Web application Request & Receive RNA 55 50 45 40 35 30 Fluorescence 25 20 15 10 5 18S 19 24 29 34 39 44 28S 49 0 54 59 64 69 Time (seconds) committee approval NASC service DATA General Users Data tools Browsing, downloading, automated delivery by CD (AffyWatch), searching, scatterplots, spot histories, mass downloads, gene swinger.. etc… Caution ! NASC Flower Clustering Array data are CLUES !! Superman Arabidopsis Zinc finger protein Scatter plot Yoda Stomata Arabidopsis MAP kinase Genomics / annotation Transcriptomics Proteomics metabolomics ArrayExpress NASC As part of the Affy service: Your data submitted to ArrayExpress (required for publication) under your name. Bioinformatics Multiple small and medium size data donors (e.g. CSHL, Pereira, Bancroft, Sundaresan) NASC EMBL Genomics (AGR) Insert data insert germplasm (JIC) Germplasm & DNA data TAIR NASCproteomicsDB MIPS ArrayExpress NASCarrays Local generation: •Spotted arrays •Affymetrix gene chips ABRC CRIME TranscriptomicsDB Proteomics (Cambridge) Metabolomics (Long Ashton) NASC PLANET • EU consortium to generate a federated set of databases for plant genomics (Arabidopsis as first focus) • Started 2002 • Goal is federation on a distributed model (inc – GRID or GRID-like approach as it becomes possible) • 2003 – the selected model became web-services / BioMOBY • Pick up the leaflet and look at the posters please !!!! Future plans NASC BioMoBY • Explores ways to represent, distribute and discover biological data. • Data providers agree on a standard defining how their data and tools are represented. • A 'registry' tracks which data sources implement which services. • Scripts consult registry to determine which data sources to query for which data type or operation. • We intend to make our data fully BioMOBY compliant. NASC MOBY-Services: ● Accepts MOBY Object(s) as queries via SOAP Returns MOBY Object(s) in response via SOAP Service provider registers with MOBY Central – – – – ● ● "Port" Service Type (from service hierarchy) URL of service Human-readable description NASC Mock Response Sequence XML for Apetala3 <MOBY Authority="ncbi.nlm.nih.gov" log="Query/ID" id = 1334543> <Sequence namespace="GenBank/Acc" id=D21125.1> <CrossReference> <Object namespace="PubMed/ID“ id="7948893"/> <Object namespace ="SwissProt/ID" id="BAA65.1"/> <Object namespace ="TAIR/Locus" id="AP3"/> <Object namespace ="GO/Acc” id="GO:0001835"/> <Object namespace ="EMBL/ID" id="AF056541"/> </CrossReference> <Length>876</Length> Envelope <SequenceString>gatcaatcca tgttagtttc Instance taactgtggc caacttagtt …. X-Refs </SequenceString> </Sequence> Payload </MOBY> NASC Where does new info come from? “SERVICE AMPLIFICATION” "I can find functionally similar genes From GO_Ids" X-REFS: "I can get Medline Expression data SwissProt From EMBL ID" PubMed EMBL Citation GO_Annotation Sequenc attccg e ggtcac "I can cluster affy data in real time" AA Locus What can I do with DNA Medline, EMBL, GO GO_Annot …??? Homologues Expression MOBY Central Future plans NASC NASC Data made available to PLANet • Phenomics data will be made available to PLANet • XML file for each seed generated • Transcriptomics data generated in-house will be made available to PLANet • Any other external database can also use the services as they become available for internal or public use. • e.g. – would you like to enhance YOUR database by adding germplasm data pulled by AGI code from our database without going through our pages ? • i.e. Data YOUR WAY when YOU want it in YOUR format and layout. Near future - Ensembl NASC Arabidopsis NASC Internal NASC ENSEMBL development plan 2003/4 launch Nottingham Arabidopsis stock centre NASC Taking from the data and seed rich and giving to the data and seed poor Bioinformatics / Curation team David Craigon – Affy DB Graeme Gill – Germplasm Curation Janet Higgins – Affy / CATMA QC curation Emma Humphreys – Germplasm curation Nick James – Germplasm DB John Okyere – Affy analysis Beatrice Shildknecht – Planet/BioMoby Guo-an Sun – AGR DB Seeds Bioinformatics Molecular Biology Bold = here in York NASC The future ? It’s already here
"MS Powerpoint file"