Lab DatabaseSearch by HC120727031027


									                         Introduction to Bioinformatics

                                    Database Search

This lab is designed to introduce you the databases available online at the National Center for
Biotechnology Information (NCBI). If you are not familiar with NCBI resources, paying a visit
to the site ( is well worth your time.

1. NCBI website ( contains all sorts of information you may need
   to become a bioinformatician. One of the most convenient features is the search engine
   Entrez information retrieval system, which links several databases. List what databases you
   can access through Entrez website ( Get familiar
   with how you can retrieve information from above databases through Entrez portal.

2. Visit NCBI Map Viewer website ( Give a brief
   discussion about what is the content of this database. How many genes in the human
   genome contain the term "homeo" in their name? To be sure you find them all, search for
   "*homeo*". The asterisks are wild cards which means that you are searching for "homeo"
   preceded or followed by any other characters. Please answer the following questions:
   (1) Number found: ______ .
   (2) Which chromosome contains the largest number of these genes? How many?
   (3) Among the genes found in question a, find one that has a role in insulin action.
       (a) Name of the gene: ________________________________.
       (b) Four-character ID of the gene: ______ .
       (c) Which chromosome contains this gene?
       (d) According to OMIM, what is the role of the protein encoded by this gene?

3. Here is a nucleotide sequence:
   Please use database search to tell us as much as you can about this sequence.
   (1) What database(s) did you search, and what tool(s) did you use to search? What
        parameter settings did you use?

   (2)   If this is not an exact match to a database entry, which entries did you consider as
         potential matches? Select what you consider to be the best match and explain why it is
         the best. (In the case of a tie, for the purposes of this exercise, use the human version)

   The remaining questions apply to your best match and you may need to search other
   databases other than the ones at NCBI.

   (3)   What is the official symbol for this gene?

   (4)   What is the name for this gene?

   (5)   What organism is it from?

   (6)   What protein does this gene code for?

   (7)   What is the amino acid sequence of this protein?

   (8)   Is the function of this protein known? If so, what does it do?

   (9)   Is the secondary structure of this protein known? If so, how many alpha helices and
         beta sheets are there in it? How did you determine the number?

   (10) Is the tertiary structure of this protein known? If so, what is the PDB structure
        accession number?

   (11) Anything else interesting you found about this sequence.

4. Lab report
   Every group should submit a report to report the findings of this lab.

To top