Cluster by malj

VIEWS: 19 PAGES: 5

									                Supplemental Materials for




    Inferring Causal Relationships among Different Histone

               Modifications and Gene Expression




 Hong Yu*, Shanshan Zhu*, Bing Zhou*, Huiling Xue and Jing-Dong J. Han




Chinese Academy of Sciences Key Laboratory of Molecular Developmental
Biology, Center for Molecular Systems Biology, Institute of Genetics and
Developmental Biology, Chinese Academy of Sciences, Beijing, 100101,
China




                                   1
Supplemental Table 1. Pair-wise Pearson Correlation Coefficients (PCC)
among the components of PRCs and H3K27me3. Genes with no binding
events for any of the factors were excluded from PCC calculation.

         Suz12   Rnf2    Phc1    H3K27

  Eed    0.57    0.45    0.49    0.16
 H3K27   0.23    -0.10   -0.04
 Phc1    0.44    0.48
 Rnf2    0.42




Supplemental Table 2. Enriched GO annotations among genes in Cluster H
and L, as well as subclusters of Cluster H and Cluster L. Genes associated
with an enriched GO term in a cluster are also listed in the last column.
Provided as Supplemental Table 2.xls.


Supplemental Table 3. Enriched KEGG pathway annotations among genes in
Cluster H and L, as well as subclusters of Cluster H and Cluster L. Genes
associated with an enriched KEGG pathway annotation in a cluster are also
listed in the last column. Provided as Supplemental Table 3.xls.


Supplemental Table 4. Boundaries found by k-means clustering to define
discretized count values for each of the 20 histone modifications, Pol II, CTCF
and H2A.Z within TSS  1 Kb of the 12,078 genes.




                                         2
               Boundary      Boundary       Boundary        Count    Count      Count
               of Low        of Medium      of High         of Low   of High    of High    Total

 CTCF          0-13          14-41          42-164          9102     2354       622        12078

 H2AZ          0-22          23-54          55-161          6303     3917       1858       12078

 H2BK5me1      0-18          19-55          56-301          8504     3152       422        12078

 H3K27me1      0-8           9-19           20-60           5772     4207       2099       12078

 H3K27me2      0-5           6-11           12-30           9052     2587       439        12078

 H3K27me3      0-7           8-18           19-58           8816     2268       994        12078

 H3K36me1      0-7           8-13           14-38           5597     4776       1705       12078

 H3K36me3      0-7           8-34           35-382          8894     3092       92         12078

 H3K4me1       0-29          30-73          74-356          6178     4582       1318       12078

 H3K4me2       0-24          25-57          58-163          5379     4607       2092       12078

 H3K4me3       0-130         131-322        323-763         4944     4273       2861       12078

 H3K79me1      0-8           9-13           14-72           4902     5250       1926       12078

 H3K79me2      0-1           2-2            3-8             9170     1903       1005       12078

 H3K79me3      0-12          13-31          32-209          8211     3286       581        12078

 H3K9me1       0-24          25-54          55-149          4360     4918       2800       12078

 H3K9me2       0-4           5-9            10-30           7702     3660       716        12078

 H3K9me3       0-3           4-18           19-71           10086    1944       48         12078

 H3R2me1       0-7           8-12           13-45           4956     4883       2239       12078

 H3R2me2       0-4           5-7            8-49            6406     4060       1612       12078

 H4K20me1      0-36          37-119         120-637         8966     2746       366        12078

 H4K20me3      0-48          50-180         186-505         12032    34         12         12078

 H4R3me2       0-4           5-7            8-48            6508     4035       1535       12078

 Pol II        0-34          35-109         110-565         7356     4145       577        12078

* Zero counts or no counts were excluded from the input to k-means, but are categorized as low counts
for Bayesian network inference.




Supplemental Table 5. Boundaries used to define discretized values of
relative T cell expression levels compared to other tissues for 12,078 genes
that have been measured by a gene expression microarray (Su et al. 2004).
Provided as Supplemental Table 5.xls.


Supplemental Table 6. Conditional probability distribution for each node in the
integrated Bayesian network model. Provided as Supplemental Table 6.xls.

                                                        3
Supplemental Table 7. Methyltransferases and demethylases for histone
modifications inferred to form causal relationships, as shown in Supplemental
Fig. 3. The references for each enzyme-substrate relationship are listed.
Provided as Supplemental Table 7.xls.


Supplemental Figure 1. One additional dependency input to H3K27me3
converts all but one edge in the polycomb gene network into compelled edges,
provided the input is independent of Eed or Suz12 binding. The sudo-factor is
labeled as ‘Add-in’. Compelled edges are marked red and directional whereas
the reversible one is green and undirected.


Supplemental Figure 2. Effects of level of discretization on accuracy and
coverage of models. To compare the accuracy of the networks inferred based
on different k values for k-means clustering, the cross-validation accuracy and
coverage were first derived under each k from ChIP-seq counts within TSS  1
kb regions, then the network coverages were normalized by the ratio of total
edges inferred under different k, i.e. multiply by a factor n/m where n is the total
number of edges (36, 32 and 26 for k=2, 3 and 4, respectively) under a specific
k value and m is the maximal number of edges in the final Bayesian network
under the three different k values. The accuracy ~ coverage curves indicate
that k=3 performs the best, in agreement with the relative robustness of edges
in each model. The average appearance frequency of an edge in the 100
minimal networks is 65, 90 and 84% for networks derived under k=2, 3 and 4,
respectively.


Supplemental Figure 3. Methyltransferases and demethylases as potential
points to perturb the network and test the model. Methyltransferases and
demethylases listed in       recent   reviews    and other publications        (see
Supplemental Table 7 for references) are labeled as nodes that activating
(methyltransferase) or inhibiting (demethylase) a histone methylation form.

                                         4
Through perturbing an upstream node and observing the level of the
downstream node, an epistatic relationship can be validated. A node label for
methyltransferase or demethylase consists of both methylation status-specific
enzymes (before a semicolon in a label) and those whose specific target
histone methylation status (mono-, di- or tri-methylation) has not been
determined (after a semicolon in a label).

References

Su, A.I., T. Wiltshire, S. Batalov, H. Lapp, K.A. Ching, D. Block, J. Zhang, R. Soden,
        M. Hayakawa, G. Kreiman, M.P. Cooke, J.R. Walker, and J.B. Hogenesch.
        2004. A gene atlas of the mouse and human protein-encoding transcriptomes.
        Proc Natl Acad Sci U S A 101: 6062-6067.




                                          5

								
To top