Информация о публикации

Просмотр записей
Инд. авторы: Kulakovskiy I.V., Vorontsov I.E., Yevshin I.S., Sharipov R.N., Fedorova A.D., Rumynskiy E.I., Medvedeva Y.A., Magana-Mora A., Bajic V.B., Papatsenko D.A., Kolpakov F.A., Makeev V.J.
Заглавие: HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis
Библ. ссылка: Kulakovskiy I.V., Vorontsov I.E., Yevshin I.S., Sharipov R.N., Fedorova A.D., Rumynskiy E.I., Medvedeva Y.A., Magana-Mora A., Bajic V.B., Papatsenko D.A., Kolpakov F.A., Makeev V.J. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis // Nucleic Acids Research. - 2018. - Vol.46. - Iss. D1. - P.D252-D259. - ISSN 0305-1048. - EISSN 1362-4962.
Внешние системы: DOI: 10.1093/nar/gkx1106; РИНЦ: 35536203; PubMed: 29140464; SCOPUS: 2-s2.0-85040905364; WoS: 000419550700039;
Реферат: eng: We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
Ключевые слова: SEQUENCES; EXPANSION; GENE; MOTIFS; MARKOV-MODELS; SITES MODELS; OPEN-ACCESS DATABASE; SELF-RENEWAL; PROFILES;
Издано: 2018
Цитирование:
1. Alam, T., Medvedeva, Y. A., Jia, H., Brown, J. B., Lipovich, L. and Bajic, V. B. (2014) Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes. PLoS One, 9, e109443.
2. Schwartz, A. M., Demin, D. E., Vorontsov, I. E., Kasyanov, A. S., Putlyaeva, L. V., Tatosyan, K. A., Kulakovskiy, I. V. and Kuprash, D. V. (2017) Multiple single nucleotide polymorphisms in the first intron of the IL2RA gene affect transcription factor binding and enhancer activity. Gene, 602, 50-56.
3. Schwartz, A. M., Putlyaeva, L. V., Covich, M., Klepikova, A. V., Akulich, K. A., Vorontsov, I. E., Korneev, K. V., Dmitriev, S. E., Polanovsky, O. L., Sidorenko, S. P. et al. (2016) Early B-cell factor 1 (EBF1) is critical for transcriptional control of SLAMF1 gene in human B cells. Biochim. Biophys. Acta-Gene Regul. Mech., 1859, 1259-1268.
4. Boeva, V. (2016) Analysis of genomic sequence motifs for deciphering transcription factor binding and transcriptional regulation in Eukaryotic cells. Front Genet., 7, 24.
5. Vorontsov, I. E. I. E., Khimulya, G., Lukianova, E. N. E. N., Nikolaeva, D. D. D., Eliseeva, I. A. I. A., Kulakovskiy, I. V. I. V. and Makeev, V. J. V. J. (2016) Negative selection maintains transcription factor binding motifs in human cancer. BMC Genomics, 17, 395.
6. Medvedeva, Y. A., Khamis, A. M., Kulakovskiy, I. V, Ba-Alawi, W., Bhuyan, M. S. I., Kawaji, H., Lassmann, T., Harbers, M., Forrest, A. R. R. and Bajic, V. B. (2014) Effects of cytosine methylation on transcription factor binding sites. BMC Genomics, 15, 119.
7. Eggeling, R., Roos, T., Myllymäki, P. and Grosse, I. (2015) Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data. BMC Bioinformatics, 16, 375.
8. Siebert, M. and Söding, J. (2016) Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences. Nucleic Acids Res., 44, 6055-6069.
9. Kulakovskiy, I. V, Levitsky, V., Oshchepkov, D., Bryzgalov, L., Vorontsov, I. and Makeev, V. (2013) From binding motifs in ChIP-Seq data to improved models of transcription factor binding sites. J. Bioinform. Comput. Biol., 11, 1340004.
10. Mathelier, A. and Wasserman, W. W. (2013) The next generation of transcription factor binding site prediction. PLoS Comput. Biol., 9, e1003214.
11. Forrest, A. R. R., Kawaji, H., Rehli, M., Baillie, J. K., de Hoon, M. J. L., Lassmann, T., Itoh, M., Summers, K. M., Suzuki, H., Daub, C. O. et al. (2014) A promoter-level mammalian expression atlas. Nature, 507, 462-470.
12. Gursky, V. V., Kozlov, K. N., Kulakovskiy, I. V., Zubair, A., Marjoram, P., Lawrie, D. S., Nuzhdin, S. V. and Samsonova, M. G. (2017) Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network. PLoS One, 12, e0184657.
13. Balwierz, P. J., Pachkov, M., Arnold, P., Gruber, A. J., Zavolan, M. and van Nimwegen, E. (2014) ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs. Genome Res., 24, 869-884.
14. Medina-Rivera, A., Defrance, M., Sand, O., Herrmann, C., Castro-Mondragon, J. A., Delerce, J., Jaeger, S., Blanchet, C., Vincens, P., Caron, C. et al. (2015) RSAT 2015: Regulatory Sequence Analysis Tools. Nucleic Acids Res., 43, W50-W56.
15. Zhang, Y., Liu, T., Meyer, C. A., Eeckhoute, J., Johnson, D. S., Bernstein, B. E., Nusbaum, C., Myers, R. M., Brown, M., Li, W. et al. (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol., 9, R137.
16. Narlikar, L. and Jothi, R. (2012) ChIP-Seq data analysis: identification of protein-DNA binding sites with SISSRs peak-finder. Methods Mol. Biol., 802, 305-322.
17. Guo, Y., Mahony, S. and Gifford, D. K. (2012) High resolution genome wide binding event finding and motif discovery reveals transcription factor spatial binding constraints. PLoS Comput. Biol., 8, e1002638.
18. Zhang, X., Robertson, G., Krzywinski, M., Ning, K., Droit, A., Jones, S. and Gottardo, R. (2011) PICS: Probabilistic Inference for ChIP-seq. Biometrics, 67, 151-163.
19. Yevshin, I., Sharipov, R., Valeev, T., Kel, A. and Kolpakov, F. (2016) GTRD: a database of transcription factor binding sites identified by ChIP-seq experiments. Nucleic Acids Res., 45, D61-D67.
20. Kulakovskiy, I. V, Vorontsov, I. E., Yevshin, I. S., Soboleva, A. V, Kasianov, A. S., Ashoor, H., Ba-Alawi, W., Bajic, V. B., Medvedeva, Y. A., Kolpakov, F. A. et al. (2016) HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res., 44, D116-D125.
21. Levitsky, V. G., Kulakovskiy, I. V, Ershov, N. I., Oschepkov, D. Y., Makeev, V. J., Hodgman, T. C., Merkulova, T. I., Oshchepkov, D. Y., Makeev, V. J., Hodgman, T. C. et al. (2014) Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data. BMC Genomics, 15, 80.
22. Kulakovskiy, I. V, Boeva, V. A., Favorov, A. V. and Makeev, V. J. (2010) Deep and wide digging for binding motifs in ChIP-Seq data. Bioinformatics, 26, 2622-2623.
23. Kulakovskiy, I. V, Medvedeva, Y. A., Schaefer, U., Kasianov, A. S., Vorontsov, I. E., Bajic, V. B. and Makeev, V. J. (2013) HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res., 41, D195-D202.
24. Wingender, E., Schoeps, T. and Donitz, J. (2012) TFClass: an expandable hierarchical classification of human transcription factors. Nucleic Acids Res., 41, D165-D170.
25. Wingender, E., Schoeps, T., Haubrock, M. and Dönitz, J. (2015) TFClass: a classification of human transcription factors and their rodent orthologs. Nucleic Acids Res., 43, D97-D102.
26. Medina-Rivera, A., Abreu-Goodger, C., Thomas-Chollier, M., Salgado, H., Collado-Vides, J. and Van Helden, J. (2011) Theoretical and empirical quality assessment of transcription factor-binding motifs. Nucleic Acids Res., 39, 808-824.
27. Wilson, S., Qi, J. and Filipp, F. V (2016) Refinement of the androgen response element based on ChIP-Seq in androgen-insensitive and androgen-responsive prostate cancer cell lines. Sci. Rep., 6, 32611.
28. Dolfini, D. and Mantovani, R. (2012) YB-1 (YBX1) does not bind to Y/CCAAT boxes in vivo. Oncogene, 32, 4189-4190.
29. Wang, J., Zhuang, J., Iyer, S., Lin, X. Y., Greven, M. C., Kim, B. H., Moore, J., Pierce, B. G., Dong, X., Virgil, D. et al. (2013) Factorbook. org: A Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res., 41, D171-D176.
30. Papatsenko, D., Darr, H., Kulakovskiy, I. V., Waghray, A., Makeev, V. J., MacArthur, B. D. and Lemischka, I. R. (2015) Single-cell analyses of ESCs reveal alternative pluripotent cell states and molecular mechanisms that control self-renewal. Stem Cell Rep., 5, 207-220.
31. Maaskola, J. and Rajewsky, N. (2014) Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models. Nucleic Acids Res, 42, 12995-13011.
32. Gagliardi, A., Mullin, N. P., Ying Tan, Z., Colby, D., Kousa, A. I., Halbritter, F., Weiss, J. T., Felker, A., Bezstarosti, K., Favaro, R. et al. (2013) A direct physical interaction between Nanog and Sox2 regulates embryonic stem cell self-renewal. EMBO J., 32, 2231-2247.
33. Hojo, H., Ohba, S., He, X., Lai, L. P. and McMahon, A. P. (2016) Sp7/Osterix is restricted to bone-forming vertebrates where it acts as a Dlx co-factor in osteoblast specification. Dev. Cell, 37, 238-253.
34. Sebastian, A. and Contreras-Moreira, B. (2014) footprintDB: a database of transcription factors with annotated cis elements and binding interfaces. Bioinformatics, 30, 258-265.
35. Verfaillie, A., Imrichova, H., Janky, R. and Aerts, S. (2015) iRegulon and i-cisTarget: reconstructing regulatory networks using motif and track enrichment. Curr. Protoc. Bioinformatics, 52, 2. 16. 1-2. 16. 39.
36. Weirauch, M. T., Yang, A., Albu, M., Cote, A. G., Montenegro-Montero, A., Drewe, P., Najafabadi, H. S., Lambert, S. A., Mann, I., Cook, K. et al. (2014) Determination and inference of eukaryotic transcription factor sequence specificity. Cell, 158, 1431-1443.
37. The UniProt Consortium (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res., 40, D71-D75.
38. Yusuf, D., Butland, S. L., Swanson, M. I., Bolotin, E., Ticoll, A., Cheung, W. A., Zhang, X. Y. C., Dickman, C. T. D., Fulton, D. L., Lim, J. S. et al. (2012) The transcription factor encyclopedia. Genome Biol., 13, R24.
39. Mathelier, A., Fornes, O., Arenillas, D. J., Chen, C., Denay, G., Lee, J., Shi, W., Shyr, C., Tan, G., Worsley-Hunt, R. et al. (2015) JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res., 44, D110-D115.
40. Mathelier, A., Zhao, X., Zhang, A. W., Parcy, F., Worsley-Hunt, R., Arenillas, D. J., Buchman, S., Chen, C., Chou, A., Ienasescu, H. et al. (2014) JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res., 42, D142-D147.
41. Schmidt, F., Gasparoni, N., Gasparoni, G., Gianmoena, K., Cadenas, C., Polansky, J. K., Ebert, P., Nordstr öm, K., Barann, M., Sinha, A. et al. (2017) Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction. Nucleic Acids Res., 45, 54-66.
42. Deplancke, B., Alpern, D., Gardeux, V., Adam, R. C., Yang, H., Rockowitz, S., Larsen, S. B., Nikolova, M., Oristian, D. S., Polak, L. et al. (2016) The genetics of transcription factor DNA binding variation. Cell, 166, 538-554.
43. Afanasyeva, M. A., Putlyaeva, L. V., Demin, D. E., Kulakovskiy, I. V., Vorontsov, I. E., Fridman, M. V., Makeev, V. J., Kuprash, D. V. and Schwartz, A. M. (2017) The single nucleotide variant rs12722489 determines differential estrogen receptor binding and enhancer properties of an IL2RA intronic region. PLoS One, 12, e0172681.
44. Touzet, H. and Varré, J.-S. (2007) Efficient and accurate P-value computation for Position Weight Matrices. Algorithms Mol. Biol., 2, 15.
45. Bailey, T. L., Johnson, J., Grant, C. E. and Noble, W. S. (2015) The MEME suite. Nucleic Acids Res., 43, W39-W49.