Motivation: Looping is one of the mechanisms responsible of gene regulation in eukaryotes. It is known that chromatin loops make possible enhancer-promoter interactions despite their distance in terms of nucleotide sequences. The loops are mediated by transcription factors that bind these sequences catalyzing the formation of the looping complex. These interactions are therefore very difficult to investigate and to predict, since they involve the formation of complexes including different proteins and DNA strands. This work aims at extracting information about these interactions by exploiting the available data regarding interacting enhancer-promoter pairs (EP-pairs) and by characterizing conserved pairs of transcription factor binding sites (TFBS, TF-pairs). This analysis can help in constructing networks that represent looping-catalyzing DNA-protein interactions. Furthermore the results can eventually be used as an information resource to be adopted in developing computational approaches to detect chromatin looping interactions. Methods: We first generated a data base of EP-pairs by downloading experimentally validated human enhancers from VISTA (http://enhancer.lbl.gov/) and collecting the correspondent promoters from UCSC (http://genome.ucsc.edu/), respectively. The number of experimentally validated human enhancers is 712 and the total number of possible EP-pairs summed up to about 1000. VISTA annotates whether a sequence is an enhancer, without giving information about the correspondent enhanced promoter/s. As a proof of principle, we assumed that: i) any experimentally validated enhancer (from VISTA) is paired to the most proximal promoter (UCSC) when its location is intragenic; ii) any experimentally validated enhancer (from VISTA) is paired to the upstream and downstream most proximal promoters (UCSC) when its location is intergenic. As a statistical test to validate significant TF-pairs putatively involved in looping we adopted a binomial test. The test positively scores two TFs as belonging to a pair when they are found on coupled enhancers and promoters. Results: After collection of enhancers and promoters, we performed a knowledge-based prediction of TFBS over the 1000 paired elements by adopting a set of 776 position frequency matrices collected from JASPAR, a matrix profile data base of human transcription factor binding sites (http://jaspar.genereg.net/). The prediction gave 2263 and 1718 TFBS in promoters and enhancers, respectively (p-value threshold set at 10-7). By applying the statistical test to estimate the presence of significant TF-pairs we extracted 37 significant TF-pairs (with a significance threshold equal to 10-5), including 24 different TFs. In none of the significant TF-pairs we found binding sites of the same transcription factor, suggesting that looping is mediated by protein-protein interaction. As a preliminary validation of our procedure we took advantage of the recently released ENCODE data (http://genome.ucsc.edu/ENCODE/). Here we found that for 3 out of our 24 TFs, CHIP-seq data are available. By collecting the corresponding bound DNA sequences and merging them with the 5C Chromatin interaction data detecting enhancer-promoter pairs, we found that one TF pair was also experimentally detected. The couple includes the transcriptional repression protein YY1 and the Protein C-ets-1 (ETS1). YY1 is a multifunctional transcription factor that exhibits positive and negative control on a large number of cellular and viral genes by binding to sites overlapping the transcription start site; similarly ETS1 may act either as transcriptional activator or repressor of numerous genes and is involved in stem cell development, cell senescence and death, and tumorigenesis. The pair may be directly or indirectly linked in looping mechanisms, as suggested by the human protein-protein interaction map (STRING, http://string-db.org/). Summing up, our procedure unravels putative associations among transcription factors binding to paired promoters and enhancer regions in DNA.
Aggazio F, Casadio R (2013). A genomic scale investigation on enhancer-promoter interactions mediated by transcription factors..
A genomic scale investigation on enhancer-promoter interactions mediated by transcription factors.
AGGAZIO, FRANCESCO;CASADIO, RITA
2013
Abstract
Motivation: Looping is one of the mechanisms responsible of gene regulation in eukaryotes. It is known that chromatin loops make possible enhancer-promoter interactions despite their distance in terms of nucleotide sequences. The loops are mediated by transcription factors that bind these sequences catalyzing the formation of the looping complex. These interactions are therefore very difficult to investigate and to predict, since they involve the formation of complexes including different proteins and DNA strands. This work aims at extracting information about these interactions by exploiting the available data regarding interacting enhancer-promoter pairs (EP-pairs) and by characterizing conserved pairs of transcription factor binding sites (TFBS, TF-pairs). This analysis can help in constructing networks that represent looping-catalyzing DNA-protein interactions. Furthermore the results can eventually be used as an information resource to be adopted in developing computational approaches to detect chromatin looping interactions. Methods: We first generated a data base of EP-pairs by downloading experimentally validated human enhancers from VISTA (http://enhancer.lbl.gov/) and collecting the correspondent promoters from UCSC (http://genome.ucsc.edu/), respectively. The number of experimentally validated human enhancers is 712 and the total number of possible EP-pairs summed up to about 1000. VISTA annotates whether a sequence is an enhancer, without giving information about the correspondent enhanced promoter/s. As a proof of principle, we assumed that: i) any experimentally validated enhancer (from VISTA) is paired to the most proximal promoter (UCSC) when its location is intragenic; ii) any experimentally validated enhancer (from VISTA) is paired to the upstream and downstream most proximal promoters (UCSC) when its location is intergenic. As a statistical test to validate significant TF-pairs putatively involved in looping we adopted a binomial test. The test positively scores two TFs as belonging to a pair when they are found on coupled enhancers and promoters. Results: After collection of enhancers and promoters, we performed a knowledge-based prediction of TFBS over the 1000 paired elements by adopting a set of 776 position frequency matrices collected from JASPAR, a matrix profile data base of human transcription factor binding sites (http://jaspar.genereg.net/). The prediction gave 2263 and 1718 TFBS in promoters and enhancers, respectively (p-value threshold set at 10-7). By applying the statistical test to estimate the presence of significant TF-pairs we extracted 37 significant TF-pairs (with a significance threshold equal to 10-5), including 24 different TFs. In none of the significant TF-pairs we found binding sites of the same transcription factor, suggesting that looping is mediated by protein-protein interaction. As a preliminary validation of our procedure we took advantage of the recently released ENCODE data (http://genome.ucsc.edu/ENCODE/). Here we found that for 3 out of our 24 TFs, CHIP-seq data are available. By collecting the corresponding bound DNA sequences and merging them with the 5C Chromatin interaction data detecting enhancer-promoter pairs, we found that one TF pair was also experimentally detected. The couple includes the transcriptional repression protein YY1 and the Protein C-ets-1 (ETS1). YY1 is a multifunctional transcription factor that exhibits positive and negative control on a large number of cellular and viral genes by binding to sites overlapping the transcription start site; similarly ETS1 may act either as transcriptional activator or repressor of numerous genes and is involved in stem cell development, cell senescence and death, and tumorigenesis. The pair may be directly or indirectly linked in looping mechanisms, as suggested by the human protein-protein interaction map (STRING, http://string-db.org/). Summing up, our procedure unravels putative associations among transcription factors binding to paired promoters and enhancer regions in DNA.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.