The increasing availability of ChIP-seq data demands for advanced statistical tools to analyze the results of such experiments. The inherent features of high-throughput sequencing output call for a modelling framework that can account for the spatial dependency between neighboring regions of the genome and the temporal dimension that arises from observing the protein binding process at progressing time points; also, multiple biological/technical replicates of the experiment are usually produced and methods to jointly account for them are needed. Furthermore, the antibodies used in the experiment lead to potentially different immunoprecipitation efficiencies, which can affect the capability of distinguishing between the true signal in the data and the background noise. The statistical procedure proposed consist of a discrete mixture model with an underlying latent Markov random field: the novelty of the model is to allow both spatial and temporal dependency to play a role in determining the latent state of genomic regions involved in the protein binding process, while combining all the information of the replicates available instead of treating them separately. It is also possible to take into account the different antibodies used, in order to obtain better insights of the process and exploit all the biological information available.

Spatio-temporal model for multiple ChIP-seq experiments

RANCIATI, SAVERIO;VIROLI, CINZIA;WIT, ERNST
2015

Abstract

The increasing availability of ChIP-seq data demands for advanced statistical tools to analyze the results of such experiments. The inherent features of high-throughput sequencing output call for a modelling framework that can account for the spatial dependency between neighboring regions of the genome and the temporal dimension that arises from observing the protein binding process at progressing time points; also, multiple biological/technical replicates of the experiment are usually produced and methods to jointly account for them are needed. Furthermore, the antibodies used in the experiment lead to potentially different immunoprecipitation efficiencies, which can affect the capability of distinguishing between the true signal in the data and the background noise. The statistical procedure proposed consist of a discrete mixture model with an underlying latent Markov random field: the novelty of the model is to allow both spatial and temporal dependency to play a role in determining the latent state of genomic regions involved in the protein binding process, while combining all the information of the replicates available instead of treating them separately. It is also possible to take into account the different antibodies used, in order to obtain better insights of the process and exploit all the biological information available.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/466770
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact