Learning of large--scale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a consequence, the prime objects of inference are full--order partial correlations which are partial correlations between two variables given the remaining ones. In the context of microarray data the number of variables exceed the sample size and this precludes the application of traditional structure learning procedures because a sampling version of full--order partial correlations does not exist. In this paper we consider limited--order partial correlations, these are partial correlations computed on marginal distributions of manageable size, and provide a set of rules that allow one to assess the usefulness of these quantities to derive the independence structure of the underlying Gaussian graphical model. Furthermore, we introduce a novel structure learning procedure based on a quantity, obtained from limited--order partial correlations, that we call the non--rejection rate. The applicability and usefulness of the procedure are demonstrated by both simulated and real data.

A robust procedure for Gaussian graphical model search from microarray data with p larger than n / R. Castelo; A. Roverato. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1532-4435. - STAMPA. - 7:(2006), pp. 2621-2650.

A robust procedure for Gaussian graphical model search from microarray data with p larger than n

ROVERATO, ALBERTO
2006

Abstract

Learning of large--scale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a consequence, the prime objects of inference are full--order partial correlations which are partial correlations between two variables given the remaining ones. In the context of microarray data the number of variables exceed the sample size and this precludes the application of traditional structure learning procedures because a sampling version of full--order partial correlations does not exist. In this paper we consider limited--order partial correlations, these are partial correlations computed on marginal distributions of manageable size, and provide a set of rules that allow one to assess the usefulness of these quantities to derive the independence structure of the underlying Gaussian graphical model. Furthermore, we introduce a novel structure learning procedure based on a quantity, obtained from limited--order partial correlations, that we call the non--rejection rate. The applicability and usefulness of the procedure are demonstrated by both simulated and real data.
2006
A robust procedure for Gaussian graphical model search from microarray data with p larger than n / R. Castelo; A. Roverato. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1532-4435. - STAMPA. - 7:(2006), pp. 2621-2650.
R. Castelo; A. Roverato
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/37628
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 77
  • ???jsp.display-item.citation.isi??? 70
social impact