Let $X$ be a $p$-variate random vector and $\widetilde{X}$ a knockoff copy of $X$ (in the sense of \cite{CFJL18}). A new approach for constructing $\widetilde{X}$ (henceforth, NA) has been introduced in \cite{JSPI}. NA has essentially three advantages: (i) To build $\widetilde{X}$ is straightforward; (ii) The joint distribution of $(X,\widetilde{X})$ can be written in closed form; (iii) $\widetilde{X}$ is often optimal under various criteria. However, for NA to apply, $X_1,\ldots,X_p$ should be conditionally independent given some random element $Z$. Our first result is that any probability measure $\mu$ on $\mathbb{R}^p$ can be approximated by a probability measure $\mu_0$ of the form $$\mu_0\bigl(A_1\times\ldots\times A_p\bigr)=E\Bigl\{\prod_{i=1}^p P(X_i\in A_i\mid Z)\Bigr\}.$$ The approximation is in total variation distance when $\mu$ is absolutely continuous, and an explicit formula for $\mu_0$ is provided. If $X\sim\mu_0$, then $X_1,\ldots,X_p$ are conditionally independent. Hence, with a negligible error, one can assume $X\sim\mu_0$ and build $\widetilde{X}$ through NA. Our second result is a characterization of the knockoffs $\widetilde{X}$ obtained via NA. It is shown that $\widetilde{X}$ is of this type if and only if the pair $(X,\widetilde{X})$ can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti's theorem for partially exchangeable sequences. In addition to the quoted results, an explicit formula for the conditional distribution of $\widetilde{X}$ given $X$ is obtained in a few cases. In one of such cases, it is assumed $X_i\in\{0,1\}$ for all $i$.

Generating knockoffs via conditional independence / Dreassi Emanuela; Leisen Fabrizio; Pratelli Luca; Rigo Pietro. - In: ELECTRONIC JOURNAL OF STATISTICS. - ISSN 1935-7524. - ELETTRONICO. - 18:1(2024), pp. 119-144. [10.1214/23-EJS2198]

Generating knockoffs via conditional independence

Rigo Pietro
2024

Abstract

Let $X$ be a $p$-variate random vector and $\widetilde{X}$ a knockoff copy of $X$ (in the sense of \cite{CFJL18}). A new approach for constructing $\widetilde{X}$ (henceforth, NA) has been introduced in \cite{JSPI}. NA has essentially three advantages: (i) To build $\widetilde{X}$ is straightforward; (ii) The joint distribution of $(X,\widetilde{X})$ can be written in closed form; (iii) $\widetilde{X}$ is often optimal under various criteria. However, for NA to apply, $X_1,\ldots,X_p$ should be conditionally independent given some random element $Z$. Our first result is that any probability measure $\mu$ on $\mathbb{R}^p$ can be approximated by a probability measure $\mu_0$ of the form $$\mu_0\bigl(A_1\times\ldots\times A_p\bigr)=E\Bigl\{\prod_{i=1}^p P(X_i\in A_i\mid Z)\Bigr\}.$$ The approximation is in total variation distance when $\mu$ is absolutely continuous, and an explicit formula for $\mu_0$ is provided. If $X\sim\mu_0$, then $X_1,\ldots,X_p$ are conditionally independent. Hence, with a negligible error, one can assume $X\sim\mu_0$ and build $\widetilde{X}$ through NA. Our second result is a characterization of the knockoffs $\widetilde{X}$ obtained via NA. It is shown that $\widetilde{X}$ is of this type if and only if the pair $(X,\widetilde{X})$ can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti's theorem for partially exchangeable sequences. In addition to the quoted results, an explicit formula for the conditional distribution of $\widetilde{X}$ given $X$ is obtained in a few cases. In one of such cases, it is assumed $X_i\in\{0,1\}$ for all $i$.
2024
Generating knockoffs via conditional independence / Dreassi Emanuela; Leisen Fabrizio; Pratelli Luca; Rigo Pietro. - In: ELECTRONIC JOURNAL OF STATISTICS. - ISSN 1935-7524. - ELETTRONICO. - 18:1(2024), pp. 119-144. [10.1214/23-EJS2198]
Dreassi Emanuela; Leisen Fabrizio; Pratelli Luca; Rigo Pietro
File in questo prodotto:
File Dimensione Formato  
23-EJS2198.pdf

accesso aperto

Descrizione: pdf editoriale
Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 382.92 kB
Formato Adobe PDF
382.92 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/950484
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact