Principal component analysis (PCA) and cluster analysis are used frequently to derive dietary patterns. Decisions on how many patterns to extract are primarily based on subjective criteria, whereas different solutions vary in their food-group composition and perhaps association with disease outcome. Literature on reliability of dietary patterns is scarce, and previous studies validated only 1 preselected solution. Therefore, we assessed reliability of different pattern solutions ranging from 2 to 6 patterns, derived from the aforementioned methods. A validated food frequency questionnaire was administered at baseline (1993–1997) to 39,678 participants in the European Prospective Investigation into Cancer and Nutrition–The Netherlands (EPIC-NL) cohort. Food items were grouped into 31 food groups for dietary pattern analysis. The cohort was randomly divided into 2 halves, and dietary pattern solutions derived in 1 sample through PCA were replicated through confirmatory factor analysis in sample 2. For cluster analysis, cluster stability and split-half reproducibility were assessed for various solutions. With PCA, we found the 3-component solution to be best replicated, although all solutions contained $1 poorly confirmed component. No quantitative criterion was in agreement with the results. Associations with disease outcome (coronary heart disease) differed between the component solutions. For all cluster solutions, stability was excellent and deviations between samples was negligible, indicating good reproducibility. All quantitative criteria identified the 2-cluster solution as optimal. Associations with disease outcome were comparable for different cluster solutions. In conclusion, reliability of obtained dietary patterns differed considerably for different solutions using PCA, whereas cluster analysis derived generally stable, reproducible clusters across different solutions. Quantitative criteria for determining the number of patterns to retain were valuable for cluster analysis but not for PCA. Associations with disease risk were influenced by the number of patterns that are retained, especially when using PCA. Therefore, studies on associations between dietary patterns and disease risk should report reasons to choose the number of retained patterns.
Fransen HP, May AM, Stricker MD, Boer JMA, Hennig C, Rosseel Y, et al. (2014). A Posteriori Dietary Patterns: How Many Patterns to Retain?. JOURNAL OF NUTRITION, 144, 1274-1282 [10.3945/jn.113.188680].
A Posteriori Dietary Patterns: How Many Patterns to Retain?
Hennig C;
2014
Abstract
Principal component analysis (PCA) and cluster analysis are used frequently to derive dietary patterns. Decisions on how many patterns to extract are primarily based on subjective criteria, whereas different solutions vary in their food-group composition and perhaps association with disease outcome. Literature on reliability of dietary patterns is scarce, and previous studies validated only 1 preselected solution. Therefore, we assessed reliability of different pattern solutions ranging from 2 to 6 patterns, derived from the aforementioned methods. A validated food frequency questionnaire was administered at baseline (1993–1997) to 39,678 participants in the European Prospective Investigation into Cancer and Nutrition–The Netherlands (EPIC-NL) cohort. Food items were grouped into 31 food groups for dietary pattern analysis. The cohort was randomly divided into 2 halves, and dietary pattern solutions derived in 1 sample through PCA were replicated through confirmatory factor analysis in sample 2. For cluster analysis, cluster stability and split-half reproducibility were assessed for various solutions. With PCA, we found the 3-component solution to be best replicated, although all solutions contained $1 poorly confirmed component. No quantitative criterion was in agreement with the results. Associations with disease outcome (coronary heart disease) differed between the component solutions. For all cluster solutions, stability was excellent and deviations between samples was negligible, indicating good reproducibility. All quantitative criteria identified the 2-cluster solution as optimal. Associations with disease outcome were comparable for different cluster solutions. In conclusion, reliability of obtained dietary patterns differed considerably for different solutions using PCA, whereas cluster analysis derived generally stable, reproducible clusters across different solutions. Quantitative criteria for determining the number of patterns to retain were valuable for cluster analysis but not for PCA. Associations with disease risk were influenced by the number of patterns that are retained, especially when using PCA. Therefore, studies on associations between dietary patterns and disease risk should report reasons to choose the number of retained patterns.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.