Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary independently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6millionkm 2) at an average sampling density of 1site/2500km 2, is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specific additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data.

The concept of compositional data analysis in practice - Total major element concentrations in agricultural and grazing land soils of Europe.

DINELLI, ENRICO;
2012

Abstract

Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary independently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6millionkm 2) at an average sampling density of 1site/2500km 2, is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specific additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data.
2012
Reimann C.; Filzmoser P.; Fabian K.; Hron K.; Birke M.; Demetriades A.; Dinelli E.; Ladenberger A.; Albanese S.; Andersson M.; Arnoldussen A.; Baritz R.; Batista M.J.; Bel-lan A.; Cicchella D.; De Vivo B.; De Vos W.; Duris M.; Dusza-Dobek A.; Eggen O.A.; Eklund M.; Ernstsen V.; Finne T.E.; Flight D.; Forrester S.; Fuchs M.; Fugedi U.; Gilucis A.; Gosar M.; Gregorauskiene V.; Gulan A.; Halamic J.; Haslinger E.; Hayoz P.; Hobiger G.; Hoffmann R.; Hoogewerff J.; Hrvatovic H.; Husnjak S.; Janik L.; Johnson C.C.; Jordan G.; Kirby J.; Kivisilla J.; Klos V.; Krone F.; Kwecko P.; Kuti L.; Lima A.; Locutura J.; Lucivjansky P.; Mackovych D.; Malyuk B.I.; Maquil R.; McLaughlin M.J.; Meuli R.G.; Miosic N.; Mol G.; Négrel P.; O'Connor P.; Oorts K.; Ottesen R.T; Pasieczna A.; Petersell V.; Pfleiderer S.; Ponavic M.; Prazeres C.; Rauch U.; Salpeteur I.; Schedl A.; Scheib A.; Schoeters I.; Sefcik P.; Sellersjö E.; Skopljak F.; Slaninka I.; Šorša A.; Srvkota R.; Stafilov T.; Tarvainen T.; Trendavilov V.; Valera P.; Verougstraete V.; Vidojevic D.; Zissimos A.M.; Zomeni Z.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/123892
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 205
  • ???jsp.display-item.citation.isi??? 190
social impact