We explore a technique to learn Support Vector Models (SVMs) when training data is partitioned among several data sources. The basic idea is to efficiently compute SVMs which can be reduced to Minimal Enclosing Ball (MEB) problems in a feature space by finding a coreset for the image of the data in that space. Our main result is that the union of local core-sets provides a close approximation to a global core-set from which the SVM can be recovered. The method requires hence a single pass through each source of data in order to compute local core-sets and then to recover the SVM from their union. Extensive simulations on real datasets are presented in order to evaluate accuracy and efficiency, comparing to a widely used single-pass heuristic to learn standard SVMs.
S. Lodi, R. Ñanculef, C.Sartori (2010). Learning Multi-Class Support Vector Models from Distributed Data using Core-Sets (Extended Abstract). BOLOGNA : Società Editrice Esculapio.
Learning Multi-Class Support Vector Models from Distributed Data using Core-Sets (Extended Abstract)
LODI, STEFANO;SARTORI, CLAUDIO
2010
Abstract
We explore a technique to learn Support Vector Models (SVMs) when training data is partitioned among several data sources. The basic idea is to efficiently compute SVMs which can be reduced to Minimal Enclosing Ball (MEB) problems in a feature space by finding a coreset for the image of the data in that space. Our main result is that the union of local core-sets provides a close approximation to a global core-set from which the SVM can be recovered. The method requires hence a single pass through each source of data in order to compute local core-sets and then to recover the SVM from their union. Extensive simulations on real datasets are presented in order to evaluate accuracy and efficiency, comparing to a widely used single-pass heuristic to learn standard SVMs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.