It has been recently shown that the quadratic programming formulation underlying a number of kernel methods can be treated as a minimal enclosing ball (MEB) problem in a feature space where data has been previously embedded. Core Vector Machines (CVMs) in particular, make use of this equivalence in order to compute Support Vector Machines (SVMs) from very large datasets in the batch scenario. In this paper we study two algorithms for online classification which extend this family of algorithms to deal with large data streams. Both algorithms use analytical rules to adjust the model extracted from the stream instead of recomputing the entire solution on the augmented dataset. We show that these algorithms are more accurate than the current extension of CVMs to handle data streams using an analytical rule instead of solving large quadratic programs. Experiments also show that the online approaches are considerably more efficient than periodic computation of CVMs even though warm start is being used.
Ricardo Ñanculef, Héctor Allende, Stefano Lodi, Claudio Sartori (2011). Two One-Pass Algorithms for Data Stream Classification Using Approximate MEBs. HEIDELBERG : Springer-Verlag [10.1007/978-3-642-20267-4_38].
Two One-Pass Algorithms for Data Stream Classification Using Approximate MEBs
LODI, STEFANO;SARTORI, CLAUDIO
2011
Abstract
It has been recently shown that the quadratic programming formulation underlying a number of kernel methods can be treated as a minimal enclosing ball (MEB) problem in a feature space where data has been previously embedded. Core Vector Machines (CVMs) in particular, make use of this equivalence in order to compute Support Vector Machines (SVMs) from very large datasets in the batch scenario. In this paper we study two algorithms for online classification which extend this family of algorithms to deal with large data streams. Both algorithms use analytical rules to adjust the model extracted from the stream instead of recomputing the entire solution on the augmented dataset. We show that these algorithms are more accurate than the current extension of CVMs to handle data streams using an analytical rule instead of solving large quadratic programs. Experiments also show that the online approaches are considerably more efficient than periodic computation of CVMs even though warm start is being used.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.