Beginning in 2009, the CMS experiment will produce several petabytes of data each year which will be distributed over many computing centres geographically distributed in different countries. The CMS computing model defines how the data is to be distributed and accessed to enable physicists to efficiently run their analyses over the data. The analysis will be performed in a distributed way using Grid infrastructure. CRAB (CMS remote analysis builder) is a specific tool, designed and developed by the CMS collaboration, that allows the end user to transparently access distributed data. CRAB interacts with the local user environment, the CMS data management services and with the Grid middleware; it takes care of the data and resource discovery; it splits the user's task into several processes (jobs) and distributes and parallelizes them over different Grid environments; it performs process tracking and output handling. Very limited knowledge of the underlying technical details is required of the end user. The tool can be used as a direct interface to the computing system or can delegate the task to a server, which takes care of the job handling, providing services such as automatic resubmission in case of failures and notification to the user of the task status. Its current implementation is able to interact with gLite and OSG Grid middlewares. Furthermore, with the same interface, it enables access to local data and batch systems such as load sharing facility (LSF). CRAB has been in production and in routine use by end users since Spring 2004. It has been extensively used in studies to prepare the Physics Technical Design Report, in the analysis of reconstructed event samples generated during the Computing Software and Analysis Challenges and in the preliminary cosmic ray data taking. The CRAB architecture and the usage inside the CMS community will be described in detail, as well as the current status and future development.
G. Codispoti, C. Mattia, A. Fanfani, F. Fanzago, F. Farina, C. Kavka, et al. (2009). CRAB: A CMS application for distributed analysis. IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 56, 2850-2858 [10.1109/TNS.2009.2028076].
CRAB: A CMS application for distributed analysis
FANFANI, ALESSANDRA;
2009
Abstract
Beginning in 2009, the CMS experiment will produce several petabytes of data each year which will be distributed over many computing centres geographically distributed in different countries. The CMS computing model defines how the data is to be distributed and accessed to enable physicists to efficiently run their analyses over the data. The analysis will be performed in a distributed way using Grid infrastructure. CRAB (CMS remote analysis builder) is a specific tool, designed and developed by the CMS collaboration, that allows the end user to transparently access distributed data. CRAB interacts with the local user environment, the CMS data management services and with the Grid middleware; it takes care of the data and resource discovery; it splits the user's task into several processes (jobs) and distributes and parallelizes them over different Grid environments; it performs process tracking and output handling. Very limited knowledge of the underlying technical details is required of the end user. The tool can be used as a direct interface to the computing system or can delegate the task to a server, which takes care of the job handling, providing services such as automatic resubmission in case of failures and notification to the user of the task status. Its current implementation is able to interact with gLite and OSG Grid middlewares. Furthermore, with the same interface, it enables access to local data and batch systems such as load sharing facility (LSF). CRAB has been in production and in routine use by end users since Spring 2004. It has been extensively used in studies to prepare the Physics Technical Design Report, in the analysis of reconstructed event samples generated during the Computing Software and Analysis Challenges and in the preliminary cosmic ray data taking. The CRAB architecture and the usage inside the CMS community will be described in detail, as well as the current status and future development.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.