#acl EditorsGroup:read,write philippecalfayan:read,write All:read <> = Introduction = The PROOF (Parallel ROOT Facility) library is designed to perform parallelized ROOT-based analyses with a possibly heterogeneous cluster of computers. It allows an interactive utilization. A simple PROOF cluster is composed of one master node and several worker nodes. The parallelization is intended to be transparent for the user. The service "xrootd" is used for the communications. The PROOF master and workers thus refers to an xrootd service running on a given node. = PROOF at the NAF = A PROOF cluster can be created on demand by the user at the NAF, using interactive SGE batch jobs which will start the worker daemons. This method has originally been developed for the CMS experiment, and has been adapted for ATLAS. It relies on a single script ("proofcluster.pl"). Information on the original CMS script can be found here: https://twiki.cern.ch/twiki/bin/view/CMS/HamburgWikiComputingNAFPROOF === The "proofcluster.pl" script === Please use the version available at this path: {{{ ~calfayan/scripts/proofcluster.pl }}} PROOF requires ROOT. If the environment variable "ROOTSYS" is not set, please enable a recent ROOT version: {{{ ini ROOT52200 }}} * To configure "proofcluster.pl", please execute: {{{ ~calfayan/scripts/proofcluster.pl config }}} The user is then required to specify the following settings: * "PROOF master server name": keep localhost. The login node will be used as PROOF master. * "Number of workers": a large number will increase the time to wait so that all slots are available on the batch system. For tests, please keep the defalt value ("4"). * "Choose the site": you can safely keep the default value ("any"). * "Select queue": you can safely keep the default value ("normal"). * "RAM required per worker (in MB)": the default value ("500") is enough to process D3PD. When reading AODs, please increase it (700 is enough for this tutorial). * "xrootd port": port number for the xrootd service. You can safely keep the default value, which is generated randomly. * "xrootd protocol port": port number for the xproofd service. You can safely keep the default value, which is generated randomly. * "Reconfigure ports automatically": in the case where several users want to start a PROOF cluster from the same host, and with the same ports, this option will enable the automatic reconfiguration of the port numbers. It is recommended to keep the default value ("y"). * "Path to ROOT installation or auto for automatic selection": you can safely keep the default value ("auto") and setup the ROOTSYS path externally. * "Keep (copy) worker log files when stopping the cluster": the default value is no ("n"), since in the current setup, problems could occur during the copy... The configuration files as well as the logs will be copied in /scratch/current/atlas//proofcluster/. * To start the PROOF cluster, do: {{{ ~calfayan/scripts/proofcluster.pl start }}} * To get the status of the PROOF cluster, try: {{{ ~calfayan/scripts/proofcluster.pl status }}} * To stop the PROOF cluster, execute: {{{ ~calfayan/scripts/proofcluster.pl stop }}} PLEASE DON'T FORGET TO STOP THE PROOF CLUSTER AFTER USE! = How to connect to the PROOF cluster and manage PROOF sessions = First start a PROOF cluster with the command: {{{ ~calfayan/scripts/proofcluster.pl start }}} The class [[http://root.cern.ch/root/html/TProof.html|TProof]] manages a PROOF session. A PROOF cluster is specified by a user login name, the name of the PROOF master node, and its according xrootd port: {{{ @: }}} When starting a PROOF cluster with the default settings at the NAF, the PROOF master will be your login node. To connect to the PROOF cluster and create a PROOF session, one can execute the following: {{{ root -l TProof* proof = TProof::Open("@localhost:") }}} or, alternatively, create a PROOF manager ([[http://root.cern.ch/root/html/TProofMgr.html|TProofMgr]]) and then create a session: {{{ root -l TProofMgr* mgr = TProofMgr::Create("@localhost:") ; TProof* proof = mgr->CreateSession() ; }}} The use of "proofcluster.pl start" will generate in addition the ROOT script "connectproof.C". This file contains the correct host name and port number for the PROOF master. One can start a PROOF session by simply executing this script: {{{ root -l /scratch/current/atlas/`whoami`/proofcluster/connectproof.C # creates a PROOF session (gProof) }}} Then, if you want to modify the current session: * To get/set the number of active workers: {{{ gProof->SetParallel(3) gProof->GetParallel() gProof->SetParallel(4) }}} * For a more detailed view of the workers: {{{ gProof->GetListOfSlaveInfos()->Print() }}} With 4 active workers, the output would be something like: {{{ Collection name='TSortedList', class='TSortedList', size=4 Slave: 0.0 hostname: tcx011.naf.desy.de msd: perf index: 100 active Slave: 0.1 hostname: tcx01a.naf.desy.de msd: perf index: 100 active Slave: 0.2 hostname: tcx016.naf.desy.de msd: perf index: 100 active Slave: 0.3 hostname: tcx015.naf.desy.de msd: perf index: 100 active }}} The numbers "0.X" are identifying the workers. * To deactivate/activate particular workers, for example: {{{ gProof->DeactivateWorker("0.0") gProof->DeactivateWorker("0.2") gProof->ActivateWorker("0.0") }}} and the corresponding printout: {{{ Collection name='TSortedList', class='TSortedList', size=4 Slave: 0.0 hostname: tcx011.naf.desy.de msd: perf index: 100 active Slave: 0.1 hostname: tcx01a.naf.desy.de msd: perf index: 100 active Slave: 0.2 hostname: tcx016.naf.desy.de msd: perf index: 100 not active Slave: 0.3 hostname: tcx015.naf.desy.de msd: perf index: 100 active }}} It is also possible to create a PROOF manager from a TProof instance: {{{ TProofMgr* mgr = gProof->GetManager() }}} Then: * To print the URL of the PROOF master: {{{ mgr->Print() }}} * One PROOF cluster can handle several PROOF sessions. To print information (ID, status, ...) about the current PROOF sessions: {{{ mgr->QuerySessions() }}} In the case of two PROOF sessions, this command will print something like the following: {{{ // # 1 // alias: tcx030.naf.desy.de, url: "proof://calfayan@tcx030.naf.desy.de:1399/" // tag: tcx030-1253634508-11580 // status: idle, attached: YES (remote ID: 0) // # 2 // alias: tcx030.naf.desy.de, url: "proof://calfayan@tcx030.naf.desy.de:1399/" // tag: tcx030-1253634843-17686 // status: idle, attached: NO (remote ID: 1) (class TList*)0x8f50ee8 }}} * To explicitly shutdown a given PROOF session: {{{ mgr->ShutdownSession(1) # "1" being the local ID of the session as printed previously }}} = How to run a PROOF-based analysis = This section assumes you already have a running PROOF cluster associated with your user name and with the PROOF master being your login node. == The TSelector == The ROOT class [[http://root.cern.ch/root/html/TSelector.html|TSelector]] provides a framework for data analysis by managing the initialization, the event processing, and the termination. An analysis code inheriting from the TSelector has to be used when processing data with PROOF. The documentation relative to the development of a TSelector based analysis can be found at this URL: http://root.cern.ch/drupal/content/developing-tselector In the TSelector class, the methods SlaveBegin() and SlaveTerminate() are executed on each slave, before and after the event processing, respectively. The method Process() is executed for each event, on each worker, and does not load the event by default, i.e., one has to call GetEntry(entry_index) or load exclusively the entries of the desired branches (to increase performance). == Running on D3PD files == Some D3PD files (ROOT-readable files for ATLAS analysis) are available at this path: {{{ /scratch/current/atlas/calfayan/ADT09/D3PD }}} These D3PD files have been generated via TauDPDMaker (using v14.2.25 AODs), and embody the analysis tree "ControlSample0". === Simple example === * No additional setup is necessary for the following. You can for instance "cd" into a new directory: {{{ mkdir -p ~/d3pd-test cd ~/d3pd-test }}} * Generate a skeleton of a TSelector that suits your D3PD format: {{{ root -b /scratch/current/atlas/calfayan/ADT09/D3PD/d3pd-sample.root ControlSample0->MakeSelector(); .q }}} Per default, the files ControlSample0.h and ControlSample0.C are created, and include the definition of the class ControlSample0 which inherits from a TSelector. The class produced is well documented, so that you can edit it easily so that it suits your analysis. * Edit ControlSample0.h and ControlSample0.C with your analysis. You might want to overwrite the generated macros with an example by doing: {{{ tar xf /scratch/current/atlas/calfayan/ADT09/proof/ControlSample0.tar }}} * Connect to the PROOF cluster {{{ root /scratch/current/atlas/`whoami`/proofcluster/connectproof.C }}} * Specify the input file(s) within a TDSet (similar to TChain) {{{ TDSet *set = new TDSet("TTree", "ControlSample0"); .L /scratch/current/atlas/calfayan/ADT09/proof/AddFilesToTDSet.C AddFilesToTDSet(set, "/scratch/current/atlas/calfayan/ADT09/D3PD/d3pd-filelist.txt"); //set->Add("/scratch/current/atlas/calfayan/ADT09/D3PD/..."); }}} * Process the analysis: To start the analysis, the Process() method of either the TProof or the TDSet classes can be used. Further documentation on the supported arguments can be found at: http://root.cern.ch/root/html/TProof.html#TProof:Process and http://root.cern.ch/root/html/TDSet.html#TDSet:Process. It is possible to provide PROOF with the source code of the analysis. The suffix "+" will trigger the compilation of the class and the generation of the corresponding CINT dictionary. The suffix "++" will force the recompilation. The character "O" will enable an optimized compilation. If no suffix is provided, the analysis will be interpreted with CINT. {{{ set->Process("ControlSample0.C+O"); }}} or {{{ gProof->Process(set, "ControlSample0.C+O"); }}} One could also first explicitly load the analysis macro into the PROOF instance, and then process it by providing the class name: {{{ gProof->Load("ControlSample0.C+O"); gProof->Process(set, "ControlSample0"); }}} or {{{ .L ControlSample0.C+O # this compiles and loads the analysis on the client gProof->Exec("gSystem->Load(\"~/d3pd-test/ControlSample0_C.so\")") # this loads the compiled library on all workers gProof->Process(set, "ControlSample0"); }}} === How to manage output objects (histos, trees, ...) in a PROOF analysis === The following is an example. Such objects could be declared as attributes of the analysis class, be instanciated in SlaveBegin(), and filled in Process(). To be able to retrieve them after processing, they could be 'booked' in SlaveBegin() as output objects. In ControlSample0.h: {{{ #include "TH1F.h" class ControlSample0 : public TSelector { // ... TH1F* htest ; // ... } ; }}} In ControlSample0.C: {{{ #include using namespace std ; void ControlSample0::SlaveBegin(TTree * /*tree*/) { // Instanciate objects htest = new TH1F("htest", "htest", 100, 0, 100000) ; // Book all objects defined in current TDirectory TList* obj_list = (TList*) gDirectory->GetList() ; TIter next_object((TList*) obj_list) ; TObject* obj ; cout << "-- Booking objects:" << endl; while ((obj = next_object())) { TString objname = obj->GetName() ; cout << " " << objname << endl ; fOutput->Add(obj) ; } } }}} {{{ Bool_t ControlSample0::Process(Long64_t entry) { // Load entry int nb = GetEntry(entry) ; htest->Fill(nb) ; } }}} To retrieve the objects after processing (TProof::Process() is over) and store them within a root file, it is then possible to use the following code: {{{ // Define output file TFile* output_file = new TFile("output.root", "recreate") ; // Retrieve objects TList* list = gProof->GetOutputList() ; TIter next_object((TList*) list); TObject* obj ; cout << "-- Retrieved objects:" << endl ; output_file->cd() ; while ((obj = next_object())) { TString objname = obj->GetName() ; cout << " " << objname << endl ; obj->Write() ; } // Write output file output_file->Write() ; }}} With the preceding method, problems may occur if your output is too large. To cope with this issue, please refer to this documentation: http://root.cern.ch/drupal/content/handling-large-outputs-root-files === How to write the log file of the master and each worker to one text file === In a ROOT session, afer having processed your analysis with PROOF: {{{ // get proof manager (if not already available) TProofMgr* mgr = gProof->GetManager() ; // get proof logs TProofLog *log = mgr->GetSessionLogs() ; // save log int flag = log->Save("*", "./log_all-workers.txt") ; }}} == Running on AOD files == To be able to run on AOD files with ROOT, the Athena package AthenaROOTAccess can be utilized. In order to use it together with PROOF, the package ara_analysis has been written in order to simplify the process and enable either Python or compiled C++ analysis loops. It is available (and already compiled) at: {{{ /scratch/current/atlas/calfayan/ADT09/proof/ara_proof.tar.gz }}} To try it, we will use Athena v15.5.0. Please set up the Athena release without the local flag: {{{ mkdir -p ~/atlas/testarea/15.5.0 cd ~/atlas/testarea/15.5.0 source ~/cmthome/setup.sh -tag=15.5.0 }}} Please ensure you use a PROOF cluster that has been set up with the ROOT version of your current Athena framework: {{{ ~calfayan/scripts/proofcluster.pl stop ~calfayan/scripts/proofcluster.pl start }}} Then, install ara_proof: {{{ cd ~/atlas/testarea/15.5.0/ tar xfz /scratch/current/atlas/calfayan/ADT09/proof/ara_proof.tar.gz cd ara_proof/ara_analysis/cmt cmt config source setup.sh #make # only if you want to recompile it cd ../scripts sh generate_profiles.sh }}} The package ara_analysis contains the following classes: * ara_selector_base: derives from TSelector, and converts a AOD persistant tree to a ROOT-readable tree in the init() method. * ara_selector_py: derives from ara_selector_base and allows a Python-based event loop. Please edit ara_proof/ara_analysis/python/ara_analysis.py to fit your needs. No recompilation of the package ara_analysis is needed when modifying the latter. * ara_selector_cpp: derives from ara_selector_base and allows an event loop based on compiled C++. Please edit ara_proof/ara_analysis/src/ara_selector_cpp.cxx and ara_proof/ara_analysis/ara_analysis/ara_selector_cpp.h to fit your needs. To increase compilation time, please comment out the lines relative to ara_selector_py and ara_analysis_cpp in the file ara_proof/ara_analysis/cmt/requirements, as in the following: {{{ #library ara_analysis_cpp ara_analysis_cpp.cxx #library ara_selector_py ara_selector_base.cxx ara_selector_py.cxx library ara_selector_cpp ara_selector_base.cxx ara_selector_cpp.cxx private #apply_pattern lcgdict dict=ara_analysis_cpp selectionfile=selection_anacpp.xml headerfiles="../ara_analysis/ara_analysis_cppDict.h" #apply_pattern lcgdict dict=ara_selector_py selectionfile=selection_py.xml headerfiles="../ara_analysis/ara_selector_pyDict.h" apply_pattern lcgdict dict=ara_selector_cpp selectionfile=selection_cpp.xml headerfiles="../ara_analysis/ara_selector_cppDict.h" end_private }}} * [ara_analysis_cpp: it is a wrapper coded in C++ intended for standalone tests of ara_selector_cpp] To test the framework, you can use the script "start_ara_proof.C", which launches the analysis: {{{ mkdir -p ~/atlas/testarea/15.5.0/run cd ~/atlas/testarea/15.5.0/run cp /scratch/current/atlas/calfayan/ADT09/proof/start_ara_proof.C . root -l start_ara_proof.C }}} Per default, the script "start_ara_proof.C" uses "ara_selector_cpp". To use the Python version of the event loop, please use: {{{ gProof->Process(set, "ara_selector_py") ; }}} Per default, the script "start_ara_proof.C" provides the PROOF instance with the local path to the libraries of ara_analysis. Since all workers are able to see the your login node, it is possible to avoid sending the complete package to the workers (the directory "ara_proof/ara_analysis/python/" is still uploaded to avoid hardcoding absolute paths in the ara_analysis package). It is also possible to send the complete package to the workers and compile it automatically on each of them, but this takes more time. To do so, please change "start_ara_proof.C" as follows: {{{ //gSystem->Exec("tar cfzh ara_proof.par ../ara_proof/ara_analysis/python") ; gSystem->Exec("tar cfzh ara_proof.par --exclude=../ara_proof/ara_analysis/i686-slc4-gcc34-opt/* ../ara_proof/") ; }}} For more information on how to upload and enable additional software with PROOF, please refer to the following documentation: http://root.cern.ch/drupal/content/preparing-uploading-and-enabling-additional-software = Links = * Very useful information from the ROOT website (description, howtos, tutorials): http://root.cern.ch/drupal/content/proof * ATLAS-D Meeting Bonn09, tutorial on AthenaROOTAccess: https://znwiki3.ifh.de/ATLAS/WorkBook/NAF/ADT09AthenaROOTAccess * ATLAS-D Meeting Bonn09: http://indico.cern.ch/conferenceDisplay.py?confId=52623