ATLAS-D Tutorial 2010: ROOT and PROOF

Inside D3PD

ini ROOT530
root -l

in this file and confirm that opening the file was successful. Use the TFile::Get command to get a pointer to the "egamma"-tree.

TFile *f = TFile::Open("dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/data10_7TeV/NTUP_EGAMMA/f282_p207/data10_7TeV.00161379.physics_Egamma.merge.NTUP_EGAMMA.f282_p207_tid159706_00/NTUP_EGAMMA.159706._000372.root.1")
.ls
TTree *egamma = (TTree*) f->Get("egamma");

 egamma->Print(); 

a text file with the printout can be found here.

and the variables  RunNumber, EventNumber 

egamma->Scan("EventNumber:el_n:el_pt:el_tight") 


Remark: Note the columns row and instance. Each event has one row number but might have several instances in this table. The value of "plain" variables (just one value per event) is repeated in all instances. The entries of arrays are shown in the differnt instances.


Note that el_n is a (single) integer which gives the number of reconstructed electron (candidates) in the current event. el_pt and el_tight are arrays with el_n entries. Each entry in el_pt gives the transverse momentum for each of the electrons. The value of el_tight is either 0 or 1 and shows, if the corresponding electron meets tight selection criteria (1) or not (0).


Solution

  egamma->Draw("el_pt");
  egamma->Draw("Sum$(el_tight)");

  egamma->Draw("el_pt","el_tight");
  gPad->SetLogy();
  gPad->SetLogy(0);

  egamma->Draw("Sum$(el_pt*el_tight)","Sum$(el_tight) != 0");
  //Note:
  egamma->Draw("Sum$(el_pt)","el_tight");
  // is different!. Sum$() always sums over all "instances" independent to the "selection" value

Setup Proofcluster

If you run proofcluster for the first time, or later proofcluster config, you are asked for configuration options. After running the configuration, please check the file  /scratch/hh/current/atlas/users/<your username>/proofcluster/setup.conf .

Make sure the contents is the following:

master=localhost
workers=10
cmssw=nocmssw
protocolport=XXXX
port=XXXX
os=any
rootsys=/afs/naf.desy.de/products/root/amd64_rhel50/5.30.00
site=XX
queue=normal
autocopylog=n
ram=2048


Start the proofcluster:

Troubleshooting:

PROOF Datasets

root -l
p = TProof::Open("localhost:<your protocol port>");
.!hostname
p->Exec(".!hostname");
cout << "Hello World" << endl;
p->Exec(" cout << \"Hello World\" << endl; ")

root -l /scratch/hh/current/atlas/users/<your user name>/proofcluster/connectproof.C

Use "gProof" instead of "p" in this case.


TFileCollection *fc = new TFileCollection("tutfilelist","Tutorial file list");
fc->AddFromFile("/afs/naf.desy.de/group/atlas/ADT11/hh/XX/tutfilelist.txt");
fc->Print();

TDSet *ds = new TDSet("tutset","egamma");
ds->Add( fc->GetList() );
ds->Print();

.x /afs/naf.desy.de/group/atlas/ADT11/hh/XX/createtutds.C


Solution

root -l  /scratch/hh/current/atlas/users/<your user name>/proofcluster/connectproof.C
.x /afs/naf.desy.de/group/atlas/ADT11/hh/XX/createtutds.C

ds->Draw("el_pt");
ds->Draw("el_pt","Sum$(el_tight)==2");

ds->Draw("sqrt( Sum$(el_E*el_tight)^2-Sum$(el_px*el_tight)^2-Sum$(el_py*el_tight)^2-Sum$(el_pz*el_tight)^2 )>>masshisto(200,0,200000)", "Sum$(el_tight)==2");

ds->Draw("sqrt( Sum$(el_E*el_tight)^2-Sum$(el_px*el_tight)^2-Sum$(el_py*el_tight)^2-Sum$(el_pz*el_tight)^2)>>masshisto(200,0,200000)", "Sum$(el_tight)==2&&Sum$(el_pt*el_tight)>60000");

Remark: You can get the result histograms from the last query with gProof->GetOutputList()->Print() and then gProof->GetOutpoutList()->FindObject("htemp")->Draw()

PROOF with TSelector

For more complex computations, the Draw function reaches its limits. The TSelector class allows write complex with full access to all TTree variables.

A skeleton to process a given TTree can be generated automatically with the TTree::MakeSelector method. to create such a skeleton for the egamma-tree, do the following:

f = TFile::Open("dcap://dcache-atlas-dcap.desy.de:22125//pnfs/desy.de/atlas/dq2/atlaslocalgroupdisk/data10_7TeV/NTUP_EGAMMA/f282_p207/data10_7TeV.00161379.physics_Egamma.merge.NTUP_EGAMMA.f282_p207_tid159706_00/NTUP_EGAMMA.159706._000372.root.1");
egamma = (TTree*) f->Get("egamma");
egamma->MakeSelector();

In this example the selector skeleton will be modified to plot the invariant mass of "tight" electrons in the egamma tree.


Remark: It is always possible to check whether the TSelector is formally correct by trying to compile the selector. This can be done by calling .L egamma.C+. With this, you can check after each of the following modifications, if you did some typos.

If you want to fill histograms (or use other variables) in your code, you first need to declare these the header file.

Modify the egamma.h file and add a TH1D* variable in the selector class. You also need include the header files for TH1D, because this header is not included in the skeleton. egamma.h

[...]
#include <TH1D.h>
[...]
public:
  TH1D * fMasshist;

The hook SlaveBegin is called before any event is processed. In contrast to the hook Begin, SlaveBegin is called on each worker node. When running with PROOF, each PROOF worker-node has an own instance of the TSelector in the memory. To be able to fill a histogram when looping over all events, on each PROOF node a histogram has to be created.

However, although we create an histogram on each node, we only want to have one result histogram. Luckily, PROOF merges all histograms into one, if we register the histogram to the fOutput list by calling fOutput->Add(fMasshist).

Find the SlaveBegin hook in the egamma.C file and modify it to contain the following.

void egamma::SlaveBegin(TTree * /*tree*/)
{
   // The SlaveBegin() function is called after the Begin() function.
   // When running with PROOF SlaveBegin() is called on each slave server.
   // The tree argument is deprecated (on PROOF 0 is passed).

   TString option = GetOption();
   fMasshist = new TH1D("masshist","Massen Histogramm; m_inv[GeV]",200,0,200000);
   fOutput->Add(fMasshist);

}

The hook Terminate is only called once on the client. In contrast TerminateSlave would be called on each worker node. This hook can be used to post-process the merged output from the PROOF-session. In our case the (parallely) created histogram will be plotted on the screen, but more sophisticated things, like fitting could be done here. Because we want to use the result of the PROOF session, we cannot just use the fMasshisto variable, but have to pick the histogram from the fOuput-list. Additionally, we have to include the TCanvas.h header, because it is not included in the automatically generated header.

Find the Terminate hook in the egamma.C file and modify it to contain the following code. egamma.C

[...]
#include <TCanvas.h>
[...]
void egamma::Terminate()
{
   // The Terminate() function is the last function to be called during
   // a query. It always runs on the client, it can be used to present
   // the results graphically or save the results to file.
  fMasshist = dynamic_cast<TH1D *>(fOutput->FindObject("masshist"));
  if ( fMasshist != NULL )
    {
      new TCanvas;
      fMasshist->Draw();
    }

}

Finally, we have to implement the code to calculate the invariant mass. The Process hook is called for each event in the dataset. Here, we can do complex calculations and "cut events away" by simply not filling them into the result-histogram. Within the Process function, we can have access to all the variables of the current event. However, although for each TTree-variable, a C++ variable exists, these variables are not filled by default.

In Process, we have to call fChain->GetTree()->GetEvent(entry) to read the full event, or b_<Tree variable name>->GetEntry(entry);. After this the events tree-variables are available in the corresponding C++-variable. Because reading the whole event into the memory might be a huge overhead, the second method is prefered for large trees.

Find the Process hook in the egamma.C file and modify it to contain the following code. egamma.C

[...]
#include <TLorentzVector.h>
[...]
Bool_t egamma::Process(Long64_t entry)
{
   // The Process() function is called for each entry in the tree (or possibly
   // keyed object in the case of PROOF) to be processed. The entry argument
   // specifies which entry in the currently loaded tree is to be processed.
   // It can be passed to either egamma::GetEntry() or TBranch::GetEntry()
   // to read either all or the required parts of the data. When processing
   // keyed objects with PROOF, the object is already loaded and is available
   // via the fObject pointer.
   //
   // This function should contain the "body" of the analysis. It can contain
   // simple or elaborate selection criteria, run algorithms on the data
   // of the event and typically fill histograms.
   //
   // The processing can be stopped by calling Abort().
   //
   // Use fStatus to set the return value of TTree::Process().
   //
   // The return value is currently not used.

  //fChain->GetTree()->GetEntry(entry);

  b_el_n->GetEntry(entry);
  b_el_tight->GetEntry(entry);

  int ntight = 0;
  for ( int i = 0; i != el_tight->size(); ++i  )
    if ( (*el_tight)[i] == 1 ) ++ntight;
  if (ntight != 2) return kFALSE;


  b_el_pt->GetEntry(entry);
  b_el_px->GetEntry(entry);
  b_el_py->GetEntry(entry);
  b_el_pz->GetEntry(entry);

  TLorentzVector sumP4(0,0,0,0);

  for ( int i = 0; i != el_n; ++i )
    {
      if ( (*el_tight)[i] == 1 )
        {
          TLorentzVector elTightP4;
          elTightP4.SetXYZM( (*el_px)[i], (*el_py)[i], (*el_pz)[i], .511 );
          sumP4 += elTightP4;
        }
      masshist->Fill( sumP4.M() );

    }


   return kTRUE;
}

Remark: Instead of modifying the automatically generated class, one can create a derived class, to implement the actual code of a TSelector. This allows just to generate a new TSelector, if the structure of the tree is changed. The following template might help you to create such a class. However, to use this derived selector, the base selector has to be loaded on each PROOF slave. Because of this, you have to call gProof->Load("egamma.C+") before you can use the myegamma selector.

#include "egamma.h"

#ifndef _myegamma_h_
#define _myegamma_h_

#include "TH1D.h"

class myegamma : public egamma
{
 protected:
  TH1D *fMasshist;

  virtual void    SlaveBegin(TTree *tree);
  virtual Bool_t  Process(Long64_t entry);
  virtual void    Terminate();
};

#endif //_myegamma_h_

#include "myegamma.h"


void myegamma::SlaveBegin(TTree * /*tree*/)
{
   // The SlaveBegin() function is called after the Begin() function.
   // When running with PROOF SlaveBegin() is called on each slave server.
   // The tree argument is deprecated (on PROOF 0 is passed).

   TString option = GetOption();

}


Bool_t myegamma::Process(Long64_t entry)
{
   // The Process() function is called for each entry in the tree (or possibly
   // keyed object in the case of PROOF) to be processed. The entry argument
   // specifies which entry in the currently loaded tree is to be processed.
   // It can be passed to either egamma::GetEntry() or TBranch::GetEntry()
   // to read either all or the required parts of the data. When processing
   // keyed objects with PROOF, the object is already loaded and is available
   // via the fObject pointer.
   //
   // This function should contain the "body" of the analysis. It can contain
   // simple or elaborate selection criteria, run algorithms on the data
   // of the event and typically fill histograms.
   //
   // The processing can be stopped by calling Abort().
   //
   // Use fStatus to set the return value of TTree::Process().
   //
   // The return value is currently not used.


   return kTRUE;
}

void myegamma::Terminate()
{
   // The Terminate() function is the last function to be called during
   // a query. It always runs on the client, it can be used to present
   // the results graphically or save the results to file.

}

ATLAS: WorkBook/NAF/ADT11PROOF (last edited 2011-09-21 07:27:47 by WolfgangEhrenfeld)