Tools for Event Filter Physics Studies

I wrote a set of scripts for running the combined ATRIG/ATRECON efficiently with different parameter settings. The main ideas were:

  • Link and stage all files needed for and produced by ATRIG/ATRECON with one command
  • Allow concurrent runs with different parameter settings on the same data sets
  • Automatic bookkeeping of runs with different settings
  • Provide mnemonic for tape and file sequence numbers
  • Start several runs with the same parameter setting on different data sets in one go

The scripts and other files are found in this tar.gz file (excluding atrecon.exe). Unpack this file in a subdirectory, for example called atrecon, and set an environment variable ATRECON_DIR to this directory. You get the following directory structure including some files. More details can be found below, or by clicking on the file name.

Subdirectory Files Description
bin atrecon.run Start multiple runs of ATRECON
startjob Start a single run, called from atrecon.run
makedatacard Generate ATRECON datacard with different options, called from startjob
getinfo Get information on parameter settings for different runs
getlog Get log file from a specific run
atrecon.del Delete data produced from multiple runs
deldata Delete data from a single run, called by atrecon.del
getname Translate mnemonic to tape number and file sequence and get stage size for data files
getcounter Determine next free run number (used by startjob)
atrecon.exe ATRECON executable or link to it
work This directory is used for the bookkeeping and holding the temporary execution directories during the run.
mnemonic.dat Datafile for mnemonics and size of stage files
kumacs getdata.kumac PAW kumac for loading ntuples
datacard atlas_progflow.tit ATRIG steering card
tdrhilumi_param.newest.tit ATRIG parameter card for high luminosity
tdrlolumi_param.newest.tit ATRIG parameter card for low luminosity
jetfinder.cone.tit Jetfinder steering card, set for cone algorithm
ntuples empty Place for links to staged ntuples in atlas_pool
stdout empty Location to store log files of individual runs
output empty Place for links to staged bank outputs in atlas_pool, if an output was requested.

Additionally, you need to include $ATRECON_DIR/bin in your PATH.
All these files can also be found under /afs/cern.ch/user/m/mommsen/public/atrecon_tools.

atrecon.run

Usage:  atrecon.run tapename.fileseq[-fileseq]|mnemonic  [options]

atrecon.run calls startjob for tapename.fileseq. If a range of file sequences is specified, ATRECON runs over all files in one go. The options are passed to startjob and are documented there.

Instead of the tapename.fileseq[-fileseq] it is also possible to use mnemonics as defined in mnemonic.dat. In this case, it is possible to start jobs for several mnemonics at once if the following naming scheme is used: a mnemonic, for example dijet, stands for y00341.1-44, the next tape y00384.1-16 has the mnemonic dijet_2, i.e. the same base name dijet then an underscore and then a number. In this case it is possible to call atrecon.run dijet-2 to run over both tapes. It is further possible to say atrecon.run dijet_4-11 for running on all mnemonics dijet_4, dijet_5, ..., dijet_11.

startjob

Usage:  startjob tapename.fileseq[-fileseq]|mnemonic [options]

startjob prepares the environment for running ATRIG/ATRECON, assigns a run number, takes care of the bookkeeping, and starts the execution for a single run. The job to be run can either be specified by tapename.fileseq[-fileseq] or by mnemonics as defined in mnemonic.dat.

The options have to be given as keyword=value. The sequence can be arbitrary. If a keyword is omitted, the default value is used.

keyword default value description
atreconexe atrecon.ACX
if hostname includes linux, atrecon.linux is chosen
The name of the ATRIG/ATRECON executable as found in the subdirectory bin
hostname linux
hname==atlasb01 || hname==atlasb02 || hname==atlasb03 || hname==atlasb04 || hname==atlasb07 || hname==atlasb08 || hname==atlasb09 || hname==atlasb10
The value of hostname is used to request a resource at the job submission (BSUB -R). If the $OSTYPE of the submitting machine is "linux", the job is started on the Linux cluster. Otherwise it is set to have HP batch machines with the same configuration. With hostname=linux it can be forced to run the job on the linux cluster.
queue 1nh
if the mnemonic includes the word pile the queue 8nh is chosen
The job is submitted to the specified batch queue. queue=none runs the job in interactive mode on the same machine as startjob is started.
titlecard tdrlolumi_param.newest.tit
if the mnemonic includes the word pile the titlecard tdrhilumi_param.newest.tit is chosen
Name of the ATRIG parameter card as found in subdirectory datacard.
jetfinder jetfinder.cone.tit Name of the jetfinder steering card as found in subdirectory datacard.

Beside these keywords, all settings for makedatacard have to be specified here.

The ATRIG parameter card and jetfinder steering card are copied to the temporary execution directory. It is therefore possible to modify the original files as soon as startjob has finished, i.e. before the job actually starts running. This it not true for atreconexe, because there is only a symlink from the temporary execution directory to the executable. This is done to preserve disk space.

makedatacard

Usage:  makedatacard runID [options]

makedatacard generates the datacard for ATRECON. The runID serves as identifier when run within startjob. In general makedatacard is not called from the command line. makedatacard uses default settings for electron identification with xKalman, which can be modify with keyword=value. There are also some options for other settings, for example to run with iPatRec instead of xKalman, but it needs additional modifications in the code to use it correctly.

The options have to be given as keyword=value. The sequence can be arbitrary. If a keyword is omitted, the default value is used. In general these options are given together with those described for startjob. They can be interchanged arbitrarily.

keyword default value description
trig 9999 Maximal number of events to be processed
redigi 0 redigi=1 switches the redigitisation of the calorimeter on
output 0 output=1 writes the EVNT, KINE, and RECB out
t2gl 1 t2gl=0 switches the LVL2 global histogram off
atrig 1 atrig=0 switches ATRIG off
ecalrec 1 ecalrec=0 turns the reconstruction of the e.m. calorimeter off
hcalrec 0 hcalrec=1 turns the reconstruction of the hadron calorimeter on
emisrec 0 emisrec=1 turns the reconstruction of the missing energy on
xkalman 1 xkalman=0 turns the pattern recognition xKalman off
ipatrec 0 ipatrec=1 turns the pattern recognition iPatRec on
pixlrec 0 pixlrec=1 turns the pattern recognition PixlRec on
conversion 0 conversion=1 turns the conversion finding on
muonbox 0 muonbox=1 turns the muonbox on
cbnt 1 cbnt=0 turns the CBNT off
ecalthresh 5 Minimal Et (GeV) for e.m. cluster
ecaleta 3.2 abs(eta) for reconstruction of e.m. calorimeter
eseed 0 eseed=1 turns the seeding with LVL2 electron RoI on
noise 1 noise=0 turns the noise in the calorimeter off
threshold 0 Noise threshold in sigmas (negative values for symmetric zero suppression)
pileup 0 Flag for pile-up addition, pileup=-2 used together with redigi=1 for redigitisation
digifilter 1
if the mnemonic includes the word pile, digifilter is set to 0
Switches the digital filtering of the e.m. calorimeter on and off
ptmin 5 Minimal pt (GeV) for track search with xKalman
hwetse 0.5 Road half-width in eta
hwfise 0.5 Road half-width in phi
ndivbar 1 Num of divisions in barrel
Misused for xkalman++ to choose recon.method:
4 Pixel+SCT => TRT
5 TRT => SCT+Pixel (Fortran xKalman method)

All these options are written to the bookkeeping file and enables to select runs with a given parameter setting (see getinfo).

getinfo

Usage:  getinfo tapename.fileseq[-fileseq]|mnemonic [run_number]

getinfo prints all available run settings for a given tapename.fileseq[-fileseq] or mnemonic. If a run number is specified, only the setting for this run is printed.

getlog

Usage:  getlog tapename.fileseq[-fileseq]|mnemonic run_number

getlog prints the log file produced during the run for a given tapename.fileseq[-fileseq] or mnemonic and the run number. The log files are stored in the subdirectory stdout. With the first call of getlog the corresponding log file is gzippt to save disk space.

atrecon.del

Usage:  atrecon.del tapename.fileseq[-fileseq]|mnemonic run_number|all [force]

atrecon.del calls deldata for one or several runs. The data set can be specified with tapename.fileseq[-fileseq] or the mnemonic. The same syntax can be used as for atrecon.run. The second argument specifies the run number to be deleted or to delete all runs of the given data set. Normally the user is asked for each run if it should be deleted. This can be omitted with the word force as third argument. Obviously, this option has to be used with great care.

deldata

Usage:  deldata tapename.fileseq[-fileseq]|mnemonic run_number [force]

deldata deletes all data produces by a run, i.e. log files, staged ntuples, entry in the bookkeeping file, bank outputs, etc. The run to be deleted can either be specified by tapename.fileseq[-fileseq] or by mnemonics. The user is asked if the selected run should be deleted. This can be omitted by giving the word force as third argument.

getname

Usage:  getname mnemonic [size]

getname parses the flat file mnemonic.dat for the mnemonic and returns the corresponding tapename.fileseq[-fileseq]. In case that there is no matching mnemonic found, the first argument is return unchanged. If as second argument the word size is given, the stage size of the corresponding data file(s) is returned. If there is no size defined, a default of 500MB is returned.

mnemonic.dat

This flat file defines the mapping of the mnemonic to the tapename.fileseq[-fileseq] and defines the stage size for the data files. The file has three columns: the first specifies the mnemonic, the second the tapename.fileseq[-fileseq], and the third the size for each data file in MB.

The following naming convention for the mnemonics is used:

  • The name specifies the particle type, and if necessary also the pt
  • If the data sets include pileup, i.e. high luminosity, the word pile is attached without any separation
  • For multiple mnemonics addressing the same kind of data sets the same base name as for the first one is used, but a number is added separated by an underscore.

All scripts us internally the unambiguous tapename.fileseq[-fileseq]. Therefore the mnemonic can be changed or deleted while the run data is still addressable using the corresponding tapename.fileseq[-fileseq] scheme.

getdata.kumac

Usage:  getdata file runno [[file runno] [file runno] ...]
        file :      tapename.fileseq[-fileseq]|mnemonic
        runno:      run number

getdata generates a chain of ntuples in PAW. Each ntuple is specified by its tapename.fileseq[-fileseq] or mnemonic and its run number. If multiple combinations of file runno are given, all ntuples are chained and the chain is named after the name of the first file.

getcounter

Usage:  getcounter tapename.fileseq[-fileseq]|mnemonic

getcounter returns the next free run number for a data set given by tapename.fileseq[-fileseq] or its mnemonic. It is used by startjob and is normally not called from the command line.

atrecon.exe

The combined ATRIG/ATRECON executables can be found under /afs/cern.ch/user/m/mommsen/public/atrecon_tools/bin or can be down loaded by clicking on the corresponding filename (size about 10MB).

atrecon.ACX
HP executable including ATRIG, calorimeters, xKalman, time-stamping, and the extension to the CBNT, based on the CVS/SRT release 0.0.33
atrecon.linux
Linux executable with the same packages, but based on CVS/SRT release 0.0.37

Valid HTML 4.01! Valid CSS!