Namespace: Event
-
namespace event
Functions
-
template<typename ...Args>
inline auto CombineFlags(ROOT::RDF::RNode df, const std::string &outputname, Args... args) This function combines multiple boolean flags into a single boolean value based on the selected mode (“any_of”, “all_of”, or “none_of”). The mode determines how the flags are evaluated:
"any_of": Returnstrueif at least one of the flags istrue"all_of": Returnstrueif all flags aretrue"none_of": Returnstrueif none of the flags aretrue
Note
The mode (
"any_of","all_of", or"none_of") is extracted as the last argument in theargsparameter pack, and the rest of the arguments are treated as individual flag columns.- Template Parameters:
Args – variadic template parameter pack representing the flag columns plus mode
- Parameters:
df – input dataframe
outputname – name of the output column containing the combined flag
args – parameter pack of column names that contain the considered flags of type
bool, with the last argument being the mode ("any_of","all_of", or"none_of")
- Returns:
a dataframe with a new column
-
namespace filter
Functions
-
ROOT::RDF::RNode GoldenJSON(ROOT::RDF::RNode df, correctionManager::CorrectionManager &correction_manager, const std::string &filtername, const std::string &run, const std::string &luminosity, const std::string &json_path)
This function applies a filter to the input dataframe using a Golden JSON file, which contains a mapping of valid run-luminosity pairs. The dataframe is filtered by checking if the run and luminosity values for each row match the entries in the Golden JSON. Rows with invalid run-luminosity pairs are removed.
The Golden JSON files are taken from the CMS recommendations.
Run2: https://twiki.cern.ch/twiki/bin/view/CMS/LumiRecommendationsRun2
Run3: https://twiki.cern.ch/twiki/bin/view/CMS/LumiRecommendationsRun3 (not added yet)
- Parameters:
df – input dataframe
correction_manager – correction manager responsible for loading the Golden JSON
filtername – name of the filter to be applied (used in the dataframe report)
run – name of the run column
luminosity – name of the luminosity column
json_path – path to the Golden JSON file
- Returns:
a filtered dataframe
-
inline ROOT::RDF::RNode Flag(ROOT::RDF::RNode df, const std::string &filtername, const std::string &flagname)
This function applies a filter to the input dataframe based on a boolean flag column. It returns only the rows where the flag value is
true.Use case examples are the noise filters recommended by the CMS JetMET group (https://twiki.cern.ch/twiki/bin/viewauth/CMS/MissingETOptionalFiltersRun2).
- Parameters:
df – input dataframe
filtername – name of the filter to be applied (used in the dataframe report)
flagname – name of the boolean flag column to use for filtering
- Returns:
a filtered dataframe
-
inline ROOT::RDF::RNode InvertedFlag(ROOT::RDF::RNode df, const std::string &filtername, const std::string &flagname)
This function applies a filter to the input dataframe based on a boolean flag column. It returns only the rows where the flag value is
false.- Parameters:
df – input dataframe
filtername – name of the filter to be applied (used in the dataframe report)
flagname – name of the boolean flag column to use for filtering
- Returns:
a filtered dataframe
-
template<typename ...Args>
inline auto Flags(ROOT::RDF::RNode df, const std::string &filtername, Args... args) This function filters the rows of the input dataframe by evaluating multiple boolean flags according to a specified mode. The filtering mode can be “any_of”, “all_of”, or “none_of”:
"any_of": Keeps the rows where at least one flag istrue"all_of": Keeps the rows where all flags aretrue"none_of": Keeps the rows where none of the flags aretrue
Note
The last argument must be the mode, while the preceding arguments are the boolean flag columns to be evaluated.
- Template Parameters:
Args – variadic template parameter pack representing the flag columns plus mode
- Parameters:
df – input dataframe
filtername – name of the filter to be applied (used in the dataframe report)
args – parameter pack of column names that contain the considered flags of type
bool, with the last argument being the mode ("any_of","all_of", or"none_of")
- Returns:
a filtered dataframe
-
template<typename T>
inline ROOT::RDF::RNode Quantity(ROOT::RDF::RNode df, const std::string &filtername, const std::string &quantity, const std::vector<T> &selection) This function filters the rows of the input dataframe by checking if a specified
quantityexists in the providedselectionvector. Rows where the quantity is found in the selection vector are kept, while others are removed.- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
filtername – name of the filter to be applied (used in the dataframe report)
quantity – name of the quantity column in the dataframe of type
Tselection – a vector containing the selection of values of type
Tto filter the quantity against
- Returns:
a filtered dataframe
-
ROOT::RDF::RNode GoldenJSON(ROOT::RDF::RNode df, correctionManager::CorrectionManager &correction_manager, const std::string &filtername, const std::string &run, const std::string &luminosity, const std::string &json_path)
-
namespace quantity
Functions
-
ROOT::RDF::RNode GenerateSeed(ROOT::RDF::RNode df, const std::string &outputname, const std::string &lumi, const std::string &run, const std::string &event, const UInt_t &master_seed = 42)
This function defines a new column in the dataframe with seeds for a random number generator for each event.
The seed value for each event is calculated by concatenating event index variables and a seed value to
{seed}_{lumi}_{run}_{event}. From that, a SHA256 hash is calculated. The first four bytes of the hash are then used to create a 32-bit unsigned integer, which serves as the event seed.- Parameters:
df – input dataframe
outputname – name of the new column containing the generated event seeds
lumi – name of the column containing the luminosity block number
run – name of the column containing the run number
event – name of the column containing the event number
master_seed – master seed value to be added to the hash used for event seed generation
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode EvenOddFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity) This function creates a flag column based on a quantity. The flag is set to
trueif the quantity value is even andfalseif it is odd. This can be useful for splitting datasets into two subsets.- Template Parameters:
T – type of the quantity (e.g.
ULong64_t,int)- Parameters:
df – input dataframe
outputname – name of the new flag column
quantity – name of the column containing a quantity that can be used to define the flag (e.g., event ID)
- Returns:
a dataframe with the new flag column
-
template<typename T>
inline ROOT::RDF::RNode MinFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T &threshold) This function defines a flag for event quantities that satisfy a minimum threshold requirement. The flag is created by comparing the value in the specified quantity column with the given threshold, marking elements as
trueif they pass the cut andfalseotherwise.- Template Parameters:
T – type of the threshold and input quantity (e.g.
float,int)- Parameters:
df – input dataframe
outputname – name of the new column containing the selected event flag
quantity – name of the quantity column for which the cut should be evaluated, expected to be of type
Tthreshold – minimum threshold value of type
T
- Returns:
a dataframe containing the new flag as a column
-
template<typename T>
inline ROOT::RDF::RNode AbsMinFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T &threshold) This function defines a flag for event quantities that satisfy a minimum threshold requirement. The flag is created by comparing the absolute value in the specified quantity column with the given threshold, marking elements as
trueif they pass the cut andfalseotherwise.- Template Parameters:
T – type of the threshold and input quantity (e.g.
float,int)- Parameters:
df – input dataframe
outputname – name of the new column containing the selected event flag
quantity – name of the quantity column for which the cut should be evaluated, expected to be of type
Tthreshold – minimum threshold value of type
T
- Returns:
a dataframe containing the new flag as a column
-
template<typename T>
inline ROOT::RDF::RNode MaxFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T &threshold) This function defines a flag for event quantities that satisfy a maximum threshold requirement. The flag is created by comparing the value in the specified quantity column with the given threshold, marking elements as
trueif they pass the cut andfalseotherwise.- Template Parameters:
T – type of the threshold and input quantity (e.g.
float,int)- Parameters:
df – input dataframe
outputname – name of the new column containing the selected event flag
quantity – name of the quantity column for which the cut should be evaluated, expected to be of type
Tthreshold – maximum threshold value of type
T
- Returns:
a dataframe containing the new flag as a column
-
template<typename T>
inline ROOT::RDF::RNode AbsMaxFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T &threshold) This function defines a flag for event quantities that satisfy a maximum threshold requirement. The flag is created by comparing the absolute value in the specified quantity column with the given threshold, marking elements as
trueif they pass the cut andfalseotherwise.- Template Parameters:
T – type of the threshold and input quantity (e.g.
float,int)- Parameters:
df – input dataframe
outputname – name of the new column containing the selected event flag
quantity – name of the quantity column for which the cut should be evaluated, expected to be of type
Tthreshold – maximum threshold value of type
T
- Returns:
a dataframe containing the new flag as a column
-
template<typename T>
inline ROOT::RDF::RNode EqualFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T &threshold) This function defines a flag for event quantities that satisfy an exact threshold requirement. The flag is created by comparing the value in the specified quantity column with the given threshold, marking elements as
trueif they pass the cut andfalseotherwise.- Template Parameters:
T – type of the threshold and input quantity (e.g.
float,int)- Parameters:
df – input dataframe
outputname – name of the new column containing the selected event flag
quantity – name of the quantity column for which the cut should be evaluated, expected to be of type
Tthreshold – exact threshold value of type
T
- Returns:
a dataframe containing the new flag as a column
-
template<typename T>
inline ROOT::RDF::RNode AbsEqualFlag(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T &threshold) This function defines a flag for event quantities that satisfy an exact threshold requirement. The flag is created by comparing the absolute value in the specified quantity column with the given threshold, marking elements as
trueif they pass the cut andfalseotherwise.- Template Parameters:
T – type of the threshold and input quantity (e.g.
float,int)- Parameters:
df – input dataframe
outputname – name of the new column containing the selected event flag
quantity – name of the quantity column for which the cut should be evaluated, expected to be of type
Tthreshold – exact threshold value of type
T
- Returns:
a dataframe containing the new flag as a column
-
template<typename T>
inline ROOT::RDF::RNode Rename(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity) This function creates a new column in the dataframe with the specified
outputname, copying the values from an existingquantitycolumn. The original column remains unchanged.- Template Parameters:
T – type of the input quantity values
- Parameters:
df – input dataframe
outputname – name of the new column
quantity – name of the existing column to copy values from
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Define(ROOT::RDF::RNode df, const std::string &outputname, T const &value) This function adds a new column to the dataframe, assigning it a constant value for all entries.
- Template Parameters:
T – type of the value to be assigned
- Parameters:
df – input dataframe
outputname – name of the new column
value – constant value to be assigned to the new column
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode GenerateRandomVector(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const int seed = 42) This function defines a new column in the dataframe, where each element is a randomly generated number. The random values are generated using
TRandom3, seeded with a user-specified value and uniformly distributed in the range [0,1]. The number of generated values matches the size of the input column vector.- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
outputname – name of the new column containing the generated random vector
quantity – name of the input column whose size determines the length of the random vector
seed – seed value for the random number generator, if not set the answer to everything is used as default
42
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Negate(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity) This function creates a new column in the dataframe by applying element-wise negation to an existing
quantitycolumn.- Template Parameters:
T – type of the input quantity values
- Parameters:
df – input dataframe
outputname – name of the new column
quantity – name of the existing column to be negated
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Take(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const std::string &index_vector) This function extracts values from the given quantity at the indices specified in a collection index. The order of the output values reflects the order of the indices. The function uses
ROOT::VecOps::Takeinternally, leading to the following behavior:{C++} auto values = ROOT::RVec<float>({0.1, 0.2, 0.3, 0.4}); auto index = ROOT::RVec<int>({2, 3, 1}); auto result = ROOT::VecOps::Take(values, index); result // (ROOT::VecOps::RVec<float>) {0.3, 0.4, 0.2}
The column
index_vectormust contain the indices for which values should be extracted, and thequantitycolumn must contain the values of the quantity.Note that
Tis the type of the values stored in theRVeccontainers in thequantitycolumn, e.g., if the column has typeRVec<float>, you must useT = float.Note
If the index is out of range, a default value of type
Tis returned.- Template Parameters:
T – underlying type of the input column values
- Parameters:
df – input dataframe
outputname – name of the new column containing the extracted value
quantity – name of the column from which the value is retrieved
index_vector – index list for values to be extracted
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Get(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const int &index) This function extracts a value from the given column at a specified index. If the index is out of range, a default value of type
Tis returned.Note
If the index is out of range, a default value of type
Tis returned.- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
outputname – name of the new column containing the extracted value
quantity – name of the column from which the value is retrieved
index – fixed index position used to extract the value
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Get(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const std::string &index_vector, const int &position) This function extracts a value from the given column based on an index stored in another column. If the index is out of range, a default value is returned.
Note
If the index is out of range, a default value of type T is returned.
- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
outputname – name of the new column containing the extracted value
quantity – name of the column from which the value is retrieved
index_vector – name of the column containing index values
position – position within the index vector used to retrieve the index
- Returns:
a dataframe with the new column
-
template<typename T>
ROOT::RDF::RNode GetGenJetForJet(ROOT::RDF::RNode df, const std::string &outputname, const std::string &genjet_quantity, const std::string &jet_genjet_index, const std::string &index_vector, const int &position) This function gets the gen. jet quantity for a given jet. This function finds the associated gen. jet to a reconstructed jet via indices that are present in nanoAODs.
If the generator-level jet cannot be accessed, the function returns a default value.
Example: Let the column
"good_jet_indices"contain the indices of selected AK4 jets. For theJetcollection, the column"Jet_genJetIdx"contains the index of the matched generator-level jet in theGenJetcollection. To define the generator-level pT of the leading reconstructed AK4 jet, one needs to call:event::quantity::GetGenJetForJet( df, "jet_gen_pt_1", "GenJet_pt", "Jet_genJetIdx", "good_jet_indices", 0 )
- Template Parameters:
T – type of the input gen. jet column values
- Parameters:
df – input dataframe
outputname – name of the output column containing the gen. jet quantity value
genjet_quantity – name of the column containing the gen. jet quantity vector
jet_genjet_index – name of the column containing the association (via index) between the jet and the gen. jet collection
index_vector – name of the column containing the vector with the relevant jet indices
position – position in the index vector that specifies which jet in the jet vector should be used to get its associated gen. jet quantity
- Returns:
a dataframe with the new column
-
template<typename T>
ROOT::RDF::RNode GetGenJetForObject(ROOT::RDF::RNode df, const std::string &outputname, const std::string &genjet_quantity, const std::string &jet_genjet_index, const std::string &object_jet_index, const std::string &object_index_vector, const int &position) This function gets the gen. jet quantity for a given object. All objects are usually also reconstructed as jets. This function finds the corresponding jet and the associated gen. jet via indices that are present in nanoAODs.
If the generator-level jet cannot be accessed, the function returns a default value.
- Template Parameters:
T – type of the input gen. jet column values
- Parameters:
df – input dataframe
outputname – name of the output column containing the gen. jet quantity value
genjet_quantity – name of the column containing the gen. jet quantity vector
jet_genjet_index – name of the column containing the association (via index) between the jet and the gen. jet collection
object_jet_index – name of the column containing the association (via index) between the object and the jet collection
object_index_vector – name of the column containing the vector with the relevant object indices
position – position in the index vector that specifies which object in the object vector should be used to get its associated gen. jet quantity
- Returns:
a dataframe with the new column
-
template<typename T>
ROOT::RDF::RNode GetJetForObject(ROOT::RDF::RNode df, const std::string &outputname, const std::string &jet_quantity, const std::string &object_jet_index, const std::string &object_index_vector, const int &position) This function gets the jet quantity for a given object. All objects are usually also reconstructed as jets. This function finds the corresponding jet via indices that are present in nanoAODs.
If the reconstruction-level jet cannot be accessed, the function returns a default value.
- Template Parameters:
T – type of the input jet column values
- Parameters:
df – input dataframe
outputname – name of the output column containing the jet quantity value
jet_quantity – name of the column containing the jet quantity vector
object_jet_index – name of the column containing the association (via index) between the object and the jet collection
object_index_vector – name of the column containing the vector with the relevant object indices
position – position in the index vector that specifies which object in the object vector should be used to get its associated jet quantity
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Sum(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const T zero = T(0)) This function computes the sum of the elements in the
quantitycolumn for each event. If no elements are selected, a default value (provided byzero) is used as the sum for that event.- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
outputname – name of the new column containing the summed values
quantity – name of the column containing the vector of values to be summed
zero – default value to use in
ROOT::VecOps::Sum(default isT(0))
- Returns:
a dataframe with the new column
-
template<typename T>
inline ROOT::RDF::RNode Sum(ROOT::RDF::RNode df, const std::string &outputname, const std::string &quantity, const std::string &index_vector, const T zero = T(0)) This function computes the sum of the elements in the
quantitycolumn, selected by the indices from theindicescolumn. The sum is computed per event, and a default value (provided byzero) is used if no elements are selected.- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
outputname – name of the new column containing the summed values
quantity – name of the column containing the vector of values to be summed
index_vector – name of the column containing the indices used to select values from
quantityzero – default value to use in
ROOT::VecOps::Sum(default isT(0))
- Returns:
a dataframe with the new column
-
template<typename ...Quantities>
inline ROOT::RDF::RNode ScalarSum(ROOT::RDF::RNode df, const std::string &outputname, Quantities... quantities) This function calculates the scalar sum of an arbitrary set of quantities of type
float.- Template Parameters:
Quantities – variadic template parameter pack representing the quantity columns
- Parameters:
df – input dataframe
outputname – name of the output column containing the scalar sum
quantities – parameter pack of column names that contain the considered quantities
- Returns:
a dataframe with a new column
-
template<typename T>
inline ROOT::RDF::RNode Unroll(ROOT::RDF::RNode df, const std::vector<std::string> &outputnames, const std::string &quantity, const size_t &index = 0) This function recursively unrolls a vector (
std::vector<T>) from thequantitycolumn into individual columns in the dataframe. Each element of the vector is stored in a separate column with names provided in theoutputnamesvector. The function works recursively to define a new column for each element in the vector.Note
The function is recursive and will create one column for each element of the vector in
quantity. Ifoutputnameshas fewer entries than the number of elements in the vector, the function will stop at the end ofoutputnames. Theindexshould not be set outside this function.Warning
The length of the quantity vector has to be the same for each event.
- Template Parameters:
T – type of the input column values
- Parameters:
df – input dataframe
outputnames – a vector of names for the new columns where the individual elements of the vector will be stored
quantity – name of the column containing the vector of values to unroll
index – index of the current element to unroll (defaults to 0).
- Returns:
a dataframe with the new columns containing each individual element of the vector from the
quantitycolumn
-
ROOT::RDF::RNode GenerateSeed(ROOT::RDF::RNode df, const std::string &outputname, const std::string &lumi, const std::string &run, const std::string &event, const UInt_t &master_seed = 42)
-
namespace reweighting
Functions
-
ROOT::RDF::RNode Pileup(ROOT::RDF::RNode df, correctionManager::CorrectionManager &correction_manager, const std::string &outputname, const std::string &true_pileup_number, const std::string &corr_file, const std::string &corr_name, const std::string &variation)
This function is used to correct Monte Carlo (MC) simulations for differences in the pileup distribution compared to the one measured in data. It retrieves a per-event weight from a correction file based on the true number of pileup interactions in an event.
The correction files are provided by the Luminosity POG and more information about the pileup reweighting can be found here: https://twiki.cern.ch/twiki/bin/view/CMS/PileupJSONFileforData
- Parameters:
df – input dataframe
correction_manager – correction manager responsible for loading the pileup weights file
outputname – name of the output column containing the pileup event weight
true_pileup_number – name of the column containing the true mean number of the poisson distribution for an event from which the number of interactions each bunch crossing has been sampled
corr_file – path to the file with the pileup weights
corr_name – name of the pileup correction in the file, e.g. “Collisions18_UltraLegacy_goldenJSON”
variation – name of the pileup weight variation, options are “nominal”, “up” and “down”
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode PUWeightROOT(ROOT::RDF::RNode df, const std::string &outputname, const std::string &truePUMean, const std::string &datafilename, const std::string &mcfilename, const std::string &histname)
Function used to read out pileup weights from root files.
Note
This function is intended only for cases where the pileup weights are not available in the correction files.
- Parameters:
df – input dataframe
outputname – name of the derived weight
truePUMean – name of the column containing the true PU mean of simulated events
datafilename – path to the data rootfile
mcfilename – path to the MC rootfile
histname – name of the histogram stored in the rootfile
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode PartonShower(ROOT::RDF::RNode df, const std::string &outputname, const std::string &ps_weights, const float isr, const float fsr)
This function is used to evaluate the parton shower (PS) weight of an event. The weights are stored in the nanoAOD files and defined as \(w_{variation}\) / \(w_{nominal}\). The nominal weight is already applied, therefore, the main use of this function is to get the initial state radiation (ISR) and final state radiation (FSR) variations to the nominal PS weight.
Depending on the selected ISR and FSR value, a specific index has to be identified. The mapping between the index and the ISR and FSR values is:
ISR
FSR
index
2.0
1.0
0
1.0
2.0
1
0.5
1.0
2
1.0
0.5
3
Note
For some simulated samples this mapping might be defined differently, therefore, it is advisable to check the documentation of the
PSWeightbranch in the nanoAOD files of the samples if issues occur.- Parameters:
df – input dataframe
outputname – name of the output column containing the ISR/FSR event weight
ps_weights – name of the column containing the parton shower (ISR/FSR) weights
isr – value of the ISR variation, possible values are 0.5, 1.0, 2.0
fsr – value of the FSR variation, possible values are 0.5, 1.0, 2.0
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode LHEscale(ROOT::RDF::RNode df, const std::string &outputname, const std::string &lhe_scale_weights, const float mu_r, const float mu_f)
This function is used to evaluate the LHE scale weight of an event. The weights are stored in the nanoAOD files and defined as \(w_{variation}\) / \(w_{nominal}\). The nominal weight is already applied, therefore, the main use of this function is to get the factorization and renormalization scale variations to the nominal scale weight.
Depending on the selected \(\mu_R\) and \(\mu_F\) value, a specific index has to be identified. The mapping between the index and the \(\mu_R\) and \(\mu_F\) values is:
mu_f
mu_r
index
0.5
0.5
0
1.0
0.5
1
2.0
0.5
2
0.5
1.0
3
1.0
1.0
4 (not always included)
2.0
1.0
5 (4)
0.5
2.0
6 (5)
1.0
2.0
7 (6)
2.0
2.0
8 (7)
Note
For some simulated samples this mapping might be defined differently, therefore, it is advisable to check the documentation of the
LHEScaleWeightbranch in the nanoAOD files of the samples if issues occur.- Parameters:
df – input dataframe
outputname – name of the output column containing the LHE scale event weight
lhe_scale_weights – name of the column containing the LHE scale weights
mu_r – value of \(\mu_R\) variation, possible values are 0.5, 1.0, 2.0
mu_f – value of \(\mu_F\) variation, possible values are 0.5, 1.0, 2.0
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode LHEpdf(ROOT::RDF::RNode df, const std::string &outputname, const std::string &lhe_pdf_weights, const std::string &variation)
This function is used to evaluate the LHE PDF weight of an event. The weights are stored in the nanoAOD files and defined as \(w_{variation}\) / \(w_{nominal}\). The nominal weight is already applied, therefore, the main use of this function is to get the variation of the PDF weights to the nominal PDF weight.
The PDF weights consist of 101 weights, where the first weight is the nominal weight and the remaining 100 weights correspond to alternative PDF sets.
Note
The proper procedure is to use each alternative PDF set as an independent systematic vatiation. However, in case of this function, a simplified approach is used to calculate a single PDF weight variation. The standard deviation of the 100 alternative PDF weights is calculated and used to define the up and down variations as follows: \(w_{up/down} = 1 \pm \sqrt{\sum_{i=1}^{100} (w_i - 1)^2}\)
- Parameters:
df – input dataframe
outputname – name of the output column containing the LHE PDF event weight
lhe_pdf_weights – name of the column containing the LHE PDF weights
variation – name of the variation that should be evaluated, possible values are “nominal”, “up”, “down”
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode LHEalphaS(ROOT::RDF::RNode df, const std::string &outputname, const std::string &lhe_pdf_weights, const std::string &variation)
This function is used to evaluate the LHE \(\alpha_S\) weight of an event. The weights are stored in the nanoAOD files and defined as \(w_{variation}\) / \(w_{nominal}\). The nominal weight is already applied, therefore, the main use of this function is to get the variation of the \(\alpha_S\) weight to the nominal weight.
For some samples the \(\alpha_S\) weight is included in the PDF weights vector. In that case the full PDF weights vector is expected to contains 103 entries, where the first 101 entries are PDF weights and the last two entries correspond to the up and down varied \(\alpha_S\) weight.
- Parameters:
df – input dataframe
outputname – name of the output column containing the LHE \(\alpha_S\) event weight
lhe_pdf_weights – name of the column containing the LHE \(\alpha_S\) weights (it is part of the LHE PDF weights)
variation – name of the variation that should be evaluated, possible values are “nominal”, “up”, “down”
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode TopPt(ROOT::RDF::RNode df, const std::string &outputname, const std::string &genparticles_pdg_id, const std::string &genparticles_status_flags, const std::string &genparticles_pt)
This function is used to calculate an event weight to correct the top quark \(p_T\) mismodeling in simulated \(t\bar{t}\) events. The correction is provided by the Top POG and in case of this function the calculated weight corrects NLO simulation (POWHEG+Pythia8) to data.
For reference: https://twiki.cern.ch/twiki/bin/viewauth/CMS/TopPtReweighting
The weight is calculated as \(w=\sqrt{SF(t)\cdot SF(\bar{t})}\)
with \(SF= \exp(0.0615-0.0005\cdot p_T)\)
Note
The Top POG also provides other reweighting functions, e.g. for NNLO to data or NLO to NNLO which could be preferred depending on the use case.
- Parameters:
df – input dataframe
outputname – name of the output column containing the derived event weight
genparticles_pdg_id – name of the column containing the PDG IDs of the generator particles
genparticles_status_flags – name of the column containing the status flags of the generator particles, where bit 13 contains the isLastCopy flag
genparticles_pt – name of the column containing the pt of the generator particles
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode ZBosonPt(ROOT::RDF::RNode df, correctionManager::CorrectionManager &correction_manager, const std::string &outputname, const std::string &gen_boson, const std::string &corr_file, const std::string &corr_name, const std::string &order, const std::string &variation)
This function is used to calculate an event weight to correct the Z boson \(p_T\). These corrections are recommended especially for LO Drell-Yan samples, where the \(p_T\) and mass of the Z boson are mismodeled compared to data. This function is defined for the corrections provided by the CMS HLepRare group. More details can be found here: https://cms-higgs-leprare.docs.cern.ch/htt-common/DY_reweight/.
Note
HLepRare only provides corrections for Run3. For Run2 see
event::reweighting::ZPtMass.- Parameters:
df – input dataframe
correction_manager – correction manager responsible for loading the correction file
outputname – name of the output column containing the derived event weight
gen_boson – name of the column containing the Lorentz vector of the generator-level boson
corr_file – path to the correction file containing the Z boson \(p_T\) corrections
corr_name – name of the correction in the json file
order – order of the used DY samples: “LO” for madgraph, “NLO” for amcatnlo, “NNLO” for powheg
variation – name of the variation that should be evaluated, options are “nom”, “up”, “down” or “upX”, “downX”. For “up” and “down” the uncertainty is defined by the envelope of all provided uncertainty sources in the correction file. Otherwise the specific uncertainty source “X” is used (where X is a number e.g. 1,2,3,…).
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode ZPtMass(ROOT::RDF::RNode df, const std::string &outputname, const std::string &gen_boson, const std::string &workspace_file, const std::string &functor_name, const std::string &argset)
This function is used to calculate an event weight based on Z boson \(p_T\) and mass corrections. These corrections are recommended especially for LO Drell-Yan samples, where the \(p_T\) and mass of the Z boson are mismodeled compared to data.
Note
The function is intended for Run 2 analysis. In Run 3 Zpt corrections are handled through correctionlib, see the function below.
Warning
This function is based on workspaces and functions that were derived for the legacy \(H(\tau\tau)\) analysis and therefore not up-to-date anymore for UL or Run3.
- Parameters:
df – input dataframe
outputname – name of the output column containing the derived event weight
gen_boson – name of the column containing the Lorentz vector of the generator-level boson
workspace_file – path to the file which contains the workspace that should be used
functor_name – name of the function in the workspace that should be used
argset – additional arguments that are needed for the function
- Returns:
a new dataframe containing the new column
-
ROOT::RDF::RNode Pileup(ROOT::RDF::RNode df, correctionManager::CorrectionManager &correction_manager, const std::string &outputname, const std::string &true_pileup_number, const std::string &corr_file, const std::string &corr_name, const std::string &variation)
-
template<typename ...Args>