mRDF: extend the RDF object#

class mkShapesRDF.processor.framework.mRDF.mRDF[source][source]#

Bases: object

Private version of RDataFrame that allows to define new columns, drop columns, and use Vary together with Snapshot.

Methods

Copy()

Copy the mRDF object

Count()

Count the number of events

Define(a, b[, excludeVariations])

Define a new column, if the column already exists redefine it.

DropColumns(pattern[, includeVariations])

Drop columns that match the given pattern

Filter(string)

Filter the mRDF, the filter is sensitive to Variations through the use of CUT, see the notes.

GetVariationsForCol(column)

Get the list of variations for a given column

GetVariedColumns(columns)

Given a list of columns, return the varied columns for all variations and tags

GetVariedColumns_oneVariation(columns, ...)

Given a list of columns, return the varied columns for a given variation and tag

Snapshot(*args, **kwargs)

Produce a Snapshot of the mRDF and return it

Sum(string)

Sum the values of a column

Vary(colName, expression[, variationTags, ...])

Define variations for an existing column, given an expression

readRDF(*ar, **kw)

Read the RDataFrame object and create the special column CUT used to hold the different Filters

setNode(dfNode, cols, cols_d, variations)

Set internal variables of an mRDF object to the provided ones

variationNaming(variationName, variationTag)

Naming convention for variations.

GetColumnNames

GetVariations

Redefine

__init__()[source][source]#

Initialize the mRDF object.

df#

df stores the RDataFrame object

cols#

cols stores the list of columns

cols_d#

cols_d stores the list of columns that were dropped

variations#

variations stores the list of variations

static variationNaming(variationName, variationTag, col='')[source][source]#

Naming convention for variations.

Given a variation name and a tag it will return variationName_variationTag. If a column name is provided, it will return col__variationName_variationTag.

Parameters:
variationNamestr

Variation name

variationTagstr

Variation tag

colstr, optional, default: “”

column name that has the variation

Returns:
str

formatted string

setNode(dfNode, cols, cols_d, variations)[source][source]#

Set internal variables of an mRDF object to the provided ones

Parameters:
dfNodeROOT.RDataFrame

The RDataFrame object

colslist of str

The list of columns of the RDataFrame

cols_dlist of str

The list of columns that were dropped

variationsdict

The dict of variations

Returns:
mRDF

The mRDF object with the internal variables set to the provided ones

Copy()[source][source]#

Copy the mRDF object

Returns:
mRDF

The copy of the mRDF object

readRDF(*ar, **kw)[source][source]#

Read the RDataFrame object and create the special column CUT used to hold the different Filters

Parameters:
arlist

The list of arguments to be passed to the RDataFrame constructor

kwdict

The dictionary of keyword arguments to be passed to the RDataFrame constructor

Returns:
mRDF

The mRDF object with the new RDataFrame object stored

Define(a, b, excludeVariations=[])[source][source]#

Define a new column, if the column already exists redefine it.

Parameters:
astr

The name of the new column

bstr

The expression to be evaluated to define the new column

excludeVariationslist of str, optional, default: []

List of pattern of variations to exlude. If * is used, all variations will be excluded and the defined column will be nominal only.

Returns:
mRDF

The mRDF object with the new column defined

Notes

If excludeVariations is [], the define expression (b) will be checked for all possible variations. If variations of the define expression are found, they will be defined for the new column as well (i.e. varied b will be defined as variations of a).

Redefine(a, b)[source][source]#
Vary(colName, expression, variationTags=['down', 'up'], variationName='')[source][source]#

Define variations for an existing column, given an expression

Parameters:
colNamestr

nominal column name

expressionstr

a valid C++ expression that defines the variations

variationTagslist, optional, default: ["down", "up"]

list of tags to be used for the variations (len must be 2)

variationNamestr, optional, default: “”

name of the variation

Returns:
mRDF

The mRDF object with the variations defined

Notes

Vary will call Define internally to define a temporary variable that contains the varied expression. Since also Define will call Vary internally, the user should be careful to not end up in an infinite loop!

References

See the official RDataFrame::Vary() documentation even if not used here (not compatible with Snapshot).

Examples

When defining the same variation twice for different nominal variables, the tags must be the same (order does not matter)

>>> df = df.Vary("var", "var + 1", ["down", "up"], "var_JER_0")
>>> df = df.Vary("var2", "var2 + 2", [ "up", "down"], "var_JER_0")
Filter(string)[source][source]#

Filter the mRDF, the filter is sensitive to Variations through the use of CUT, see the notes.

Parameters:
stringstr

the filter expression

Returns:
mRDF

The mRDF object with the filter applied

Notes

If the filter expression contains a variable for which variation are already defined, the CUT will be varied accordingly. Only events that pass at least one of the varied CUT (or the nominal) will be considered.

GetColumnNames()[source][source]#
GetVariations()[source][source]#
GetVariationsForCol(column)[source][source]#

Get the list of variations for a given column

Parameters:
columnstr

the nominal column name

Returns:
list

list of all variations defined for the given column

GetVariedColumns_oneVariation(columns, variationName, tag)[source][source]#

Given a list of columns, return the varied columns for a given variation and tag

Parameters:
columnslist of str

list of columns to search variations for

variationNamestr

the variation name

tagstr

the variation tag

Returns:
list of str

List of varied columns for the given variation and tag

GetVariedColumns(columns)[source][source]#

Given a list of columns, return the varied columns for all variations and tags

Parameters:
columnslist of str

list of columns to search variations for

Returns:
list of str

List of varied columns for all variations and tags

DropColumns(pattern, includeVariations=True)[source][source]#

Drop columns that match the given pattern

Parameters:
patternstr

the pattern to be matched

includeVariationsbool, optional, default: True

whether to include variations or not

Returns:
mRDF

The mRDF object with the columns dropped

Notes

The columns from self.cols matching the pattern will be dropped and added to self.cols_d. If includeVariations is True, the variations of the dropped columns will be dropped as well.

Count()[source][source]#

Count the number of events

Returns:
Proxy<Long64_t>

The number of events (need to apply GetValue() to get the actual value)

Sum(string)[source][source]#

Sum the values of a column

Parameters:
stringstr

the column name

Returns:
Proxy<Float_t>

The sum of the values of the column (need to apply GetValue() to get the actual value)

Snapshot(*args, **kwargs)[source][source]#

Produce a Snapshot of the mRDF and return it

Parameters:
*argslist

list of arguments to be passed to the RDataFrame::Snapshot method

**kwargsdict

dictionary of keyword arguments to be passed to the RDataFrame::Snapshot method

Returns:
Snapshot or Proxy<Snapshot>

The Snapshot object, or a Proxy<Snapshot> if lazy=True is passed as a keyword argument