Advanced Usage
The Expression Parser
A user can define new observables for their analyses. These new observables must be arithmetic expressions of already defined observables and/or parameters. Observables defined in this way benefit from the same optimizations as all built-in observables, including multi-threaded evaluation and caching. This paragraph describes and exemplifies the syntax of the expression parser that interprets user defined python strings as new EOS observables.
The Construction of Expressions
The Rules
Basic rules are used to parse the input string
Spaces are ignored and can be added arbitrarily for readability.
The parser supports usual arithmetic operations
+
,-
,*
,/
and^
and parenthesized expressions; usual precedence rules of arithmetics apply.
<<...>>
encapsulates the name of an EOS object, which must either be the name of parameter or an observables. Any such must therefore adhere to the restrictions ofeos.QualifiedName
, for example<<mass::mu>>
or<<B_u->lnu::BR>>
.
The following strings are valid observable expressions
"(<<mass::B_d>>^2 - 4 * <<mass::mu>>^2) ^ 0.5"
"1.0 / <<B_u->lnu::BR@l=mu>>"
Aliasing
By default, the kinematic arguments of the observables are transferred to the expression.
For example, the expression 1.0 / <<B->pilnu::dBR/dq2>>
will expect a kinematic specification for q2
(either through an eos.Kinematic
object or indirectly in a plotting routine).
When more than one observable appears in the expression, it is useful to rename the kinematic variables.
This can be done via an alias specification <<...>>[...]
.
Two types of specification are supported
the
=
operator fixes a kinematic variable to a given value. E.g.<<B->pilnu::BR>>[q2_min=0.1]
only expects a specification forq2_max
.the
=>
operator renames the kinematic variable on its left-hand side to the name on the right-hand side. E.g.<<B_u->lnu::BR@l=mu>>[q2=>q2_mu] / <<B_u->lnu::BR@l=e>>[q2=>q2_e]
requires two kinematic specifications,q2_mu
andq2_e
.
Note that these specifications can be combined in comma-separated list, for example [q2_min=1.0, q2_max=>q2_mu]
.
The Insert Method
Once a new observable is defined via its expression string, it can be added to the list of observables via the insert
method
eos.Observables().insert(name, latex, unit, options, expression)
where name
, latex
and unit
are the (Qualified)name, the latex representation and the unit of the new observable;
options
takes an eos.Options
object and allows to specify global options (i.e. applied to all observables in the expression);
and expression
is the expression string to be parsed.
We conclude with a concrete example
eos.Observables().insert('B->Kll::R_K_example', R'(R_K)', eos.Unit.Unity(), eos.Options(),
'( <<B->Kll::BR;l=mu>>[q2_max=6, q2_min=>q2_mu_min] / <<B->Kll::BR;l=e>>[q2_max=6,q2_min=>q2_e_min] )')
R_K = eos.Observable.make('B->Kll::R_K_example', eos.Parameters.Defaults(), eos.Kinematics(q2_e_min=1.1, q2_mu_min=1.1), eos.Options(**{'tag':'BFS2004'}))
R_K.evaluate() # should be ~1
The EOS Command-Line Interface
Although using EOS within an interactive Jupyter notebook on your personal computer or laptop is useful to prototype an analysis, this approach sometimes suffers from limited computing power. To circumvent this problem, you can alternatively
use EOS in Jupyter interactively on a remote workstation computer via an SSH tunnel (see the FAQ);
use EOS on remote workstations or compute clusters via the command-line interface.
In the following we document the command-line interface and the file format used in conjunction with it.
Note
The EOS command-line interface is completely optional and does not provide any means beyond the interactive Python interface.
The Analysis Description Format
EOS uses a YAML file to describe the individual steps of one or more statistical analyses. At the top level, the format includes the following YAML keys:
priors
(mandatory) — The list of priors within the analysis.
likelihoods
(mandatory) — The list of likelihoods within the analysis..
posteriors
(mandatory) — The list of posteriors within the analysis.
predictions
(optional) — The list of theory predictions within the analysis.
Describing Priors
The priors
key contains a list of named priors. Each prior has two mandatory keys:
name
(mandatory) — The unique name of this prior.
parameters
(mandatory) — The ordered list of parameters described by this prior.
The description of each individual parameter follows the prior description used in the
Analysis
constructor.
Describing Likelihoods
The likelihoods
key contains a list of named likelihoods. Each likelihood has two mandatory keys:
name
(mandatory) — The unique name of this likelihood.
constraints
(mandatory) — The ordered list of EOS constraint names that comprise this likelihood.
Describing Posteriors
The posteriors
key contains a list of named posteriors. Each posterior can contain several keys:
name
(mandatory) — The unique name of this posterior.
global_options
(optional) — A key/value map providing global options, i.e., options that apply to all observables used by this posterior.
prior
(mandatory) — The ordered list of named priors that are used as part of this posterior.
likelihood
(optional) — The ordered list of named likelihoods that are used as part of this posterior.
fixed_parameter
(optional) — A key/value map providing values for parameters that deviate from the default values.
Example
Example examples/cli/btopilnu.analysis
priors:
- name: CKM
parameters:
- parameter: CKM::abs(V_ub)
min: 2.0e-3
max: 5.0e-3
type: uniform
- name: FF-BCL2008
parameters:
- parameter: B->pi::f_+(0)@BCL2008
min: 0.2
max: 0.4
type: uniform
- parameter: B->pi::b_+^1@BCL2008
min: -20.0
max: +20.0
type: uniform
- parameter: B->pi::b_+^2@BCL2008
min: -20.0
max: +20.0
type: uniform
likelihoods:
- name: theory
constraints:
- B->pi::f_+@IKMvD:2014A
- name: BaBar
constraints:
- B^0->pi^+lnu::BR@BaBar:2010B
- B^0->pi^+lnu::BR@BaBar:2012D
- name: Belle
constraints:
- B^0->pi^+lnu::BR@Belle:2010A
- B^0->pi^+lnu::BR@Belle:2013A
posteriors:
- name: th+exp
global_options:
model: CKM
form-factors: BCL2008
prior:
- CKM
- FF-BCL2008
likelihood:
- theory
- BaBar
- Belle
predictions:
- name: differential
global_options:
model: CKM
form-factors: BCL2008
l: e
observables:
- name: B->pilnu::dBR/dq2
kinematics: { q2: 0.05 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 0.10 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 0.25 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 0.50 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 0.75 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 1.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 1.50 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 2.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 2.50 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 3.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 3.50 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 4.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 6.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 8.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 10.00 }
- name: B->pilnu::dBR/dq2
kinematics: { q2: 12.00 }
The Command-Line Interface
The eos-analysis
script provides several subcommands that
inspect the analysis file;
sample from a posterior density with Monte Carlo methods;
perform auxiliary tasks on intermediate results.
The output of these commands are stored on disk as directories filled with YAML files
(for descriptions and small numerical datasets) and Numpy datafiles (for samples).
The datafiles can be access with the classes documented as part of the eos.data
module.
usage: eos-analysis [-h] [-v] [-f ANALYSIS_FILE]
{list-priors,list-likelihoods,list-posteriors,list-predictions,sample-mcmc,sample-pmc,plot-samples,find-mode,find-clusters,predict-observables,run}
...
Named Arguments
- -v, --verbose
Increases the verbosity of the script
- -f, --analysis-file
The analysis file. Defaults to ‘.analysis.yaml’.
Sub-commands:
list-priors
Lists the named prior PDFs defined within the scope of this analysis file.
eos-analysis list-priors [-h]
list-likelihoods
Lists the named likelihoods defined within the scope of this analysis file.
eos-analysis list-likelihoods [-h] [-d]
Named Arguments
- -d, --display-details
Whether to display further details for each likelihood.
list-posteriors
Lists the named posterior PDFs defined within the scope of this analysis file.
eos-analysis list-posteriors [-h]
list-predictions
Lists the named prediction sets defined within the scope of this analysis file.
eos-analysis list-predictions [-h]
sample-mcmc
Samples from a named posterior PDF using Markov Chain Monte Carlo (MCMC) methods.
The output file will be stored in EOS_BASE_DIRECTORY/POSTERIOR/mcmc-IDX.
eos-analysis sample-mcmc [-h] [-N N] [-S STRIDE] [-p PRERUNS] [-n PRE_N]
[-b BASE_DIRECTORY]
POSTERIOR CHAIN-IDX
Positional Arguments
- POSTERIOR
The name of the posterior PDF from which to draw the samples.
- CHAIN-IDX
The index assigned to the Markov chain. This value is used to seed the RNG for a reproducable analysis.
Named Arguments
- -N, --number-of-samples
The number of samples to be stored in the output file.
- -S, --stride
The ratio of samples drawn over samples stored. For every S samples, S - 1 will be discarded.
- -p, --number-of-preruns
The number of prerun steps, which ared used to adapt the MCMC proposal to the posterior.
- -n, --number-of-prerun-samples
The number of samples to be used for an adaptation in each prerun steps. These samples will be discarded.
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.
sample-pmc
Samples from a named posterior using the Population Monte Carlo (PMC) methods.
The results of the find-cluster command are expected in EOS_BASE_DIRECTORY/POSTERIOR/clusters. The output file will be stored in EOS_BASE_DIRECTORY/POSTERIOR/pmc.
eos-analysis sample-pmc [-h] [-n STEP_N] [-s STEPS] [-t PERPLEXITY_THRESHOLD]
[-N FINAL_N] [-c] [-b BASE_DIRECTORY]
POSTERIOR
Positional Arguments
- POSTERIOR
The name of the posterior PDF from which to draw the samples.
Named Arguments
- -n, --number-of-adaptation-samples
The number of samples to be used in each adaptation step. These samples will be discarded.
- -s, --number-of-adaptation-steps
The number of adaptation steps, which ared used to adapt the PMC proposal to the posterior.
- -t, --perplexity-threshold
The threshold for the perplexity in the last step after which further adaptation steps are to be skipped.
- -N, --number-of-final-samples
The number of samples to be stored in the output file.
- -c, --continue-sampling
Whether to continue sampling from the previous sample-pmc results, or start fresh from the proposal obtained using find-clusters.
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.
plot-samples
Plots all samples obtained for a named posterior.
The results of either the sample-mcmc or the sample-pmc command are expected in EOS_BASE_DIRECTORY/POSTERIOR/mcmc-* or EOS_BASE_DIRECTORY/POSTERIOR/pmc, respectively. The plots will be stored as PDF files within the respective sample inputs.
eos-analysis plot-samples [-h] [-B BINS] [-b BASE_DIRECTORY] POSTERIOR
Positional Arguments
- POSTERIOR
The name of the posterior PDF from which to draw the samples.
Named Arguments
- -B, --bins
The number of bins per histogram.
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.
find-mode
Finds the mode of the named posterior.
The optimization process can be initialized either with a provided parameter point, or by extracting the point with the largest posterior from among previously obtained MCMC samples. The output will be stored in EOS_BASE_DIRECTORY/posterior/mode.
eos-analysis find-mode [-h] [-p POINTS] [-i INIT_FILE] [--from-point POINT]
[--use-random-seed SEED] [-b BASE_DIRECTORY]
POSTERIOR
Positional Arguments
- POSTERIOR
The name of the posterior PDF that will be maximized.
Named Arguments
- -p, --starting-points
The number of parameter points from which maximization is started.
- -i, --init-from-file
The name of an MCMC data file from which the maximization is started.
- --from-point
The point from which the minization is started.
- --use-random-seed
The seed used to generate the random starting point of the minimization.
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.
find-clusters
Finds clusters among posterior MCMC samples, grouped by Gelman-Rubin R value, and creates a Gaussian mixture density.
Finding clusters and creating a Gaussian mixture density is a neccessary intermediate step before using the sample-pmc subcommand. The input files are expected in EOS_BASE_DIRECTORY/POSTERIOR/mcmc-*. All MCMC input files present will be used in the clustering. The output files will be stored in EOS_BASE_DIRECTORY/POSTERIOR/clusters.
eos-analysis find-clusters [-h] [-t THRESHOLD] [-c K_G] [-b BASE_DIRECTORY]
POSTERIOR
Positional Arguments
- POSTERIOR
The name of the posterior PDF from which MCMC samples have previously been drawn.
Named Arguments
- -t, --threshold
The R value threshold. If two sample subsets have an R value larger than this threshold, they will be treated as two distinct clusters. (default: 2.0)
- -c, --clusters-per-group
The number of mixture components per cluster. (default: 1)
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.
predict-observables
Predicts a set of observables based on previously obtained PMC samples.
The input files are expected in EOS_BASE_DIRECTORY/POSTERIOR/pmc. The output files will be stored in EOS_BASE_DIRECTORY/POSTERIOR/pred-PREDICTION.
eos-analysis predict-observables [-h] [-B BEGIN] [-E END] [-b BASE_DIRECTORY]
POSTERIOR PREDICTION
Positional Arguments
- POSTERIOR
The name of the posterior PDF from which to draw the samples.
- PREDICTION
The name of the set of observables to predict.
Named Arguments
- -B, --begin-index
The index of the first sample to use for the predictions.
- -E, --end-index
The index beyond the last sample to use for the predictions.
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.
run
Runs a list of subcommands.
eos-analysis run [-h] [-b BASE_DIRECTORY]
Named Arguments
- -b, --base-directory
The base directory for the storage of data files. Can also be set via the EOS_BASE_DIRECTORY environment variable.