API Reference¶
Model¶
- class hboms.model.HbomsModel(name: str, state: list[Variable], odes: str, init: str, params: list[Parameter], obs: list[Observation], dists: list[StanDist], trans_state: list[Variable] | None = None, transform: str | None = None, covariates: list[Covariate] | None = None, correlations: list[Correlation] | None = None, plugin_code: dict[str, str] | None = None, options: dict | None = None, compile_model: bool = True, optimize_code: bool = True, model_dir: str | None = None)¶
Bases:
objectObject that represents an HBOMS model
This serves as an interface to the actual Stan model
- property fit: CmdStanMCMC | CmdStanVB | CmdStanPathfinder¶
Get access to the fitted Stan model. Make sure to call the method sample, variational or pathfinder first.
- Returns:
The fitted Stan model.
- Return type:
CmdStanMCMC | CmdStanVB | CmdStanPathfinder
- get_simulation_times(data: dict, n_sim: int | None = None) list[list[float]]¶
Make a list of time points used for simulating trajectories of the model. This is convenient for e.g. plotting the model fits.
- Parameters:
data (dict) – data dictionary. Must contain the key “Time”
n_sim (Optional[int]) – number of simulation time points. If the model has been fit, this number can be inferred from the chain, if it is not specified by the user. The fefault is None
- Returns:
SimTime – simulation time points for each unit
- Return type:
list[list[float]]
- init_check(data: dict, state_var_names: list[str] | None = None, obs_names: list[str] | None = None, n_sim: int = 100, **kwargs) Figure¶
Check the initial parameter guess. Plot the given data alongside the trajectories based on the initial parameters.
- Parameters:
data (dict) – A dictionary with the data. Must contain fields “Time” and fields for the observations.
state_var_names (list[str] | None, optional) – Choose which trajectories to plot. If None, then plot all trajectories (both state and transformed state). The default is None.
obs_names (list[str] | None, optional) – Choose which observations to plot. If None, then plot all observations. The default is None.
n_sim (int, optional) – Number of time points used for the simulations. The default is 100.
**kwargs – Additional arguments passed to matplotlib.
- Returns:
fig – a figure with a panel for each unit. Showing data and trajectories.
- Return type:
matplotlib.pyplot.Figure
- property model_code: str¶
return the stan model code as a string
- pathfinder(data: dict, n_sim: int = 100, **kwargs) None¶
- post_pred_check(data: dict, state_var_names: list[str] | None = None, obs_names: list[str] | None = None, **kwargs) Figure¶
plot the estimated trajectories together with the data
- property prior_fit: CmdStanMCMC¶
Get access to the prior predictive fit of the Stan model. Make sure to call the method sample_from_prior first. TODO: perhaps prior_fit is not a good name as we’re not fitting to any data here…
- Returns:
The prior predictive fit of the Stan model.
- Return type:
CmdStanMCMC
- sample(data: dict, n_sim: int = 100, **kwargs) None¶
generate posterior samples from the Stan model using HMC
- sample_from_prior(data: dict, n_sim: int = 100, compile_prior_sampler: bool = True, **kwargs) None¶
Sample from the prior distribution of the model. This is done by compiling a model that ignores the observations and only simulates from the prior. To facilitate efficient sampling from the prior, we force all random parameters to be non-centered.
- Parameters:
data (dict) – A dictionary with the data. Must contain at least a field “Time” and fields for the observations and covariates.
n_sim (int, optional) – The number of time points for the simulation. The default is 100.
compile_prior_sampler (bool, optional) – If False, don’t compile the prior sampling code. This is used for debugging. The default is True.
**kwargs – Additional arguments passed to cmdstanpy.
- set_init(init_dict: dict[str, Any]) None¶
Set initial parameter guesses to provided values. Values for transformed parameters are not accepted, as these are updated automatically based on the regular parameters on which they depend. If the user passes a value for a transformed parameter, a warning is shown and the value is ignored.
- Parameters:
init_dict (dict[str, Any]) – dictionary of initial parameter values.
- simulate(data: dict, num_simulations: int, compile_simulator: bool = True, output_dir: str | None = None, seed: int | None = None) list[tuple[dict, dict]]¶
Simulate data using the model. Returns a list of pairs of simulated data sets and corresponding random parameter draws. The simulated data sets can be used as-is to fit the model.
- Parameters:
data (dict) – dataset used for simulation. Requires at least a field “Time”, but for more complex models more data is needed (e.g. covariates).
num_simulations (int) – Determines the number of simulated datasets that will be returned. Should be at least 1.
compile_simulator (bool, optional) – If False, don’t compile the simulator code. This is used for debugging. The default is True.
output_dir (Optional[str], optional) – Determines the directory where the Stan files are stored. If None, use a temporary folder. The default is None.
seed (Optional[int], optional) – Seed passed to cmdstan to ensure reproducibility. If None, a random seed is used. The default is None.
- Returns:
A list of pairs. Each pair contains a data set and the random parameters used to create the data set.
- Return type:
list[tuple[dict, dict]]
- property simulator_code: str | None¶
return the stan simulator code as a string
- stan_data_and_init(data: dict, n_sim: int = 100) tuple[dict, dict]¶
Make a dictionary required to fit the model. The input is the data required for the model. This method adds a number of required items, like constants, the number of units, simulation time points. etc. The method also returns a dictionary with initial parameter values.
- Parameters:
data (dict) – A dictionary with the data. Must contain at least a field “Time” and fields for the observations.
n_sim (int, optional) – The number of time points for the simulation. This is used to determine the number of time points for the ODE system. The default is 100.
- Returns:
A tuple with two dictionaries. The first dictionary contains the data required for the Stan model. The second dictionary contains the initial parameter values for the Stan model.
- Return type:
tuple[dict, dict]
- variational(data: dict, n_sim: int = 100, **kwargs) None¶
Model Ingredients¶
- class hboms.frontend.Variable(name: str, dim: int | None = None)¶
Representation of a model variable (state or transformed state). This can be a scalar or vector-valued variable, but is always real-valued, or has real-valued components.
- name¶
Name of the variable.
- Type:
str
- dim¶
Dimension of the variable (for vector-valued variables).
- Type:
int, optional
Examples
A state variable “S” representing susceptible individuals:
var = Variable(name="S")
A vector-valued variable “velocity” with dimension 3:
var = Variable(name="velocity", dim=3)
- class hboms.frontend.Parameter(name: str, value: float | list[float], par_type: str, scale: float | None = None, covariates: list[str] | None = None, cw_values: dict[str, float | list[float]] | None = None, space: str | None = None, lbound: float | None = 0.0, ubound: float | None = None, prior: StanPrior | None = None, loc_prior: StanPrior | None = None, scale_prior: StanPrior | None = None, level: str | None = None, level_type: str | None = None, level_scale: float | None = None, level_scale_prior: StanPrior | None = None, noncentered: bool | None = None)¶
Representation of a model parameter.
- name¶
Name of the parameter.
- Type:
str
- value¶
Initial value(s) of the parameter.
- Type:
float or list[float]
- par_type¶
Type of the parameter (e.g.,
"fixed","random").- Type:
str
- scale¶
Initial scale of the parameter (for random parameters).
- Type:
float, optional
- covariates¶
List of covariate names affecting the parameter.
- Type:
list[str], optional
- cw_values¶
Covariate-wise initial values for the parameter.
- Type:
dict[str, float or list[float]], optional
- space¶
Parameter space (e.g.,
"real","vector").- Type:
str, optional
- lbound¶
Lower bound of the parameter.
- Type:
float, optional
- ubound¶
Upper bound of the parameter.
- Type:
float, optional
- level¶
Hierarchical level of the parameter (for hierarchical parameters).
- Type:
str, optional
- level_type¶
Type of the hierarchical level (e.g.,
"fixed","random").- Type:
str, optional
- level_scale¶
Initial scale of the hierarchical level (for random levels).
- Type:
float, optional
- level_scale_prior¶
Prior for the scale of the hierarchical level (for random levels).
- Type:
StanPrior, optional
- noncentered¶
Whether to use non-centered parameterization (for random levels).
- Type:
bool, optional
Examples
A random parameter “k” with initial value 0.1, scale 0.05, lower bound 0.0, upper bound 1.0, affected by covariates “age” and “weight”, with a normal prior on its location and an exponential prior on its scale:
prior_loc = StanPrior(name="normal", params=[0.0, 1.0]) prior_scale = StanPrior(name="exponential", params=[1.0]) param_k = Parameter( name="k", value=0.1, par_type="random", scale=0.05, covariates=["age", "weight"], lbound=0.0, ubound=1.0, loc_prior=prior_loc, scale_prior=prior_scale )
- class hboms.frontend.StanPrior(name: str, params: list[float])¶
Representation of a prior distribution in Stan.
Notice that the name should correspond to a valid Stan distribution, or a user-defined distribution in the Stan model code.
- name¶
Name of the Stan distribution.
- Type:
str
- params¶
Parameters of the distribution.
- Type:
list[float]
Examples
Standard normal prior:
prior = StanPrior(name="normal", params=[0, 1])
Exponential prior with rate 2:
prior = StanPrior(name="exponential", params=[2])
- class hboms.frontend.DiracDeltaPrior(param: str | float)¶
Representation of a Dirac delta prior (point mass) for a parameter.
This can be used to fix e.g. a location or scale parameter to a specific value: possibly the location or scale of another parameter.
- param¶
The fixed value or the name of the parameter to which this prior is anchored.
- Type:
str or float
Examples
Fixing a parameter to a constant value:
prior = DiracDeltaPrior(param=5.0)
Anchoring a parameter to another parameter named “alpha”:
prior = DiracDeltaPrior(param="alpha")
Let a be a random parameter with scale
scale_a, and fix b’s scale to a’s scale:param_a = Parameter(name="a", value=0.0, par_type="random") ddp = DiracDeltaPrior(param="scale_a") param_b = Parameter(name="b", value=1.0, par_type="random", scale_prior=ddp)
- class hboms.frontend.Correlation(params: list[str], value: ndarray | None = None, intensity: float | None = None)¶
Representation of a correlation structure among parameters. The intensity attribute can be used to specify the a priori strength of the correlation. The intenisty is used in the prior definition for the correlation matrix, e.g., in a LKJ prior.
- params¶
List of parameter names involved in the correlation.
- Type:
list[str]
- value¶
Initial correlation matrix, identity matrix if None.
- Type:
np.ndarray, optional
- intensity¶
Intensity of the correlation structure (used for the prior).
- Type:
float, optional
Examples
A correlation structure among parameters “alpha”, “beta”, and “gamma” with a specified correlation matrix:
corr = Correlation( params=["alpha", "beta", "gamma"], value=np.array([[1.0, 0.5, 0.3], [0.5, 1.0, 0.2], [0.3, 0.2, 1.0]]), intensity=1.0 )
- class hboms.frontend.Covariate(name: str, cov_type: str = 'cont', categories: list[str] | None = None, dim: int | None = None)¶
Representation of a covariate variable.
- name¶
Name of the covariate.
- Type:
str
- cov_type¶
Type of the covariate (
"cont"for continuous,"cat"for categorical).- Type:
str
- categories¶
Categories for categorical covariates.
- Type:
list[str], optional
- dim¶
Dimension of the covariate (for vector-valued covariates).
- Type:
int, optional
Examples
A continuous covariate “age”:
cov = Covariate(name="age", cov_type="cont")
A categorical covariate “treatment” with categories “placebo” and “drug”:
cov = Covariate(name="treatment", cov_type="cat", categories=["placebo", "drug"])
A vector-valued covariate “biomarkers” with dimension 5:
cov = Covariate(name="biomarkers", cov_type="cont", dim=5)
- class hboms.frontend.Observation(name: str, data_type: str = 'real', censored: bool = False)¶
Representation of an observed variable. These are typically linked to the model state via StanDist objects.
- name¶
Name of the observation variable.
- Type:
str
- data_type¶
Data type of the observation (e.g.,
"real","int").- Type:
str
- censored¶
Whether the observation is censored.
- Type:
bool
Examples
A real-valued observation “y” that is not censored:
obs = Observation(name="y")
A left-censored viral load observation “VL” of type “real”:
obs = Observation(name="VL", data_type="real", censored=True)
A count variable “cases” of type “int”:
obs = Observation(name="cases", data_type="int")
- class hboms.frontend.StanDist(name: str, obs_name: str, params: list[str])¶
Representation of an observation distribution in Stan.
- name¶
Name of the Stan distribution.
- Type:
str
- obs_name¶
Name of the observation variable.
- Type:
str
- params¶
List of parameter names for the distribution.
- Type:
list[str]
Examples
A normal distribution for observation “y” with parameters “mu” and “sigma”:
dist = StanDist(name="normal", obs_name="y", params=["mu", "sigma"])