API Reference

Parameter Estimation & Identification

Parameter Estimation

Uncertainty Analysis

middoe.iden_uncert.uncert(resultpr, system, models, iden_opt, case=None)[source]

Perform comprehensive uncertainty and sensitivity analysis on parameter estimation results.

This function evaluates parameter uncertainties using variance-covariance matrices, computes confidence intervals, performs sensitivity analysis via finite-difference methods, and assesses model quality through multiple statistical metrics (R², MSE, likelihood). It supports Hessian-based, Jacobian-based, and bootstrap-based covariance estimation.

Parameters:
  • resultpr (dict[str, scipy.optimize.OptimizeResult]) –

    Parameter estimation results from parmest(). Each OptimizeResult should contain:
    • ’scpr’np.ndarray

      Optimized parameters.

    • ’hess_inv’np.ndarray, optional

      Inverse Hessian (for var-cov=’H’).

    • ’v’np.ndarray, optional

      Bootstrap covariance matrix (for var-cov=’B’).

    • ’fun’float

      Final objective function value.

  • system (dict) –

    System configuration including:
    • ’tvi’dict

      Time-variant input definitions.

    • ’tii’dict

      Time-invariant input definitions.

    • ’tvo’dict
      Time-variant output definitions:
      • ’meas’ : bool — include in analysis

      • ’unc’ : float — measurement standard deviation

    • ’tio’dict

      Time-invariant output definitions.

  • models (dict) –

    Model definitions:
    • ’can_m’list[str]

      Active model/solver names.

    • ’mutation’dict[str, list[bool]]

      Parameter masks (True=free, False=fixed).

    • ’theta’dict[str, list[float]]

      Nominal parameter values (updated in-place if case=None).

    • ’V_matrix’dict[str, np.ndarray]

      Covariance matrices (updated in-place if case=None).

    • ’LSA’dict[str, np.ndarray]

      Local sensitivity analysis matrices (updated in-place if case=None).

  • iden_opt (dict) –

    Identification options:
    • ’sens_m’str, optional
      Sensitivity method (from paper Table S3):
      • ’central’: Central finite difference (second-order accuracy,

        requires 2 evaluations per parameter)

      • ’forward’: Forward finite difference (first-order accuracy,

        requires 1 evaluation per parameter)

      Default: ‘central’.

    • ’var-cov’str, optional
      Covariance method (from paper Table S3):
      • ’H’: Hessian-based (from optimizer, local approximation)

      • ’J’: Jacobian-based (from sensitivity matrix, local approximation)

      • ’B’: Bootstrap-based (global, requires var-cov=’B’ in parmest)

      Default: ‘J’ if not specified.

    • ’eps’float or dict[str, float], optional

      Finite-difference step size. If None, automatically determined via mesh-independency test for each model.

    • ’log’bool, optional

      Enable verbose logging (default: False).

  • case (str, optional) –

    Analysis mode:
    • None: Update models dictionary in-place with results.

    • Any other value: Return results without modifying models.

Returns:

analysis_results

Dictionary with:
  • ’results’dict[str, dict]
    Detailed uncertainty analysis for each model/solver:
    • ’optimization_result’OptimizeResult

      Original parameter estimation result.

    • ’data’dict

      Experimental data (unchanged).

    • ’LS’float

      Least Squares objective.

    • ’WLS’float

      Weighted Least Squares objective.

    • ’MLE’float

      Maximum Likelihood Estimation objective.

    • ’MSE’float

      Mean Squared Error.

    • ’CS’float

      Chi-squared statistic.

    • ’LSA’np.ndarray, shape (n_observations, n_active_params)

      Local Sensitivity Analysis matrix (Jacobian).

    • ’Z’np.ndarray

      Normalized sensitivities (Z-scores).

    • ’V_matrix’np.ndarray, shape (n_active_params, n_active_params)

      Variance-covariance matrix.

    • ’CI’np.ndarray

      95% confidence intervals (half-widths).

    • ’t_values’np.ndarray

      t-statistics for parameter significance tests.

    • ’R2_responses’dict[str, float]

      R² for each response variable.

    • ’R2_total’float

      Overall R² across all responses.

    • ’M’np.ndarray

      Fisher Information Matrix.

    • ’t_m’dict

      Time vectors for each sheet.

    • ’tv_input_m’dict

      Time-variant inputs for each sheet.

    • ’ti_input_m’dict

      Time-invariant inputs for each sheet.

    • ’tv_output_m’dict

      Simulated time-variant outputs for each sheet.

    • ’ti_output_m’dict

      Simulated time-invariant outputs for each sheet.

    • ’estimations’np.ndarray

      Final parameter estimates (same as ‘scpr’).

    • ’found_eps’float

      Finite-difference step size used.

    • ’P’float

      Relative probability (model weight) based on Chi-squared.

  • ’obs’int

    Total number of observations used.

Return type:

dict

Notes

Sensitivity Analysis: The Local Sensitivity Analysis (LSA) matrix is computed via finite differences:

  • Forward: ( frac{partial y}{partial theta_i} approx frac{y(theta + h e_i) - y(theta)}{h} )

  • Central: ( frac{partial y}{partial theta_i} approx frac{y(theta + h e_i) - y(theta - h e_i)}{2h} )

Central differences are more accurate (second-order) but require twice as many simulations.

Variance-Covariance Estimation Methods (from paper Table S3):

  • ‘H’ (Hessian-based): Local uncertainty from optimizer’s Hessian matrix. [ mathbf{V} = sigma^2 mathbf{H}^{-1} ] where ( sigma^2 ) is estimated from residual variance. Advantages: Fast, automatic from optimizer Disadvantages: Assumes local linearity, may be inaccurate for ill-conditioned problems

  • ‘J’ (Jacobian-based): Local uncertainty from Fisher Information Matrix. [ mathbf{M} = mathbf{Q}^T mathbf{W} mathbf{Q}, quad mathbf{V} = mathbf{M}^{-1} ] where ( mathbf{Q} ) is the sensitivity matrix and ( mathbf{W} = text{diag}(1/sigma_i^2) ). Advantages: More robust than Hessian for parameter-sensitive models Disadvantages: Requires explicit sensitivity computation

  • ‘B’ (Bootstrap): Global uncertainty via residual resampling. Accounts for parameter bounds via truncated normal correction. Advantages: Most robust, accounts for non-linearity Disadvantages: Computationally expensive, requires running parmest with var-cov=’B’

Confidence Intervals: 95% confidence intervals are computed using t-distribution:

[ CI_i = t_{0.975, nu} cdot sqrt{V_{ii}} ]

where ( nu = n_{obs} - n_{params} ) is degrees of freedom.

t-Statistics: Parameter significance is tested via:

[ t_i = frac{theta_i}{sqrt{V_{ii}}} ]

Large ( |t_i| ) (typically > 1.96 for 95% confidence) indicates the parameter is significantly different from zero.

Normalized Sensitivities (Z-scores): Sensitivities are normalized by parameter uncertainties and measurement errors:

[ Z_{ij} = frac{partial y_i}{partial theta_j} cdot frac{sqrt{V_{jj}}}{sigma_i} ]

This quantifies the relative influence of parameter ( j ) on observable ( i ).

Model Weights (P-test): When multiple models are analyzed, relative probabilities are computed via P-test:

[ P_k = frac{exp(-chi_k^2 / 2)}{sum_m exp(-chi_m^2 / 2)} times 100% ]

This provides model comparison based on goodness-of-fit (higher probability → better model).

Mesh-Independency Test: If epsilon is not provided, it is automatically determined by varying ( h ) over ( [10^{-12}, 10^{-1}] ) and selecting the value that minimizes sensitivity matrix variance. Results are saved in ‘./meshindep_plots/’.

Ill-Conditioning Diagnostics: The function automatically logs conditioning metrics:

  • Rank of sensitivity matrix ( mathbf{Q} )

  • Condition numbers of ( mathbf{Q} ) and ( mathbf{M} )

  • Minimum eigenvalue of ( mathbf{M} )

  • Warnings if system is ill-conditioned (condition number > ( 10^8 ))

In-Place Updates (case=None): When case=None, the function updates models dictionary:

  • models[‘theta’]: Updated with final estimates

  • models[‘V_matrix’]: Updated with covariance matrices

  • models[‘LSA’]: Updated with sensitivity matrices

  • models[‘thetastart’]: Preserves pre-uncertainty estimates

References

See also

parmest

Parameter estimation (prerequisite for uncertainty analysis).

_uncert_metrics

Core uncertainty computation routine.

_fdm_mesh_independency

Epsilon selection via mesh-independency test.

Examples

>>> # After parameter estimation
>>> results_pe = parmest(system, models, iden_opt={'meth': 'SLSQP', 'ob': 'WLS'})
>>>
>>> # Uncertainty analysis with Jacobian-based covariance (recommended)
>>> iden_opt_ua = {'sens_m': 'central', 'var-cov': 'J', 'log': True}
>>> uncert_results = uncert(results_pe, system, models, iden_opt_ua)
>>>
>>> # Access results
>>> for model in uncert_results['results']:
...     res = uncert_results['results'][model]
...     print(f"Model {model}:")
...     print(f"  R² = {res['R2_total']:.4f}")
...     print(f"  Parameters: {res['estimations']}")
...     print(f"  CI (95%): {res['CI']}")
...     print(f"  t-values: {res['t_values']}")
...     print(f"  Significant (|t|>1.96): {np.abs(res['t_values']) > 1.96}")
>>> # Bootstrap-based uncertainty (requires bootstrap in parmest)
>>> results_boot = parmest(system, models,
...                       iden_opt={'meth': 'SLSQP', 'ob': 'WLS',
...                                 'var-cov': 'B', 'nboot': 500})
>>> uncert_boot = uncert(results_boot, system, models, {'var-cov': 'B'})
>>> print(f"Bootstrap covariance:\n{uncert_boot['results']['M1']['V_matrix']}")

Model Validation

Identification Utilities

middoe.iden_utils.Plot_estimability(round_data, path, solver)[source]

Plot the estimability of parameters across different rounds for a given model. Estimability plotter post analysis.

Parameters: round_data (dict): Data related to each round of experiments. path (str): Base path where the plot will be saved. model (str): The name of the model for which the estimability is being plotted.

Returns: None

class middoe.iden_utils.Plotting_FinalResults(round_data, winner_solver, selected_rounds)[source]

Bases: object

A class to handle final plotting and reporting of results, post-identification.

round_data

Data related to each round of identification (e.g. ‘Round 1’, ‘Round 2’, …).

Type:

dict

winner_solver

The name of the winning model/model (most probable).

Type:

str

selected_rounds

Round numbers the user wants to include. If empty, all rounds are used.

Type:

list of int

__init__(round_data, winner_solver, selected_rounds)[source]

Initialize Plotting_FinalResults with round data, winner model, and selected rounds.

Parameters:
  • round_data (dict) – Data keyed by ‘Round X’ strings.

  • winner_solver (str) – The name of the winning model/model.

  • selected_rounds (list of int) – The rounds to include. If empty, all rounds are considered.

accvsprec_plot(filename)[source]

Plot parameter estimates and confidence intervals for the selected rounds, skipping any with broad covariance.

Parameters:

filename (str) – Filename to save the accuracy vs. precision plot.

conf_plot(filename, ellipsoid_volume_filename='ellipsoid_volume_across_rounds.png', std_devs_filename='parameter_std_devs_across_rounds.png', parameter_estimates_filename='parameter_estimates_across_rounds.png')[source]

Plot parameter PDFs (on the diagonal) and confidence ellipses (off-diagonal) for the winning model across selected rounds. Then call conf_plot_metrics and accvsprec_plot.

Parameters:
  • filename (str) – Filename for the main confidence plot figure.

  • ellipsoid_volume_filename (str, optional) – Filename for the ellipsoid volume plot.

  • std_devs_filename (str, optional) – Filename for the standard deviations plot.

  • parameter_estimates_filename (str, optional) – Filename for the parameter estimates plot.

conf_plot_metrics(ellipsoid_volume_filename, std_devs_filename)[source]
Compute and plot:
  1. The volume of the 95% confidence ellipsoid.

  2. The parameter standard deviations across selected rounds,

excluding rounds whose covariance matrix is broad/non-positive-definite.

Parameters:
  • ellipsoid_volume_filename (str) – Filename to save the confidence ellipsoid volume plot.

  • std_devs_filename (str) – Filename to save the standard deviations plot.

pcomp_plot(save_path)[source]

Plot P values for each selected round and save each plot with the round number in the filename.

Parameters:

save_path (str) – Path where the plots will be saved.

Return type:

None

pt_plot(filename, roundc)[source]

Plot changes in P values, t values, and TRV across selected rounds, skipping any with broad covariance.

Parameters:
  • filename (str) – Filename to save the P/t plot.

  • roundc (dict) – Additional design info for each round (key=’Round X’), e.g. {‘Round 2’: {‘design_type’: ‘classic design’}, …}.

reporter(filename)[source]

Write an Excel file for each round and a summary text file for the last round. Excel files contain experimental data, model simulations, and input data. The text file contains scaled parameters, t-values, and R² metrics for the last round.

Parameters:

filename (str) – Base filename. Round numbers are appended to Excel files (e.g., “my_file_2.xlsx”). The text summary is written to a file with “_summary.txt” appended.

tcomp_plot(save_path)[source]

Plot t values for each selected round and save each plot with the round number in the filename.

Parameters:

save_path (str) – Path where the plots will be saved.

Return type:

None

class middoe.iden_utils.Plotting_Results(models, pltshow, round=None)[source]

Bases: object

A class to handle plotting of results while analysis.

Attributes: path (str): Base path for saving plots. case (str): Case identifier for the experiment. mutation_settings (dict): Settings related to mutation in the modelling process. modelling_folder (str): Folder name for storing modelling results.

__init__(models, pltshow, round=None)[source]

Initialize the Plotting_Results class with modelling settings.

Parameters: models (dict): Settings related to the modelling process. round (int, optional): Round number, if you want to tag subfolders.

conf_plot(parameters, cov_matrices, confidence_intervals)[source]

Generate and save a confidence ellipsoid plot for estimated parameters, across different rounds of identification for different models.

Parameters: parameters (dict): Model parameters. cov_matrices (dict): Covariance matrices for the estimated parameters confidence_intervals (dict): Confidence intervals for the parameters. round (int): The round number.

fit_plot(data, result, system)[source]

Generate and save the model fitting to experimental data plot for the given data and results.

Parameters: data (dict): Experimental data. result (dict): Modelling results. round (int): The round number. case (str): The case type (e.g., ‘doe’ or ‘classic’).

middoe.iden_utils.plot_rCC_vs_k(x_values, rCC_values, round, solver)[source]

Plot rCC values against k for a specific round and model. Estimability plotter while analysis.

Parameters: x_values (list): List of x values (k values). rCC_values (list): List of rCC values corresponding to x values. round (int): The round number. framework_settings (dict): Settings related to the framework, including paths and case information. model (str): The name of the model for which the plot is being generated.

Returns: None

middoe.iden_utils.plot_sobol_results(time_samples, sobol_analysis_results, sobol_problem, solver, response_key)[source]

Plot Sobol sensitivity analysis results for a given model

Parameters: time_samples (list): List of time for samples (time span) sobol_analysis_results (dict): Results of the Sobol sensitivity analysis. sobol_problem (dict): Problem definition for the Sobol analysis. model (str): The name of the model for which the plot is being generated. response_key (str): The response key for which the sensitivities are plotted. framework_settings (dict): User provided - Settings related to the framework, including paths and case information.

Returns: None

middoe.iden_utils.run_postprocessing(round_data, solvers, selected_rounds, plot_global_p_and_t=True, plot_confidence_spaces=True, plot_p_and_t_tests=True, export_excel_reports=True, plot_estimability=True)[source]

Run post-processing analysis for specified solvers and rounds with plot/excel export options.

Parameters:
  • round_data (dict) – Data containing results from parameter estimation rounds.

  • solvers (list of str) – Solvers to include in the post-analysis (e.g. [‘f20’, ‘f21’]).

  • selected_rounds (list of int) – Rounds to include in the plots (e.g. [1, 2, 3]).

  • plot_global_p_and_t (bool) – If True, generates global P and t comparison plots across all models and rounds.

  • plot_confidence_spaces (bool) – If True, generates confidence space, ellipsoid volume, std dev, and parameter estimate plots.

  • plot_p_and_t_tests (bool) – If True, generates P-value and t-value evolution plots over rounds per model.

  • export_excel_reports (bool) – If True, generates Excel files with experimental, simulation, and input data + summary.

  • plot_estimability (bool) – If True, generates rCC vs k estimability plots per model.

middoe.iden_utils.validation_R2(prediction_metric, validation_metric, reference_metric, case)[source]

Plot R² prediction, R² validation, and R² reference for each model as an envelope plot. Also, create a bar plot with average R² values for prediction, validation, and all the data for each model.

Parameters: prediction_R2 (dict): Dictionary containing R² prediction values for each fold and model. validation_R2 (dict): Dictionary containing R² validation values for each fold and model. R2_ref (dict): Dictionary containing R² reference values for each model.

Returns: None

middoe.iden_utils.validation_params(parameters, ref_params)[source]

Plot normalized parameters for each model in each fold, divided by the corresponding member in ref_params.

Parameters: parameters (dict): Dictionary containing parameter values for each fold and model. (one data out at a time based cross-validation) ref_params (dict): Dictionary containing reference parameter values for each model. (all data used estimation)

Returns: None

Model-Based Design of Experiments

MBDoE - Parameter Precision

MBDoE - Model Discrimination

middoe.des_md.mbdoe_md(des_opt: Dict[str, Any], system: Dict[str, Any], models: Dict[str, Any], round: int, num_parallel_runs: int = 1) Dict[str, Dict[str, Any] | float][source]

Perform Model-Based Design of Experiments for Model Discrimination (MBDoE-MD).

This function identifies experimental designs that maximise the ability to discriminate between competing models. Optimisation can run in parallel or single-core mode. Failed runs are discarded, and the best valid design is returned.

Parameters:
  • des_opt (dict) –

    Optimisation settings, including:
    • ’mdob’str
      Discrimination criterion:
      • ’HR’: Hunter-Reiner T-optimality

        Maximises sum of squared differences between model predictions across all time points and measured outputs. Suitable for discriminating models with significant structural differences.

      • ’BFF’: Buzzi-Ferraris-Forzatti T-optimality

        Alternative T-optimality criterion with different weighting scheme for model divergence computation. Accounts for signal magnitude in discrimination objective.

    • ’itr’dict
      Iteration controls:
      • ’maxmd’: int — maximum iterations

      • ’toldmd’: float — convergence tolerance

      • ’pps’: int — population size

    • ’eps’float

      Perturbation step size for numerical derivatives.

    • ’plt’bool

      Enable plotting of results.

    • ’meth’str, optional

      Optimisation method (‘DE’, ‘PS’, ‘DEPS’).

  • system (dict) –

    System model and constraints:
    • ’t_s’tuple[float, float]

      Start and end times (ti, tf).

    • ’t_d’tuple[float, float]

      Restricted initial/final intervals (dead time).

    • ’t_r’float

      Time resolution for simulation.

    • ’tvi’dict
      Time-variant input definitions with keys:
      • ’max’, ‘min’: float — bounds

      • ’stps’: int — number of switching segments

      • ’const’: str — constraint type (‘inc’, ‘dec’, ‘rel’)

      • ’offt’: float — minimum time offset between switches

      • ’offl’: float — minimum level offset

      • ’cvp’: str — control variable profile (‘CPF’ piecewise-constant,

        ’LPF’ piecewise-linear)

    • ’tii’dict

      Time-invariant input definitions (‘max’, ‘min’).

    • ’tvo’dict
      Time-variant output definitions:
      • ’meas’: bool — include in measurement

      • ’sp’: int — number of sampling points

      • ’offt’: float — minimum time between samples

      • ’samp_s’: str — sampling synchronisation group

      • ’samp_f’: list[float] — forced sampling times

      • ’unc’: float — measurement uncertainty (standard deviation)

    • ’tio’dict

      Time-invariant output definitions.

  • models (dict) –

    Model definitions and settings:
    • ’can_m’list[str]

      Competing models to discriminate (active solver names).

    • ’theta’dict[str, list[float]]

      Nominal parameter values for each model.

    • ’mutation’dict[str, list[bool]]

      Parameter masks indicating which parameters are free (True) vs fixed (False).

    • ’V_matrix’dict[str, np.ndarray], optional

      Parameter covariance matrices (default: diagonal with small values).

  • round (int) – Current experimental design round (used for logging/reporting).

  • num_parallel_runs (int, optional) – Number of optimisation runs in parallel (default: 1, single-core mode).

Returns:

results

Best design found:
  • ’tii’dict[str, float]

    Optimal time-invariant input values.

  • ’tvi’dict[str, Any]

    Optimal time-variant input profiles.

  • ’swps’dict[str, list[float]]

    Switching times for time-variant controls.

  • ’St’dict[str, np.ndarray]

    Sampling times for each output variable.

  • ’md_obj’float

    Model discrimination objective value (maximised).

  • ’t_values’list[float]

    Full simulation time vector.

Return type:

dict

Raises:

RuntimeError – If all parallel runs fail or if the single-core run encounters an exception.

Notes

MBDoE-MD aims to design experiments that enhance the ability to reject incorrect models by maximising discrimination criteria based on T-optimality.

The discrimination objective maximises the divergence between competing model predictions, enabling statistical tests (P-test, t-test) to identify the most representative model.

Workflow for Model Discrimination: 1. Calibrate all candidate models on available data 2. Design discriminative experiment using MBDoE-MD (HR or BFF criterion) 3. Conduct designed experiment 4. Re-calibrate models with new data 5. Compute model probabilities via P-test 6. Select model with highest posterior probability 7. Validate selected model

Supported constraints include:
  • Control level bounds and switching rules (increasing, decreasing, relaxed)

  • Minimum sampling intervals and total sample limits

  • Forced or forbidden sampling times

  • Synchronised sampling across outputs

  • Piecewise-constant (CPF) or piecewise-linear (LPF) control profiles

References

See also

_safe_run_md

Parallel optimisation wrapper with exception handling.

_run_single_md

Single-core optimisation routine.

_optimiser_md

Core optimisation problem setup and solver.

Examples

>>> import numpy as np
>>> from middoe import des_md
>>>
>>> # Define system with 4 competing models
>>> system = {
...     'tvo': {
...         'X': {'init': 0.5, 'meas': True, 'unc': 0.05, 'sp': 17},
...         'S': {'init': 15.0, 'meas': True, 'unc': 1.0, 'sp': 17},
...         'P': {'init': 0.0, 'meas': True, 'unc': 0.05, 'sp': 17}
...     },
...     'tvi': {
...         'T': {'min': 296.15, 'max': 306.15, 'stps': 5, 'cvp': 'CPF',
...               'const': 'rel', 'offt': 2.0, 'offl': 0.5}
...     },
...     'tii': {
...         'S0': {'min': 0.366, 'max': 0.65},
...         'X0': {'min': 0.19, 'max': 0.595}
...     },
...     't_s': (0.0, 20.0),
...     't_d': (1.0, 1.0)
... }
>>>
>>> # Define 4 competing models (after calibration)
>>> models = {
...     'can_m': ['M1', 'M2', 'M3', 'M4'],
...     'krt': {'M1': 'fermentation_model_1', 'M2': 'fermentation_model_2',
...             'M3': 'fermentation_model_3', 'M4': 'fermentation_model_4'},
...     'theta': {
...         'M1': [0.408, 0.22, 71.5, 0.28, 0.607, 0.1],
...         'M2': [0.450, 0.20, 65.0, 0.30, 0.550, 0.12],
...         'M3': [0.380, 0.24, 75.0, 0.26, 0.630, 0.09],
...         'M4': [0.420, 0.21, 70.0, 0.29, 0.580, 0.11]
...     },
...     'mutation': {
...         'M1': [True, True, True, False, False, False],
...         'M2': [True, True, True, False, False, False],
...         'M3': [True, True, True, False, False, False],
...         'M4': [True, True, True, False, False, False]
...     }
... }
>>>
>>> # Configure MBDoE-MD with Hunter-Reiner criterion
>>> des_opt = {
...     'mdob': 'HR',
...     'meth': 'DEPS',
...     'itr': {'maxmd': 5000, 'toldmd': 1e-8, 'pps': 50},
...     'eps': 0.01,
...     'plt': True
... }
>>>
>>> # Run MBDoE-MD (parallel mode with 4 cores)
>>> design = des_md.mbdoe_md(des_opt, system, models, round=2, num_parallel_runs=4)
>>>
>>> # Access results
>>> print(f"Optimal temperature profile: {design['tvi']['T']}")
>>> print(f"Optimal initial substrate: {design['tii']['S0']:.4f} mol/L")
>>> print(f"Optimal initial biomass: {design['tii']['X0']:.4f} mol/L")
>>> print(f"Sampling times: {design['St']['X']}")
>>> print(f"HR discrimination value: {design['md_obj']:.4e}")

Design Utilities

middoe.des_utils.configure_logger(name=None, level=20)[source]

Configure a logger with standard output formatting for MBDoE operations.

This function creates or reconfigures a logger with a consistent format and ensures output is directed to stdout (not stderr). Existing handlers are removed to prevent duplicate log messages.

Parameters:
  • name (str, optional) – Logger name. If None, returns the root logger (default: None).

  • level (int, optional) –

    Logging level (default: logging.INFO). Common levels:

    • logging.DEBUG (10): Detailed diagnostic information.

    • logging.INFO (20): General informational messages.

    • logging.WARNING (30): Warning messages.

    • logging.ERROR (40): Error messages.

    • logging.CRITICAL (50): Critical errors.

Returns:

logger – Configured logger instance.

Return type:

logging.Logger

Notes

Handler Configuration:
  • A StreamHandler is attached to sys.stdout (not sys.stderr).

  • Format: ‘%(levelname)s: %(message)s’ (e.g., “INFO: Optimisation started”).

Handler Clearing:

If the logger already has handlers, they are removed before adding a new one. This prevents duplicate messages when the function is called multiple times.

Propagation:

Logger propagation is disabled (logger.propagate = False) to prevent messages from being passed to parent loggers.

See also

logging.getLogger

Standard library logger retrieval.

logging.StreamHandler

Standard library stream handler.

Examples

>>> logger = configure_logger('mbdoe', level=logging.DEBUG)
>>> logger.info("Starting optimisation")
INFO: Starting optimisation
>>> logger.debug("Current objective value: 1.234e5")
DEBUG: Current objective value: 1.234e5

Sensitivity & Estimability Analysis

Global Sensitivity Analysis (Sobol)

middoe.sc_sensa.sample(settings, N, calc_second_order=True, scramble=True, skip_values=0, seed=None)[source]

Generate structured Sobol sample matrix for variance-based sensitivity analysis.

Creates a low-discrepancy quasi-random sample matrix using Sobol sequences, structured to enable efficient computation of first-order and total-order sensitivity indices. Supports grouped analysis and various sampling options.

Parameters:
  • settings (dict) –

    Sobol problem definition:
    • ’num_vars’ : int — Number of parameters

    • ’bounds’ : list[tuple] — [(min, max), …] for each parameter

    • ’dists’ : list[str], optional — Distribution types

    • ’groups’ : list[str], optional — Group assignments

  • N (int) – Base number of samples. Total simulations = N × (2D + 2) where D = num_vars.

  • calc_second_order (bool, optional) – Generate samples for second-order indices (default: True). If False, reduces to N × (D + 2) simulations.

  • scramble (bool, optional) – Apply Owen scrambling to Sobol sequence (default: True).

  • skip_values (int, optional) – Skip initial Sobol points (default: 0).

  • seed (int, optional) – Random seed for scrambling (default: None).

Returns:

sample_matrix – Structured sample matrix with interleaved A, AB, BA (optional), B matrices.

Return type:

np.ndarray, shape ((2D+2)*N, D) or ((D+2)*N, D)

Notes

Sample Matrix Structure: For each base sample i, generates:

[A_i, AB_i1, …, AB_iD, (BA_i1, …, BA_iD), B_i]

where AB_ij is A with column j replaced by B.

Sobol Sequences: Low-discrepancy quasi-random sequences with superior space-filling vs random:

  • Uniform coverage of parameter space

  • Faster convergence of Monte Carlo estimates

  • Deterministic (given seed)

Total Simulations:
  • With 2nd order: N × (2D + 2)

  • Without 2nd order: N × (D + 2)

Example for N=100, D=5:
  • With 2nd order: 1,200 simulations

  • Without: 700 simulations

Scrambling: Owen scrambling randomizes while preserving low-discrepancy properties. Enables confidence interval estimation via multiple scrambled sequences.

See also

_rescale_parameters

Applies bounds and distributions.

_generate_group_matrix

For grouped analysis.

sensa

Main function that calls this.

Examples

>>> problem = {
...     'num_vars': 3,
...     'bounds': [(0, 1), (100, 500), (1e3, 1e6)]
... }
>>> samples = sample(problem, N=100, seed=42)
>>> print(samples.shape)  # (800, 3) = 100*(2*3+2)
>>>
>>> # With distributions
>>> problem['dists'] = ['unif', 'norm', 'logunif']
>>> samples = sample(problem, N=100)
middoe.sc_sensa.sensa(gsa, models, system)[source]

Perform global Sobol sensitivity analysis to quantify parameter and variable importance.

This function implements variance-based global sensitivity analysis using Sobol indices, identifying which parameters and input variables most influence model predictions. It supports both single-core and multi-core parallel execution, making it suitable for computationally expensive models. The analysis generates time-varying sensitivity indices showing how parameter importance evolves during dynamic experiments.

Parameters:
  • gsa (dict) –

    Global sensitivity analysis configuration:
    • ’tii_n’list[float]

      Nominal values for time-invariant inputs.

    • ’tvi_n’list[float]

      Nominal values for time-variant inputs.

    • ’samp’int

      Number of Sobol samples (N). Total simulations = N*(2D+2) where D = num inputs.

    • ’var_s’bool

      Enable variable sensitivity analysis.

    • ’par_s’bool

      Enable parameter sensitivity analysis.

    • ’plt’bool

      Generate Sobol plots automatically.

    • ’var_d’float, optional

      Variable variation factor (e.g., 1.2 = ±20%). If not provided, uses bounds.

    • ’par_d’float, optional

      Parameter variation factor. If not provided, uses t_l and t_u bounds.

    • ’multi’False or float, optional

      Parallel execution: False=single-core, 0.0-1.0=fraction of CPUs to use.

  • models (dict) –

    Model definitions:
    • ’can_m’list[str]

      Active model/solver names.

    • ’theta’dict[str, list[float]]

      Nominal parameter values.

    • ’t_u’dict[str, list[float]]

      Parameter upper bounds.

    • ’t_l’dict[str, list[float]]

      Parameter lower bounds.

  • system (dict) –

    System configuration:
    • ’t_s’tuple[float, float]

      Time span.

    • ’tvi’dict

      Time-variant input definitions.

    • ’tii’dict

      Time-invariant input definitions with ‘min’ and ‘max’ bounds.

    • ’tvo’dict

      Time-variant output definitions.

Returns:

sobol_output

Complete Sobol analysis results:
  • ’analysis’dict[str, dict[str, list[dict]]]

    Nested structure: {solver: {response: [time_step_results]}} Each time_step_results contains:

    • ’S1’ : np.ndarray — First-order indices (main effects)

    • ’ST’ : np.ndarray — Total indices (main + interactions)

    • ’S1_conf’ : np.ndarray — S1 confidence intervals

    • ’ST_conf’ : np.ndarray — ST confidence intervals

  • ’problem’dict[str, dict]

    Sobol problem definitions: {solver: {‘num_vars’, ‘names’, ‘bounds’}}

Return type:

dict

Notes

Sobol Sensitivity Indices: Variance-based global sensitivity analysis decomposes output variance into contributions from individual inputs and their interactions.

First-Order Index (S1): Main effect of input i, excluding interactions:

[ S_i = frac{V_{X_i}[E_{X_{sim i}}(Y|X_i)]}{V(Y)} ]

Interpretation: Fraction of output variance due to varying input i alone.

Total Index (ST): Total effect including all interactions:

[ ST_i = frac{E_{X_{sim i}}[V_{X_i}(Y|X_{sim i})]}{V(Y)} ]

Interpretation: Fraction of variance that would remain if all inputs except i were fixed.

Computational Cost: For N samples, D inputs: N × (2D + 2) simulations required.

References

See also

sample

Generates Sobol quasi-random samples.

_sobol_analysis

Computes Sobol indices from model outputs.

Examples

>>> gsa = {
...     'tii_n': [1.5, 300],
...     'tvi_n': [320],
...     'samp': 1000,
...     'var_s': True,
...     'par_s': True,
...     'plt': True,
...     'multi': 0.8
... }
>>> results = sensa(gsa, models, system)
>>> ST = results['analysis']['M1']['CA'][50]['ST']

Estimability Analysis

Simulation & Experiments

Simulation Kernel

middoe.krnl_simula.simula(t, swps, uphi, uphisc, uphitsc, utsc, utheta, uthetac, cvp, uphit, model_name, system, models)[source]

Execute forward model simulation with flexible solver and interpolation support.

This is the main simulation kernel that orchestrates the complete forward modeling workflow. It handles model loading (internal, external Python, or gPROMS), variable scaling and interpolation, initial condition setup, and ODE integration. The function supports multiple control variable parameterization (CVP) methods and integrates seamlessly with parameter estimation and MBDoE routines.

Parameters:
  • t (list or np.ndarray) – Normalized time points for simulation [0, 1]. Will be scaled to physical time internally using utsc.

  • swps (dict[str, np.ndarray]) –

    Switching points for piecewise interpolation of time-variant inputs:
    • ’{var}t’np.ndarray

      Switching times (normalized [0, 1]).

    • ’{var}l’np.ndarray

      Switching levels (normalized [0, 1]).

  • uphi (dict[str, float]) – Time-invariant input variables (normalized [0, 1]).

  • uphisc (dict[str, float]) – Scaling factors (maximum values) for time-invariant inputs.

  • uphitsc (dict[str, float]) – Scaling factors (maximum values) for time-variant inputs.

  • utsc (float) – Time scaling factor (physical time = normalized time * utsc).

  • utheta (list[float]) – Normalized model parameters (typically [1, 1, …, 1]).

  • uthetac (list[float]) – Parameter scaling factors (true parameter values).

  • cvp (dict[str, str]) –

    Control variable parameterization methods for each time-variant input:
    • ’CPF’ : Constant piecewise function (zero-order hold).

    • ’LPF’ : Linear piecewise function (linear interpolation).

    • ’no_CVP’ : No control parameterization (constant value from uphit).

  • uphit (dict[str, float or np.ndarray]) – Initial or constant values for time-variant inputs when CVP=’no_CVP’.

  • model_name (str) – Name of the model to simulate. Must exist in models[‘krt’].

  • system (dict) –

    System configuration:
    • ’tvo’dict
      Time-variant output definitions with:
      • ’init’float or ‘variable’

        Initial condition (value or reference to uphi key).

    • ’tio’dict

      Time-invariant output definitions.

  • models (dict) –

    Model registry and settings:
    • ’krt’dict[str, str or callable]
      Model type/function mapping:
      • ’pym’ : Internal Python model (from krnl_models module)

      • ’pys’ : External Python script with solve_model()

      • ’gpr’ : gPROMS model via pygpas interface

      • callable : Direct function reference

    • ’src’dict[str, str], optional

      File paths for ‘pys’ and ‘gpr’ models.

    • ’creds’dict[str, str], optional

      Credentials for gPROMS models.

Returns:

  • tv_ophi (dict[str, np.ndarray]) –

    Time-variant output trajectories:
    • Keys: output variable names (e.g., ‘y1’, ‘y2’, ‘CA’, ‘T’)

    • Values: arrays of shape (len(t),) with simulation results

  • ti_ophi (dict[str, float]) – Time-invariant output values (currently empty dict, for future use).

  • phit (dict[str, np.ndarray]) –

    Scaled and interpolated time-variant input profiles:
    • Keys: input variable names

    • Values: arrays of shape (len(t),) with input trajectories

Raises:
  • KeyError – If model_name is not found in models[‘krt’], or if required keys are missing from swps or uphit for specified CVP methods.

  • ImportError – If required modules (krnl_models, pygpas) cannot be imported.

  • ValueError – If CVP method is unrecognized or if switching point arrays have mismatched lengths.

Notes

Simulation Workflow:
  1. Model Loading: Determine model type and load model function

  2. Variable Scaling: Convert normalized inputs to physical units

  3. Interpolation: Construct time-variant input profiles via CVP

  4. Initial Conditions: Extract from system[‘tvo’] definitions

  5. Integration: Call solver_selector() with appropriate backend

  6. Return: Scaled outputs and interpolated inputs

Model Types:
  • ‘pym’ (Python internal): Model function defined in middoe.krnl_models. Loaded via importlib and called directly.

  • ‘pys’ (Python external): Model defined in external .py file. Must contain solve_model(t, y0, tii, tvi, theta) function. Loaded dynamically via importlib.util.

  • ‘gpr’ (gPROMS): Model defined in gPROMS .gPJ file. Requires pygpas package for interface. Evaluated via gPAS client-server architecture.

  • callable: Model function passed directly in models[‘krt’][model_name]. Most flexible option for custom models.

Control Variable Parameterization (CVP):
  • CPF (Constant Piecewise): Zero-order hold interpolation. Input held constant between switching points (step changes). [ u(t) = u_k quad text{for } t in [t_k, t_{k+1}) ]

  • LPF (Linear Piecewise): Linear interpolation between switching points. Smooth transitions create linear ramps. [ u(t) = u_k + (u_{k+1} - u_k) frac{t - t_k}{t_{k+1} - t_k} ]

  • no_CVP: No control optimization. Input remains constant at uphit value.

Normalization Convention: All variables, parameters, and time are normalized internally:

  • Variables: ( phi_{physical} = phi_{norm} cdot phi_{max} )

  • Parameters: ( theta_{physical} = theta_{norm} cdot theta_c )

  • Time: ( t_{physical} = t_{norm} cdot t_{scale} )

This improves numerical conditioning and allows unified handling across different scales. Scaling is reversed internally before calling the physical model.

Initial Conditions: Each time-variant output can specify:

  • Numeric value: Used directly as ( y_0 )

  • ‘variable’: References uphi[‘{var}_0’] (e.g., ‘CA_0’ for initial concentration)

Integration Backend: solver_selector() handles ODE integration using:

  • SciPy solve_ivp (LSODA) for Python models

  • Custom solve_model() for external scripts

  • gPAS evaluate() for gPROMS models

Retry Logic: Simulation failures trigger automatic retries with small delays. This handles transient numerical issues or gPROMS connection problems.

References

See also

_backscal

Internal function that performs scaling and calls solver_selector.

_Piecewiser

Constructs piecewise interpolated input profiles.

solver_selector

Backend-specific integration routine.

expera

Uses this function for in-silico experiment generation.

parmest

Uses this function during parameter estimation.

Examples

>>> # Basic simulation with internal Python model
>>> t = np.linspace(0, 1, 101)
>>> swps = {'Tt': np.array([0, 0.5, 1]), 'Tl': np.array([300, 350, 300])}
>>> uphi = {'P': 0.8, 'CA_0': 0.5}
>>> uphisc = {'P': 5.0, 'CA_0': 2.0}
>>> uphitsc = {'T': 400}
>>> utheta, uthetac = [1, 1, 1], [1.5e5, 2.3, 0.8]
>>> cvp = {'T': 'LPF'}
>>> tv_out, ti_out, tv_in = simula(
...     t, swps, uphi, uphisc, uphitsc, 100, utheta, uthetac,
...     cvp, {}, 'CSTR_model', system, models
... )
>>> print(tv_out.keys())  # ['CA', 'CB', 'T']
>>> print(tv_in['T'])  # Temperature profile [300, ..., 350, ..., 300]
>>> # External Python model
>>> models = {
...     'krt': {'my_model': 'pys'},
...     'src': {'my_model': '/path/to/model.py'}
... }
>>> tv_out, ti_out, tv_in = simula(t, {}, uphi, uphisc, uphitsc,
...                                  100, utheta, uthetac, {}, uphit,
...                                  'my_model', system, models)
>>> # gPROMS model
>>> models = {
...     'krt': {'gproms_model': 'gpr'},
...     'src': {'gproms_model': '/path/to/model.gPJ'},
...     'creds': {'gproms_model': 'password'}
... }
>>> tv_out, ti_out, tv_in = simula(t, swps, uphi, uphisc, uphitsc,
...                                  100, utheta, uthetac, cvp, {},
...                                  'gproms_model', system, models)
middoe.krnl_simula.solver_selector(model, t, y0, phi, phit, theta, models, model_name, system)[source]

Select and execute appropriate ODE solver backend with automatic retry logic.

This function dispatches to the correct integration backend based on model type, handling SciPy solve_ivp, external Python scripts, and gPROMS models. It includes automatic retry logic to handle transient numerical failures or connection issues.

Parameters:
  • model (callable) –

    Model function to integrate. Signature depends on backend:
    • SciPy: dydt = model(t, y, phi, phit, theta, t_vec)

    • External: result = model(t, y0, phi, phit, theta)

  • t (list[float]) – Physical time points for evaluation.

  • y0 (list[float]) – Initial conditions for ODE states.

  • phi (dict[str, float]) – Time-invariant inputs (physical units).

  • phit (dict[str, np.ndarray]) – Time-variant input profiles (physical units).

  • theta (list[float]) – Model parameters (physical units).

  • models (dict) –

    Model registry:
    • ’krt’dict[str, str]

      Model types (‘pys’, ‘gpr’, or callable).

    • ’src’dict[str, str]

      File paths for external models.

    • ’creds’dict[str, str]

      Credentials for gPROMS models.

  • model_name (str) – Name of model being simulated.

  • system (dict) – System configuration with output variable definitions.

Returns:

  • tv_ophi (dict[str, np.ndarray]) – Time-variant output trajectories.

  • ti_ophi (dict[str, float]) – Time-invariant output values.

  • tvi (dict[str, np.ndarray]) – Time-variant input profiles (unchanged, passed through).

Notes

Solver Backends:
  1. ‘pys’ (External Python Script): - Dynamically loads .py file from models[‘src’][model_name] - Calls solve_model(t, y0, tii, tvi, theta) - Returns {‘tvo’: {…}, ‘tio’: {…}}

  2. ‘gpr’ (gPROMS): - Requires pygpas package - Opens .gPJ file via StartedConnected client - Sets inputs via client.set_input_value() - Evaluates model via evaluate(client) - Retrieves outputs via client.get_value()

  3. Default (SciPy solve_ivp): - Uses LSODA method (automatic stiffness detection) - Integrates from min(t) to max(t) - Evaluates at specified time points (t_eval=t) - Returns {‘y1’: sol[0], ‘y2’: sol[1], …}

Retry Logic: Maximum 1 retry attempt for each backend:

  • 0.5s delay between attempts for SciPy

  • 1.0s delay between attempts for gPROMS

  • Prints failure messages to console

SciPy Integration Settings:
  • Method: LSODA (stiff/non-stiff automatic switching)

  • rtol: 1e-8 (relative tolerance)

  • atol: 1e-10 (absolute tolerance)

  • t_span: [min(t), max(t)]

  • t_eval: t (evaluation points)

gPROMS Connection:
  • Random port selection (10000-90000) to avoid conflicts

  • Silent mode (stdout/stderr suppressed)

  • Automatic connection teardown via context manager

  • Requires valid credentials in models[‘creds’]

Error Handling: If all retries fail:

  • Prints error message with attempt count

  • Returns empty dicts (simulation failed)

  • Calling code should check for empty results

Output Naming Convention: For SciPy models, outputs are automatically named ‘y1’, ‘y2’, etc. based on position in state vector. For external/gPROMS models, output names must match keys in system[‘tvo’] and system[‘tio’].

References

See also

simula

Main entry point.

_backscal

Prepares inputs and calls this function.

scipy.integrate.solve_ivp

SciPy ODE solver.

Examples

>>> # Typically called internally by _backscal()
>>> # SciPy backend (default)
>>> def my_model(t_curr, y, tii, tvi, theta, t_vec):
...     dydt = [-theta[0] * y[0] * tii['CA'], theta[0] * y[0] * tii['CA']]
...     return dydt
>>> tv_out, ti_out, tv_in = solver_selector(
...     my_model, [0, 50, 100], [1.0, 0.0],
...     {'CA': 2.0}, {'T': T_profile}, [1.5e5, 2.3],
...     models, 'my_model', system
... )
>>> print(tv_out.keys())  # ['y1', 'y2']

Experiment Generation

middoe.krnl_expera.expera(system, models, insilicos, design_decisions, expr, swps=None)[source]

Execute in-silico experiment and generate synthetic measurement data with noise.

This function orchestrates the complete in-silico experimentation workflow: setting up simulation parameters, running the forward model, adding realistic measurement noise, and saving results to Excel. It supports both preliminary (classic) designs and optimized MBDoE designs. The function handles time-variant and time-invariant inputs/outputs, control variable parameterizations, and switching profiles.

Parameters:
  • system (dict) –

    System configuration defining model structure:
    • ’t_s’tuple[float, float]

      Time span (start, end).

    • ’t_r’float

      Time resolution for discretization.

    • ’tvi’dict
      Time-variant input definitions:
      • ’max’ : float — upper bound

      • ’cvp’ : str — control variable parameterization method

    • ’tii’dict

      Time-invariant input definitions.

    • ’tvo’dict
      Time-variant output definitions:
      • ’meas’ : bool — measurement flag

      • ’unc’ : float — measurement uncertainty (std dev)

      • ’sp’ : int — number of sampling points

    • ’tio’dict

      Time-invariant output definitions.

  • models (dict) – Model definitions (currently unused but passed for extensibility).

  • insilicos (dict) –

    In-silico experiment configuration:
    • ’tr_m’str

      True model name (ground truth for simulation).

    • ’theta’list[float]

      True parameter values (ground truth).

    • ’prels’dict

      Preliminary/classic design values.

    • ’errt’str, optional

      Error type: ‘abs’ (absolute) or ‘rel’ (relative, default: ‘abs’).

  • design_decisions (dict or None) –

    MBDoE design decisions. If None, runs preliminary design. If provided:
    • ’St’dict[str, np.ndarray]

      Sampling times for each output variable (unscaled).

    • ’t_values’np.ndarray

      Global time vector for simulation.

    • ’tii’dict[str, float]

      Time-invariant input values.

    • ’tvi’dict[str, np.ndarray]

      Time-variant input profiles.

  • expr (int) – Experiment number (used for Excel sheet naming).

  • swps (dict, optional) –

    Switching points for time-variant inputs (unscaled):
    • ’{var}t’ : np.ndarray — switching times

    • ’{var}l’ : np.ndarray — switching levels

Returns:

  • excel_path (Path) – Path to the Excel file containing experiment data (‘data.xlsx’).

  • df_combined (pd.DataFrame) –

    Combined DataFrame with all simulation results and metadata:
    • Per-variable sections: ‘MES_X:{var}’, ‘MES_Y:{var}’, ‘MES_E:{var}’

    • Global time vector: ‘X:all’

    • Input values: time-invariant and time-variant

    • Output values: time-invariant (noisy)

    • CVP metadata: ‘CVP:{var}’ (control parameterization methods)

    • Switching points: ‘{var}t’, ‘{var}l’ (if applicable)

Notes

Experimental Workflow:
  1. Setup: Extract scaling factors, construct variable/parameter dictionaries

  2. Time Discretization: Create global time vector based on sampling requirements

  3. Simulation: Call forward model (simula) with specified inputs

  4. Noise Addition: Add measurement noise according to error type

  5. Data Organization: Structure results in standardized DataFrame format

  6. Export: Save to Excel with experiment number as sheet name

Preliminary vs MBDoE Designs:
  • Preliminary (design_decisions=None):
    • Uses classic_des values from insilicos[‘prels’]

    • Uniform sampling in time for each output

    • Simple input profiles (constant or default CVP)

  • MBDoE (design_decisions provided):
    • Uses optimized sampling times from St

    • Optimized input profiles from tvi

    • Advanced CVP methods (step, ramp, etc.)

Measurement Noise: Two error models supported:

  • Absolute (‘abs’): ( y_{noisy} = y_{true} + mathcal{N}(0, sigma^2) )

  • Relative (‘rel’): ( y_{noisy} = y_{true} + mathcal{N}(0, (sigma cdot y_{true})^2) )

Negative values are clipped to zero (physical constraint).

DataFrame Structure: The output DataFrame contains multiple sections separated by blank columns:

  1. Per-variable measurements: Separate columns for each output variable with sampling times (MES_X), noisy values (MES_Y), and uncertainties (MES_E)

  2. Global simulation: Time vector (X:all), all inputs, time-invariant outputs

  3. CVP metadata: Control variable parameterization methods

  4. Switching points: Only for MBDoE designs with switching profiles

Excel Output: Results are appended to ‘data.xlsx’ in the current directory. Each experiment creates/updates a sheet named by the experiment number. This allows accumulation of multiple experiments in a single file for batch processing.

Time Scaling: All time values are normalized to [0, 1] internally, then scaled back to physical units (tsc = system[‘t_s’][1]) for output. This improves numerical conditioning.

References

See also

simula

Forward model simulation kernel.

_construct_var

Variable scaling and organization.

_construct_par

Parameter vector construction.

_inssimulator

Core simulation and noise addition routine.

Examples

>>> # Preliminary design (uniform sampling)
>>> excel_path, df = expera(system, models, insilicos, None, expr=1)
>>> print(f"Experiment saved to: {excel_path}")
>>> print(f"Measurements: {df.filter(like='MES_Y').columns.tolist()}")
>>> # MBDoE design (optimized sampling)
>>> design_decisions = {
...     'St': {'y1': np.array([0, 50, 100]), 'y2': np.array([0, 75, 100])},
...     't_values': np.array([0, 0.5, 0.75, 1.0]),
...     'tii': {'P': 1.5, 'T0': 298},
...     'tvi': {'T': np.array([300, 320, 310, 300])}
... }
>>> swps = {'Tt': np.array([0, 33, 67]), 'Tl': np.array([300, 320, 310])}
>>> excel_path, df = expera(system, models, insilicos, design_decisions,
...                          expr=2, swps=swps)

Data Management

Logging & I/O Utilities

middoe.log_utils.add_norm_par(modelling_settings)[source]

Initialize normalized parameter vectors for all models.

This utility creates ‘normalized_parameters’ entries (all ones) from the ‘theta’ parameter scaling factors. It’s used during setup to establish the normalization convention used throughout the package.

Parameters:

modelling_settings (dict) –

Model definitions with:
  • ’theta’dict[str, list[float]]

    Parameter scaling factors for each model.

Returns:

modelling_settings

Updated dictionary with added:
  • ’normalized_parameters’dict[str, list[float]]

    Normalized parameter vectors (all ones) for each model.

Return type:

dict

Raises:

KeyError – If ‘theta’ key is not present in modelling_settings.

Notes

Normalization Convention: Throughout the package, parameters are represented as:

[ theta_{physical} = theta_{normalized} cdot theta_c ]

where ( theta_{normalized} = [1, 1, …, 1] ) represents the baseline (true or initial) parameter values.

Why All Ones?: Using ones as the baseline simplifies:

  • Parameter estimation: Optimize ( theta_{normalized} ) around 1.0

  • Bounds specification: Bounds become multiplicative factors (e.g., [0.5, 2.0])

  • Interpretation: Values >1 indicate parameters larger than baseline

Usage Context: Called during model setup, typically early in the workflow before any estimation or design routines.

See also

_construct_par

Uses normalized parameters during simulation.

parmest

Estimates normalized parameters during optimization.

Examples

>>> models = {
...     'theta': {
...         'M1': [1.5e5, 2.3, 0.8],
...         'M2': [2.0e5, 1.8, 1.2]
...     },
...     'can_m': ['M1', 'M2']
... }
>>> models = add_norm_par(models)
>>> print(models['normalized_parameters'])
# {'M1': [1, 1, 1], 'M2': [1, 1, 1]}
middoe.log_utils.data_appender(df_combined, experiment_number, data_storage)[source]

Accumulate experimental data in memory during sequential experiment generation.

This function provides an alternative to Excel file accumulation for in-silico experiments. It’s particularly useful for programmatic workflows where data is generated and consumed entirely in Python without intermediate file I/O.

Parameters:
  • df_combined (pd.DataFrame) – DataFrame for current experiment with standardized structure (from expera()).

  • experiment_number (int or str) – Unique identifier for this experiment (converted to string internally).

  • data_storage (dict[str, pd.DataFrame]) – Accumulator dictionary. Updated in-place.

Returns:

data_storage – Updated storage dictionary with appended experiment.

Return type:

dict[str, pd.DataFrame]

Notes

Usage Context: Alternative to Excel-based workflow:

  • Excel workflow: expera() → data.xlsx → read_excel() → parmest()

  • Memory workflow: expera() → data_appender() → parmest(data=data_storage)

When to Use:
  • Automated batch experiments

  • High-throughput screening

  • Avoiding file I/O overhead

  • Testing and validation

DataFrame Structure: The df_combined DataFrame should match the format expected by read_excel(), with the same column naming conventions.

Memory Considerations: Storing many experiments in memory can consume significant RAM. For large campaigns (>100 experiments), consider periodic saving to disk.

See also

expera

Generates df_combined for each experiment.

read_excel

Excel-based alternative.

parmest

Accepts data from either source.

Examples

>>> # Initialize storage
>>> data_storage = {}
>>>
>>> # Generate multiple experiments
>>> for i in range(1, 6):
...     _, df = expera(system, models, insilicos, design_decisions, expr=i)
...     data_storage = data_appender(df, i, data_storage)
>>>
>>> print(f"Accumulated {len(data_storage)} experiments")
>>> # {'1': DataFrame, '2': DataFrame, ..., '5': DataFrame}
>>>
>>> # Use directly for estimation
>>> results = parmest(system, models, iden_opt, data=data_storage)
middoe.log_utils.fun_globalizer(func_name)[source]

Register a function in the kernel_simulator module namespace for dynamic access.

This utility enables runtime registration of functions in the kernel_simulator module, facilitating dynamic model loading and callback mechanisms. It’s primarily used for custom model functions that need to be accessible globally during simulation.

Parameters:

func_name (str) – Name under which to register the function in kernel_simulator’s namespace. The function with this name must exist in the current global scope.

Returns:

Function is registered as a side effect.

Return type:

None

Notes

Use Case: When defining custom ODE models in a script, this function allows them to be dynamically registered in the kernel_simulator module, making them accessible to the simulation engine without modifying the core package code.

Registration Mechanism: Uses setattr() to add a lambda wrapper that calls the local function:

[ text{kernel_simulator.func_name} = lambda(*args, **kwargs) rightarrow text{globals()[func_name]}(*args, **kwargs) ]

Namespace Management: The function must exist in globals() at the time of registration. This typically means it should be defined in the same script before calling fun_globalizer().

Warning: This modifies the kernel_simulator module’s namespace at runtime. Use with caution in production environments. Prefer direct model registration via models[‘krt’].

See also

simula

Uses registered functions for model execution.

Examples

>>> # Define a custom model
>>> def my_custom_model(t, y, tii, tvi, theta, t_vec):
...     dydt = [-theta[0] * y[0], theta[0] * y[0]]
...     return dydt
>>>
>>> # Register it globally
>>> fun_globalizer('my_custom_model')
>>>
>>> # Now accessible in kernel_simulator
>>> # (used internally by simulation engine)
middoe.log_utils.load_from_jac()[source]

Load previously saved identification and sensitivity analysis results.

This function attempts to deserialize .jac files from the current directory, providing easy access to archived results without re-running expensive computations. It handles missing files gracefully and reports status for each expected file.

Returns:

loaded_data

Dictionary with keys ‘iden’ and ‘sensa’:
  • loaded_data[‘iden’]: Identification results (round_data) or None

  • loaded_data[‘sensa’]: Sensitivity results (sobol_results) or None

Returns None if neither file is found.

Return type:

dict or None

Notes

Expected Files:
  • ‘iden_results.jac’: Identification/estimation results

  • ‘sensa_results.jac’: Sensitivity analysis results

Return Behavior:
  • If both files exist: Returns dict with both keys populated

  • If one file exists: Returns dict with one key populated, other is None

  • If neither exists: Returns None

Error Handling: If a file exists but cannot be loaded (corrupt, version mismatch, etc.), the function prints an error message and sets that key to None, but continues loading other files.

Usage Patterns:
  1. Resume workflow: Load previous rounds and continue

  2. Post-processing only: Load results for plotting/analysis

  3. Comparison: Load multiple result sets for comparison

Pickle Security: As with save_to_jac(), only load .jac files from trusted sources.

See also

save_to_jac

Companion function to save results.

run_postprocessing

Often called after loading results.

Examples

>>> # Load previous results
>>> loaded = load_from_jac()
>>> if loaded is not None:
...     if loaded['iden'] is not None:
...         print(f"Loaded {len(loaded['iden'])} estimation rounds")
...         round_data = loaded['iden']
...     if loaded['sensa'] is not None:
...         print("Loaded sensitivity analysis results")
...         sobol_results = loaded['sensa']
>>> else:
...     print("No previous results found, starting fresh")
>>>
>>> # Continue from last round
>>> if loaded and loaded['iden']:
...     last_round = max([int(k.split()[1]) for k in loaded['iden'].keys()])
...     print(f"Continuing from round {last_round}")
...     # Prepare for round last_round + 1
middoe.log_utils.read_excel()[source]

Read experimental data from ‘data.xlsx’ in the current working directory.

This is the standard data loading function used throughout the MBDoE package. It reads all sheets from an Excel file containing experimental measurements, input conditions, and metadata. The file must follow the standardized format with specific column naming conventions.

Returns:

data

Dictionary of DataFrames:
  • Keys: Sheet names (typically experiment numbers: ‘1’, ‘2’, ‘3’, …)

  • Values: DataFrames with standardized columns:
    • ’MES_X:{var}’: Measurement times for time-variant outputs

    • ’MES_Y:{var}’: Measured values

    • ’MES_E:{var}’: Measurement uncertainties (std dev)

    • ’X:all’: Full time vector for simulation

    • Input variable columns (e.g., ‘T’, ‘P’, ‘CA_0’)

    • ’CVP:{var}’: Control parameterization methods

    • ’{var}t’, ‘{var}l’: Switching points (if applicable)

Return type:

dict[str, pd.DataFrame]

Raises:

FileNotFoundError – If ‘data.xlsx’ does not exist in current working directory.

Notes

File Location: The function looks for ‘data.xlsx’ in Path.cwd() (current working directory). Ensure you run your script from the project root or navigate to the correct directory before calling this function.

Data Format: The Excel file should contain multiple sheets, each representing one experiment. Within each sheet:

  • Measurement columns: ‘MES_X:{var}’, ‘MES_Y:{var}’, ‘MES_E:{var}’

  • Time column: ‘X:all’ (full simulation time vector)

  • Input columns: Time-invariant (e.g., ‘P’, ‘T0’) and time-variant (e.g., ‘T’)

  • Metadata: ‘CVP:{var}’ for control parameterization info

  • Switching points: ‘{var}t’ (times) and ‘{var}l’ (levels)

Usage Context: Called by:

  • parmest(): To load data for parameter estimation

  • uncert(): To load data for uncertainty analysis

  • validation(): To split data for cross-validation

  • expera(): Writes to this file during in-silico experiments

Multiple Sheets: Each sheet represents an independent experiment. Parameter estimation typically uses all sheets simultaneously to estimate a single set of parameters. The sheet names are arbitrary but should be unique identifiers.

Pandas read_excel Options: Uses sheet_name=None to read all sheets at once, returning a dictionary. This is more efficient than reading sheets individually.

See also

expera

Writes data to this Excel file.

parmest

Primary consumer of this data.

data_appender

Alternative for in-memory data accumulation.

Examples

>>> # Load experimental data
>>> data = read_excel()
>>> print(f"Found {len(data)} experiments")
>>> print(f"Experiments: {list(data.keys())}")
>>>
>>> # Access first experiment
>>> exp1 = data['1']
>>> print(f"Columns: {exp1.columns.tolist()}")
>>>
>>> # Extract measurements for a specific variable
>>> y1_times = exp1['MES_X:y1'].dropna().values
>>> y1_values = exp1['MES_Y:y1'].dropna().values
>>> y1_errors = exp1['MES_E:y1'].dropna().values
>>> print(f"Measured {len(y1_values)} points for y1")
middoe.log_utils.save_rounds(round, result, design_type, round_data, models, iden_opt, obs, system, ranking=None, k_optimal_value=None, rCC_values=None, J_k_values=None, best_uncert_result=None)[source]

Save and organize results from a single experimental round for post-processing.

This function consolidates all results from one round of the MBDoE workflow (design → experiment → estimation → analysis) into a structured dictionary. It computes reference t-values, saves parameter estimates and uncertainties, generates plots, and prepares data for multi-round comparison.

Parameters:
  • round (int) – Round number (1, 2, 3, …) indicating experimental campaign progression.

  • result (dict[str, dict]) –

    Uncertainty analysis results for each model/solver containing:
    • ’estimations’: Estimated parameter values

    • ’V_matrix’: Covariance matrix

    • ’CI’: Confidence intervals

    • ’t_values’: t-statistics for parameter significance

    • ’P’: Model probability/weight

    • Plus all metrics from uncert()

  • design_type (str) – Design category: ‘classic’ (preliminary) or ‘DOE’ (MBDoE-optimized).

  • round_data (dict) – Cumulative storage dictionary for all rounds. Updated in-place.

  • models (dict) – Model definitions with parameter masks, names, and settings.

  • iden_opt (dict) –

    Identification options including plotting flags:
    • ’c_plt’: bool — Generate confidence plots

    • ’f_plt’: bool — Generate fit plots

    • ’plt_s’: str — Plotting style/folder

  • obs (int) – Total number of observations (measurements) across all experiments.

  • system (dict) – System configuration with variable definitions.

  • ranking (list[str], optional) – Parameter ranking from estimability analysis (most to least estimable).

  • k_optimal_value (int, optional) – Optimal number of parameters to estimate based on rCC analysis.

  • rCC_values (dict[str, list[float]], optional) – Revised Critical Ratio values for each model: {solver: [rCC_1, rCC_2, …]}.

  • J_k_values (dict[str, list[float]], optional) – Objective function values for each k: {solver: [J_1, J_2, …]}.

  • best_uncert_result (dict, optional) – If estimability analysis was performed, the uncertainty results for the optimal parameter subset. Overrides ‘result’ if provided.

Returns:

  • trv (dict[str, float]) –

    Reference t-values for 95% confidence level:

    [ t_{ref} = t_{0.975, nu} quad text{where } nu = n_{obs} - n_{params} ]

    Keys: solver names. Used to assess parameter significance.

  • Side Effects

  • ————

  • - Updates round_data dict with complete round information

  • - Generates confidence plots (if iden_opt[‘c_plt’]=True)

  • - Generates fit plots (if iden_opt[‘f_plt’]=True)

  • - Prints summary statistics (t-values, P-values) to console

Notes

Round Data Structure: round_data[‘Round i’] contains:

  • ‘ranking’: Parameter estimability ranking

  • ‘k_optimal_value’: Optimal number of parameters

  • ‘rCC_values’: Corrected Critical Ratios

  • ‘J_k_values’: Objective function values

  • ‘design_type’: ‘classic’ or ‘DOE’

  • ‘mutation’: Parameter activity masks

  • ‘original_positions’: Parameter reordering info

  • ‘trv’: Reference t-values

  • ‘scaled_params’: Estimated parameter values

  • ‘result’: Full uncertainty analysis results

  • ‘iden_opt’: Copy of identification options

  • ‘models’: Copy of model definitions

  • ‘system’: Copy of system configuration

  • ‘est_EA’: Estimability analysis results

Reference t-Values: Computed from Student’s t-distribution:

[ t_{ref} = t_{1 - alpha/2, nu} quad text{with } alpha = 0.05, nu = n_{obs} - n_{params} ]

Used to assess parameter significance:
  • If ( |t_{param}| > t_{ref} ): Parameter is significant

  • If ( |t_{param}| < t_{ref} ): Parameter is non-significant (poorly determined)

Plotting: Two types of plots are generated (if enabled):

  1. Confidence plots: 2D ellipsoids showing parameter correlations

  2. Fit plots: Model predictions vs experimental data

Plots are saved to folders named by round and solver.

Estimability Analysis Integration: If best_uncert_result is provided, it indicates that estimability analysis was performed to select a subset of parameters. This result is used instead of the full uncertainty analysis.

Mutation Tracking: Parameter activity masks (mutation) track which parameters are estimated (True) vs fixed (False) in each round. This is crucial for sequential parameter estimation strategies.

See also

uncert

Generates the ‘result’ dict.

run_postprocessing

Uses round_data for multi-round analysis.

Plotting_Results

Generates confidence and fit plots.

Examples

>>> # After completing estimation and uncertainty analysis
>>> round_data = {}
>>> trv = save_rounds(
...     round=1,
...     result=uncert_results,
...     design_type='classic',
...     round_data=round_data,
...     models=models,
...     iden_opt={'c_plt': True, 'f_plt': True, 'plt_s': './plots'},
...     obs=50,
...     system=system
... )
>>>
>>> print(f"Round 1 saved. Reference t-value: {trv['M1']:.3f}")
>>> print(f"Stored data: {list(round_data['Round 1'].keys())}")
>>>
>>> # Check parameter significance
>>> for i, t_val in enumerate(uncert_results['M1']['t_values']):
...     significant = "✓" if abs(t_val) > trv['M1'] else "✗"
...     print(f"  θ{i+1}: t={t_val:.2f} {significant}")
middoe.log_utils.save_to_jac(results, purpose)[source]

Serialize and save results to a binary .jac file using pickle.

This function provides persistent storage for estimation or sensitivity analysis results, enabling later retrieval without re-running expensive computations. The .jac extension is a custom convention for “Job Archive” files.

Parameters:
  • results (dict) –

    Results dictionary to serialize. Structure depends on purpose:
    • For ‘iden’: round_data dict with all estimation rounds

    • For ‘sensa’: Sobol sensitivity analysis results

  • purpose (str) –

    Result type, determines filename:
    • ’iden’: Saves to ‘iden_results.jac’ (identification results)

    • ’sensa’: Saves to ‘sensa_results.jac’ (sensitivity results)

Returns:

File is written to disk as a side effect.

Return type:

None

Raises:
  • ValueError – If purpose is neither ‘iden’ nor ‘sensa’.

  • Exception – If file writing fails (permissions, disk space, etc.).

Notes

Pickle Protocol: Uses pickle.HIGHEST_PROTOCOL for maximum compatibility with future Python versions while maintaining backward compatibility.

File Location: Files are saved in Path.cwd() (current working directory). Ensure you have write permissions in this directory.

Security Warning: Pickle files can execute arbitrary code during deserialization. Only load .jac files from trusted sources. For untrusted data, consider JSON or HDF5.

File Size: Results can be large (MB-GB) if they contain many rounds with extensive covariance matrices, LSA matrices, and simulation trajectories. Monitor disk space usage.

Use Cases:
  • Checkpoint: Save after each round for crash recovery

  • Archive: Preserve results for reproducibility

  • Sharing: Distribute results to collaborators

  • Post-processing: Load results for plotting without re-estimation

See also

load_from_jac

Companion function to load saved results.

save_rounds

Organizes data before saving.

Examples

>>> # After completing all estimation rounds
>>> save_to_jac(round_data, purpose='iden')
# [INFO] Results saved to: .../iden_results.jac
>>>
>>> # After sensitivity analysis
>>> save_to_jac(sobol_results, purpose='sensa')
# [INFO] Results saved to: .../sensa_results.jac
>>>
>>> # Invalid purpose
>>> save_to_jac(results, purpose='invalid')
# ValueError: Purpose must be 'iden' or 'sensa'
middoe.log_utils.save_to_xlsx(sensa, sobol_problem)[source]

Export Sobol sensitivity analysis results to Excel for visualization and reporting.

This function converts nested Sobol analysis results (time-varying sensitivity indices for multiple models and responses) into a well-organized Excel workbook with separate sheets for each model-response combination.

Parameters:
  • sensa (dict) –

    Sensitivity analysis results with structure:
    sensa[‘analysis’][model_name][response_name] = [

    {‘ST’: [s1, s2, …], ‘S1’: […], …}, # time point 1 {‘ST’: [s1, s2, …], ‘S1’: […], …}, # time point 2 …

    ]

  • sobol_problem (dict) –

    Sobol problem definition with parameter names:

    sobol_problem[model_name][‘names’] = [‘k1’, ‘k2’, ‘Ea’, …]

Returns:

  • None – Excel file is written to disk as a side effect.

  • Side Effects

  • ————

  • Creates ‘sobol_results.xlsx’ in current working directory.

Notes

Excel Structure: Each sheet contains:

  • Column 1: Time (or time point index)

  • Columns 2+: Total Sobol indices (ST) for each parameter

Sheet names follow the pattern: ‘{model}_{response}’ (truncated to 31 chars for Excel compatibility).

Total Sobol Index (ST): The function exports ST (total-order indices) rather than S1 (first-order) because ST captures both main effects and all interactions, providing the most comprehensive sensitivity information.

Time-Varying Sensitivity: Each row represents one time point. This captures how parameter importance evolves during a dynamic experiment:

  • Early times: Initial condition parameters dominate

  • Middle times: Kinetic parameters dominate

  • Late times: Equilibrium parameters dominate

Missing Data Handling:
  • Skips invalid or empty data

  • Uses default parameter names (‘Param_1’, ‘Param_2’, …) if names unavailable

  • Handles mismatched data gracefully

Visualization: The exported Excel file can be:

  • Plotted directly in Excel

  • Imported into plotting software

  • Used for publication-quality figures

See also

plot_sobol_results

Visualization function from iden_utils.

SALib.analyze.sobol

Sobol analysis computation.

Examples

>>> # After running Sobol analysis
>>> save_to_xlsx(sensa_results, sobol_problem)
# Sobol results saved to: .../sobol_results.xlsx
>>>
>>> # Result: Excel file with sheets like:
>>> # - M1_CA (Concentration A sensitivity for model 1)
>>> # - M1_CB (Concentration B sensitivity for model 1)
>>> # - M2_T (Temperature sensitivity for model 2)
>>> # Each with columns: Time, k1, k2, Ea, ...