Data

One of the key functionalities of the NLME module is to estimate the parameter values of a model based on observed data, which is typically represented as time series measurements for each individual, study arm, animal, or other experimental setup. Additionally, the relevant data is often linked to drug administration and may include both time-varying and constant independent variables (covariates) that can be incorporated into the model.

Communication between the data and the model is facilitated by compiling a dataset with a predefined structure, which can be categorized into three types of elements: time series, dosing events, and covariates.

Standardized dataset structure

Standardized datasets in tabulated format accepted by Simurg software are inspired by CDISC guidelines [1] and are compatible with other conventional software, such as Monolix (Lixoft, France) and NONMEM (Icon, USA).
Each line of the dataset should correspond to one inidivudal and one time point. Single line can desribe a measurement, or a dosing event, or both.

Time series

Mandatory columns:

  • ID - unique identificator of an individual/animal/study arm/experimental setup, typically characterized by unique combination of observations, dosing events and covariates. Can be numeric or character.
  • TIME - observation time. Numeric.
  • DV - observed value of a dependent variable. Numeric.
  • DVID - natural number corresponding to the identificator of a dependent variable.

Optional columns:

  • DVNAME - character name of a dependent variable. Should have single value per DVID.
  • MDV - missing dependent variable flag. Equals 0 by default. If equals 1 - observation in the corresponding line is ignored by the software.
  • CENS - censoring flag, can be empty, 0, -1 (for right censoring) and 1 (for left censoring). Value in DV column associated with CENS not equal to 0 servesa as lower limit of quantification for left censoring or upper limit of quantification for right censoring (relevant for M3 censoring method).
  • LIMIT - if CENS column is present, numerical value in LIMIT column will define lower or upper limit of the censored observations (relevant for M4 censoring method).

Dosing events

  • EVID - identificator of a dosing event. By default equals 0 which corresponds to an observation without any associated events (AMT, etc. are ignored). Other possible values include:
    • 1 - dosing event.
    • 2 - reset of the whole system to initial conditions, with or without dosing event.
    • 3 - reset of the associated DVID to the value in DV column, with or without dosing event.
  • CMT - dosing compartment - a natural number corresponding to the running number of a differential equation within a model.
  • ADM - manually assigned administration ID. Replaces CMT if present. Natural number.
  • AMT - dosing amount. Numeric.
  • II - time interval between the doses. Numeric.
  • ADDL - number of additional doses. Natural number.
  • TINF or DUR - duration of infusion. Numeric.
  • RATE - infusion rate. Numeric. Replaces TINF or DUR if present.

Covariates

Any additional column in a dataset can be considered as continuous (if numeric) or categorical (if character) covariate, either constant (if covariate value does not change over time within a single ID), or time-varying. Interpolation for the latter is performed via last observation carried forward approach.

Initialization of the dataset

A dataset can be uploaded into the environment by pressing button and selecting a file with the following extensions: .csv, .txt, .tsv, .xls, .xlsx, .sas7bdat, .xpt.

Once a dataset is uploaded, its content will appear in a form of a table on the main panel:

Modifications of the dataset are possible through the Simurg Data management module's Data tab.

Once uploaded, the dataset is recognized by the software and can be used for subsequent model development.

References

[1] https://www.cdisc.org/standards/foundational/adam/basic-data-structure-adam-poppk-implementation-guide-v1-0