Data
One of the key functionalities of the NLME module is to estimate the parameter values of a model based on observed data, which is typically represented as time series measurements for each individual, study arm, animal, or other experimental setup. Additionally, the relevant data is often linked to drug administration and may include both time-varying and constant independent variables (covariates) that can be incorporated into the model.
Communication between the data and the model is facilitated by compiling a dataset with a predefined structure, which can be categorized into three types of elements: time series, dosing events, and covariates.
Standardized dataset structure
Standardized datasets in tabulated format accepted by Simurg software are inspired by CDISC guidelines [1] and are compatible with other conventional software, such as Monolix (Lixoft, France) and NONMEM (Icon, USA).
Each line of the dataset should correspond to one inidivudal and one time point. Single line can desribe a measurement, or a dosing event, or both.
Time series
Mandatory columns:
ID
- unique identificator of an individual/animal/study arm/experimental setup, typically characterized by unique combination of observations, dosing events and covariates. Can be numeric or character.TIME
- observation time. Numeric.DV
- observed value of a dependent variable. Numeric.DVID
- natural number corresponding to the identificator of a dependent variable.
Optional columns:
DVNAME
- character name of a dependent variable. Should have single value perDVID
.MDV
- missing dependent variable flag. Equals 0 by default. If equals 1 - observation in the corresponding line is ignored by the software.CENS
- censoring flag, can be empty, 0, -1 (for right censoring) and 1 (for left censoring). Value inDV
column associated withCENS
not equal to 0 servesa as lower limit of quantification for left censoring or upper limit of quantification for right censoring (relevant for M3 censoring method).LIMIT
- ifCENS
column is present, numerical value inLIMIT
column will define lower or upper limit of the censored observations (relevant for M4 censoring method).
Dosing events
EVID
- identificator of a dosing event. By default equals 0 which corresponds to an observation without any associated events (AMT
, etc. are ignored). Other possible values include:- 1 - dosing event.
- 2 - reset of the whole system to initial conditions, with or without dosing event.
- 3 - reset of the associated
DVID
to the value inDV
column, with or without dosing event.
CMT
- dosing compartment - a natural number corresponding to the running number of a differential equation within a model.ADM
- manually assigned administration ID. ReplacesCMT
if present. Natural number.AMT
- dosing amount. Numeric.II
- time interval between the doses. Numeric.ADDL
- number of additional doses. Natural number.TINF
orDUR
- duration of infusion. Numeric.RATE
- infusion rate. Numeric. ReplacesTINF
orDUR
if present.
Covariates
Any additional column in a dataset can be considered as continuous (if numeric) or categorical (if character) covariate, either constant (if covariate value does not change over time within a single ID), or time-varying. Interpolation for the latter is performed via last observation carried forward approach.
Initialization of the dataset
A dataset can be uploaded into the environment by pressing button and selecting a file with the following extensions:
.csv
, .txt
, .tsv
, .xls
, .xlsx
, .sas7bdat
, .xpt
.
Once a dataset is uploaded, its content will appear in a form of a table on the main panel:
Modifications of the dataset are possible through the Simurg Data management module's Data tab.
Once uploaded, the dataset is recognized by the software and can be used for subsequent model development.
References
[1] https://www.cdisc.org/standards/foundational/adam/basic-data-structure-adam-poppk-implementation-guide-v1-0