ONP Analytics

Presentation and Functionalities

ONP Analytics is a component of Onco Place Platform (ONP) that provides statistical analyses and data processing models based on data collected in ONP projects. Its purpose is to move beyond data collection, structuring and harmonization to reach data transformation and mining on cohorts of patients selected in the project. ONP Analytics currently implements the following functionalities:

  • Descriptive statistics: computes 1st order statistics to summarize the project database (DB). It gives an overview of the DB and can ease its Quality Check.

Example: number and percentage of patients receiving chemotherapy or mean treatment duration.

  • Dose Volume Histogram (DVH): automatically computes (or reads) DVHs based on DICOM Images and DICOM RT data (contours and RT dose distribution) previously uploaded in ONP platform.
  • Dose Indices: allows calculation of dose indices based on user-defined templates and DVH previously computed. All Dose indices can be used as new features in other modules.

Examples: Dmean, DV%, VDGy, Dnear min, Dmax, DSC, gEUD …

  • Features extraction: This module allows to define and extract new features (data) from the original project DB or from previously extracted or calculated features. Dose indices calculated by (c) can also be used as input in this module. All new features extracted can be integrated in other analysis modules.

Example: the DB contains weight, age and height. ONP Analytics can extract BMI = weight / height² but also obesity=YES if BMI ≥ 30

  • Survival curve: computes the survival curve of a cohort using the Kaplan–Meier’s method.
  • Concordance index: computes the Harrell’s concordance index (C-index) that measures how much two risk scores have the same tendency. It is typically used in survival analysis to compare a predictive death score with the actual survival time.
  • Cox proportional hazards model learns a Cox model of survival on the selected cohort and evaluates it using cross-validation. The performance (C-index), the coefficients and the hazard ratios (HR) of the model are given to identify the variables that are associated with a higher or lower risk of death (or recurrence, etc.). All the other learned parameters, or those used for learning, are also returned to allow for reproducibility and predictive applications.
  • Predictive ROC analysis: computes the sensitivity–specificity curve (ROC) based on a given numeric predictive score and a reference variable to predict (disease vs. no disease). The predictive ROC (PROC) curve can also be computed to evaluate the reliability of the predictive score in a clinical context; it shows the positive predictive value (PPV) vs. the negative predictive value (NPV).
  • Custom Python program: integrates a user’s Python program in a secure ONP environment and executes it on selected features of a cohort. The purpose of this module is to allow researchers to run custom programs on their ONP project without extracting data from its DB.

Other functionalities are under development for future versions of Analytics (logistic regression and other classifiers, clustering, descriptive histograms, etc.).