---
title: "Changes from original"
bibliography: references.bib
vignette: >
  %\VignetteIndexEntry{Changes from original}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
---

In this vignette, you'll find a description of the changes that have
been made to the OSDC algorithm since its [original
validation](https://doi.org/10.2147/CLEP.S407019). The osdc package uses
the latest changes to the algorithm. Potential changes to the algorithm,
rather than the specific implementation and code details, that we might
in the future will also be described in this vignette. We will also
provide validation metrics here whenever we make a change, and track
these validations over the different versions.

## Specific changes since the original validation (version from the paper)

### Version 1.0

1.  Purchases of GLP1-RA, dapagliflozin or empagliflozin are no longer
    used for inclusion nor type-classification.
    -   Due to increasing use in treatment of non-diabetes.
2.  For T1D classification, the window of 180 days to make a purchase of
    an insulin is now evaluated from the date of the first purchase of a
    glucose-lowering drug, rather than the date of inclusion.
    -   To simplify computations and increase robustness to noise and
        atypical cases.
3.  Purchases of insulin in the previous year is no longer required for
    T1D classification.
    -   Because we found that the vast majority of individuals
        classified as T2D due to this criteria, reported that they had
        T1D in the data from Health in Central Denmark.
4.  The logic defining pregnancy index dates has been simplified to only
    use diagnoses of pregnancy endings (no longer uses data on maternal
    care visits).
    -   For the sake of simplicity, as we found no impact on
        classification accuracy in the Health in Central Denmark data.
5.  HbA1c samples taken on the same date are de-duplicated.
    -   To better align with recommended diagnostic practice [@WHO2011].
        In the original implementation, only samples taken at the exact
        same time were de-duplicated.

## Validity

The validity of the OSDC algorithm is tested against self-reported
diabetes type in the Health in Central Denmark survey. The results are
reported as overall PPV (positive predictive value) and sensitivity for
each version of the algorithm and within subsets of the diabetes
population reporting onset of diabetes before or after age 40,
respectively, similar to tables 1 & 2 of the original validation paper
[@Isaksen2023].

### Validity in 2019

This uses the same data as [the original validation
paper](https://doi.org/10.2147/CLEP.S407019) and provides a direct
comparison to the original implementation.

#### Stratified by diabetes type and age at onset

| Version | Diabetes type | PPV   | Sensitivity |
|---------|---------------|-------|-------------|
| Paper   | T1D           | 0.943 | 0.773       |
| Paper   | T1D \>40 yrs  | 0.708 | 0.378       |
| Paper   | T2D           | 0.875 | 0.944       |
| Paper   | T2D \<40 yrs  | 0.471 | 0.863       |

| Version | Diabetes type | PPV   | Sensitivity |
|---------|---------------|-------|-------------|
| 1.0     | T1D           | 0.944 | 0.783       |
| 1.0     | T1D \>40 yrs  | 0.708 | 0.378       |
| 1.0     | T2D           | 0.879 | 0.944       |
| 1.0     | T2D \<40 yrs  | 0.480 | 0.863       |

#### Bootstrapped metrics

Corresponds to supplementary table S3 of the validation paper.

| Version | Diabetes type | Sensitivity | Specificity | PPV   | NPV   |
|---------|---------------|-------------|-------------|-------|-------|
| Paper   | T1D           | 0.774       | 0.999       | 0.951 | 0.997 |
| Paper   | T2D           | 0.943       | 0.989       | 0.878 | 0.995 |

| Version | Diabetes type | Sensitivity | Specificity | PPV   | NPV   |
|---------|---------------|-------------|-------------|-------|-------|
| 1.0     | T1D           | 0.788       | 0.999       | 0.947 | 0.997 |
| 1.0     | T2D           | 0.940       | 0.990       | 0.881 | 0.995 |

### Validity in 2025

This section will contain metrics from validation performed in
subsequent survey waves of Health in Central Denmark, as this data
becomes available.

## Potential future changes

1.  Add support for using medical birth register to define pregnancies
    to censor gestational diabetes (GDM). This will allow for the
    censoring of glucose-lowering drug (GLD) purchases all the way back
    to 1995 (rather than 1997 onward, as the obstetric codes are limited
    to), and enable the extension of the window of valid dates of
    diagnosis to 1996 onward.
2.  Limit the historic scope of primary diagnoses used to evaluate
    majority of diabetes-specific diagnoses in type classification (e.g.
    only evaluate majority among the last five type-specific diabetes
    diagnoses).

## References
