4 Variables

This section provides variable definitions which the Exeter Diabetes team uses for their datasets.

4.1 Codelist

The Exeter Diabetes codelists are available at Exeter-Diabetes/CPRD-Codelists.

This repository includes Medcodes (for use with the Observation table) and Prodcodes (for use with the Drug Issue table), as well as ICD10 and OPCS4 codelists for use with linked HES APC data. All codelists have been clinically-reviewed except a small subset where review by a non-clinician was deemed acceptable (BMI, weight, height and blood pressure readings) or where PBCL biomarker codes were used (see Codelist generation process: https://github.com/Exeter-Diabetes/CPRD-Codelists/blob/main/readme.md#code-list-generation-process).

We have also included Read and SNOMED codelists created during medcode list development (see Codelist generation process); where available, these are located in each of the subfolders in the Medcodes folder.

All codelists are based on an October 2020 extract of CPRD Aurum; later versions may include extra Medcodes/Procodes not included here.

4.2 Variable definitions

The definitions and derivation algorithms for the variables used are available at Exeter-Diabetes/CPRD-Codelists.

This GitHub repository, provides a detailed view on:

  1. Biomarker derivation algorithms (https://github.com/Exeter-Diabetes/CPRD-Codelists/tree/main?tab=readme-ov-file#biomarker-algorithms)

We have developed the R package EHRBiomarkr (available in the GitHub repository Exeter-Diabetes/EHRBiomarkr), which includes various functions for cleaning and processing biomarkers in EHR, especially in CPRD Aurum. All functions can be used on local data (loaded into R) or data stored in MySQL (by using the dbplyr package or another package which uses dbplyr).

Two functions for cleaning biomarkers values are included in this package:

These functions can be applied to a large range of biomarkers. See this for more information: https://github.com/Exeter-Diabetes/EHRBiomarkr?tab=readme-ov-file#biomarker-cleaning-functions

The R package also includes functions for calculating more complex variables such as:

  1. Comorbidity derivation algorithms (https://github.com/Exeter-Diabetes/CPRD-Codelists/tree/main?tab=readme-ov-file#comorbidity-algorithms)

  2. Diabetes derivation algorithms (https://github.com/Exeter-Diabetes/CPRD-Codelists/tree/main?tab=readme-ov-file#diabetes-algorithms)

  3. Sociodemographics derivation algorithms (https://github.com/Exeter-Diabetes/CPRD-Codelists/tree/main?tab=readme-ov-file#sociodemographics-algorithms)