Available data

The Simulacrum contains data about synthetic patients, such as age and sex, and data about synthetic tumours, such as staging and pathology information. Like in real life, the synthetic patients can have multiple tumours.

The Simulacrum imitates data about tumours diagnosed in England between 2013 and 2017.

The vital status of each synthetic patient up to February 2019 has also been simulated so researchers can analyse survival using the Simulacrum data.

There is also data about treatments with chemotherapy, known as Systemic Anti-Cancer Therapy (SACT), which contains information about the types and number of treatments received. SACT data has been simulated up until March 2018 so researchers can analyse the treatments following diagnosis.

The Simulacrum contains data on:

2,200,626 synthetic patients
2,371,281 synthetic tumours

1,402,070 have cancer stages 0-4, which includes non-melanoma skin cancers; 1,137,676 have stage values between 1-4

366,266 of the synthetic patients have chemotherapy data, covering 730,472 regimens, 2,442,037 cycles and 6,385,828 drug administrations.

The top incidence counts over the three years are:

607,619 non-melanoma skin cancers (C44)
226,406 breast cancer (C50)
201,785 prostate cancer (C61)
169,118 lung cancer (C34)

For a detailed description of all the data fields within the Simulacrum, please see our Simulacrum data dictionary.

Release comparison chart

  Simulacrum v1.1.0 (2015) Simulacrum v1.2.0 (2017)
Diagnosis years 2013-2015 2013-2017
Synthetic patients 1,322,100 2,200,626
Synthetic tumours 1,402,817 2,371,281
Stages 0 – 4 (including non-melanoma skin cancers) 821,454 1,402,070
Stages 1 - 4 659,948 1,137,676
Synthetic patients with SACT data 245,938 366,266
SACT Regimens 471,919 730,472
SACT Cycles 1,462,099 2,442,037
SACT drug administrations 3,533,584 6,385,828
Non-melanoma skin cancers (C44) 350,130 607,619
Breast cancer (C50) 133,907 226,406
Prostate cancer (C61) 119,347 201,785
Lung cancer (C34) 102,350 169,118