Available data

The Simulacrum contains data about synthetic patients, such as age and sex, and data about synthetic tumours, such as staging and pathology information. Like in real life, the synthetic patients can have multiple tumours.

The Simulacrum imitates data about tumours diagnosed in England in 2013, 2014, and 2015.

The vital status of each synthetic patient up to the end of 2017 has also been simulated so researchers can analyse survival using the Simulacrum data.

There is also data about treatments with chemotherapy, known as Systemic Anti-Cancer Therapy (SACT), which contains information about the types and number of treatments received. SACT data has been simulated up until June 2016 so researchers can analyse the treatments following diagnosis.   

The Simulacrum contains data on:

1,322,100 synthetic patients
1,402,817 synthetic tumours

821,454 have cancer stages 0-4, which includes non-melanoma skin cancers; 659,948 have stage values between 1-4

245,938 of the synthetic patients have chemotherapy data, covering 471,919 regimens, 1,462,099 cycles and 3,533,584 drug administrations.

The top incidence counts over the three years are:

350,130 non-melanoma skin cancers (C44)
133,907 breast cancer (C50)
119,347 prostate cancer (C61)
102,350  lung cancer (C34)

For a detailed description of all the data fields within the Simulacrum, please see our Simulacrum data dictionary.