Table Structure

The Simulacrum is a collection of linked data tables which contain the same structure as those used in the original NCRAS data.

There are three sets of tables within the Simulacrum:

1. Cancer registration tables  (SIM_AV)

2. Systemic Anti-Cancer Therapy (SACT) tables  (SIM_SACT)

3. Lookup tables

The tables are linked as follows:

 

The SIM_AV_ tables represent the patient and tumour registration data, and the SIM_SACT_ tables represent the Systematic Anti-Cancer Therapy (SACT) data. These two sets of tables are linked via the patient tables by the Link_Number, which is the common identifier in both datasets.

The SIM_AV_TUMOUR table contains most of the detailed information regarding staging and pathology of the tumour. Each patient can have multiple tumours in both the registration and the SACT datasets. Aside from the linkage by the Link_Number, both the SACT and AV tables are independent, so the tumour IDs in the SIM_AV_TUMOUR table cannot be matched to the tumour IDs in the SIM_SACT_TUMOUR table. The SACT tables contain details of treatments received for patients and can be used to identify patients who were treated with certain therapeutic regimens.

Note: so that queries built using the Simulacrum data can (subject to the appropriate permissions and ethical approvals from PHE) be run directly on the CAS data, we have maintained the field-naming convention used in PHE’s Cancer Analysis System.   However, we have changed the field name “NHS_Number” that is used in the CAS to Link_Number in the simulated data to avoid any risk that a casual viewer might assume that the NHS unique identifier has been included in the synthetic data – when it is not.