Artificial patient-like cancer data to help researchers gain insights

The Simulacrum is synthetic data that imitates cancer patient records held securely by the National Disease Registration Service (NDRS) at NHS England. 

The data in the Simulacrum is made up entirely of artificial patient records. It does not contain any real patient information, so it cannot be used to identify a real person.  

Simulacrum data is free to download and use, allowing anyone to analyse realistic record-level cancer data without risk of breaching patient confidentiality. 

We have kept the data structure and statistical properties similar to the real data, so that it can be used to write and test queries that can run on the real data. 


What the Simulacrum is and why we made it.

Available data

Find out what data is in the Simulacrum.

Getting started

Find out how to use the Simulacrum.

Download the Simulacrum

Download the Simulacrum data and get started on your research.


Everything you want to know about the project, from who made it to how it was made and how to use it.