
The Simulacrum imitates some of the data held securely by the Public Health England’s National Cancer Registration and Analysis Service.
The data in the Simulacrum is entirely artificial. It does not contain data about real patients, so users can never identify a real person. It is free to use and allows anyone who wants to use record-level cancer data to do so, safe in the knowledge that while the data feels like the real thing, there is no danger of breaching patient confidentiality.
We have kept the data model – the shape of the data – the same as the real one so that it can be used to write and test queries that would run on the real data.

Updated version of Simulacrum released
We are pleased to announce the release of an updated version of the Simulacrum containing two additional years of diagnostic data (2013 – 2017 diagnoses) and treatment follow up to March 2018. This new version of the Simulacrum contains almost twice as many synthetic...
Background
What the Simulacrum is and why we made it.
Available data
Find out what data is in the Simulacrum.
Getting started
Download the Simulacrum
FAQs
Everything you want to know about the project, from who made it to how it was made and how to use it.