
Research ready Lung Data
Research ready lung data
Overview
This data dictionary describes the contents of the research ready lung cancer data set. This data set is made up of all Queensland non-small cell lung cancer (NSCLC) diagnoses between 2000 and 2019. The fields have been arranged in four categories as follows:
- Demographic,
- Clinical,
- Treatment, and
- Administrative
and a dataset sample is provided containing 10 example records. For detailed reference tables containing ICD Codes for clinical data items such as Primary Site, Morphology and Procedure codes, please see Reference Sets.
Population-wide data on stage is available for the first time, with the most recent years 2017-2019 containing TNM stage information. Multi-modal treatment data (Surgery, Radiation therapy, Intravenous Systemic Therapy) is available for patients who underwent treatment for their diagnosis of lung cancer, along with details of known MDT presentations.
All of the data within this collection comes from the Queensland Oncology Repository (QOR), a cancer patient database developed and maintained by the Queensland Cancer Control Analysis Team (QCCAT; Queensland Health) to support Queensland’s cancer control, safety, and quality assurance initiatives. QOR consolidates cancer patient information for the state and contains data on diagnoses and deaths, surgery, chemotherapy, and radiotherapy. For more information, visit our website.
Click here to request access
Demographic
Demographic fields in ths dataset include the age, sex and Indigenous status of the person diagnosed with cancer, as well as geographic and socio-economic data based on the person’s place of residence at diagnosis.
Field title | Field name | Definition | Field values |
---|---|---|---|
Age at diagnosis | AgeAtDiagnosis | The age of the person in (completed) years at a specific point in time when first diagnosed | |
Age group at diagnosis (code) | AgeGroupFiveYearsKey | The five-year age group the person belonged to at a specific point in time when first diagnosed | |
Age group at diagnosis (description) | AgeGroupFiveYears | The five-year age group the person belonged to at a specific point in time when first diagnosed | |
Date of birth | BirthDate | The date on which an individual was born | |
Date of death | DeathDate | The date of death of the person | |
Hospital and Health Service (code) | HHSOfResidenceKey | Queensland Hospital and health service geographic region at diagnosis | |
Hospital and Health Service (description) | HHSOfResidence | Queensland Hospital and health service geographic region at diagnosis | Cairns and Hinterland Central Queensland Central West Darling Downs Gold Coast Mackay Metro North Metro South North West South West Sunshine Coast Torres and Cape Townsville West Moreton Wide Bay |
Indigenous status (code) | IndigenousStatusID | A measure of whether a person identifies as being of Aboriginal or Torres Strait Islander origin | |
Indigenous status (description) | IndigenousStatus | A measure of whether a person identifies as being of Aboriginal or Torres Strait Islander origin | Indigenous non-Indigenous Not Stated/Unknown |
Remoteness area | Remoteness | The remoteness of residence at time of diagnosis. | Major City Inner Regional Outer Regional Remote & Very Remote |
Sex (code) | SexID | The biological distinction between male and female | |
Sex (description) | Sex | The biological distinction between male and female | |
Socioeconomic status | DecileID |
Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Area 2 (SA2). (See summary table) The Index of Relative Socioeconomic Advantage and Disadvantage is used (IRSAD) |
1 (Low) . . . 10 (High) |
Socioeconomic status | DecileName |
Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Area 2 (SA2). (See summary table) The Index of Relative Socioeconomic Advantage and Disadvantage is used (IRSAD) |
1 2 3 4 5 6 7 8 9 10 |
Socioeconomic status (group) | SocioeconomicStatus | Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Area 2 (SA2). (See summary table) | Affluent Disadvantaged Middle Unknown |
Statistical Area 2 (SA2 Residence at Diagnosis) |
ASGS_SA2Code | A designated region describing location and contact details that represents a medium-sized area built from a number of Statistical Area 1, as represented by a code. The aim is to represent a community that interacts together socially and economically | See ABS website for details |
Statistical Area 2 (SA2 Residence at Diagnosis) |
ASGS_SA2Description | A designated region describing location and contact details that represents a medium-sized area built from a number of Statistical Area 1. The aim is to represent a community that interacts together socially and economically | See ABS website for details |
Clinical
The clinical fields contained in this dataset include information about the cancer diagnosed. This includes date of diagnosis, the clinical features of the tumour such as size, morphology, nodal status and the presence of metastases. For the first time, clinical stage is included in this data. Full staging information is available only for patients diagnosed 2017-2019, although Stage IV patients have been identified from 2000-onwards. This data is comprehensive and has been obtained from clinical audits conducted by clinicians and coders, leading to the development of automated algorithms used on prospective data to glean stage routinely for new diagnoses.
Field title | Field name | Definition | Field values |
---|---|---|---|
Cancer-related death | DeathEventCauseSpecific | Was the person’s death caused by their cancer? | 0 - No 1 - Yes |
Date of diagnosis | DiagnosisDate | The date a disease or condition is diagnosed | |
Differentiation (code) | DifferentiationKey | The histological grade of the cancer tissue in a person with cancer | 1 2 3 4 98 99 |
Differentiation (description) | Differentiation | The histological grade of the cancer tissue in a person with cancer |
Well Differentiated / Low Grade / Grade 1 Moderately Differentiated / Intermediate Grade / Grade 2 Not Applicable Not Stated/Unknown Poorly Differentiated / High Grade / Grade 3 Undifferentiated / Anaplastic / Grade 4 |
Morphology of cancer (code) | MorphologyCode | The histological classification of the cancer tissue (histopathological type) in a person with cancer, and a description of the course of development that a tumour is likely to take: benign or malignant (behaviour), as represented by a code. | See Appendix A |
Morphology of cancer (description) | Morphology | The histological classification of the cancer tissue (histopathological type) in a person with cancer, and a description of the course of development that a tumour is likely to take: benign or malignant (behaviour). | See Appendix A |
Morphology of cancer (group) | MorphologyGroup | High level grouping of the morphologies | Adenocarcinomas Other Specific Carcinomas Squamous Carcinomas Unspecified Carcinomas (NOS) |
Most valid basis of diagnosis of cancer (code) | DiagnosisBasisKey | The most reliable basis of a cancer diagnosis | 0 1 2 3 4 5 6 7 8 9 10 11 12 9999 |
Most valid basis of diagnosis of cancer (description) | DiagnosisBasis | The most reliable basis of a cancer diagnosis |
Clinical Investigations |
Number of comorbidities | ComorbidityCount | A grouping of clinical conditions that has the potential to significantly affect a cancer patient’s prognosis. | Numeric value |
Number of comorbidities (grouped) | ComorbidityCountGroup | A grouping of clinical conditions that has the potential to significantly affect a cancer patient’s prognosis. | 0 1 2+ |
Performance status (code) | PerformanceStatusCode | Performance status recorded in QOOL at time of MDT (code) | 0 1 2 3 4 99 |
Performance status (description) | PerformanceStatus | Performance status recorded in QOOL at time of MDT (description) | Fully active Ambulatory - capable of light work Bed < 50% - self caring - not working Bed > 50% - partially self caring Confined to bed or chair Unknown |
Primary site of cancer (description) | PrimarySite | The site of origin of the tumour, as opposed to the secondary or metastatic sites. | Bronchus or lung Lower lobe, bronchus or lung Main bronchus Middle lobe, bronchus or lung Overlapping lesion of bronchus and lung Trachea Upper lobe, bronchus or lung |
Primary site of cancer (group) | PrimarySiteGroup | High level grouping of the sites in which the tumour originated in a person with cancer | NSCLC |
Primary site of cancer (ICD-10-AM code) | PrimarySiteCode | The site of origin of the tumour, as opposed to the secondary or metastatic sites, as represented by an ICD-10-AM code. | C33 C340 C341 C342 C343 C348 C349 |
Underlying cause of death | CauseOfDeath | The cause of death of the person as represented by an ICD-10-AM code. |
Treatment
Treatment data items fall into 4 sub-categories as follows: Multidisciplinary Team meetings (MDT), Surgical procedures, Radiation therapy (RT), and Intra-venous systemic therapy (IVST). For detailed information regarding clinical data items, please see Reference Sets.
MULTIDISCIPLINARY TEAM MEETINGS (MDT)
MDT data is present from 2000 onwards. Initial MDT data is limited to data from a project run at The Prince Charles Hospital, however data sourced from QOOL is available from 2009. MDTs also provide many of the data items required for staging.
Field title | Field name | Definition | Field values |
---|---|---|---|
Had MDT review | HadMDTReview | Record of MDT presentation as recorded by QOOL | No Yes |
SURGICAL DATA
Data items related to surgery and the admission during which the procedure was performed are outlined below. Surgical treatment for lung cancer is complex, and procedures fall into three broad groupings. For details of the procedure codes that fall under these categories, please see Appendix A.
Field title | Field name | Definition | Field values |
---|---|---|---|
ASA score (procedure) | ProcASAScore | A score used assess and communicate a patient’s pre-anesthesia medical co-morbidities. The classification system alone does not predict the perioperative risks, but used with other factors (eg, type of surgery, frailty, level of deconditioning), it can be helpful in predicting perioperative risks. | 1 2 3 4 5 9 |
Date of admission | AdmissionDate | Date admitted to a facility for procedure | |
Date of discharge | DischargeDate | Date discharged from a facility after procedure | |
Date of procedure | ProcedureDate | The date on which a clinical intervention was performed during an inpatient episode of care | |
Death in hospital | DeathInpatient | Death in hospital following surgery | No Yes |
Elective status of admission | ProcElectiveStatus | Denotes if admission was elective or emergency | Elective admission Emergency admission Not assigned |
Facility capability score (procedure) | ProcedureFacilityCSCF | High level grouping of hospital capability for service delivery | 3 4 5 6 |
Facility peer group (procedure) | ProcFacPeerGrp | High level grouping of AIHW peer group that does not distinguish between public and private | |
Length of stay | ProcedureLOS | The number of days a patient was in hospital during the admission for their procedure | |
Patient received treatment for their cancer | HadTreatment | Did the patient receive treatment for their cancer | No Yes |
Patient travelled for surgery | TravelledForSurgery | Was the surgery performed in a facility within the same HHS that the person lives in | No Yes |
Patient underwent surgery | IsSurgery | Did the patient have surgery for their cancer | No Yes |
Procedure code | ProcedureCode | A clinical intervention represented by a ICD-10-AM 11th Edtition code | 3843800 3843801 3843802 3844000 3844001 3844100 3844101 9016900 |
Procedure group name | ProcedureGroupName | A group of clinical interventions | Lobectomy Partial Resection Pneumonectomy |
Procedure name | ProcedureName | A description of the clinical intervention (ICD-10-AM 11th Edtition) | Endoscopic wedge resection of lung Lobectomy of lung Pneumonectomy Radical lobectomy Radical pneumonectomy Radical wedge resection of lung Segmental resection of lung Wedge resection of lung |
RADIATION THERAPY (RT)
Field title | Field name | Definition | Field values |
---|---|---|---|
Date of radiotherapy (end) | FirstRTEndDate | Date radiation therapy was completed | |
Date of radiotherapy (start) | FirstRTStartDate | Date radiation therapy was first received | |
Death within 30 days of radiotherapy | RT30DaysToDeath | Did the patient receive radiotherapy in the 30 days prior to death | No Yes |
Facility type (radiotherapy) | FirstRTFacilityType | Facility type (public/private) where radiation therapy was delivered | |
Patient received adjuvant radiotherapy | HadPostProcRT | No Yes |
|
Patient received RT | HadRT | Did the patient receive radiotherapy as treatment for their cancer | No Yes |
Patient received RT before surgery | HadPreProcRT | No Yes |
|
Treatment intent (final RT) | LastRTIntent | The intent of the course of radiation therapy (curative/palliatitve) | Curative Palliative |
Treatment intent (radiotherapy) | FirstRTIntent | The intent of the course of radiation therapy (curative/palliatitve) | Curative Palliative |
IV SYSTEMIC THERAPY
Field title | Field name | Definition | Field values |
---|---|---|---|
Date of IVST (end) | FirstCTEndDate | Date IV systemic therapy completed | |
Date of IVST (start) | FirstCTStartDate | Date IV systemic therapy began | |
Death within 30 days of IVST | CT30DaysToDeath | Did the person receive IV systemic therapy in the 30 days prior to death | No Yes |
Patient received IVST | HadCT | Did the person receive IV systemic therapy for their cancer | No Yes |
Administrative
Field title | Field name | Definition | Field values |
---|---|---|---|
Censor date | SurvivalCensorDate | Patients followed up until this date | |
Patient identifier | UniqueID | Unique identifier for each person in the dataset |
Sample data
This sample contains 10 sample records intended to show the type of data available and the format in which the data is presented. This will allow researchers to prepare load scripts and analysis programs in advance of downloading the full data set.
Click here to download the sample data in Excel.
Reference sets
AGE GROUPS
The five-year age group the person belonged to at a specific point in time when first diagnosed
Age group key | Age group |
---|---|
1 | 0-4 |
2 | 5-9 |
3 | 10-14 |
4 | 15-19 |
5 | 20-24 |
6 | 25-29 |
7 | 30-34 |
8 | 35-39 |
9 | 40-44 |
10 | 45-49 |
11 | 50-54 |
12 | 55-59 |
13 | 60-64 |
14 | 65-69 |
15 | 70-74 |
16 | 75-79 |
17 | 80-84 |
18 | 85+ |
SEX
The biological distinction between male and female.
Reference code | Short description | Long description |
---|---|---|
1 | MALE | MALE |
2 | FEMALE | FEMALE |
3 | OTHER | OTHER |
9 | NOT STATED/INADEQUATELY DESCRIBED | NOT STATED/INADEQUATELY DESCRIBED |
INDIGENOUS STATUS
A measure of whether a person identifies as being of Aboriginal or Torres Strait Islander origin.
Reference code | Short description | Long description |
---|---|---|
1 | Aboriginal but not Torres Strait Islander origin | Aboriginal but not Torres Strait Islander origin |
2 | Torres Strait Islander but not Aboriginal origin | Torres Strait Islander but not Aboriginal origin |
3 | Both Aboriginal and Torres Strait Islander origin | Both Aboriginal and Torres Strait Islander origin |
4 | Neither Aboriginal nor Torres Strait Is. Origin | Neither Aboriginal nor Torres Strait Islander origin |
9 | Not Stated / Unknown | Not Stated / Unknown |
PRIMARY SITE
The site of origin of the tumour, as opposed to the secondary or metastatic sites, as represented by an ICD-10-AM code.
Primary site code | Primary site punctuated | Short description | Long description | Group |
---|---|---|---|---|
C33 | C33 | Trachea | Malignant neoplasm of trachea | Lung |
C340 | C34.0 | Main bronchus | Malignant neoplasm of main bronchus | Lung |
C341 | C34.1 | Upper lobe | Malignant neoplasm of upper lobe, bronchus or lung | Lung |
C342 | C34.2 | Middle lobe | Malignant neoplasm of middle lobe, bronchus or lung | Lung |
C343 | C34.3 | Lower lobe | Malignant neoplasm of lower lobe, bronchus or lung | Lung |
C348 | C34.8 | Overlapping lesion of lung | Overlapping malignant lesion of bronchus and lung | Lung |
C349 | C34.9 | Lung | Malignant neoplasm of bronchus or lung, unspecified | Lung |
MORPHOLOGY
The histological classification of the cancer tissue (histopathological type) in a person with cancer, and a description of the course of development that a tumour is likely to take: benign or malignant (behaviour).
Morphology code | Short description | Group |
---|---|---|
81403 | Adenocarcinoma | Adenocarcinomas |
82003 | Adenoid cystic carcinoma | Adenocarcinomas |
82013 | Cribriform carcinoma | Adenocarcinomas |
82113 | Tubular adenocarcinoma | Adenocarcinomas |
82503 | Bronchiolo-alveolar adenocarcinoma | Adenocarcinomas |
82513 | Alveolar adenocarcinoma | Adenocarcinomas |
82523 | Bronchiolo-alveolar carcinoma, non-mucinous | Adenocarcinomas |
82533 | Bronchiolo-alveolar carcinoma, mucinous | Adenocarcinomas |
82543 | Bronchiolo-alveolar carcinoma, mixed mucinous and non-mucinous | Adenocarcinomas |
82553 | Adenocarcinoma with mixed subtypes | Adenocarcinomas |
82603 | Papillary adenocarcinoma | Adenocarcinomas |
82633 | Adenocarcinoma in tubulovillous adenoma | Adenocarcinomas |
83103 | Clear cell adenocarcinoma | Adenocarcinomas |
83233 | Mixed cell adenocarcinoma | Adenocarcinomas |
84303 | Mucoepidermoid carcinoma | Adenocarcinomas |
84803 | Mucinous adenocarcinoma | Adenocarcinomas |
84813 | Mucin-producing adenocarcinoma | Adenocarcinomas |
84903 | Signet ring cell carcinoma | Adenocarcinomas |
85503 | Acinar cell carcinoma | Adenocarcinomas |
85723 | Adenocarcinoma with spindle cell metaplasia | Adenocarcinomas |
85743 | Adenocarcinoma with neuroendocrine differentiation | Adenocarcinomas |
85763 | Hepatoid adenocarcinoma | Adenocarcinomas |
80303 | Giant cell and spindle cell carcinoma | Other Specific Carcinomas |
80313 | Giant cell carcinoma | Other Specific Carcinomas |
80323 | Spindle cell carcinoma | Other Specific Carcinomas |
80333 | Pseudosarcomatous carcinoma | Other Specific Carcinomas |
80463 | Non-small cell carcinoma | Other Specific Carcinomas |
82303 | Solid carcinoma | Other Specific Carcinomas |
82443 | Mixed adenoneuroendocrine carcinoma | Other Specific Carcinomas |
82453 | Adenocarcinoid tumour | Other Specific Carcinomas |
82463 | Neuroendocrine carcinoma | Other Specific Carcinomas |
85603 | Adenosquamous carcinoma | Other Specific Carcinomas |
80523 | Papillary squamous cell carcinoma | Squamous Carcinomas |
80703 | Squamous cell carcinoma | Squamous Carcinomas |
80713 | Squamous cell carcinoma, keratinising | Squamous Carcinomas |
80723 | Squamous cell carcinoma, large cell, nonkeratinising | Squamous Carcinomas |
80733 | Squamous cell carcinoma, small cell, nonkeratinising | Squamous Carcinomas |
80743 | Squamous cell carcinoma, spindle cell | Squamous Carcinomas |
80753 | Squamous cell carcinoma, adenoid | Squamous Carcinomas |
80763 | Squamous cell carcinoma, microinvasive | Squamous Carcinomas |
80833 | Basaloid squamous cell carcinoma | Squamous Carcinomas |
80843 | Squamous cell carcinoma, clear cell type | Squamous Carcinomas |
81233 | Basaloid carcinoma | Squamous Carcinomas |
80103 | Carcinoma | Unspecified Carcinomas (NOS) |
80123 | Large cell carcinoma | Unspecified Carcinomas (NOS) |
80133 | Large cell neuroendocrine carcinoma | Unspecified Carcinomas (NOS) |
80143 | Large cell carcinoma with rhabdoid phenotype | Unspecified Carcinomas (NOS) |
80203 | Carcinoma, undifferentiated | Unspecified Carcinomas (NOS) |
80213 | Carcinoma, anaplastic | Unspecified Carcinomas (NOS) |
80223 | Pleomorphic carcinoma | Unspecified Carcinomas (NOS) |
80503 | Papillary carcinoma | Unspecified Carcinomas (NOS) |
PROCEDURE CODES
The following surgical procedures are contained within this dataset.
Procedure code | Procedure name | Group |
---|---|---|
3843801 | Lobectomy of lung | Lobectomy of lung |
3844100 | Radical lobectomy | Lobectomy of lung |
3843800 | Segmental wedge resection of lung | Partial Resection |
3844000 | Wedge resection of lung | Partial Resection |
3844001 | Radical wedge resection of lung | Partial Resection |
9016900 | Endoscopic wedge resection of lung | Partial Resection |
3843802 | Pneumonectomy | Pneumonectomy |
3844101 | Radical pneumonectomy | Pneumonectomy |
DIAGNOSIS BASIS
The most reliable basis of a cancer diagnosis.
Diagnosis basis code | Short description | Long description | Group |
---|---|---|---|
5 | Cytology or Haematology | Cytology: Examination of cells from a primary or secondary site, including fluids aspirated by endoscopy or needle; also includes the microscopic examination of peripheral blood and bone marrow aspirates | Histological |
6 | Histology of Metastasis | Histology of metastasis: Histological examination of tissue from a metastasis, including autopsy specimens | Histological |
7 | Histology of Primary Tumour | Histology of a primary tumour: Histological examination of tissue from primary tumour, however obtained, including all cutting techniques and bone marrow biopsies; also includes autopsy specimens of primary tumour | Histological |
8 | Histology (unknown if Primary or Metastasis) | Histology: either unknown whether of primary or metastatic site, or not otherwise specified | Histological |
0 | Death certificate only | Death certificate only: Information provided is from a death certificate | Other |
1 | Clinical Only | Clinical: Diagnosis made before death, but without any of the following (codes 2-7) | Other |
2 | Clinical Investigations | Clinical investigation: All diagnostic techniques, including x-ray, endoscopy, imaging, ultrasound, exploratory surgery (e.g. laparotomy), and autopsy, without a tissue diagnosis | Other |
4 | Specific Tumour Markers (Biochemical or Immunological Testing) | Specific tumour markers: Including biochemical and/or immunological markers that are specific for a tumour site | Other |
9 | Not Stated/Unknown | Unknown | Other |
SOCIOECONOMIC STATUS
Socioeconomic status is based on the Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Local Areas (SLA).
The ABS use SEIFA scores to rank regions into ten groups (deciles) numbered one to ten, with one being the most disadvantaged and ten being the most affluent group.
This ranking is useful at the national level, but the number of people in each decile often becomes too small for meaningful comparisons when applied to a subset of the population.
For this reason, this document further aggregates SEIFA deciles into 3 socioeconomic groups.
Socioeconomic status decile | Group | Percentage of population |
---|---|---|
1-2 | Disadvantaged | 0.2 |
3-8 | Middle | 0.6 |
9-10 | Affluent | 0.2 |
Frequently Asked Questions (FAQ)
What data is included in the Research Ready Dataset (RRD)?
This dataset contains records for all persons diagnosed with Non-Small Cell Lung Cancer (NSCLC) in Queensland between 2000 and 2019. Demographic data such as sex, age-group, location/remoteness of residence is included.
Is the data anonymised?
Yes. No names or addresses will be included in the data extracts. Unique identifiers will be attached to each person included in the file.
Does this dataset contain information on treatments the individual received?
Yes. Data on the following modes of treatment are included in the dataset: surgery (major resections), radiation therapy (RT), intra-venous systemic therapy (IVST)
Are there gaps in the data?
Treatment data is available for the duration of the 20-yr period. Complete surgical data is available for the full timespan. Radiation therapy data sources were enhanced in 2007 and data from 2007-2019 has greater coverage. Similarly, chemotherapy sources such as iPharmacy and CHARM were progressively added, leading to full coverage from 2009. Chemotherapy data is limited to Intra-venous systemic therapy (IVST) only.
Staging data capture is limited to Stage IV only for 2000 to 2016 and has been inferred by the presence of clinical records indicating metastases. From 2017 onwards, complete 8th edition staging data has been inferred using TNM staging approaches via clinical data reviewed by specialist coders.
How to apply
Details on how to apply for access to the full dataset can be found here
How do I cite data provided?
Use of the data in publications requires the acknowledgement and citation as outlined in the CAQ publication guidelines here
Contact details
For more information or to contact us directly, click here