Research ready Lung Data

Research ready lung data

Overview

This data dictionary describes the contents of the research ready lung cancer data set. This data set is made up of all Queensland non-small cell lung cancer (NSCLC) diagnoses between 2000 and 2019. The fields have been arranged in four categories as follows:

  • Demographic,
  • Clinical,
  • Treatment, and
  • Administrative

and a dataset sample is provided containing 10 example records. For detailed reference tables containing ICD Codes for clinical data items such as Primary Site, Morphology and Procedure codes, please see Reference Sets.

Request access

 

Demographic

Demographic fields in ths dataset include the age, sex and Indigenous status of the person diagnosed with cancer, as well as geographic and socio-economic data based on the person’s place of residence at diagnosis.

Field title Field name Definition Field values
Age at diagnosis AgeAtDiagnosis The age of the person in (completed) years at a specific point in time when first diagnosed  
Age group at diagnosis (code) AgeGroupFiveYearsKey The five-year age group the person belonged to at a specific point in time when first diagnosed  
Age group at diagnosis (description) AgeGroupFiveYears The five-year age group the person belonged to at a specific point in time when first diagnosed  
Date of birth BirthDate The date on which an individual was born  
Date of death DeathDate The date of death of the person  
Hospital and Health Service (code) HHSOfResidenceKey Queensland Hospital and health service geographic region at diagnosis  
Hospital and Health Service (description) HHSOfResidence Queensland Hospital and health service geographic region at diagnosis Cairns and Hinterland
Central Queensland
Central West
Darling Downs
Gold Coast
Mackay
Metro North
Metro South
North West
South West
Sunshine Coast
Torres and Cape
Townsville
West Moreton
Wide Bay
Indigenous status (code) IndigenousStatusID A measure of whether a person identifies as being of Aboriginal or Torres Strait Islander origin  
Indigenous status (description) IndigenousStatus A measure of whether a person identifies as being of Aboriginal or Torres Strait Islander origin Indigenous
non-Indigenous
Not Stated/Unknown
Remoteness area Remoteness The remoteness of residence at time of diagnosis. Major City
Inner Regional
Outer Regional
Remote & Very Remote
Sex (code) SexID The biological distinction between male and female  
Sex (description) Sex The biological distinction between male and female  
Socioeconomic status DecileID

Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Local Areas (SLA). (See summary table)

The Index of Relative Socioeconomic Advantage and Disadvantage is used (IRSAD)
1 (Low)
.
.
.
10 (High)
Socioeconomic status DecileName

Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Local Areas (SLA). (See summary table)

The Index of Relative Socioeconomic Advantage and Disadvantage is used (IRSAD)
1
2
3
4
5
6
7
8
9
10
Socioeconomic status (group) SocioeconomicStatus Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Local Areas (SLA). (See summary table) Affluent
Disadvantaged
Middle
Unknown
Statistical Area level 2 (SA2 Residence at Diagnosis) ASGS_SA2Code A designated region describing location and contact details that represents a medium-sized area built from a number of Statistical Area 1, as represented by a code. The aim is to represent a community that interacts together socially and economically See ABS website for details
Statistical Area level 2 (SA2 Residence at Diagnosis) ASGS_SA2Description A designated region describing location and contact details that represents a medium-sized area built from a number of Statistical Area 1. The aim is to represent a community that interacts together socially and economically See ABS website for details

Clinical

The clinical fields contained in this dataset include information about the cancer diagnosed. This includes date of diagnosis, the clinical features of the tumour such as size, morphology, nodal status and the presence of metastases. For the first time, clinical stage is included in this data. Full staging information is available only for patients diagnosed 2017-2019, although Stage IV patients have been identified from 2000-onwards. This data is comprehensive and has been obtained from clinical audits conducted by clinicians and coders, leading to the development of automated algorithms used on prospective data to glean stage routinely for new diagnoses.

Field title Field name Definition Field values
Cancer-related death DeathEventCauseSpecific Was the person’s death caused by their cancer? 0 - No
1 - Yes
Date of diagnosis DiagnosisDate The date a disease or condition is diagnosed  
Differentiation (code) DifferentiationKey The histological grade of the cancer tissue in a person with cancer 1
2
3
4
98
99
Differentiation (description) Differentiation The histological grade of the cancer tissue in a person with cancer Moderately Differentiated / Intermediate Grade / Grade 2
Not Applicable
Not Stated/Unknown
Poorly Differentiated / High Grade / Grade 3
Undifferentiated / Anaplastic / Grade 4
Well Differentiated / Low Grade / Grade 1
Morphology of cancer (code) MorphologyCode The histological classification of the cancer tissue (histopathological type) in a person with cancer, and a description of the course of development that a tumour is likely to take: benign or malignant (behaviour), as represented by a code. See Appendix A
Morphology of cancer (description) Morphology The histological classification of the cancer tissue (histopathological type) in a person with cancer, and a description of the course of development that a tumour is likely to take: benign or malignant (behaviour). See Appendix A
Morphology of cancer (group) MorphologyGroup High level grouping of the morphologies Adenocarcinomas
Other Specific Carcinomas
Squamous Carcinomas
Unspecified Carcinomas (NOS)
Most valid basis of diagnosis of cancer (code) DiagnosisBasisKey The most reliable basis of a cancer diagnosis 0
1
2
3
4
5
6
7
8
9
10
11
12
9999
Most valid basis of diagnosis of cancer (description) DiagnosisBasis The most reliable basis of a cancer diagnosis Clinical Investigations
Clinical Only
Cytology or Haematology
Exploratory Surgery
Histology (unknown if Primary or Metastasis)
Histology of Metastasis
Histology of Primary Tumour
Not Stated/Unknown
Specific Tumour Markers (Biochemical or Immunological Testing)
Number of comorbidities ComorbidityCount A grouping of clinical conditions that has the potential to significantly affect a cancer patient’s prognosis. Numeric value
Number of comorbidities (grouped) ComorbidityCountGroup A grouping of clinical conditions that has the potential to significantly affect a cancer patient’s prognosis. 0
1
2+
Performance status (code) PerformanceStatusCode Performance status recorded in QOOL at time of MDT (code) 0
1
2
3
4
99
Performance status (description) PerformanceStatus Performance status recorded in QOOL at time of MDT (description) Fully active
Ambulatory - capable of light work
Bed < 50% - self caring - not working
Bed > 50% - partially self caring
Confined to bed or chair
Unknown
Primary site of cancer (description) PrimarySite The site of origin of the tumour, as opposed to the secondary or metastatic sites. Bronchus or lung
Lower lobe, bronchus or lung
Main bronchus
Middle lobe, bronchus or lung
Overlapping lesion of bronchus and lung
Trachea
Upper lobe, bronchus or lung
Primary site of cancer (group) PrimarySiteGroup High level grouping of the sites in which the tumour originated in a person with cancer NSCLC
Primary site of cancer (ICD-10-AM code) PrimarySiteCode The site of origin of the tumour, as opposed to the secondary or metastatic sites, as represented by an ICD-10-AM code. C33
C340
C341
C342
C343
C348
C349
Underlying cause of death CauseOfDeath The cause of death of the person as represented by an ICD-10-AM code.  

Treatment

Treatment data items fall into 4 sub-categories as follows: Multidisciplinary Team meetings (MDT), Surgical procedures, Radiation therapy (RT), and Intra-venous systemic therapy (IVST). For detailed information regarding clinical data items, please see Reference Sets.

MULTIDISCIPLINARY TEAM MEETINGS (MDT)

MDT data is present from 2000 onwards. Initial MDT data is limited to data from a project run at The Prince Charles Hospital, however data sourced from QOOL is available from 2009. MDTs also provide many of the data items required for staging.

Field title Field name Definition Field values
Had MDT review HadMDTReview Record of MDT presentation as recorded by QOOL No
Yes

Administrative

Field title Field name Definition Field values
Censor date SurvivalCensorDate Patients followed up until this date  
Patient identifier UniqueID Unique identifier for each person in the dataset  

Sample data

This sample contains 10 sample records intended to show the type of data available and the format in which the data is presented. This will allow researchers to prepare load scripts and analysis programs in advance of downloading the full data set.

Click here to download the sample data in Excel.

 

Reference sets

AGE GROUPS

The five-year age group the person belonged to at a specific point in time when first diagnosed

Age group key Age group
1 0-4
2 5-9
3 10-14
4 15-19
5 20-24
6 25-29
7 30-34
8 35-39
9 40-44
10 45-49
11 50-54
12 55-59
13 60-64
14 65-69
15 70-74
16 75-79
17 80-84
18 85+

SEX

The biological distinction between male and female.

Reference code Short description Long description
1 MALE MALE
2 FEMALE FEMALE
3 OTHER OTHER
9 NOT STATED/INADEQUATELY DESCRIBED NOT STATED/INADEQUATELY DESCRIBED

INDIGENOUS STATUS

A measure of whether a person identifies as being of Aboriginal or Torres Strait Islander origin.

Reference code Short description Long description
1 Aboriginal but not Torres Strait Islander origin Aboriginal but not Torres Strait Islander origin
2 Torres Strait Islander but not Aboriginal origin Torres Strait Islander but not Aboriginal origin
3 Both Aboriginal and Torres Strait Islander origin Both Aboriginal and Torres Strait Islander origin
4 Neither Aboriginal nor Torres Strait Is. Origin Neither Aboriginal nor Torres Strait Islander origin
9 Not Stated / Unknown Not Stated / Unknown

PRIMARY SITE

The site of origin of the tumour, as opposed to the secondary or metastatic sites, as represented by an ICD-10-AM code.

Primary site code Primary site punctuated Short description Long description Group
C33 C33 Trachea Malignant neoplasm of trachea Lung
C340 C34.0 Main bronchus Malignant neoplasm of main bronchus Lung
C341 C34.1 Upper lobe Malignant neoplasm of upper lobe, bronchus or lung Lung
C342 C34.2 Middle lobe Malignant neoplasm of middle lobe, bronchus or lung Lung
C343 C34.3 Lower lobe Malignant neoplasm of lower lobe, bronchus or lung Lung
C348 C34.8 Overlapping lesion of lung Overlapping malignant lesion of bronchus and lung Lung
C349 C34.9 Lung Malignant neoplasm of bronchus or lung, unspecified Lung

MORPHOLOGY

The histological classification of the cancer tissue (histopathological type) in a person with cancer, and a description of the course of development that a tumour is likely to take: benign or malignant (behaviour).

Morphology code Short description Group
81403 Adenocarcinoma Adenocarcinomas
82003 Adenoid cystic carcinoma Adenocarcinomas
82013 Cribriform carcinoma Adenocarcinomas
82113 Tubular adenocarcinoma Adenocarcinomas
82503 Bronchiolo-alveolar adenocarcinoma Adenocarcinomas
82513 Alveolar adenocarcinoma Adenocarcinomas
82523 Bronchiolo-alveolar carcinoma, non-mucinous Adenocarcinomas
82533 Bronchiolo-alveolar carcinoma, mucinous Adenocarcinomas
82543 Bronchiolo-alveolar carcinoma, mixed mucinous and non-mucinous Adenocarcinomas
82553 Adenocarcinoma with mixed subtypes Adenocarcinomas
82603 Papillary adenocarcinoma Adenocarcinomas
82633 Adenocarcinoma in tubulovillous adenoma Adenocarcinomas
83103 Clear cell adenocarcinoma Adenocarcinomas
83233 Mixed cell adenocarcinoma Adenocarcinomas
84303 Mucoepidermoid carcinoma Adenocarcinomas
84803 Mucinous adenocarcinoma Adenocarcinomas
84813 Mucin-producing adenocarcinoma Adenocarcinomas
84903 Signet ring cell carcinoma Adenocarcinomas
85503 Acinar cell carcinoma Adenocarcinomas
85723 Adenocarcinoma with spindle cell metaplasia Adenocarcinomas
85743 Adenocarcinoma with neuroendocrine differentiation Adenocarcinomas
85763 Hepatoid adenocarcinoma Adenocarcinomas
80303 Giant cell and spindle cell carcinoma Other Specific Carcinomas
80313 Giant cell carcinoma Other Specific Carcinomas
80323 Spindle cell carcinoma Other Specific Carcinomas
80333 Pseudosarcomatous carcinoma Other Specific Carcinomas
80463 Non-small cell carcinoma Other Specific Carcinomas
82303 Solid carcinoma Other Specific Carcinomas
82443 Mixed adenoneuroendocrine carcinoma Other Specific Carcinomas
82453 Adenocarcinoid tumour Other Specific Carcinomas
82463 Neuroendocrine carcinoma Other Specific Carcinomas
85603 Adenosquamous carcinoma Other Specific Carcinomas
80523 Papillary squamous cell carcinoma Squamous Carcinomas
80703 Squamous cell carcinoma Squamous Carcinomas
80713 Squamous cell carcinoma, keratinising Squamous Carcinomas
80723 Squamous cell carcinoma, large cell, nonkeratinising Squamous Carcinomas
80733 Squamous cell carcinoma, small cell, nonkeratinising Squamous Carcinomas
80743 Squamous cell carcinoma, spindle cell Squamous Carcinomas
80753 Squamous cell carcinoma, adenoid Squamous Carcinomas
80763 Squamous cell carcinoma, microinvasive Squamous Carcinomas
80833 Basaloid squamous cell carcinoma Squamous Carcinomas
80843 Squamous cell carcinoma, clear cell type Squamous Carcinomas
81233 Basaloid carcinoma Squamous Carcinomas
80103 Carcinoma Unspecified Carcinomas (NOS)
80123 Large cell carcinoma Unspecified Carcinomas (NOS)
80133 Large cell neuroendocrine carcinoma Unspecified Carcinomas (NOS)
80143 Large cell carcinoma with rhabdoid phenotype Unspecified Carcinomas (NOS)
80203 Carcinoma, undifferentiated Unspecified Carcinomas (NOS)
80213 Carcinoma, anaplastic Unspecified Carcinomas (NOS)
80223 Pleomorphic carcinoma Unspecified Carcinomas (NOS)
80503 Papillary carcinoma Unspecified Carcinomas (NOS)

PROCEDURE CODES

The following surgical procedures are contained within this dataset.

Procedure code Procedure name Group
3843801 Lobectomy of lung Lobectomy of lung
3844100 Radical lobectomy Lobectomy of lung
3843800 Segmental wedge resection of lung Partial Resection
3844000 Wedge resection of lung Partial Resection
3844001 Radical wedge resection of lung Partial Resection
9016900 Endoscopic wedge resection of lung Partial Resection
3843802 Pneumonectomy Pneumonectomy
3844101 Radical pneumonectomy Pneumonectomy

DIAGNOSIS BASIS

The most reliable basis of a cancer diagnosis.

Diagnosis basis code Short description Long description Group
5 Cytology or Haematology Cytology: Examination of cells from a primary or secondary site, including fluids aspirated by endoscopy or needle; also includes the microscopic examination of peripheral blood and bone marrow aspirates Histological
6 Histology of Metastasis Histology of metastasis: Histological examination of tissue from a metastasis, including autopsy specimens Histological
7 Histology of Primary Tumour Histology of a primary tumour: Histological examination of tissue from primary tumour, however obtained, including all cutting techniques and bone marrow biopsies; also includes autopsy specimens of primary tumour Histological
8 Histology (unknown if Primary or Metastasis) Histology: either unknown whether of primary or metastatic site, or not otherwise specified Histological
0 Death certificate only Death certificate only: Information provided is from a death certificate Other
1 Clinical Only Clinical: Diagnosis made before death, but without any of the following (codes 2-7) Other
2 Clinical Investigations Clinical investigation: All diagnostic techniques, including x-ray, endoscopy, imaging, ultrasound, exploratory surgery (e.g. laparotomy), and autopsy, without a tissue diagnosis Other
4 Specific Tumour Markers (Biochemical or Immunological Testing) Specific tumour markers: Including biochemical and/or immunological markers that are specific for a tumour site Other
9 Not Stated/Unknown Unknown Other

SOCIOECONOMIC STATUS

Socioeconomic status is based on the Socio-Economic Indexes for Areas (SEIFA), a census-based measure of social and economic well-being developed by the Australian Bureau of Statistics (ABS) and aggregated at the level of Statistical Local Areas (SLA).

The ABS use SEIFA scores to rank regions into ten groups (deciles) numbered one to ten, with one being the most disadvantaged and ten being the most affluent group.

This ranking is useful at the national level, but the number of people in each decile often becomes too small for meaningful comparisons when applied to a subset of the population.

For this reason, this document further aggregates SEIFA deciles into 3 socioeconomic groups.

Socioeconomic status decile Group Percentage of population
1-2 Disadvantaged 0.2
3-8 Middle 0.6
9-10 Affluent 0.2