Appendix: source tables
Table: stem_table
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| domain_id | CHARACTER VARYING | ||
| person_id | INTEGER | ||
| visit_occurrence_id | INTEGER | ||
| provider_id | INTEGER | ||
| id | INTEGER | ||
| concept_id | INTEGER | ||
| source_value | CHARACTER VARYING | ||
| source_concept_id | INTEGER | ||
| type_concept_id | INTEGER | For condition: 32020 - EHR encounter diagnosis For meas/obs: derived from EHR | |
| start_date | DATE | ||
| start_datetime | DATETIME | ||
| end_date | DATE | ||
| end_datetime | DATETIME | ||
| verbatim_end_date | DATE | ||
| days_supply | INTEGER | ||
| dose_unit_source_value | CHARACTER VARYING | ||
| lot_number | CHARACTER VARYING | ||
| modifier_concept_id | INTEGER | ||
| modifier_source_value | CHARACTER VARYING | ||
| operator_concept_id | INTEGER | ||
| modifier_source_value | CHARACTER VARYING | ||
| quantity | INTEGER | ||
| range_high | FLOAT | ||
| range_low | FLOAT | ||
| refills | INTEGER | ||
| route_concept_id | INTEGER | ||
| route_source_value | CHARACTER VARYING | ||
| sig | CHARACTER VARYING | ||
| stop_reason | CHARACTER VARYING | ||
| unique_device_id | CHARACTER VARYING | ||
| unit_concept_id | INTEGER | ||
| unit_source_value | CHARACTER VARYING | ||
| value_as_concept_id | INTEGER | ||
| value_as_number | DECIMAL | ||
| value_as_string | CHARACTER VARYING | ||
| value_source_value | CHARACTER VARYING | ||
| anatomic_site_concept_id | INTEGER | ||
| disease_status_concept_id | INTEGER | ||
| specimen_source_id | INTEGER | ||
| anatomic_site_source_value | CHARACTER VARYING | ||
| disease_status_source_value | CHARACTER VARYING | ||
| condition_status_concept_id | CHARACTER VARYING | ||
| condition_status_source_value | INTEGER | ||
| qualifier_concept_id | INTEGER | ||
| qualifier_source_value | CHARACTER VARYING |
Table: baseline
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | Numerical |
| field | varchar | 728-0.0 | |
| value | varbinary | 0 |
Table: gp_registrations
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | |
| data_provider | int | 3 | Data system (Scotland, Wales, Englandx2). |
| reg_date | date | YYYY-MM-DD | entry |
| deduct_date | date | YYYY-MM-DD | exit date, can be missing, still active GP practices |
Table: gp_clinical
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | |
| data_provider | int | 3 | Used for understanding the data Relevant for ETL, but less for the data itself. |
| event_dt | date | Empty event date: In source, a placeholder is used (1900-01-01) | |
| read_code | varchar | 2469. | Deduced field, read_2 or read_3. Read v2 is subset of Read v3 |
| read_2 | varchar | ||
| read_3 | varchar | ||
| value1 | varchar | Some lookups, some corresponding with the one for Caliber. e.g. OPR = operator Lookups not documented For now focus on the numeric values. | |
| value2 | varchar | For now focus on the numeric values, ignore the lookups. | |
| value3 | varchar | Units only captured for one data_provider. MEAxxx = unit lookup (to be provided) |
Table: gp_prescriptions
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | |
| data_provider | int | 3 | |
| issue_date | date | ||
| read_2 | varchar | ||
| bnf_code | varchar | ||
| dmd_code | varchar | ||
| drug_name | varchar | ||
| quantity | varchar |
Table: hesin
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | |
| ins_index | int | 0 | Unique key within eid, identifying a visit. UKB provided field |
| admidate | date | ||
| disdate | date | ||
| dsource | char | HES | Data source. Lookups of other fields depend on this. |
| source | int | 6 | |
| epistart | date | Within an admission | |
| epiend | date | ||
| epidur | int | 0 | |
| bedyear | int | 0 | Only for data from England, year an episode counts for (administrative) |
| epistat | int | 3 | |
| epitype | int | 1 | |
| epiorder | int | 1 | |
| spell_index | int | 0 | If multiple admissions chained together. |
| spell_seq | int | 0 | |
| spelbgin | int | 2 | |
| spelend | char | Y | |
| speldur | int | 0 | |
| pctcode | char | Care site, group of hospitals (primary care trust) | |
| gpprpct | char | ||
| category | int | TBD | |
| elecdate | date | Date decided to be admitted to hospital (administrative, to capture delays) | |
| elecdur | int | ||
| admimeth_uni | int | 1001 | |
| admimeth | char | 11 | Way a patient was seen (emergency, planned, …) Coding depends on dsource http://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=41251 |
| admisorc_uni | int | 1000 | |
| admisorc | char | 19 | |
| firstreg | int | First registration (adminstrative) | |
| classpat_uni | int | 2000 | |
| classpat | char | 2 | |
| intmanag_uni | int | 200 | |
| intmanag | int | 2 | |
| mainspef_uni | int | 1350 | |
| mainspef | char | 100 | |
| tretspef_uni | int | 1490 | |
| tretspef | char | 300 | |
| operstat | int | 1 | |
| dismeth_uni | int | 1000 | |
| dismeth | int | 1 | |
| disdest_uni | int | 1000 | |
| disdest | char | 19 | |
| carersi | int | 99 |
Table: hesin_diag
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | |
| ins_index | int | 0 | |
| arr_index | int | 0 | Order in which diagnoses were recorded (administrative) |
| level | int | 2 | |
| diag_icd9 | char | NHS version of ICD standard | |
| diag_icd9_nb | char | Ignore | |
| diag_icd10 | char | I10 | NHS version of ICD standard |
| diag_icd10_nb | char | Used addendum to ICD10. Infrequent. |
Table: hesin_oper
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | 3045406 | |
| ins_index | int | 0 | |
| arr_index | int | 0 | |
| level | int | 2 | |
| opdate | date | ||
| oper3 | char | OPCS v3 | |
| oper3_nb | char | ||
| oper4 | char | X998 | OPCS v4 |
| oper4_nb | char | ||
| posopdur | int | Days spent in hospital before (administrative, can be derived from visit) | |
| preopdur | int | Days spent in hospital after (administrative, can be derived from visit) |
Table: covid
| Field | Type | Most freq. value | Comment |
|---|---|---|---|
| eid | int | XXX | |
| specdate | date | YYYY-MM-DD | |
| spectype | int | 2 | |
| laboratory | int | 44 | |
| origin | int | 1 | |
| result | int | 0 | 1=covid positive |