-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathdata_dictionary.txt
318 lines (184 loc) · 7.01 KB
/
data_dictionary.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
DATA DICTIONARY
2022-01-06
CHANGELOG
2022-01-06 Removed empty genomeSequences.*
2021-12-01 Added SGTF field
Unless mentioned all fields are text strings.
[] next to a field indicates that it is a comma separated
array.
* marks mandatory fields
METADATA
1. _id *
Internal ID used by Global.health database.
This is not expected to be stable.
2. caseReference.additionalSources []
Additional sources (URLs) for this case
3. caseReference.sourceId *
Unique source ID for this case. Each case is ingested from
a specific source URL, which has an unique ID. This is
stable for a particular source.
4. caseReference.sourceUrl *
Data URL from which this case was ingested.
5. caseReference.uploadIds [] *
Subsequent uploads following the initial upload of a case can
change the data of a case (only in sources that provide an
unique ID in caseReference.sourceEntryId). This field records
the unique upload IDs that updated this case.
6. caseReference.verificationStatus *
Case verification status
Values: VERIFIED | UNVERIFIED | EXCLUDED
VERIFIED: Case was verified by a curator after ingestion
UNVERIFIED: Case was automatically ingested without verification
EXCLUDED: Case has been excluded from the line list
Most of our automated data ingestion is from authoritative
government datasets, with a few from volunteer-operated datasets.
DEMOGRAPHICS
Generally, we prefer to ingest demographic information over
location, if they are not available in the same dataset.
7. demographics.ageRange.end
Upper age range of individual (0 - 120)
8. demographics.ageRange.start
Lower age range of individual (0 - 120)
9. demographics.ethnicity
Ethnicity of individual
10. demographics.gender
Gender of individual (Male | Female | Non-binary/Third gender | Other)
11. demographics.nationalities []
All the nationalities of the individual
12. demographics.occupation
Occupation of the individual
EVENTS
All .date values are dates in YYYY-MM-DD format.
13. events.confirmed.date *
14. events.confirmed.value
Confirmed date. If value is present, indicates
method of confirmation.
15. events.firstClinicalConsultation.date
First clinical consultation date
16. events.hospitalAdmission.date
17. events.hospitalAdmission.value
Hospital admission date, value (Yes | No)
18. events.icuAdmission.date
19. events.icuAdmission.value
Intensive Care Unit admission date, value (Yes | No)
20. events.onsetSymptoms.date
Date of onset of symptoms
21. events.outcome.date
22. events.outcome.value
Outcome date, values are
Death | Recovered | hospitalAdmission | icuAdmission | Unknown
23. events.selfIsolation.date
Date that individual started self-isolating
LOCATION
24. location.administrativeAreaLevel1
Admin1 level location of individual (usually state or province)
25. location.administrativeAreaLevel2
Admin2 level location of individual (usually district)
26. location.administrativeAreaLevel3
Admin3 level location of individual (usually city)
27. location.country *
Country that case was reported in.
28. location.geoResolution *
Geo-resolution of location (how coarse the location is)
Country | Admin1 | Admin2 | Admin3 | Point
29. location.geometry.latitude *
Geolocated latitude (-90 to 90)
Positive values are North, negative values are South
30. location.geometry.longitude *
Geolocated longitude (-180 to 180)
Positive values are East, negative values are West
31. location.name
Full name of location
(example: Lyon, Auvergne-Rhône-Alpes, France)
32. location.place
Name of the place this location refers to
(example: Boston Children's Hospital)
PATHOGENS
33. pathogens []
Pathogens other than SARS-CoV-2
PRE-EXISTING CONDITIONS
34. preexistingConditions.hasPreexistingConditions
Whether the patient has pre-existing conditions
Boolean: True | False
35. preexistingConditions.values []
List of pre-existing conditions
REVISION METADATA
36. revisionMetadata.creationMetadata.date
Date this case was first created
37. revisionMetadata.creationMetadata.notes
Notes added by the curator for this case
38. revisionMetadata.editMetadata.date
Date this case was last edited
39. revisionMetadata.editMetadata.notes
Notes added by the curator for last edit
40. revisionMetadata.revisionNumber
Revision number of the case (positive integer)
SGTF
41. SGTF
S-Gene Target failure (0 = no deletion, 1 = deletion (S-))
SYMPTOMS
42. symptoms.status
Symptom status (Asymptomatic | Symptomatic | Presymptomatic | null)
43. symptoms.values []
List of symptoms
TRANSMISSION
How this case got infected and by who if known
44. transmission.linkedCaseIds []
UUID of a related case in the system
45. transmission.places []
Places where transmission occurred
46. transmission.routes []
Routes of transmission
TRAVEL HISTORY
47. travelHistory.travel.dateRange.end
48. travelHistory.travel.dateRange.start
Start and end dates for travel history
49. travelHistory.travel.location.administrativeAreaLevel1 []
50. travelHistory.travel.location.administrativeAreaLevel2 []
51. travelHistory.travel.location.administrativeAreaLevel3 []
52. travelHistory.travel.location.country []
53. travelHistory.travel.location.geoResolution []
These have the same meaning as in LOCATION, except that these
pertain to travel history of the individual. Unlike the fields in
location, the fields here are all comma-separated arrays, with each
item corresponding to a travel location in the last 30 days.
54. travelHistory.travel.location.geometry.coordinates []
Comma-separated tuples of latitude and longitude. If the individual
visited latitude m1 and longitude n1 this would be represented as
"(m1, n1)". If there was another travel coordinate (m2, n2), then
this would be represented as "(m1, n1),(m2, n2)"
55. travelHistory.travel.location.name []
56. travelHistory.travel.location.place []
Same as LOCATION, except these are arrays
57. travelHistory.travel.methods []
Corresponding travel methods (such as air, ship, rail ...)
58. travelHistory.travel.purpose []
Purpose of travel
59. travelHistory.traveledPrior30Days
Whether the patient has travelled in the past 30 days
Boolean: True | False
VACCINES
60. vaccines.0.batch
First vaccine batch
61. vaccines.0.date
Date of first vaccine
62. vaccines.0.name
Name of first vaccine
63. vaccines.0.sideEffects []
List of side-effects experienced after vaccine
64. vaccines.1.batch
65. vaccines.1.date
66. vaccines.1.name
67. vaccines.1.sideEffects
68. vaccines.2.batch
69. vaccines.2.date
70. vaccines.2.name
71. vaccines.2.sideEffects
72. vaccines.3.batch
73. vaccines.3.date
74. vaccines.3.name
75. vaccines.3.sideEffects
Same as before, for subsequent vaccines taken by the same individual
VARIANT OF CONCERN
76. variantOfConcern
Variant of concern that was detected. This uses the Pango lineage.