-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔍 WANTED: We are looking for data and data curators #69
Comments
One way might be to crowdsource the search for data. There are many COVID-19 and SARS-CoV-2-related projects on the web. Some of them may contain data, APIs or just interesting ideas that can help us to make our application better. Here are some examples: |
https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide seems like a good data source? Example data: https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-geographic-disbtribution-worldwide-2020-03-20.xlsx Data seems to be global and well structured. It only counts cases and deaths, though (no hospitalized, ICU, recovered) |
Would this be enough ? It’s data that’s refreshed daily at 9am EST https://www.tableau.com/covid-19-coronavirus-data-resources |
https://coronadatascraper.com/ there's a fair bit of data available there as well |
For Spain, this is a good data source, containing national and regional cases, deaths, ICU and recovered, updated on a daily basis: https://github.com/datadista/datasets/tree/master/COVID%2019 |
I have a finished pull request for the ECDC dataset pending now, replacing the WHO data and parser. |
There is an amazing amount of data on that API, but I guess it is not an official source. Should be easy to write a parser for, if required. |
Good point! I have checked a few of their scrappers, they all seem to be directed at government pages eg https://www.health.gov.au/news/health-alerts/novel-coronavirus-2019-ncov-health-alert/coronavirus-covid-19-current-situation-and-case-numbers for Australia If we were to go this road it shouldn't take too long to vet each source I guess. |
Spain's data: neherlab/covid19_scenarios_data#11 |
Hey all, Have been working on https://coronadatascraper.com/ aka https://github.com/lazd/coronadatascraper in my own time, and also am a Sanity user professionally+personally. We've been building scrapers over there only from official sources. No news, no aggregates, just governments directly (yes this is a pain since many governments like to have free-text press releases, sometimes with useful numbers written out like If there are any sources on there that aren't primary sources (government depts), please raise an issue on that github and we'll work to sort it out. There's a slack for that project too if anyone wants to jump on and chat with us that are working on it. |
I wrote a first parser for the coronadatascaper.com now (in my forked repo). In the latest version, it should also contain correct entries for regions such as USA-OK-Love County. Everything is stored in a global .tsv (and json as well). Re: source quality of coronadatascaper: Germany's numbers are pulled of the app of a tabloid... I don't think it will be possible to vet sources of such an API, as they can change things as they see fit. |
Hi, thanks you all for the work. |
I have collected a lot of data for Germany: https://github.com/ManuelB/covid-19-vis/tree/gh-pages/germany It is used to run a full simulation for 417 districts in Germany and runs on the command line. Details what I am doing in described here: If the data is integrated into the data repository it would show more than 400 items in the select box. I would think this is too much. |
That's really cool @ManuelB. Thanks for sharing. |
I can provide you an API that gives all country data regarding COVID-19 .It also get updated frequently
Hope it will help you guys. |
For Brazil, I saw that the data available at https://brasil.io/dataset/covid19/ have been used. Great! |
Hi @pauloangelo , thanks for highlighting this. The data needs to be update manually by the maintainers of this project, and that has just not been done in the last 3 days. I am sure they will do this soon! |
Thank you @noleti . I'm available to help, if needed. Thank you all for this remarkable initiative! |
Hey @pauloangelo, sorry for the delay. We will update the date now and re-release soon! |
Thank you @nnoll ! |
If you compile population sizes for Brazilian regions and their hospital capacities, we can add them as presets. |
Hi all, The counts for "BRA-Distrito Federal" are including the cases from other regions detected at Distrito Federal. The Brasil.io dataset registers external cases as "Importados/Indefinidos". I suggest to count just the local cases. For example, for 29-May-2020 there are 142 local deaths, while the TSV counts 154. Best regards, PA |
Hi @rneher , I will have a look at it. For the hospital capacities, unfortunately, we don't have a reliable data. The government are varying this information. For the population sizes, R0, etc, I believe we can provide, at least for "BRA-Distrito Federal". Follows below the link/data that we have been using in our weekly report. Weekly reports created by our observatory (parameters are also motivated here) |
@pauloangelo I created a separate issue for this, let's continue there |
Currently we are looking for case counts data and other statistical information from different countries as well as for people who can maintain this data (add, curate, update).
The entire process should be automated as much as possible. The README in the directory
covid19_scenarios/data
contains some information on how to get started:https://github.com/neherlab/covid19_scenarios/data
It also contains the preprocessed data ready for the consumption by the build system of the app.
If you think you may know where to find the relevant data for a country, please let us know either in this thread, or open an issue. If you are ready to contribute, feel free to open a pull request.
Don't hesitate to ask if you have any questions or if you need something to get started!
cc @nnoll @rneher
The text was updated successfully, but these errors were encountered: