Bulk of improvement suggestions #855

mtsfer · 2024-09-25T22:15:24Z

#822 is pretty important and suggests great improvements to the project. However, these improvements requires a lot of collective effort and they are kinda impossible to be made considering the current structure of the repository and the existing issues in the dataset.

When I say collective effort, I'm including people that do not knows how to work with SQL. JSON is a way more readable format for lay people. If the contributions could be made in the JSON files, more people would be able to help adding new places to the dataset and fixing inconsistencies.

Here go some suggestions to the project (in order of urgency on my point of view):

Allow contributions on JSON (actually, should be the default way): Update records in a inlined SQL insert statement is pretty counter-productive. I would suggest move the contributions to JSON. From there, you could easily create the SQL insert statements and generate the dataset in other file formats too.
Reduce the size of the repository: There is almost 2Gb of data on this repo, and cloning this to contribute is a huge pain, that partially justify the low number of contributors. The majority of the data is repeated on multiple file formats, and repeated again with permutations of place types (e.g countries+cities, countries+states, countries+states+cities). I'm quite sure that individual files would be more than sufficient. Make the data available only on the most used file formats would also help;
Normalize the database: There are some current inconsistences in the dataset caused by inadvertent denormalization. Also, with the database normalized, it would be easier to identify problems and fix them;
Include translations to cities and states.
Introduce more specific places to the dataset, as suggested in Incoherence between the data across tables #822.

@dr5hn Thanks a lot for this project, it's a gem.

dosubot bot added the enhancement New feature or request label Sep 25, 2024

mtsfer changed the title ~~Bulk of improvement suggestions to the project~~ Bulk of improvement suggestions Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk of improvement suggestions #855

Bulk of improvement suggestions #855

mtsfer commented Sep 25, 2024 •

edited

Loading

Bulk of improvement suggestions #855

Bulk of improvement suggestions #855

Comments

mtsfer commented Sep 25, 2024 • edited Loading

mtsfer commented Sep 25, 2024 •

edited

Loading