Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicates from lists #80

Open
DavideViolante opened this issue Jul 30, 2023 · 1 comment
Open

Remove duplicates from lists #80

DavideViolante opened this issue Jul 30, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@DavideViolante
Copy link
Owner

DavideViolante commented Jul 30, 2023

Examples:
Andrea, Juan, etc

only the most frequent should be considered.

related #79

@Micosilent
Copy link
Contributor

Micosilent commented Jul 31, 2023

After checking all the maps, there are duplicates in all the lists, according to my code gender-detection-from-name/duplicateDeletion, there are a few more duplicates than we might have originally thought.

There are 4082 duplicated names

  • 3542 in the enMap
  • 2561 in the itMap
  • 3570 in the esMap

And if we exclude the esMap, then there are 2021 dupes between the itMap and the enMap ( of which 67 have different genders )

EDIT: I missed an actionable point, we could start by removing all the duplicates from the maps that have the same gender, and think about a solution to the differing genders further ahead.

PS: to verify yourself, checkout my branch, and run npm run name_lint

@DavideViolante DavideViolante added the enhancement New feature or request label Aug 1, 2023
@DavideViolante DavideViolante changed the title Remove duplicates from ES list Remove duplicates from lists Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants