Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC4247: User Pronouns #4247

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

everypizza1
Copy link

@everypizza1 everypizza1 commented Dec 27, 2024

proposals/4247-user-pronouns.md Outdated Show resolved Hide resolved
proposals/4247-user-pronouns.md Outdated Show resolved Hide resolved
proposals/4247-user-pronouns.md Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Client
  • Server

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conduwuit already supports this by supporting arbitrary fields in #4133 so I'd assume this MSC only requires a client implementation that can read/write the field?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With tulir/gomuks#574, Gomuks supports rendering pronouns (setting them is not implemented in the UI yet, but is supported on the backend).

@turt2live turt2live added proposal A matrix spec change proposal client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Dec 27, 2024
everypizza1 and others added 3 commits December 27, 2024 11:04
Co-authored-by: Travis Ralston <[email protected]>
Co-authored-by: Travis Ralston <[email protected]>
Copy link
Member

@uhoreg uhoreg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pretty straightforward. A few mostly-nitpicky comments.

proposals/4247-user-pronouns.md Show resolved Hide resolved
proposals/4247-user-pronouns.md Outdated Show resolved Hide resolved
proposals/4247-user-pronouns.md Outdated Show resolved Hide resolved
}
```
The example uses it/its pronouns followed by she/her pronouns, both in English.
The array is ordered by preference, `language` should be a language code, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"language code" needs to be defined more precisely, I don't think anything in the current spec uses languages. #3554 seems to reference BCP-47

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would ISO 639 language codes work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think BCP-47 uses ISO 639, but also allows optional subtags for specifying regions (so both en and en-US are allowed in BCP-47, while ISO 639 only has the concept of en)

Copy link
Author

@everypizza1 everypizza1 Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch to BCP-47 then, so there's more flexibility.

proposals/4247-user-pronouns.md Outdated Show resolved Hide resolved
Comment on lines 22 to 26
"subject": "it",
"object": "it",
"possessive_determiner": "its",
"possessive_pronoun": "its",
"reflexive": "itself",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these fields are a good idea, since they seem to be specific to English and probably aren't compatible with any other language. Also, I'd guess most implementations will simply use summary for displaying in the profile and won't even try to apply the other fields.

It might be better to just have the freeform field. Maybe also an enum (non-freeform string) for preferred grammatical gender, although even that could get complicated.

Is there any prior art or research into user-definable pronouns that support internationalization?

If the fields are kept as-is, each of them needs to be defined separately, it's not enough to have them in the example. Keeping the fields may also require an implementation actually using them to show they're useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about the enum: it should at least have values masculine, feminine and neuter. The potential complications I can think of are:

  • Some languages don't have a real neuter. Maybe that's not a problem though, because clients have to be able to fall back to gender being unknown anyway?
  • Some languages have animate and inanimate neuter forms (singular they vs it in English). Does the enum need to have those separately? (neuter_inanimate)
  • There might be other types in some weird languages. Are there any, and if there are, do they need to be options too? (and then how do other languages handle those options?)

On the implementation side: pronoun information is generally only needed for rendering state events like profile changes ("X changed his/her/their name"), but those state events don't include this profile info. Fetching the full profile for each profile change state event seems like a bad idea.

In any case, a freeform field should exist to display when viewing someones profile. If the other fields can't be used effectively, then it may be best to narrow this down to only have the freeform field and nothing else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may want to throw out subject, object, etc and do the masculine, feminine, and neuter with the animate and inanimate and for languages that don't support some of those, fall back to the closest to neutral.

Copy link

@tcpipuk tcpipuk Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried out a few options across a few languages, and for languages like Russian that modify other words in the sentence depending on the pronouns, I'd probably use something like this so you can just specify the grammatical pattern once per profile so software can know which form of words you want it to use around your pronouns, but then have the option to specify pronouns per-language (if you want, you could just specify English, etc):

{
    "m.pronouns": [
        {
            "grammatical_pattern": "feminine",
            "forms": {
                "en": {
                    "subject": "she",
                    "object": "her",
                    "possessive_determiner": "her",
                    "possessive_pronoun": "hers",
                    "reflexive": "herself",
                    "dependent": "her"
                },
                "de": {
                    "subject": "sie",
                    "object": "sie",
                    "possessive_determiner": "ihre",
                    "possessive_pronoun": "ihres",
                    "reflexive": "sich",
                    "dependent": "sie"
                },
                "ja": {
                    "subject": "彼女",
                    "object": "彼女",
                    "possessive_determiner": "彼女の",
                    "possessive_pronoun": "彼女のもの",
                    "reflexive": "自分",
                    "dependent": "彼女"
                },
                "zh": {
                    "subject": "",
                    "object": "",
                    "possessive_determiner": "她的",
                    "possessive_pronoun": "她的",
                    "reflexive": "自己",
                    "dependent": ""
                },
                "fi": {
                    "subject": "hän",
                    "object": "häntä",
                    "possessive_determiner": "hänen",
                    "possessive_pronoun": "hänen",
                    "reflexive": "itsensä",
                    "dependent": "hänen"
                }
            },
            "display": {
                "en": "she/her",
                "de": "sie/ihre",
                "ja": "彼女/彼女の",
                "zh": "她/她的",
                "fi": "hän/häntä"
            }
        }
    ]
}

This is just an example, I haven't factored in multiple genders/pronouns per language, etc... also in some languages we end up with duplicates because they don't distinguish between the two pronoun forms, but including all grammatical categories is probably simpler in the long run... what do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also, some languages like Japanese vary the content based on the "formality" so you'd likely need to put some wording in that it's out of scope for this MSC, as public pronouns can't really specify whether this particular interaction has a particular level of formality)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the test against more languages, Hungarian and Tamil would be a challenge... they have loads of forms that would be quite difficult to describe without supporting all of them 🤔

{
    "grammatical_pattern": "feminine",
    "forms": {
        "hu": {
            "subject": "ő", // Nominative: She
            "object": "őt", // Accusative: Her (direct object)
            "possessive_determiner": "ővé", // Possessive determiner: Her
            "possessive_pronoun": "övé", // Possessive pronoun: Hers
            "reflexive": "magának", // Reflexive: Herself
            "instrumental": "vele", // Instrumental: With her
            "dative": "neki", // Dative: To/for her
            "locative": "nála", // Locative: At her place
            "ablative": "tőle", // Ablative: From her
            "allative": "hozzá", // Allative: To her
            "elative": "belőle", // Elative: Out of her
            "illative": "bele", // Illative: Into her
            "superessive": "rajta", // Superessive: On her
            "sublative": "alá", // Sublative: Under her
            "delative": "róla", // Delative: From (off) her
            "terminative": "őig", // Terminative: As far as her
            "essive": "őként", // Essive: As her
            "translative": "ővé", // Translative: Becoming her
            "causal-final": "miatta", // Causal-final: Because of her
            "temporal": "őtől fogva", // Temporal: Since her
            "dependent": "őt" // General dependent form: Her
        }
    },
    "display": {
        "hu": "ő/őt"
    }
},
{
    "grammatical_pattern": "feminine",
    "forms": {
        "ta": {
            "subject": "அவள்", // Nominative: She
            "object": "அவளை", // Accusative: Her (direct object)
            "possessive_determiner": "அவளுடைய", // Possessive determiner: Her
            "possessive_pronoun": "அவளுடையது", // Possessive pronoun: Hers
            "reflexive": "தன்னால்", // Reflexive: Herself
            "instrumental": "அவளால்", // Instrumental: With/by her
            "dative": "அவளுக்கு", // Dative: To/for her
            "locative": "அவளிடம்", // Locative: At her place
            "ablative": "அவளிடமிருந்து", // Ablative: From her
            "genitive": "அவளுடைய", // Genitive: Belonging to her
            "vocative": "அவளே", // Vocative: Calling her
            "dependent": "அவளை" // General dependent form: Her
        }
    },
    "display": {
        "ta": "அவள்/அவளை"
    }
}

That said, if the main purpose is to just tell people some pronouns, and there's no expectation that clients will have different messages/etc based on the pronoun data, perhaps it could just be stated in the MSC that many of these are not needed for an online chat medium so have been intentionally left out?

It seems to me that the purpose of this MSC is to help people know what pronouns to use when talking to another person, not to entirely describe how a given language's grammatical system works.

array. These fields can be fetched through the
[profile API endpoints](https://spec.matrix.org/unstable/client-server-api/#profiles).
Clients should use these instead of they/them where possible. All fields
within `m.pronouns` are optional, exluding `"language"` and `"summary"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
within `m.pronouns` are optional, exluding `"language"` and `"summary"`.
within `m.pronouns` are optional, excluding `"language"` and `"summary"`.

(since there is only one more field, it might be better to reverse the wording of this sentence and say that all are required except grammatical gender)

"avatar_url": "…", "displayname": "…",
"m.pronouns": [
{
"grammatical_gender": "inanimate",
Copy link
Contributor

@sumnerevans sumnerevans Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What grammatical genders are available? I think they need to be enumerated. I don't know what grammatical gender "they/them" would be, for example.

I also think they need to be prefixed (m.male for example) and defined what that means in terms of how the UIs should adapt their rendering. This will also allow for future grammatical genders from other languages or linguistic innovations to be defined later (and use unstable prefixes).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some languages distinguish between "animate" and "inanimate" so @everypizza1 may want to consider whether it's important to consider bot accounts as "inanimate" or whether we just anthropomorphise every user, but from brief research into a few African, Asian, and European languages, I think this should cover most of them:

{
    "m.pronouns": [
        {
            "grammatical_pattern": "masculine",
            "forms": {
                "en": {
                    "subject": "he",
                    "object": "him",
                    "possessive_determiner": "his",
                    "possessive_pronoun": "his",
                    "reflexive": "himself",
                    "dependent": "him"
                }
            },
            "display": { "en": "he/him" }
        },
        {
            "grammatical_pattern": "feminine",
            "forms": {
                "en": {
                    "subject": "she",
                    "object": "her",
                    "possessive_determiner": "her",
                    "possessive_pronoun": "hers",
                    "reflexive": "herself",
                    "dependent": "her"
                }
            },
            "display": { "en": "she/her" }
        },
        {
            "grammatical_pattern": "neuter",
            "forms": {
                "en": {
                    "subject": "it",
                    "object": "it",
                    "possessive_determiner": "its",
                    "possessive_pronoun": "its",
                    "reflexive": "itself",
                    "dependent": "it"
                }
            },
            "display": { "en": "it/its" }
        },
        {
            "grammatical_pattern": "common",
            "forms": {
                "sv": {
                    "subject": "hen",
                    "object": "hen",
                    "possessive_determiner": "hens",
                    "possessive_pronoun": "hens",
                    "reflexive": "sig",
                    "dependent": "hen"
                }
            },
            "display": { "sv": "hen/hens" }
        },
        {
            "grammatical_pattern": "plural",
            "forms": {
                "en": {
                    "subject": "they",
                    "object": "them",
                    "possessive_determiner": "their",
                    "possessive_pronoun": "theirs",
                    "reflexive": "themself",
                    "dependent": "them"
                }
            },
            "display": { "en": "they/them" }
        },
        {
            "grammatical_pattern": "epicene",
            "forms": {
                "en": {
                    "subject": "ze",
                    "object": "zir",
                    "possessive_determiner": "zir",
                    "possessive_pronoun": "zirs",
                    "reflexive": "zirself",
                    "dependent": "zir"
                }
            },
            "display": { "en": "ze/zir" }
        },
        {
            "grammatical_pattern": "none",
            "forms": {
                "fi": {
                    "subject": "hän",
                    "object": "häntä",
                    "possessive_determiner": "hänen",
                    "possessive_pronoun": "hänen",
                    "reflexive": "itsensä",
                    "dependent": "hänen"
                }
            },
            "display": { "fi": "hän/häntä" }
        }
    ]
}

There are extra forms not covered here, e.g. some languages have "noble" versions, but this MSC might want to just state an assumption that a chat protocol does not necessarily need or want to record the nobility level of the user 🙂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may want to consider whether it's important to consider bot accounts as "inanimate" or whether we just anthropomorphise every user

It's not just bot accounts that would have that - some users also use inanimate pronouns. The extended classes of pronouns will probably end up being removed, because most users will just want the summary and grammatical gender.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants