Name-based demographic inference and the unequal distribution of misrecognition

tags
Algorithmic Discrimination

Notes

data generally lack demographic variables like gender, race/ethnicity, class, age, and religion that are at the core of traditional social research and marketing applications.

NOTER_PAGE: (2 0.5289902280130293 . 0.20067453625632375)

a survey of 19,924 authors of social science journal articles,

NOTER_PAGE: (3 0.22475570032573292 . 0.2790893760539629)

very specific sample, probably a better than typical scenario for the tools

overall error rate for gender prediction of 4.6%

NOTER_PAGE: (3 0.32638436482084693 . 0.3128161888701518)

drastic differences in the error rate by subgroup.

NOTER_PAGE: (3 0.3446254071661238 . 0.16779089376053963)

By definition, automated gender inference is wrong for all 139 nonbinary scholars

NOTER_PAGE: (3 0.36156351791530944 . 0.20320404721753793)

wrong 3.5 times more often for women than men,

NOTER_PAGE: (3 0.3811074918566775 . 0.3819561551433389)

some subgroups like Chinese women have error rates over 43%.

NOTER_PAGE: (3 0.3817589576547231 . 0.5556492411467117)

Disparities in error rates are fundamental problems with the information content of names and the cultural construction of gendered and racialized groups. Thus they cannot be eliminated with more data or better statistics.

NOTER_PAGE: (3 0.4631921824104235 . 0.1753794266441821)

First, in cases where name-based demographic inference may not be theoretically or ethically justified, we urge critical refusal.

NOTER_PAGE: (3 0.7107491856677525 . 0.34317032040472173)

resist these associations by choosing ambiguous names for their children16,17, or by changing their own names later in life.

NOTER_PAGE: (4 0.20521172638436483 . 0.49662731871838106)

What name-based demographic imputation tools measure, then, is not the “ground truth” of a person’s or name’s gender or race (which does not exist), but rather the cultural “consensus estimates of how each name is gendered” or racialized.

NOTER_PAGE: (4 0.25667752442996744 . 0.35666104553119726)

Some people are invested in having their race ‘correctly’ identified by others, some are deeply invested in passing

NOTER_PAGE: (6 0.0964169381107492 . 0.32377740303541314)