3 Comments
User's avatar
Jon Stokes's avatar

"Using LEACE, the researchers were able to largely remove the outdated correlation between ‘nurses’ and ‘female’ from a large language model." Can you explain why you characterize this correlation as "outdated"? Over 86% of nurses are female in 2023.

Expand full comment
Matthew Mittelsteadt's avatar

Thank you for the comment, Jon. Precision/clarity of language issue on my end. I don't mean to say that correlation doesn't exist objectively, it clearly does. What I meant was removing default assumptions that nurses, generically, are female. An example of a bias worth scrubbing might be - User Prompt: "What does a nurse do?" LLM Response: " She cares for the sick" With concept erasure, perhaps the system would trend towards neutral subject language such as "A nurse cares for the sick." Admittedly, I don't have the full list of what may or may not have been affected by the LEACE treatment, but its my understanding that these sorts of 'unconscious' biases are the target, whereas information such as facts about gender ratios in the nursing profession wouldn't be removed. Since this is all so new, however, this must be verified, its full efficacy tested, and potential/pit falls explored.

Expand full comment
Bob Crawford's avatar

Hi Matthew. Great article and thanks for digging in. Curious if you think that concept erasure will work in the long haul when the AI algorithms are trained to mimic and/or otherwise learn from human behavior and statistics (as is the case with nurses being correlated with women). In other words, while the initial erasure “treatment” showed large improvement initially, doesn’t such an algorithm re-learn the biasness as an innate function of the AI coding? And, if the algorithm is to weight input parameters to correct for biasness, then would the algorithm be trustworthy (or would that depend on the weighting and the coding)? With so much to consider, the ongoing evolution of AI/ML will be quite the ride.

- Bob C (your old neighbor)

Expand full comment