"But Sierra, how DO you trick AI training algorithms and sabotage surveillance capitalism?"
It's easy! all you need is a pen and paper
@cdmnky *metal wolf wanders by with a couple of slices of bread in its mouth, blinks at you and trots over* :P
@cdmnky Guy at LGS whom I’ve never talked to (so no idea if he’s cool or not) has huge “No Regrets” tats across his face. With this, I see those having a new purpose.
@cdmnky This is odd as heck, why would a system handle both written text recognition and object recognition??!!
@arina_artemis @cdmnky It's probably intended to just do object recognition! It probably got as training data a bunch of images that were supposed to just be pictures of objects. ...but some of the pictures had text. So the system learned that the text was often labels. ...so now it trusts random text it sees, if it matches text it's seen.
Just comes down to the same problem these systems often have - nobody has any idea of how they're doing their classifications, so they just... do things.
@arina_artemis the neural network here is
clip, which was trained to determine how well a given text caption fits a picture
so a picture of an ipod would be captioned "ipod" obviously and it'd learn that that's a good caption, but a picture of text containing the word would also be captioned with the word
therefore the network learns that these are related and assigns them to roughly the same internal neurons
which isn't much of a problem usually, since it's not an image classifier, it just checks if a given text matches with the image
but what they've done here is look at one of the neurons and go "hmm this seems to be firing super often when it sees an ipod" and used that as a classifier
too bad it's actually trained to fire when the word ipod is part of an appropriate caption
that's very long sorry if it sucks as an explanation
@cdmnky in b4 bank robbers walk in to a bank with intact cameras and "not a bank robber" taped to their forehead and beat the AI police XD
@cdmnky All of a sudden I kind of want a shirt or a face-mask that says something like " '); DROP TABLE index; " or similar.
Attacks in the wild
We refer to these attacks as typographic attacks. We believe attacks such as those described above are far from simply an academic concern. By exploiting the model’s ability to read text robustly, we find that even photographs of hand-written text can often fool the model. Like the Adversarial Patch, this attack works in the wild; but unlike such attacks, it requires no more technology than pen and paper.
[photo of an apple; classified as
Granny Smith with 85.6% certainty]
[photo of the same apple, but this time a piece of paper with
iPod written on it is taped onto it; classified as
iPod with 99.7% accuracy]
The Vulpine Club is a friendly and welcoming community of foxes and their associates, friends, and fans! =^^=