Creators say we shouldn’t worry about being replaced yet
There is a new breed of bot accounts coming to Twitter, but these aren’t put there by Russia or the CIA or whoever else is trying to influence an election. They’re novelty accounts, posting large quantities of tweets that mimic the style of existing users.
Twitter user @kingdomakrillic runs one of these accounts. He asked to only be referenced by his Twitter account. His parody account, @dril_gpt2, sends out a new tweet in the style of @dril several times a day. @dril is a somewhat mysterious, absurdist comedy account that posts their jokes from behind the pseudo-anonymity of a profile image of an incredibly blurry Jack Nicholson. @kingdomakrillic explains their reasoning for choosing @dril to imitate.
“I wanted to do a GPT-2 bot of someone who was both famous and whose voice on Twitter was near-exclusively comedic,” he says. “If I did, say, a Trump bot, the only humor would come from the novelty of a bot generating Trump-like tweets.”
These imitation @dril tweets can be shockingly on-brand yet original at times. It’s not uncommon to see replies wondering if the tweets from the account are still created by a bot.
@kingdomakrillic assures me the tweets are bot-written but hand-selected.
“Curating the tweets is like DJing. I pace the content out, placing tweets I’m sure are funny next to ones I’m more uncertain about,” @kingdomakrillic says. “Sometimes I screw up. It’s a skill, not 1/10th of the skill that goes into actually writing tweets like dril’s, but it’s still something I need to improve on. There’s no excuse to post duds when you can output infinite text.”
That infinite text doesn’t come from nowhere. It comes from GPT-2, a language model created by OpenAI, a research group with a focus on machine learning.
Sherrene Bogle is a computer science professor at HSU with experience using machine learning. Conceptually, teaching an algorithm how to do something is a lot like teaching a person. Bogle uses the example of teaching an algorithm to recognize whether a bird is in the foreground or the background of an image. First the algorithm is given a set of bird pictures that are already labeled as to whether the bird is in the foreground or background, allowing it to figure out the differences. Then it’s given unlabeled bird images, where it looks for those same differences. The difference between a human in a machine doing this task is that the machine doesn’t actually understand what it’s doing. The machine simply recognizes patterns.
Instead of looking for where birds are in pictures, GPT-2 looks for patterns in text. It’s job is to predict not just the next word, but the next couple paragraphs. GPT-2 is so good at this task that it can make paragraphs of human-readable text after being given only a handful of words. The output text can be about anything, but in order to generate text that mimics the style of a Twitter user, programmers need to retrain the model.
@kingdomakrilic says he retrained GPT-2 on 9,500 tweets, totaling about 750 kilobytes. This focuses the original GPT-2 training data, consisting of almost 40 gigabytes of data, to accomplish a more simple task. The more simple a task, the better an AI can imitate it. Imitating tweets is simple, and with GPT-2’s vast capabilities, imitation yields good results.
There is also @kingdomakrilic’s curation, which gives many of his followers the impression that the AI is better than it really is.
Max Woolf is a data scientist at BuzzFeed, and the person responsible for making these twitter bots so easy to create. He built a tool, called GPT-2 Simple, to easily retrain GPT-2 with any new data—tweets—and wrote an accompanying tutorial. Some people think AI is a threat to humanity, but Woolf says otherwise.
“The potential for harm is less than current human bad actors,” Woolf says.
@kingdomakrilic agrees with this sentiment.
“Some people get freaked out at the fact that GPT-2 can produce sentences that have humanlike coherence, but are made with no meaning or intent on the bot’s part besides to imitate how humans write,” he says. “Markov chains, Madlibs, autocompletors, esquisite corpses—they’re also capable of creating coherent text with the illusion of intent. They’re just not mysterious black box programs like GPT-2.”