- 🤖 AI in the Public Sphere: A 2024 Look Ahead
🤖 AI in the Public Sphere: A 2024 Look Ahead
3 Questions + Analysis on Artificial Intelligence
Hope everyone had a Merry Christmas and Happy Holidays! I’m so grateful that you’ve joined me here at Statecraft in its inaugural year - your support makes this newsletter possible and it truly means the world to me. I hope you’ve enjoyed my articles so far and I’m excited to bring you even more in the new year. Hope you enjoy this subscriber-exclusive post + video!
Scale almost always finds a way of amplifying preexisting issues and laying bare some unforeseen challenges with new technology. 2023 was the year that “ChatGPT”, “LLMs”, and even “AI” truly entered the mainstream vernacular. And as such, I want to outline some questions and musings about AI going into the new year.
Please feel free to comment your thoughts if you feel so inclined. Maybe as the new year goes by, we’ll revisit some of these questions in isolation. Now, let’s get into it:
🍪 Will AI-Generated Content Eat Itself? And How/If Platforms Adapt?
I know this is 2 questions but they’re intertwined so I’m taking my liberties. I came across NotByAI.fyi a few months back and that really kickstarted this train of thought. The site poses an interesting question that reminds me of that game “Snake” on Nokia phones:
Snake on a Nokia phone - rendering. Courtesy: Nokia Phones
Basically, the thought process goes:
AI has lowered the bar for creating tremendously detailed or realistic content that achieve “viral” status (think of the Pope’s Balenciaga jacket or Trump’s less-than-peaceful arrest). This lower bar allows for a critical mass of AI generated images, text, and video to flood the Internet at a much higher rate than new human-generated content.
AI Generated images of the Pope in a Balenciaga puffer jacket
But this AI-generated content (“AIGC”) is imperfect despite its realism. See the Pope’s right hand in the left image above and his left hand in the right image above
Because models are trained from content and its associated metadata (e.g. alt text, captions) scraped from around the internet, models will incorporate AIGC into newer training data. Assuming there is no reliable, systematized way to tell if a piece of content was generated by Dall-E, Midjourney, Stablity, Gemini, ChatGPT, LLaVa, BARD, etc.
Those new training data will incorporate imperfections or echoes from the uncanny valley more and more as the concentration of AIGC on the internet grows
An empirical study of the uncanny valley
Tying back to Snake: this is a feedback loop where the quality of model output degrades as the concentration of AI generated content grows.
I’ll be happier if this process doesn’t play out or if training data becomes more curated to preempt this.
AIGC’s explosion in 2023 has given rise to changing aesthetics and policies on content platforms. Here are a few platforms I selected:
DeviantArt: Worked with Stability AI, then effectively forced opt-in by artists to have their images used in AI training data for DeviantArt’s in-house image generator (DreamUp), then reversed that policy. All while the artist community was being divided by both the initial cooperation with Stability and AI propagation on the platform
Medium: Made changes to its incentive structure to incentivize “high quality human writing”, explicitly requiring Medium membership for monetization and adjusting monetization metrics to more heavily weight engagement
TikTok: Added a “Label as AI-Generated Content” button (anecdotally this isn’t used very often and seems like it’s just there to preempt EU / US inquiry into AIGC on the platform)
As a content creator, I’ll be listening to platforms on what even constitutes “AIGC” in their view and how - or if - they prioritize human-generated content.
🔐 Will Privacy and Copyright Haunt AI?
Convoluted privacy settings, opaque training data, personal info leaks, and lawsuits breed trust crises. All of these have been discussed in the context of AI to some degree. Privacy researchers have pointed out the labyrinthian process of requesting to remove personal data from AI training datasets as well as how prodding AI for personal info may actually yield PII. Although I mostly see these concerns coming from the AI / privacy practitioners of the world, I’ll be curious to see if the concerns will spill over to the general public (e.g. will “OpenAI sees all your data on the internet no matter what” become the new “Facebook listens to your private conversations to serve ads”).
I’ve been acutely aware of legislation restricting AI use working its way through statehouses and cities. They generally seem to have privacy and data protection (for both consumers - “Private Sector Use” - and governments - “Government Use”) at their core:
Additionally, the New York Times has sued OpenAI for copyright infringement, trademark dilution, and unfair competition. The Times specifically calls out its articles’ prominence in the Common Crawl dataset as well as ChatGPT’s ability to seemingly bypass NYT’s paywalls. It’s a case that will be highly consequential for the viability of training sets in commercial AI products and the future of journalism:
Questions of ownership - of both copyright and personal data - will continue into the new year and probably well beyond.
💡 How Will Ideological Splits Effect AI?
We probably could use a whole newsletter / video just for this one. One of the reasons I get particularly hyped when talking about AI is it asks a lot of questions - fundamental questions - about us, humans. “What is art? What is intelligence really? Who owns what? Is Artificial General Intelligence just around the corner or 50 years away?” And because of that, the vision for where AI will be in the next decade plus varies wildly depending on who you’re talking to.
This split is partially capture by the ousting and swift reinstatement of Sam Altman of OpenAI.
Perhaps the most intriguing schism I’ll be following is the Effective Altruist (EA) community and academic researchers. There seems to be fundamental disagreements here surrounding what issues to even focus on. Whereas academic researchers seem to focus on issues like: bias, disinformation / hate speech, privacy, and child abuse material (CSAM) that have an urgent need to be safeguarded against in the now, the time horizon for EAs is perceptibly longer-term. EAs focus on the benefits of Artificial General Intelligence as an end that justifies moving quickly without hinderances (e.g. safety guardrails) and point to human extinction level events as primary safety concerns to focus on.
Those are my top three going into the new year but that’s just me! They’re far from the only storylines. Here are a couple more topic-specific stories that I’ll be monitoring as well, we’ll call these “Honorable Mentions”:
Labor movements and labor commodification
Urbanism, self-driving, and the fight for the future of cities
If you enjoyed this article or video, please consider subscribing below. It’s a free way to support the content, we’ll never send you spam, and you can always change your mind later. Thanks for reading/watching!