Let’s Stop Pretending AI Is New
I’ve been thinking a lot about generative AI lately. It’s kind of hard not to with the latest ChatGPT announcement. The technologies we’re witnessing are powerful, impressive, and developing fast. Everything you’re been seeing on screen, for example, is footage from OpenAI’s new video-generating model, Sora. But let’s peel back the algorithmically patterned wallpaper for a moment, and take a hard look at the structure behind it.
The AI of the 2020s isn’t new. But its consequences are. If you’re watching this, they’ve already affected you. So how should we, the public, respond to tools that rely upon more data than we could ever fathom? How can they change our relationship to work? And…do we need to panic?
A lot of people are both excited and scared about the state of AI right now, and rightfully so. One of my goals with this channel, though, is to provide you with reasons to remain optimistic. Today, I’m going to try to put the recent explosion of interest in AI into context.
Before we get into it, I want to be clear. When I use the word “AI,” I’m specifically referring to generative AI. That includes large language models, or LLMs, like ChatGPT, and image generators like Midjourney.
Basically, these programs are meant to perform specific tasks. And to describe the way they work as simply as possible, they identify patterns. When they find patterns in a given input that matches the data they’ve been trained on, they use that data as a springboard to form a new output. Or at least that’s the idea.1
What’s key is that these tools are not examples of artificial general intelligence (AGI), or the Marvins and HALs of sci-fi spaceships. They’re far more narrow than that. Overeager or not, tech companies do recognize that AGI is still a goal.23 So keep that Oliver Cheatham choreography in your back pocket for now.
My main goal with this video is to contribute nuance to larger conversations about AI as a whole. Which is why I want to start by reminding you…
AI Isn’t New
I know that to some that statement might seem obvious, and to others that might be confusing, so let me clarify. Actually, I have a couple of friends that can help me with that.
“Most people don’t realize this, but AI has been around for a long time, and it helps solve all kinds of problems across all kinds of industries. Before starting my channel, I spent eight years as a rocket scientist at MIT, and part of my job was deploying machine learning algorithms to help space telescopes and long range radars detect really small and fast moving objects. We ended up building a few neural networks and training them to understand what to look for and what they can ignore.” — Alex Divinsky – Ticker Symbol You
“I was a software engineer at Salesforce, and we had this product called Einstein and it brought AI to your data. It was a lot like Netflix’s recommendations. When you watch one show, it’ll tell you, hey, you’ll like this show as well. But it was largely pattern based.” — Ricky Roy – Two Bit Da Vinci
For reference, Einstein launched in 2016.4 But we can go even further back than the 2000s. Researchers have been picking at what we now know as generative AI for way longer than you might think. Let me tell you about the time I first met ELIZA.
Our first family computer was a Commodore 64. Yup…64KB of RAM with no disk or hard drive of any kind. My brother, Sean, and I would spend hours sitting in our little upstairs playroom nook plugging in lines of programming code from a book of BASIC.
ELIZA is one program that’s stuck with me all these years. It would ask you questions and then follow up on your answers in the style of a Rogerian psychologist. This was important to the illusion because Rogerians encourage therapy patients to do most of the talking.56 The technological trick behind the scenes was that ELIZA searched for “keys” in your sentence. In other words, it was looking for patterns.
ELIZA: “What did you do today?”
Me: “I played with a Hot Wheels car.”
ELIZA: “Tell me more about the Hot Wheels car.”
For a little kid in the 1980s, this was mind-blowing, and it felt like you were talking to something alive inside the computer … until you turned it off.
Sound familiar? In any case, ELIZA is just one of many in a long line of precursors to the chatbots we know today. And if you observe collective reactions to these types of programs across history, you’ll notice that people’s tendency to anthropomorphize AI helps perpetuate false ideas about its capacities.
We can look at the persona of the chatbot known as “Eugene Goostman” for another example. You’ve probably heard of Turing tests, which are basically an interpretation of a concept famously discussed by mathematician Alan Turing. In a formative 1950 paper, he proposed a theoretical “imitation game” to determine a machine’s ability to exhibit behavior indistinguishable from a human’s.78 Since then, various groups have organized competitions with panels of judges to evaluate the “humanness” of chatbots — though it’s important to know that Turing tests don’t have universally agreed upon rules, and not everyone finds this form of assessment valuable.9
When it comes to Goostman, its creators sought to give the bot a “personality” by establishing a backstory. He…I mean _it_… is meant to act like a 13-year-old Ukrainian boy with a pet guinea pig…so you can probably see how this might have made the bot more convincing during Turing tests. I mean, when have middle school conversations not been awkward and clunky?10
So, is this cast of characters all that removed from what we’re contending with now? Yes and no. Yes in the sense that, speaking broadly, these Bots from Before operated within systems that directly involved human hands, whether through programming languages or mimicking inputs from crowd-sourced conversations.11
This is unlike the popular large language models of today, which use machine learning. And more specifically, it’s the “deep” kind of learning: AKA neural networks. The whole point of these networks is to simulate the human brain, therefore allowing AI systems to “learn” with less intervention.1213
The chatbots that have already set the past few years abuzz are built upon different foundations, yes. But these foundations themselves are just as old.14 Within the context of U.S. history, it was in 1943 that scientists Warren McCulloch and Walter Pitts laid out the mathematical groundwork for an algorithm to classify input data.15 You know…the same sort of tasks you complete every time a website asks you to complete a CAPTCHA to prove your humanity.
Then, in 1957, psychologist Frank Rosenblatt further advanced what would become the basis of neural networks through what he called “the perceptron.” He then married math to metal by building a “Mark I” version of the machine. Its purpose? To recognize images.1617
Hey, why don’t we read some news? Here’s a few quotes from the introduction to a piece from the New York Times on machine learning:
“Computer scientists, taking clues from how the brain works, are developing new kinds of computers that seem to have the uncanny ability to learn by themselves.
…The new computers are called neural networks because they contain units that function roughly like the intricate network of neurons in the brain. Early experimental systems, some of them eerily human-like, are inspiring predictions of amazing advances.”
Oh wait, hang on. This paper is dated…1987. Right around the time I was punching in ELIZA code into my Commodore 64.18
To give you a more recent peek into how long we’ve tinkered with machine learning, I can discuss my own career. Once upon a time, I used to work on competitive multiplayer games. You could win prizes by beating other players, so there was a huge incentive for people to cheat. To counter that, the development team created bot detection systems. They would allow us to analyze move history data from previous matches, which would reveal the subtle differences between how humans and cheat programs play. It was pretty effective.
But we needed human data to make a comparison. And like the chatterbots of yore, our modern Bards and Copilots fundamentally rely upon human data to operate. Be it a quirky conversation partner in the 80s or an aspiring assistant in the 2010s, AI systems interpret massive amounts of information and make their best guesses as to what to do with it. Without all the data that we produce, they can’t do much. And that’s part of the problem.
Why Are People So Concerned?
“The question I always have is, where does that data and training come from? It does come from human art, right? Whether it’s writers or artists, painters, or videographers. So I do worry, are we using our creativity to train AI to basically replace us?” -Ricky Roy – Two Bit Da Vinci
As a YouTube creator, I think it’s for the best that I start with the AI-generated elephant in the room. OpenAI has profited off my work. OpenAI has profited off of every YouTuber’s work. OpenAI has profited off of any work that’s ever been published on the Internet. And we know this because the company ran out of online text to scrape, so it went out of its way to develop a transcription program that could capture every sound on the Internet it could. Every video, every podcast, every audiobook. It’s already done, and none of us have seen a cent for it, so much as acknowledgement that we had a part in it.19 Companies now want your forum replies and blog posts, too, while they’re at it.202122
Late last year, Ed Newton-Rex, a musician who uses AI himself, pointed out that there’s no existing social contract for generative AI training. Meaning, you can’t justify the mass consumption of virtually all the communications published on the internet by comparing the practice to how humans learn. As he wrote in a tweet:23
“Every creator who wrote a book, or painted a picture, or composed a song, did so knowing that others would learn from it. That was priced in. This is definitively not the case with AI. Those creators did not create and publish their work in the expectation that AI systems would learn from it and then be able to produce competing content at scale. The social contract has never been in place for the act of AI training.”
Don’t get me wrong: OpenAI is not the only one doing this. That’s another thing. The act of hijacking people’s voices, art styles, and identities without their consent is already being legitimized because of how easy it is with gen AI. Just a few weeks ago, someone trained a model on Marques Brownlee’s reviews to build a product recommendation tool using his likeness. Did he have anything to do with it or any idea it was even being created? No.24
Another reason for the negative response toward the spike in AI advancement is, well, the suddenness of it all. I know I just said that this stuff isn’t anything new. But what I mean is that up until very recently, the average person didn’t interact with AI…in a way that they were immediately aware of. What’s changed is that companies are now presenting AI as a consumer product for everyone. It’s leapt from research computers to social media and smartphone apps. In other words, it’s more accessible than ever before.
“So my experience with machine learning before 2020 was pretty minimal, mostly playing games against the computer, which was some kind of form of machine learning or rule based system, though I didn’t really know it at the time. Currently, day to day though, I use it a lot more. I use it in coding during my PhD, and also when I’m exploring broad topics, both in the PhD and during YouTube video research.” — Ryan Hughes – Ziroth
Then there’s even more big picture problems that threaten both livelihoods and lives, and a lot of it comes down to transparency. For years, tech giants have deliberately obscured the human labor they exploit to reinforce incorrect assumptions that AI has reached major milestones.2511 In essence, these systems have been behaving more like a Mechanical Turk. By mid-2022, over a thousand workers working remotely from India were reviewing transactions for Amazon’s “Just Walk Out” shopping system. They make the magic happen, not fully autonomous “deep learning techniques.”26272829 In Amazon’s words, though, they’re a vague group of “associates” keeping things accurate.28
Similarly, the ChatGPT we know wouldn’t exist without Kenyan workers. In late 2021, OpenAI partnered with the data labeling company Sama to outsource the excruciating process of identifying graphic content — that way, it could train GPT-3 to not reproduce it. After reading up to hundreds of passages depicting violent topics like suicide and sexual abuse in explicit detail for nine hours a day, Kenya-based Sama employees would take home less than $2 an hour for their trouble.30313233
Another major issue is that the mechanics of neural networks still aren’t entirely understood.3435 That lends itself to a host of complicated consequences that are best summed up by the concept of the “black box.” The black box is the opaque middle of a hypothetical system. You know your input and you know your output…but you can’t see the process that got you from point A to point B.36
But if you can’t decipher the internal workings of a tool that is being used to make decisions, how do you ensure that it’s working properly? How do you prevent it from furthering biases that cause harm? These questions are not just the stuff of dystopian stories. Algorithms determining the “riskiness” of human beings have already been around for a while. Steven Spielberg’s movie adaptation of “Minority Report” came out in 2002, but England and Wales had already begun implementation of the Offender Assessment System (or OASys) in 2001.37 It’s still in use today.
Again, what’s changed is the public availability and mass appeal of the technology, not so much the actual systems. The innovations that at least seemed incremental are now overpowering in their speed, scale, and scope. It’s like we can’t catch our collective breath. Developers are continuing to concentrate more and more resources into AI, businesses are rushing to brand themselves as “AI-first,” and every month there’s another eye-popping spectacle…that might really just be a dumpster fire.3839
Hey, remember those Sora clips I showed earlier? Yeah, about that…the Toronto-based video production company Shy Kids actually used Sora to produce its short film “Air Head.” The ratio of footage the team generated versus what actually made it into the final minute and a half cut was about 300:1. And there was a lot of “we’ll fix that in post.”40 I’d suggest you read the fine print before you use generators, but I doubt it would be legible.
What Does This All Mean?
Well, you’ve heard from my peers already. What do I think about all this? Overall, I’d say I feel torn. AI is amazing, but the origins of the current suite of products are unethical for a number of reasons. And most critically, the damage has already been done. We’ve already explored that angle, so let’s move into the positives.
The number of use cases for these tools is dizzying, so I’ll stick to talking about the applications that I can vouch for, ones that are workable right now — not what’s plausible, promised, or someday possible. All that could be its own video.
If you haven’t noticed before, I use an AI tool to dub my videos in languages other than English. It’s been a little hit or miss, but offering multiple audio tracks helps me reach more viewers across the world. We’ve received some pretty good feedback (and…some bad). It’s kind of trippy to hear my own voice speaking a language that I can’t. You can check it out on this video.
Then there’s what’s available in Notion, which is the platform I use to plan my videos. Since it introduced AI, I’ve been able to make the video production process more convenient. I’ve set up a system that automatically pulls online articles relevant to topics I cover, then summarizes them into a short paragraph. This makes it super easy to comb through countless headlines.
I also use a lot of Photoshop’s AI tools when making my video’s thumbnails. I don’t generate images from scratch, but oftentimes I like an existing photo that’s been shot vertically. That won’t work for the aspect ratio I need, so I scale the canvas up, click content aware fill, and bam … instant landscape orientation. And I’m not alone — other YouTubers do this, too.
This is barely scraping the surface, considering all the tedious tasks we could automate, and all the discoveries that can be sped up using AI.41424344 New drugs, improved battery chemistries, nuclear fusion calculations…the world is truly our oyster, sitting on an ocean rockbed. Sunshine is streaming through the water and reflecting off its surface. The water is shallow and light blue…
But we can’t get ahead of ourselves here. AI is still reliant on humans. You don’t push a toddler in a tricycle down a steep hill, so we shouldn’t expect proficiency from technology that is quite literally still in training.4546 Over and over again, businesses have placed too much confidence into an AI tool and regretted the decision immediately.
“…one thing that I learned from actually using these tools every day is just how important people are to the generative AI process. It still takes a lot of work to get the outputs that you want in the quality that you need, so humans aren’t going anywhere.” — Alex Divinsky – Ticker Symbol You
On top of all the other problems I’ve mentioned, the amount of resources required to train and use these models can’t be ignored.47 It’s not just electricity, but water for cooling and space for data centers.48 According to a 2023 study, Google, Microsoft, and Meta withdrew about 2.2 billion cubic meters’ worth of water in 2022…which is twice the total annual water use of all of Denmark.49 “Not sustainable” is an understatement.
What Do We Do?
I’ve given you a lot to digest so far, and even then my points are far from exhaustive. But I’d like to come back to the question I posed earlier: should we be freaking out? I don’t think we need to panic. I think we need to hold these tech companies accountable for how they handled the training data…and be prepared for where this tech is heading. AI and machine learning may not be new, but these new AI tools are already changing our world … and fast. So, how will you move forward? Will you change your relationship to social media? Will you advocate for regulation? Will you prioritize doing the inconvenient thing — supporting human creators like me?
- What is Generative AI? ↩︎
- What is artificial general intelligence (AGI)? ↩︎
- The hype around DeepMind’s new AI model misses what’s actually cool about it ↩︎
- Salesforce forms research group, launches Einstein A.I. platform that works with Sales Cloud, Marketing Cloud ↩︎
- ELIZA–A Computer Program For the Study of Natural Language Communication Between Man and Machine ↩︎
- Person-Centered Therapy ↩︎
- Turing test ↩︎
- I.—COMPUTING MACHINERY AND INTELLIGENCE ↩︎
- Minds, brains, and programs ↩︎
- Bot with boyish personality wins biggest Turing test ↩︎
- Facebook’s Perfect, Impossible Chatbot ↩︎
- What is ML? ↩︎
- AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the difference? ↩︎
- Large Margin Classification Using the Perceptron Algorithm ↩︎
- A logical calculus of the ideas immanent in nervous activity ↩︎
- The perceptron: A probabilistic model for information storage and organization in the brain. ↩︎
- Electronic Neural Network, Mark I Perceptron ↩︎
- MORE HUMAN THAN EVER, COMPUTER IS LEARNING TO LEARN ↩︎
- A.I.’s Original Sin ↩︎
- Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT ↩︎
- Exclusive: Reddit in AI content licensing deal with Google ↩︎
- Tumblr and WordPress to Sell Users’ Data to Train AI Tools ↩︎
- People often say AI should be able to train on everything, without consent, because that’s how humans learn. ↩︎
- Reply to “Introducing AskMKBHD:” ↩︎
- The Exploited Labor Behind Artificial Intelligence ↩︎
- How Amazon’s Big Bet On ‘Just Walk Out’ Stumbled ↩︎
- Does Amazon’s cashless Just Walk Out technology rely on 1,000 workers in India? ↩︎
- An update on Amazon’s plans for Just Walk Out and checkout-free technology ↩︎
- Theo Wayt ↩︎
- Inside Facebook’s African Sweatshop ↩︎
- Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic ↩︎
- I’m paid $14 an hour to rate AI-generated Google search results. Subcontractors like me do key work but don’t get fair wages or benefits ↩︎
- The Trauma Floor ↩︎
- Scientists Increasingly Can’t Explain How AI Works ↩︎
- Large language models can do jaw-dropping things. But nobody knows exactly why. ↩︎
- What is a black box? A computer scientist explains what it means when the inner workings of AIs are hidden ↩︎
- A ‘black box’ AI system has been influencing criminal justice decisions for over two decades—it’s time to open it up ↩︎
- Your ‘AI first’ company is giving me AI fatigue ↩︎
- OpenAI’s Sam Altman doesn’t care how much AGI will cost: Even if he spends $50 billion a year, some breakthroughs for mankind are priceless ↩︎
- Actually Using SORA ↩︎
- How AI might speed up the discovery of new drugs ↩︎
- New chef dataset brings AI to cooking ↩︎
- A model that generates complex recipes from images of available ingredients ↩︎
- Accelerating computational materials discovery with artificial intelligence and cloud high-performance computing: from large-scale screening to experimental validation ↩︎
- AI hype is built on high test scores. Those tests are flawed. ↩︎
- Scientists warn of AI collapse ↩︎
- The overlooked climate consequences of AI ↩︎
- The surging demand for data is guzzling Virginia’s water ↩︎
- Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models ↩︎
Comments