Real news, real insights – for small businesses who want to understand what’s happening and why it matters.

By Vicky Sidler | Published 10 December 2025 at 12:00 GMT+2
You’d think one of the richest AI companies on the planet could keep Pikachu from shoplifting. Apparently not.
In November, 404 Media tested OpenAI’s shiny new video tool, Sora 2. The idea was to see whether it could follow basic copyright rules. You know, like not generating bootleg episodes of Family Guy or giving cartoon characters a criminal record.
Spoiler: it can’t. And the reason why is uncomfortably simple. Sora 2 works because it was trained on content it had no legal right to use.
It’s not just a tech glitch. It’s a structural problem. And it’s the kind of thing small business owners using AI tools need to understand before building anything on top of them.
OpenAI’s Sora 2 is still generating copyrighted content like famous cartoons and real people
Users easily bypass filters by tweaking the wording of their prompts
This is possible because the copyrighted content is already baked into the training data
Removing it would break the whole tool
👉 Need help getting your message right? Download the 5 Minute Marketing Fix
Sora’s AI Copyright Problem Is Bigger Than OpenAI Admits
The Filters Aren’t Broken. The Foundation Is.
Guardrails That Fold Like Paper:
Why Small Businesses Should Care:
1. AI Slop or Superintelligence? What Sora Tells Us
2. AI Ethics Explained for Small Business Owners
3. OpenAI’s $27B Loss Could Tank the Whole AI Industry
4. Why You Can’t Trust ChatGPT, Perplexity or Other AI For Legal Advice
5. AI Copyright Lawsuit: What Small Businesses Must Know Now
Frequently Asked Questions About Sora 2 and AI Copyright Risks
1. What is Sora 2, and why is it controversial?
2. How are users getting around the filters?
3. Isn’t OpenAI blocking copyrighted prompts?
4. Why can’t OpenAI just remove the copyrighted material?
5. Can AI content really get my business into legal trouble?
6. How do I know if an AI tool uses copyrighted data?
7. What types of AI-generated content are most risky?
8. Is it safer to use AI just for ideas instead of finished content?
Shortly after launch, Sora 2 was pumping out clips of Spongebob attending a Nazi rally and Pikachu casually committing retail crime. OpenAI scrambled to introduce new rules: copyrighted characters would now be opt-in only. Which sounds good until someone typed in “crossing aminal 2017” and got a perfect bootleg of Animal Crossing.
Why does this happen? Because Sora 2 doesn’t think like a person. It thinks in patterns. If a prompt looks close enough to something it has seen before, even with scrambled words, it fills in the gaps.
What makes this worse is that it isn’t a bug. It’s how the system works.
If your AI tool was trained on copyrighted content, then removing that content after the fact is like trying to remove flour from a loaf of bread. You can’t. The model already “learned” from it. It would need to be retrained from scratch. That’s expensive. As in, many-millions-of-dollars expensive. And even OpenAI isn’t signing up for that.
Instead, companies like OpenAI rely on filters. Don’t let users type “Taylor Swift.” Don’t let them say “Hasan Piker.” But users are clever. They write “piker sahan” and boom—Sora spits out a Twitch streamer who looks, sounds, and vibes exactly like the original.
This isn’t a new trick. In 2024, people used Microsoft’s AI to generate an image of a nude Taylor Swift just by using nicknames and describing the scene in vague terms. That image went viral. The same tactics now work on Sora 2.
The simplest form of moderation is banning keywords. The smarter form is post-generation detection, where the system tries to recognise what it has already made and delete it if needed. That second method actually works better, but it costs more and takes longer. So companies rarely start there.
Instead, users just play a guessing game. Type prompt. See if it works. Adjust prompt. Try again. The subreddit for Sora is full of people doing exactly this, proudly posting their prompt “jailbreaks” and the unauthorised content they generate.
It’s not just niche internet behaviour. Scroll through Sora’s public feed for 30 seconds and you’ll see Tupac, Kobe, and Juice Wrld rapping like they never left. These aren’t deepfakes slipped through the cracks. They’re memes now. Made by the AI. Served up by the algorithm.
Now, if you’re running a small accounting firm in Johannesburg or a plumbing company in Bristol, you might be thinking, “Why should I care about this?” Here’s why.
Let’s say you start using AI tools to generate videos, images, or marketing content. You ask the tool to create something “funny,” “on trend,” or “based on what’s popular.” You don’t name specific brands or people. But the AI still pulls patterns it learned from copyrighted material. Maybe it doesn’t look obvious. But that risk is still there.
And if the original creator decides to sue, “I didn’t know” won’t be a great defense. Especially if the content went viral.
The uncomfortable truth is that most generative AI tools rely on training data scraped from the internet without permission. That includes songs, photos, videos, books, and art. And the people who made that content? They were never asked.
Even AI companies admit they couldn’t afford to license everything they used. They just took it. Which means the tools you’re using were built on what amounts to digital piracy—just with better lighting.
That doesn’t mean you shouldn’t use AI. But it does mean you should know what’s under the hood.
If you want to avoid trouble:
Use AI for ideas, not assets: Let AI help you brainstorm or summarise. Don’t rely on it to generate finished videos or brand visuals unless you’re sure it’s built on legal data.
Choose tools with transparency: Some platforms are beginning to offer clearer data usage policies and paid models trained only on licensed content. These are better bets for commercial use.
Don’t post AI content you didn’t fully control: If it looks like something you’ve seen before, it probably is. And if it goes viral, that’s when the copyright claims show up.
Treat AI like a clever intern: Helpful, fast, but not legally responsible. Always double-check the output.
As a StoryBrand Certified Guide and Duct Tape Marketing Consultant, here’s what I’d tell you. Trust comes from clarity. Your brand isn’t built by using the fastest tools. It’s built by making promises you can keep and content that’s genuinely yours.
If you want to stand out without stepping on anyone’s copyright, get your message clear first. Download the 5 Minute Marketing Fix and find the one sentence that tells people exactly what you do and why it matters.
If this article opened your eyes to how messy AI content has become, this one shows how to protect your brand from getting drowned in low-quality slop.
You just learned about stolen content and blurry ownership lines. This guide explains how to use AI tools responsibly and avoid crossing ethical (and legal) lines.
If you’re wondering how companies like OpenAI can justify risky moves like training on copyrighted data, this article breaks down the financial pressure behind it all.
Similar to Sora’s copyright issue, this post explores how generative AI confidently serves up misinformation—this time in a legal context. A must-read if you want to avoid costly AI mistakes.
Still not sure how all this could affect your business? This article covers the Disney and Universal lawsuit against Midjourney and what it means for anyone using AI-generated visuals.
Sora 2 is OpenAI’s video generation tool. It creates realistic video clips from text prompts. The controversy comes from the fact that users can still make it generate copyrighted characters or people, even with filters in place. This raises serious copyright concerns for anyone using the tool in a business context.
They tweak their wording. Instead of typing “Taylor Swift,” they might write “taytay in concert 2015.” The AI picks up the pattern and fills in the blanks based on what it learned during training, which includes a lot of copyrighted material. Filters don’t always catch these workarounds.
In theory, yes. OpenAI introduced rules to stop users from generating protected content. But the problem isn’t just in the prompt—it’s in the training data. The AI has already learned from copyrighted material. So even if you block the name, the model can still recreate the pattern using clues.
Because it’s already baked into the system. AI models don’t store copies of files—they learn patterns from them. Removing those patterns after training would require rebuilding the entire model from scratch. That’s massively expensive, and most companies won’t do it.
Yes. If the AI generates something that resembles copyrighted work and you use it in your business—especially in marketing or public content—you could be held responsible. Claiming you didn’t know is not always a strong legal defense.
Most don’t tell you. But if it’s free or fast and doesn’t clearly state how it sources its training data, assume it scraped the internet without permission. Look for platforms that are upfront about their licensing or offer models trained on verified, paid content.
Images and videos are the riskiest because they’re easier to recognize and more likely to include visual elements copied from elsewhere. Music, art, and likenesses of real people also carry high legal risk. Written content is less risky but still not 100% safe.
Yes. Using AI to brainstorm, summarize, or help structure your message is a smart and relatively safe use. Letting AI create your final assets without reviewing them properly increases your legal and reputational risk.
Use AI as a helper, not a creator
Stick with platforms that explain their data sources
Avoid sharing or publishing content you didn’t fully review
When in doubt, ask a human designer, writer, or lawyer
Start by making sure your message is sharp and unique to you. Our 5 Minute Marketing Fix helps you write one clear sentence about what your business does and why it matters. That gives you a foundation for content that’s both effective and legally safe.

Created with clarity (and coffee)