EMARKETER PRO+
New data sets, deeper insights, and flexible data visualizations.
Learn More
Analyst Access Program
Exclusive time with the thought leaders who craft our research.
Learn More
Advertising & Sponsorship Opportunities
Boost your brand and generate demand with media programs.
Learn More

Events & Resources

Resources
Read through guides, explore resource hubs, and sample our coverage.
Learn More
Events
Register for an upcoming webinar and track which industry events our analysts attend.
Learn More
Podcasts
Listen to our podcast, Behind the Numbers for the latest news and insights.
Learn More

About

Our Story
Learn more about our mission and how EMARKETER came to be.
Learn More
Our Clients
Key decision-makers share why they find EMARKETER so critical.
Learn More
Our People
Take a look into our corporate culture and view our open roles.
Join the Team
Our Methodology
Rigorous proprietary data vetting strips biases and produces superior insights.
Learn More
Newsroom
See our latest press releases, news articles or download our press kit.
Learn More
Contact Us
Speak to a member of our team to learn more about EMARKETER.
Contact Us

OpenAI faces pivotal moment in AI copyright battle

The news: In a move that may have broader implications for intellectual property and publishers’ bottom lines, OpenAI is allowing authors suing the company, including Sarah Silverman, to inspect data used to train its AI models.

  • The lawsuit accuses OpenAI of using copyrighted works without permission, alleging these were harvested from the internet to train ChatGPT.
  • The review will occur under strict security at OpenAI's San Francisco office, as the authors aim to prove their works were used unlawfully.

Why it matters: This marks the first time OpenAI has opened access to its training data.

  • This case has the potential to set legal precedents on how AI models can use copyrighted works.
  • As AI tools proliferate, companies like OpenAI are facing increased scrutiny over how they source their training data.
  • If the court sides with the authors, it could force the entire AI industry to rethink how datasets are compiled, potentially shaking up development processes.
  • Per reports from Reuters, OpenAI’s claim that this falls under fair use may not be enough to shield it from liability.

Zooming out: Major content platforms are resisting AI data scraping due to revenue loss and IP concerns.

  • OpenAI's SearchGPT has faced pushback from major publishers like The New York Times, highlighting tensions over unauthorized data use and the need for clearer guidelines.
  • Major publishers are also opting out of Apple’s web crawler, Applebot, to protect their content from being used in genAI training. Many have signed exclusive content deals with other AI companies, highlighting the high stakes and value of content access.
  • In a new interview, Meta CEO Mark Zuckerberg told The Verge he believes creators overestimate their content's value for AI training.
  • The news of authors being allowed to inspect OpenAI’s data comes as its competitor, Perplexity, gets ready to introduce in-chat ads that will allow brands like Nike and Marriott to bid for sponsored answers.

Our take: This is about more than just one lawsuit—it could ripple through the tech industry.

  • AI companies are operating in a legal gray zone, and this case might push them into clearer, more regulated waters.
  • OpenAI’s willingness to allow inspections signals that they are ready to fight, but also that they know their precarious position.
  • Authors like Silverman aren’t just standing up for themselves—they’re standing up for an entire creative industry. If they win, AI developers might have to get more creative with their data sourcing.