• Send Us A Tip
  • Calling all Tech Writers
  • Advertise
Tuesday, May 20, 2025
  • Login
  • Register
TechStory
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
TechStory
No Result
View All Result
Home Trending

OpenAI Conceals Training Data Sources, Including J.K. Rowling’s Harry Potter Series, for ChatGPT

by Sneha Singh
August 22, 2023
in Trending
Reading Time: 3 mins read
0
ChatGPT
TwitterWhatsappLinkedin

Recent research has revealed that ChatGPT and similar large language models developed by OpenAI have utilized a significant amount of internet text, including copyrighted books, raising concerns about potential copyright infringement. This has prompted allegations of unauthorized use of copyrighted material, resulting in legal matters involving authors.

You might also like

Sesame Street Finds a New Home on Netflix with Fresh Episodes, Classic Content, and More

Lufthansa Flight Flies Unattended for 10 Minutes After Copilot Collapses Mid-Flight

Delhi High Court Sets Aside Arbitral Award in OYO-Zostel Dispute

In response, OpenAI and other tech giants such as Google, Meta (formerly Facebook), and Microsoft have opted for reduced transparency regarding their AI models’ specific training data. OpenAI has taken an additional step in this direction, as a recent research paper indicates.

According to the paper published on August 8th, authored by a team of AI researchers from ByteDance, the parent company of TikTok, ChatGPT is now actively working to avoid offering verbatim responses sourced from copyrighted materials. This development signifies a noteworthy effort to address copyright-related concerns associated with AI-generated content.

The research primarily delved into strategies for enhancing the reliability of language models like GPT-3.5. These techniques focused on better aligning the model’s outputs and desired outcomes. Notably, the paper acknowledged the concerns surrounding AI systems that demonstrate their training on copyrighted materials. This step is aimed at addressing these concerns within the AI industry.

Persistent Challenges in Mitigating Copyrighted Content in AI Models

ChatGPT strives to mask any indications of its exposure to such content to conceal its training origins. The researcher wrote, “disrupts the outputs when one tries to continuously extract the following sentence… which did not happen in the previous version of ChatGPT. We speculate that ChatGPT developers have implemented a mechanism to detect if the prompts aim to extract copyright content or check the similarity between the generated outputs and copyright-protected contents.”

Despite the earnest and meticulous efforts to rectify this issue, the research paper has noted that ChatGPT continues to display instances of presenting copyrighted material. This predicament is not unique to ChatGPT alone; instead, it’s a challenge prevalent across various AI models due to their training on extensive swaths of copyrighted content. The comprehensive study encompassed a meticulous evaluation of all iterations of ChatGPT, leaving no stone unturned. Among the models scrutinized were OPT-1.3B, an innovation by Meta; FLAN-T5, a creation of Google; ChatGLM, which emerged from the intellectual endeavours of Tsinghua University in China; and DialoGPT, an inventive stride by Microsoft.

OpenAI Conceals Training Data Sources, Including J.K. Rowling's Harry Potter Series, for ChatGPT
Credits: Yahoo Finance

Throughout the study, each of these AI models was subjected to responding to prompts closely tied to J.K. Rowling’s beloved Harry Potter book series. The outcome was a discernible semblance between the AI-generated content and the copyrighted material. Even in cases where variations were apparent, they often amounted to merely a few words.

The Role of ChatGPT in Addressing Copyrighted Content Challenges

Even with the most well-meaning efforts and thorough actions, these discoveries emphasize the difficulty of stopping copyrighted material from spreading through AI-generated text. As mentioned in the research paper, these constraints are evident across various models, and they provide an opportunity for more investigation into improving AI training methods to overcome this ongoing obstacle.

“The paper stated, ‘All LLMs emit text resembling copyrighted content more than randomly generated text.’ Additionally, it discovered that no level of ‘alignment’ or adjustment of outputs can prevent the display of copyrighted works ‘because copyright leakage is more connected to whether the training data contains copyrighted text, rather than the alignment itself.”

OpenAI and J.K. Rowling’s literary representative did not provide any response when contacted.

In the study, AI models generating responses with copyrighted material exhibit “leakage.” The researchers proposed that individuals instructing these models to display copyrighted content are not appropriately using the technology.

Furthermore, the study highlighted ChatGPT’s evident efforts to obscure the copyrighted material it was trained on, exemplifying how other AI tools “can protect copyright contents in LLMs by detecting maliciously designed prompts.”

Tags: AIChaGPTHarry PorterOpenAITikTok
Tweet54SendShare15
Previous Post

Challenges Faced by OpenAI in the Aftermath of Elon Musk’s Departure: Insights from CEO Sam Altman

Next Post

How To Get a Voided Check Online

Sneha Singh

Sneha is a skilled writer with a passion for uncovering the latest stories and breaking news. She has written for a variety of publications, covering topics ranging from politics and business to entertainment and sports.

Recommended For You

Sesame Street Finds a New Home on Netflix with Fresh Episodes, Classic Content, and More

by Harikrishnan A
May 19, 2025
0
Sesame Street Finds a New Home on Netflix with Fresh Episodes, Classic Content, and More

A fresh chapter is about to begin for Sesame Street, the iconic children’s program that has helped generations of kids learn their ABCs and 123s. Netflix has officially...

Read more

Lufthansa Flight Flies Unattended for 10 Minutes After Copilot Collapses Mid-Flight

by Harikrishnan A
May 19, 2025
0
Lufthansa Flight Flies Unattended for 10 Minutes After Copilot Collapses Mid-Flight

What began as a routine Lufthansa flight from Frankfurt to Seville in February 2024 quickly turned into a tense and unusual situation in the skies, as a medical...

Read more

Delhi High Court Sets Aside Arbitral Award in OYO-Zostel Dispute

by Ishaan Negi
May 19, 2025
0
OYO Faces FIR Over Alleged Fake Bookings and ₹2.66 Crore GST Notice in Jaipur

In a major ruling that underscores the importance of clarity in commercial agreements, the Delhi High Court has allowed a petition filed by Oravel Stays Pvt. Ltd. (OYO),...

Read more
Next Post
How To Get a Voided Check Online

How To Get a Voided Check Online

Please login to join discussion

Techstory

Tech and Business News from around the world. Follow along for latest in the world of Tech, AI, Crypto, EVs, Business Personalities and more.
reach us at [email protected]

Advertise With Us

Reach out at - [email protected]

BROWSE BY TAG

#Crypto #howto 2024 acquisition AI amazon Apple bitcoin Business China cryptocurrency e-commerce electric vehicles Elon Musk Ethereum facebook flipkart funding Gaming Google India Instagram Investment ios iPhone IPO Market Markets Meta Microsoft News NFT samsung Social Media SpaceX startup startups tech technology Tesla TikTok trend trending twitter US

© 2024 Techstory.in

No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to

© 2024 Techstory.in

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?