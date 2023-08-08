CityLife

The Power of AI Models

News

OpenAI Launches GPTBot for Data Collection

ByVicky Stavropoulou

Aug 8, 2023
OpenAI Launches GPTBot for Data Collection

OpenAI has introduced GPTBot, a web crawling bot designed to collect data for training the company’s next generation of AI systems. With the trademark application for “GPT-5,” it suggests a forthcoming release from OpenAI. Functioning similarly to popular search engines like Google and Bing, GPTBot will gather publicly available data from websites. However, website owners who prefer not to have their content included in the dataset can employ a “disallow” rule on their server’s standard file. OpenAI assures that GPTBot will remove personally identifiable information and content that violates the company’s policies.

While the opt-out approach of GPTBot has raised concerns among technology ethicists who argue that consent issues remain unresolved, some users on Hacker News defend OpenAI’s decision, asserting the necessity of current data for continually updating AI models. In response to recent criticism of OpenAI’s data scraping practices, particularly with regards to training large language models like ChatGPT, the company updated its privacy policies earlier this year. The trademark application for GPT-5 further affirms OpenAI’s commitment to training its next model.

In contrast to OpenAI’s focus on gathering extensive data for its models, Meta, the social media giant, has adopted a different strategy. Meta provides an open-source language model but restricts its usage by competitors and large businesses. Although Meta does not disclose the datasets it utilizes or the information it collects, it allows users to fine-tune the model with their own data.

OpenAI’s ChatGPT remains widely utilized, and its partnership with Microsoft has bolstered Bing’s capabilities. As OpenAI continues to spearhead advancements in the AI domain, concerns regarding copyright and consent arise due to the expansion of internet data collection. Striking a balance between transparency, ethics, and capabilities will prove to be a complex challenge as AI systems become increasingly sophisticated.

By Vicky Stavropoulou

Related Post

News

How AI is Revolutionizing Virtual Reality for Training and Education

Aug 8, 2023 Gabriel Botha
News

The Impact of Algorithms, Machine Learning, and AI on Society

Aug 8, 2023 Robert Andrew
News

Odyssey Semiconductor Technologies Announces Q2 2023 Results

Aug 8, 2023 Robert Andrew

You missed

Satellite

Fresno City College Debuts New Campus in West Fresno

Aug 8, 2023 Gabriel Botha 0 Comments
News

How AI is Revolutionizing Virtual Reality for Training and Education

Aug 8, 2023 Gabriel Botha 0 Comments
News

The Impact of Algorithms, Machine Learning, and AI on Society

Aug 8, 2023 Robert Andrew 0 Comments
AI

New AI Attack Can Steal Passwords Through Keystroke Sounds

Aug 8, 2023 Mampho Brescia 0 Comments