Welcome to DU!
The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards.
Join the community:
Create a free account
Support DU (and get rid of ads!):
Become a Star Member
Latest Breaking News
Editorials & Other Articles
General Discussion
The DU Lounge
All Forums
Issue Forums
Culture Forums
Alliance Forums
Region Forums
Support Forums
Help & Search
General Discussion
In reply to the discussion: This message was self-deleted by its author [View all]highplainsdem
(61,519 posts)14. It's entirely about AI, and you know it. I am pointing out why chatbots are an
unreliable source of information, and why genAI tools are unethical, beginning with the training.
And continuing to be more and more unethical with the scraping.
The scraping being done by genAI companies - which is still continuing, every minute of every day - is a threat to the entire internet. EarlG has mentioned the scraping being a problem here.
Article on that scraping:
https://www.theregister.com/2025/08/21/ai_crawler_traffic/
Cloud services giant Fastly has released a report claiming AI crawlers are putting a heavy load on the open web, slurping up sites at a rate that accounts for 80 percent of all AI bot traffic, with the remaining 20 percent used by AI fetchers. Bots and fetchers can hit websites hard, demanding data from a single site in thousands of requests per minute.
-snip-
The report warned, "Some AI bots, if not carefully engineered, can inadvertently impose an unsustainable load on webservers," Fastly's report warned, "leading to performance degradation, service disruption, and increased operational costs." Kumar separately noted to The Register, "Clearly this growth isn't sustainable, creating operational challenges while also undermining the business model of content creators. We as an industry need to do more to establish responsible norms and standards for crawling that allows AI companies to get the data they need while respecting websites content guidelines."
That growing traffic comes from just a select few companies. Meta accounted for more than half of all AI crawler traffic on its own, at 52 percent, followed by Google and OpenAI at 23 percent and 20 percent respectively. This trio then has its hands on a combined 95 percent of all AI crawler traffic. Anthropic, by contrast, accounted for just 3.76 percent of crawler traffic. The Common Crawl Project, which slurps websites to include in a free public dataset designed to prevent duplication of effort and traffic multiplication at the heart of the crawler problem, was a surprisingly-low 0.21 percent.
The story flips when it comes to AI fetchers, which unlike crawlers are fired off on-demand when a user requests that a model incorporates information newer than its training cut-off date. Here, OpenAI was by far the dominant traffic source, Fastly found, accounting for almost 98 percent of all requests. That's an indication, perhaps, of just how much of a lead OpenAI's early entry into the consumer-facing AI chatbot market with ChatGPT gave the company, or possibly just a sign that the company's bot infrastructure may be in need of optimization.
-snip-
The report warned, "Some AI bots, if not carefully engineered, can inadvertently impose an unsustainable load on webservers," Fastly's report warned, "leading to performance degradation, service disruption, and increased operational costs." Kumar separately noted to The Register, "Clearly this growth isn't sustainable, creating operational challenges while also undermining the business model of content creators. We as an industry need to do more to establish responsible norms and standards for crawling that allows AI companies to get the data they need while respecting websites content guidelines."
That growing traffic comes from just a select few companies. Meta accounted for more than half of all AI crawler traffic on its own, at 52 percent, followed by Google and OpenAI at 23 percent and 20 percent respectively. This trio then has its hands on a combined 95 percent of all AI crawler traffic. Anthropic, by contrast, accounted for just 3.76 percent of crawler traffic. The Common Crawl Project, which slurps websites to include in a free public dataset designed to prevent duplication of effort and traffic multiplication at the heart of the crawler problem, was a surprisingly-low 0.21 percent.
The story flips when it comes to AI fetchers, which unlike crawlers are fired off on-demand when a user requests that a model incorporates information newer than its training cut-off date. Here, OpenAI was by far the dominant traffic source, Fastly found, accounting for almost 98 percent of all requests. That's an indication, perhaps, of just how much of a lead OpenAI's early entry into the consumer-facing AI chatbot market with ChatGPT gave the company, or possibly just a sign that the company's bot infrastructure may be in need of optimization.
Every decision to use an AI answer instead of doing an actual search and looking at reliable websites supports the AI companies' theft from those websites and destruction of the internet.
That's in addition to the harm done by posting text to DU that can be riddled with errors from hallucinating AI - unless the DUer posting the slop has taken the time to check every single detail and correct the hallucinations.
Any chatbot can give a different answer to the same prompt at a different time, as well as disagreeing with other chatbot models.
If posting what chatbots say becomes something that no one can object to here, we will be seeing threads where the OP says "Claude says" and someone else replies "ChatGPT says" and someone else chimes in with what Gemini says, and then another person thinks they got a better answer from Claude with a slightly different prompt, and so on.
It's so easy for an AI user to do that. And so meaningless. I've seen AI-generated replies on Reddit removed by mods because they're considered "low effort posts" that don't add anything to a discussion.
And it turns a message board for humans into a competition between chatbots. Chatbots that might never be able to repeat what they said in any particular answer.
Posting AI slop here, whether text or art, will alienate everyone who takes the harm done by genAI companies seriously.
A liberal forum that makes a carveout for genAI will not look very liberal to anyone aware of the harm genAI does.
And that's especially true if people who like AI slop succeed in censoring people concerned about the multiple harms from it. Which is what you are trying to do. You want people to be able to post AI slop without being reminded that it's flawed, unreliable, unethical tech that dumbs down users, harms the environment, helps oligarchs, and in general is the most harmful non-weapon tech ever developed.
Cannot edit, recommend, or reply in locked discussions
Edit history
Please sign in to view edit histories.
Recommendations
0 members have recommended this reply (displayed in chronological order):
15 replies
= new reply since forum marked as read
Highlight:
NoneDon't highlight anything
5 newestHighlight 5 most recent replies
RecommendedHighlight replies with 5 or more recommendations
n.b., a classic example of an epic fail is a defense of that 'AI' slop OP. If one examines the list of citations . . .
xocetaceans
Yesterday
#11
More AI slop. There are actual human experts out there whose comments on Project 2025 are worth
highplainsdem
Yesterday
#3
I am pointing out facts about AI. If you want to try to argue that it wasn't trained on stolen intellectual property,
highplainsdem
Yesterday
#9
It's entirely about AI, and you know it. I am pointing out why chatbots are an
highplainsdem
19 hrs ago
#14