Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

highplainsdem

(60,014 posts)
Tue May 13, 2025, 09:46 AM May 2025

ChatGPT Blows Mapmaking 101: A Comedy of Errors (Gary Marcus, Substack). And AI models flunk finance, too.

https://garymarcus.substack.com/p/chatgpt-blows-mapmaking-101

People keep telling me ChatGPT is smart. Is it really?

I once again put it to the test. It was very good at some things, but not others. It was very good at giving me bullshit, my dog-ate-my-homework excuses, offering me a bar graph after I asked for a map, falsely claiming that it didn’t know how to make maps.

A minute later, as I turned to a different question, I discovered that it turns out ChatGPT does know how to draw maps. Just not very well.

-snip-

The recent vals.ai financial benchmarks showed comparable failures reading basic financial reports. (On questions like “Which Geographic Region has Airbnb (NASDAQ: ABNB) experienced the most revenue growth from 2022 to 2024?” performance was near zero.)

-snip-


See the Substack article for the maps, errors, and chatbot responses.


The April 2025 financial benchmarks testing results Gary mentioned are at

https://www.vals.ai/benchmarks/finance_agent-04-22-2025

and they're dismal.

-snip-

The foundation models are currently ill-suited to perform open-ended questions expected of entry-level finance analysts

A majority of models struggled with tool use in general, and more specifically for information retrieval, leading to inaccurate answers — most notably the small models like Llama 4 Scout or Mistral Small 3.1.

Models on average performed best in the simple quantitative (37.57% average accuracy) and qualitative retrieval (30.79% average accuracy) tasks. These tasks are easy but time-intensive for finance analysts.

On our hardest tasks, the models perform much worse. Ten models scored 0% on the Trends task, and the best performance on this task was only 28.6% by Claude Sonnet 3.7.

-snip-



The report concluded that "none of the existing AI models exceed 50% accuracy" and they "still have a long way to go before they can be deployed reliably and trusted in the finance industry."

No kidding.

But according to the AI bros and those duped by them, we should

•keep using generative AI models like this in every possible way,
•change laws and regulations to give the AI companies whatever they want,
•allow every bit of intellectual property to be used for free to train the AI,
•accept degradation of education and the environment and the internet by genAI,
•and approve and applaud hundreds of billions of dollars invested in AI

because it's AI!!!!
7 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
ChatGPT Blows Mapmaking 101: A Comedy of Errors (Gary Marcus, Substack). And AI models flunk finance, too. (Original Post) highplainsdem May 2025 OP
Toxic crap should be illegal imo. SheltieLover May 2025 #1
I really wish it was. highplainsdem May 2025 #4
I really do, too! SheltieLover May 2025 #7
I've been hearing the "You're simply not using it enough!" refrain for a few years now. Hugin May 2025 #2
Yes. Years of lies and hype from the AI bros. Have to keep that investment money flowing... highplainsdem May 2025 #5
It sucks.. I just asked it to show current measles cases confirmed in each of 50 states and it hlthe2b May 2025 #3
I've heard from teachers on Twitter that they're encountering students so brainwashed by the hype highplainsdem May 2025 #6

Hugin

(37,440 posts)
2. I've been hearing the "You're simply not using it enough!" refrain for a few years now.
Tue May 13, 2025, 09:56 AM
May 2025

“It’ll get better! It’s leeearning!”

No, not really. Besides blaming the victim, most users aren’t even interacting with the “teaching” cycles other than their scraped texts, artwork, and personal data being used as source material. That’s an entirely different part of the process. Which are usually not interactive.

highplainsdem

(60,014 posts)
5. Yes. Years of lies and hype from the AI bros. Have to keep that investment money flowing...
Tue May 13, 2025, 09:16 PM
May 2025

hlthe2b

(112,818 posts)
3. It sucks.. I just asked it to show current measles cases confirmed in each of 50 states and it
Tue May 13, 2025, 10:02 AM
May 2025

took from newspaper reports only. I know this because I know the case tallies in Colorado--which were not even included, even though they have been reported to CDC and confirmed. Even the May 2009 data on the CDC page is more up-to-date. (31 states had reported cases by May 9 and chatGPT reported only the 10 states with the most cases as of (supposedly) May 13. Worthless

SUCKS... Students that do not do their own research and depend on AI are going to fail (all of us), miserably.

highplainsdem

(60,014 posts)
6. I've heard from teachers on Twitter that they're encountering students so brainwashed by the hype
Tue May 13, 2025, 09:20 PM
May 2025

that even when the teachers can point to reliable sources proving the chatbot is wrong, the kids will still assume the chatbot is correct. Which is both sad and scary.

Latest Discussions»General Discussion»ChatGPT Blows Mapmaking 1...