Close Menu
    Facebook X (Twitter) Instagram
    TRENDING :
    • Market Talk – April 29, 2026
    • Uber just expanded into hotels, AI, and ‘room service’ and it’s moving fast
    • Social media’s big tobacco moment is just a first step
    • Ghirardelli Chocolate products recalled over Salmonella fears. Avoid this list of 13 beverage mixes
    • Google, TikTok and Meta could be taxed by Australia to fund its newsrooms
    • MacKenzie Scott says we underestimate the impact of small acts of kindness. Science agrees
    • Trump says Iran ‘better get smart soon’ as economies deal with skyrocketing energy prices
    • A key weapon in America’s ‘Golden Dome’ defense shield is taking shape
    Compatriot Chronicle
    • Home
    • US Politics
    • World Politics
    • Economy
    • Business
    • Headline News
    Compatriot Chronicle
    Home»Business»AI’s most important benchmark in 2026? Trust
    Business

    AI’s most important benchmark in 2026? Trust

    January 2, 20267 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
    Follow Us
    Google News Flipboard
    Share
    Facebook Twitter LinkedIn Pinterest Email

    In 2026 (and beyond) the best benchmark for large language models won’t be MMLU or AgentBench or GAIA. It will be trust—something AI will have to rebuild before it can be broadly useful and valuable to both consumers and businesses.

    Researchers identify several different kinds of AI trust. In people who use chatbots as companions or confidants, they measure a feeling that the AI is benevolent or has integrity. In people who use AI for productivity or business, they measure something called “competence trust,” or the belief that the AI is accurate and doesn’t hallucinate facts. I’ll focus on that second kind.

    Competence trust can grow or shrink. An AI tool user, quite rationally, begins by giving the AI simple tasks—perhaps looking up facts or summarizing long documents. If the AI does a good job of these things, the user naturally thinks “what else can I do with this?” They may give the AI a slightly harder task. If the AI continues to get things right, trust grows. If the AI fails or provides a low-quality answer, the user will think twice about trying to automate the task next time.

    Steps forward, steps back

    Today’s AI chatbots, which are powered by large generative AI models, are far better than the ones we had in 2023 and 2024. But AI tools are just beginning to build trust with most users, and most C-suite executives who hope the tools will streamline business functions. My own trust of chatbots grew in 2025. But it has also diminished. 

    Example: I entered a long conversation with one of the popular chatbots about the contents of a long document. The AI made some interesting observations about the work, and suggested some sensible ways of filling in gaps. Then it made an observation that seemed to contradict something I knew was in the document. 

    When I pointed out the missing data, it immediately admitted its mistake. When I asked it (again) if it had digested the full document, it again insisted it had. Another AI chatbot returned a research report that it said was based on 20 sources. But there were no citations in the text connecting specific statements to specific sources. After it added the citations within the text, I noted that in two places the AI had relied on a single, not-very-trustworthy source for a key fact. 

    I learned that AI models still struggle with long chats involving large amounts of information, and that they’re not good at telling the user when they’re in over their heads. The experience adjusted my trust in the tools.

    Grappling with ambiguity

    As we enter 2026, generative AI’s story is still in its early chapters. The story started with AI labs developing models that could converse, write, and summarize. Now the big AI labs seem confident that AI agents can autonomously work through complex tasks, calling on tools and checking their work against expert data. They seem convinced that the agents will soon manage ambiguity with humanlike judgment. 

    If large companies begin to trust that these agents can reliably do such jobs, it would mean enormous revenues for the AI company that developed them. Based on their current investments of hundreds of billions into AI infrastructure, the AI companies and their backers seem to believe this outcome is close at hand. 

    Even if the AI could bring human-level intellect to business scenarios tomorrow, it may still take time to build trust among decision-makers and workers. Today, trust in AI isn’t high. The consulting firm KPMG surveyed 48,000 people in 47 countries (two-thirds of which use AI regularly) and found that while 83% believe AI will be beneficial, only 46% actually trust the output of AI tools. Some may have a false trust in the technology: two-thirds of the respondents say they sometimes rely on AI output without evaluating its accuracy.

    But I doubt that AI agents are ready to complete complex tasks and manage ambiguity like human experts might. As the AI is used by more people and businesses, they will encounter a universe of unique problems within various contexts that they’ve never seen before. I doubt that current AI agents understand the ways of humans and the world well enough to improvise their way through such situations. Not yet anyway. 

    The limitations of the models

    The fact is that AI companies are using the same kind of (transformer-based) AI models to underpin reasoning agents that they used for early chatbots that were essentially word generators. The core function of such models, and the objective of all their training, is predicting the next word (or pixel or audio bit) in a sequence, Microsoft AI CEO (and Google DeepMind cofounder) Mustafa Suleyman explained in a recent podcast. “It is using that very simple likelihood-of-word prediction function to simulate what it’s like to have a great conversation or to answer complex questions,” he said. 

    Suleyman and others doubt it. Suleyman believes that current models don’t account for some of the key drivers of the things humans say and do. “Naturally, we would expect that something that has the hallmarks of intelligence also has the underlying synthetic physiology that we do, but it doesn’t,” Suleyman said. “There is no pain network. There is no emotional system. There is no inner will or drive or desire.” 

    AI pioneer (and Turing Prize winner) Yann LeCun says the LLMs of today are useful enough to be applied in some valuable ways, but thinks they’ll never achieve the general or human-level intelligence needed to do the really high-value work the AI companies hope they will. In order to learn to intuit paths through real-world complexity the AI would need a much higher-bandwidth training regimen than just words, images, and computer code, LeCun says. They may need to learn the world via something more like the multisensory experience babies have, and possess the uncanny ability to process and store all that information quickly, as babies can, he says. 

    Suleyman and LeCun may be wrong. Companies like OpenAI and Anthropic may achieve human-level intelligence using models whose origin is in language. 

    AI governance matters

    Meanwhile, competence is just one factor in AI trust among business users. Enterprises use governance platforms to monitor whether and how AI systems might be creating regulatory compliance issues or exposing the company to risk of cyberattack, for example. “When it comes to AI, large enterprise companies . . . want to be trusted by customers, investors, and regulators,” says Navrina Singh, founder and CEO of the governance platform Credo AI. “AI governance isn’t slowing us down, it’s the only thing that allows measurable trust and lets intelligence scale without breaking the world.”

    In the meantime the pace at which humans delegate tasks to AI will be moderated by trust. AI tools should be used for tasks they’re good at, so that confidence in the results grows. That’ll take time, and it’s a moving target because the AI is continually improving. Discovering and delegating new tasks for AI, monitoring the results, and adjusting expectations will very likely become a routine part of work in the 21st century.  

    No, AI won’t suddenly reinvent business all at once next year. 2026 won’t be the “year of the agent.” It’ll take a decade for AI tools to prove out and become battle-hardened. Trust is the hardening agent.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    Uber just expanded into hotels, AI, and ‘room service’ and it’s moving fast

    April 29, 2026

    Social media’s big tobacco moment is just a first step

    April 29, 2026

    Ghirardelli Chocolate products recalled over Salmonella fears. Avoid this list of 13 beverage mixes

    April 29, 2026
    Top News

    Armageddon, Iran War & Why Peace Is Impossible

    By Staff WriterMarch 25, 2026

    The Neocons running the Trump administration, as always, are a complete failure when it comes…

    Send In The Clowns. Don’t Bother – They Are Here.

    September 15, 2025

    Idaho Judge Lifts Sweeping Gag Order in Bryan Kohberger’s Quadruple Murder Case

    August 17, 2025

    ChatGPT maker OpenAI becomes world’s most valuable startup at $500B

    October 2, 2025
    Top Trending

    Market Talk – April 29, 2026

    By Staff WriterApril 29, 2026

    ASIA: The major Asian stock markets had a mixed day today: •…

    Uber just expanded into hotels, AI, and ‘room service’ and it’s moving fast

    By Staff WriterApril 29, 2026

    Uber Technologies is doing everything it can to save its customers’ time,…

    Social media’s big tobacco moment is just a first step

    By Staff WriterApril 29, 2026

    Many commentators have called March’s California jury verdict, finding Meta and Google…

    Categories
    • Business
    • Economy
    • Headline News
    • Top News
    • US Politics
    • World Politics
    About us

    The Populist Bulletin serves as a beacon for the populist movement, which champions the interests of ordinary citizens over the agendas of the powerful and entrenched elitists. Rooted in the belief that the voices of everyday workers, families, and communities are often drowned out by powerful people and institutions, it delivers straightforward, unfiltered, compelling, relatable stories that resonate with the values of the American public.

    The Populist Bulletin was founded with a fervent commitment to inform, inspire, empower and spark meaningful conversations about the economy, business, politics, inequality, government accountability and overreach, globalization, and the preservation of American cultural heritage.

    The site offers a dynamic mix of investigative journalism, opinion editorials, and viral content that amplify populist sentiments and deliver stories that echo the concerns of everyday Americans while boldly challenging mainstream narratives that serve the privileged few.

    Top Picks

    Market Talk – April 29, 2026

    April 29, 2026

    Uber just expanded into hotels, AI, and ‘room service’ and it’s moving fast

    April 29, 2026

    Social media’s big tobacco moment is just a first step

    April 29, 2026
    Categories
    • Business
    • Economy
    • Headline News
    • Top News
    • US Politics
    • World Politics
    Copyright © 2025 Populist Bulletin. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.