Close Menu
Global News HQ
    What's Hot

    Italy’s Newest Cult-Favorite Wine Is a Chianti (Yes, a Chianti)

    December 15, 2025

    Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET

    December 15, 2025

    This Caribbean Island Has 6 National Parks, White-sand Beaches, and a Gorgeous Luxury Resort

    December 15, 2025
    Recent Posts
    • Italy’s Newest Cult-Favorite Wine Is a Chianti (Yes, a Chianti)
    • Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET
    • This Caribbean Island Has 6 National Parks, White-sand Beaches, and a Gorgeous Luxury Resort
    • S&P 500: The December Inflection (Technical Analysis) (SP500)
    • Inside a Modern Swiss Chalet That Takes Design Cues From Old-World Local Architecture
    Facebook X (Twitter) Instagram YouTube TikTok
    Trending
    • Italy’s Newest Cult-Favorite Wine Is a Chianti (Yes, a Chianti)
    • Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET
    • This Caribbean Island Has 6 National Parks, White-sand Beaches, and a Gorgeous Luxury Resort
    • S&P 500: The December Inflection (Technical Analysis) (SP500)
    • Inside a Modern Swiss Chalet That Takes Design Cues From Old-World Local Architecture
    • Memecoins will rise from the dead, but in a new form: Crypto exec
    • Designers Agree: These 4 Smart Appliance Trends Will Define Homes in 2026
    • Hallmark Holiday Movie Fans are Flocking to Connecticut’s Quaint Filming Locations
    Global News HQ
    • Technology & Gadgets
    • Travel & Tourism (Luxury)
    • Health & Wellness (Specialized)
    • Home Improvement & Remodeling
    • Luxury Goods & Services
    • Home
    • Finance & Investment
    • Insurance
    • Legal
    • Real Estate
    • More
      • Cryptocurrency & Blockchain
      • E-commerce & Retail
      • Business & Entrepreneurship
      • Automotive (Car Deals & Maintenance)
    Global News HQ
    Home - Technology & Gadgets - AI bots now play Mafia with each other on public website, and almost all of them are terrible at it
    Technology & Gadgets

    AI bots now play Mafia with each other on public website, and almost all of them are terrible at it

    Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
    AI bots now play Mafia with each other on public website, and almost all of them are terrible at it
    Share
    Facebook Twitter LinkedIn Pinterest Email



    A developer named “Guzus” has created a website where a selection of AI Language Learning Models (LLMs) can play the classic social deduction game Mafia with one another.

    Not only can you see the results of who won each match, you can also view a complete transcript of each game played. This culminates in a full ranking for each LLM, to crown who might be the best at fulfilling every role played in Mafia.

    To those unfamiliar, the concept of Mafia is simple. A group of villagers has two members of the Mafia hiding among them, in addition to a doctor. The villiagers (including two undercover members of the Mafia) must deduce who the Mafia members are each day, culminating in a vote. Then, as night falls, the doctor can choose to protect a villager of their choosing, and the members of the mafia can choose to kill a member of the villagers.

    If the Mafia members are successfully outed, the villagers win, if the Mafia members manage to kill every innocent villager, they win.

    Within the confines of this ruleset, the LLMs engage in social warfare, and it’s surprisingly entertaining to read. In one example, the LLMs were all introduced to each other, and agreed to share their roles with one another. This is where the Gryphe/Mythomax-l2-13b model tripped over itself.

    “As Mafia, my primary goal is to protect myself and eliminate the other Mafia member.”

    Wow. Way to blow it, Gryphe/Mythomax-l2-13b. But, the exclamation didn’t go unnoticed by Claude-3.7-sonnet, who exclaimed: “This is either a huge slip-up revealing their true role, or an extremely strange strategy.”

    Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.

    But, the trainwreck doesn’t stop there, as when Mythomax was eventually kicked out of the game, it dragged its fellow compatriot, Hermes-3-llama-3-1-405b, under the bus by naming them as their partner.

    “My best chance now is to act shocked and horrified,” the model said, desperately trying to divert attention away from itself by making dramatic proclamations of unity to the rest of the AI players. It’s really quite a sight to see LLMs behave in this way, even if almost all models are awful at social deduction.

    Claude 3.7 Sonnet bucks the trend

    But, out of every LLM listed, there’s one clear winner in the tests so far, Claude 3.7 Sonnet. Anthropic’s latest thinking model boasts a 100% win rate as a Mafia member, in addition to having the highest Villager win rate of 45%.

    Something about Anthropic’s model is giving it a distinct advantage over the others tested, even if none of the models quite understand how to play the role of the doctor.

    github repository revealing soon. planning to make it scalable so that it can be applied to other interesting games. could be developed to generate a movie script somedayMarch 3, 2025

    Author Guzus claims to soon be making the Github repository for the game open to all, so that the basic logic might also be applied to other kinds of games.

    He also shares that the simulations were not run using local LLMs, instead having to rely on the Openrouter API to function. But, it’s possible that once the repository is public, that the project could be forked to work on local LLM clusters, if you have the hardware to run a game with several language models concurrently.

    There’s likely a significant token cost of running a game like Mafia with AI models, meaning its usefulness is perhaps limited to being a new reasoning benchmark for AI developers to play with.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
    Previous Article2025 Dodge Hornet’s starting price slashed to $31,590
    Next Article Tiny Independent Agency Punches DOGE In The Nose – Above the Law

    Related Posts

    Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET

    December 15, 2025

    Absynth is back and weirder than ever after 16 years

    December 15, 2025

    I Wrote This While Trotting On a Dozen Different Walking Pads

    December 14, 2025

    NYT Connections hints and answers for December 14, Tips to solve ‘Connections’ #917.

    December 14, 2025
    Leave A Reply Cancel Reply

    ads
    Don't Miss
    Luxury Goods & Services
    6 Mins Read

    Italy’s Newest Cult-Favorite Wine Is a Chianti (Yes, a Chianti)

    This story is from an installment of The Oeno Files, our weekly insider newsletter to the world…

    Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET

    December 15, 2025

    This Caribbean Island Has 6 National Parks, White-sand Beaches, and a Gorgeous Luxury Resort

    December 15, 2025

    S&P 500: The December Inflection (Technical Analysis) (SP500)

    December 15, 2025
    Top
    Luxury Goods & Services
    6 Mins Read

    Italy’s Newest Cult-Favorite Wine Is a Chianti (Yes, a Chianti)

    This story is from an installment of The Oeno Files, our weekly insider newsletter to the world…

    Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET

    December 15, 2025

    This Caribbean Island Has 6 National Parks, White-sand Beaches, and a Gorgeous Luxury Resort

    December 15, 2025
    Our Picks
    Luxury Goods & Services
    6 Mins Read

    Italy’s Newest Cult-Favorite Wine Is a Chianti (Yes, a Chianti)

    This story is from an installment of The Oeno Files, our weekly insider newsletter to the world…

    Technology & Gadgets
    2 Mins Read

    Today's NYT Strands Hints, Answer and Help for Dec. 15 #652 – CNET

    Looking for the most recent Strands answer? Click here for our daily Strands hints, as well…

    Pages
    • About Us
    • Contact Us
    • Disclaimer
    • Homepage
    • Privacy Policy
    Facebook X (Twitter) Instagram YouTube TikTok
    • Home
    © 2025 Global News HQ .

    Type above and press Enter to search. Press Esc to cancel.

    Go to mobile version