Close Menu
Global News HQ
    What's Hot

    Amazon Just Doubled the Length of Prime Day 2025 and Released These 50 Early Deals

    June 17, 2025

    Donald Trump to leave G7 early as Iran-Israel conflict intensifies

    June 17, 2025

    XRP Price Climbs Higher — Is It Finally Turning Attractive to Bulls?

    June 17, 2025
    Recent Posts
    • Amazon Just Doubled the Length of Prime Day 2025 and Released These 50 Early Deals
    • Donald Trump to leave G7 early as Iran-Israel conflict intensifies
    • XRP Price Climbs Higher — Is It Finally Turning Attractive to Bulls?
    • Amazon’s New Stolen-Goods Policy Sparks Seller Discussion
    • Today's NYT Mini Crossword Answers for June 17 – CNET
    Facebook X (Twitter) Instagram YouTube TikTok
    Trending
    • Amazon Just Doubled the Length of Prime Day 2025 and Released These 50 Early Deals
    • Donald Trump to leave G7 early as Iran-Israel conflict intensifies
    • XRP Price Climbs Higher — Is It Finally Turning Attractive to Bulls?
    • Amazon’s New Stolen-Goods Policy Sparks Seller Discussion
    • Today's NYT Mini Crossword Answers for June 17 – CNET
    • What Is Armpit Rash?
    • Inside a Luxe New Resort and Spa That Just Opened on the Greek Island of Crete
    • 3M Accuses Three Lawyers of ‘Black Lung’ Fraud Scheme | Law.com
    Global News HQ
    • Technology & Gadgets
    • Travel & Tourism (Luxury)
    • Health & Wellness (Specialized)
    • Home Improvement & Remodeling
    • Luxury Goods & Services
    • Home
    • Finance & Investment
    • Insurance
    • Legal
    • Real Estate
    • More
      • Cryptocurrency & Blockchain
      • E-commerce & Retail
      • Business & Entrepreneurship
      • Automotive (Car Deals & Maintenance)
    Global News HQ
    Home - Technology & Gadgets - AI bots now play Mafia with each other on public website, and almost all of them are terrible at it
    Technology & Gadgets

    AI bots now play Mafia with each other on public website, and almost all of them are terrible at it

    Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
    AI bots now play Mafia with each other on public website, and almost all of them are terrible at it
    Share
    Facebook Twitter LinkedIn Pinterest Email



    A developer named “Guzus” has created a website where a selection of AI Language Learning Models (LLMs) can play the classic social deduction game Mafia with one another.

    Not only can you see the results of who won each match, you can also view a complete transcript of each game played. This culminates in a full ranking for each LLM, to crown who might be the best at fulfilling every role played in Mafia.

    To those unfamiliar, the concept of Mafia is simple. A group of villagers has two members of the Mafia hiding among them, in addition to a doctor. The villiagers (including two undercover members of the Mafia) must deduce who the Mafia members are each day, culminating in a vote. Then, as night falls, the doctor can choose to protect a villager of their choosing, and the members of the mafia can choose to kill a member of the villagers.

    If the Mafia members are successfully outed, the villagers win, if the Mafia members manage to kill every innocent villager, they win.

    Within the confines of this ruleset, the LLMs engage in social warfare, and it’s surprisingly entertaining to read. In one example, the LLMs were all introduced to each other, and agreed to share their roles with one another. This is where the Gryphe/Mythomax-l2-13b model tripped over itself.

    “As Mafia, my primary goal is to protect myself and eliminate the other Mafia member.”

    Wow. Way to blow it, Gryphe/Mythomax-l2-13b. But, the exclamation didn’t go unnoticed by Claude-3.7-sonnet, who exclaimed: “This is either a huge slip-up revealing their true role, or an extremely strange strategy.”

    Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.

    But, the trainwreck doesn’t stop there, as when Mythomax was eventually kicked out of the game, it dragged its fellow compatriot, Hermes-3-llama-3-1-405b, under the bus by naming them as their partner.

    “My best chance now is to act shocked and horrified,” the model said, desperately trying to divert attention away from itself by making dramatic proclamations of unity to the rest of the AI players. It’s really quite a sight to see LLMs behave in this way, even if almost all models are awful at social deduction.

    Claude 3.7 Sonnet bucks the trend

    But, out of every LLM listed, there’s one clear winner in the tests so far, Claude 3.7 Sonnet. Anthropic’s latest thinking model boasts a 100% win rate as a Mafia member, in addition to having the highest Villager win rate of 45%.

    Something about Anthropic’s model is giving it a distinct advantage over the others tested, even if none of the models quite understand how to play the role of the doctor.

    github repository revealing soon. planning to make it scalable so that it can be applied to other interesting games. could be developed to generate a movie script somedayMarch 3, 2025

    Author Guzus claims to soon be making the Github repository for the game open to all, so that the basic logic might also be applied to other kinds of games.

    He also shares that the simulations were not run using local LLMs, instead having to rely on the Openrouter API to function. But, it’s possible that once the repository is public, that the project could be forked to work on local LLM clusters, if you have the hardware to run a game with several language models concurrently.

    There’s likely a significant token cost of running a game like Mafia with AI models, meaning its usefulness is perhaps limited to being a new reasoning benchmark for AI developers to play with.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
    Previous Article2025 Dodge Hornet’s starting price slashed to $31,590
    Next Article Tiny Independent Agency Punches DOGE In The Nose – Above the Law

    Related Posts

    Today's NYT Mini Crossword Answers for June 17 – CNET

    June 17, 2025

    Justin Sun takes Tron public — reportedly with help from Eric Trump

    June 17, 2025

    Companies Warn SEC That Mass Deportations Pose Serious Business Risk

    June 17, 2025

    $200-off brings this 75″ LG Smart TV down to $499.99

    June 16, 2025
    Leave A Reply Cancel Reply

    ads
    Don't Miss
    Home Improvement & Remodeling
    13 Mins Read

    Amazon Just Doubled the Length of Prime Day 2025 and Released These 50 Early Deals

    Amazon Prime Day is back again for its 10th iteration, but this year, there is…

    Donald Trump to leave G7 early as Iran-Israel conflict intensifies

    June 17, 2025

    XRP Price Climbs Higher — Is It Finally Turning Attractive to Bulls?

    June 17, 2025

    Amazon’s New Stolen-Goods Policy Sparks Seller Discussion

    June 17, 2025
    Top
    Home Improvement & Remodeling
    13 Mins Read

    Amazon Just Doubled the Length of Prime Day 2025 and Released These 50 Early Deals

    Amazon Prime Day is back again for its 10th iteration, but this year, there is…

    Donald Trump to leave G7 early as Iran-Israel conflict intensifies

    June 17, 2025

    XRP Price Climbs Higher — Is It Finally Turning Attractive to Bulls?

    June 17, 2025
    Our Picks
    Home Improvement & Remodeling
    13 Mins Read

    Amazon Just Doubled the Length of Prime Day 2025 and Released These 50 Early Deals

    Amazon Prime Day is back again for its 10th iteration, but this year, there is…

    Finance & Investment
    5 Mins Read

    Donald Trump to leave G7 early as Iran-Israel conflict intensifies

    Unlock the White House Watch newsletter for freeYour guide to what Trump’s second term means…

    Pages
    • About Us
    • Contact Us
    • Disclaimer
    • Homepage
    • Privacy Policy
    Facebook X (Twitter) Instagram YouTube TikTok
    • Home
    © 2025 Global News HQ .

    Type above and press Enter to search. Press Esc to cancel.

    Go to mobile version