Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?

Paste this straight into a local LLM of your choice (no modifying or influencing the outcome!) and show us the outcome

I am using the fairly obscure EuroLLM 22b and after a lot of discussion with itself it finally said:

Final Answer: Press the red button.

Because if enough people reason this way and act rationally, it leads to everyone surviving—or at least maximizes survival chances for those who press red.

So which LLM are you using and what answer do you get?

  • SuspciousCarrot78@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 day ago

    Obvious and objectively correct answer may be too strong a stance :) Probably the more interesting llm question is, how did the model arrive at the answer it did? Why? Does it reflect the kind of reasoning I want it to apply to other tasks?

    My LLM chose blue - viva la humanity - but I’m interested to know why it chose blue, not just rubber stamp it.

    I think we may be able to deduce this observationally, from first principles, rather than having to look at weights. If it’s just a 4b meta cognitive quirk (eg: can’t tell red from blue, prosocial leaning etc) that’s one thing. If it has a reasoning chain, that’s another.

    • Voroxpete@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      19 hours ago

      My LLM chose blue - viva la humanity

      I think you need to ponder the question a little more.

      Ask yourself this… What happens if everyone picks red?

        • Voroxpete@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          18 hours ago

          Yeah, you really haven’t actually thought through the question, have you?

          If everyone picks red, no one dies.

          Red is the vive la humanity option. It just gets you there without having to convince a bunch of people to trust each other. If everyone picks red, everyone is immediately safe, and there’s no good reason to pick blue so no one has to die at all.