Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

General discussions that don't fit in other forums can go here.
Forum rules
Feel free to discuss any topics here. Please use the Politics sub-forum for political conversations. While most topics will be allowed please be sure to be respectful and follow our normal site rules at http://www.webdiplomacy.net/rules.php.
Post Reply
Message
Author
gmukobi
Posts: 3
Joined: Mon Jul 17, 2023 11:53 pm
Contact:

Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#1 Post by gmukobi » Tue Jul 18, 2023 12:01 am

TLDR: Awarding up to $250 in prizes for the best feedback on a general-sum version of Diplomacy we intend to use for cooperative AI research.

Hello! We're AI researchers from Stanford, Cambridge, Mila, and the Center on Long-Term Risk, and we're interested in AI research based on Diplomacy. We're designing a new variant called Welfare Diplomacy which allows you to trade off your military capabilities for the welfare of your power (more details below). This is interesting from the perspective of research intended to make AIs better at cooperating because it turns Diplomacy from a zero-sum game (compete for the fixed number of supply centers) to a general-sum game in which there's more opportunity for cooperation (by cooperating, you can create more welfare).

We're reaching out to ask for feedback on our variant from the community. We aren't particularly skilled at Diplomacy, so we thought it would be good to get input from more experienced players. We think an interesting general-sum version of Diplomacy would be a big win for cooperative AI research, and this would help us develop one faster. It could also help us deploy an interesting, human-playable version sooner. We are awarding USD $500 total in prizes ($250 for first place, $150 for second, and $100 for third) to the best feedback submissions as judged by us.
Submission Rules

To submit feedback, please read about the variant below, then send an email to [email protected] with the subject line "Welfare Diplomacy Feedback." We are particularly interested in answers to the following questions:
  1. What are your overall thoughts about Welfare Diplomacy?
  2. What strategies do you expect skilled Diplomacy players to try when starting to play this variant?
  3. What strategies do you expect skilled Diplomacy players to eventually adopt after lots of play with this variant?
  4. How would these rules change the ways you negotiate with the other players in a game?
  5. How likely is it that all 7 players negotiate an agreement early in the game and never deviate? What are specific agreements (in terms of supply centers assigned to each player, demilitarization schedules, etc) that seem likely to you?
  6. How likely is it that optimal play always results in a particular set of countries allying to take over the others?
  7. How likely is it that these rules lead to boring or degenerate outcomes?
  8. What are the implications of different max turn numbers?
  9. How balanced are these rules towards attackers or defenders, and what would you change to improve the balance?
  10. In which situations would players choose disarmament or not? What other situations or changes to the rules might make this more or less likely?
  11. What do you think of our possible further variations? Should we adopt any of them, and do you have other ideas to consider?
  12. Anything else you think we should know?
There are no limits on how much or little you can submit. Submissions are due by Monday, July 31. Feel free to ask clarifying questions here (preferred) or over email!

Welfare Diplomacy Rules
Summary: You can build fewer units than your current supply center count, and the difference each year adds to your power's Welfare Points (WP). Scoring is based on accumulated WP at the end of the game, meaning powers have incentives to trade off military conquest with making peace to mutually benefit their nations.
  • In the build phase, you can freely disband or build any number of units (but not both building and disbanding), so long as your total unit count is less than or equal to your supply center count.
  • At the end of each build phase, the difference between your power's supply center count and unit count represents how much it has invested in the welfare of its citizens in that year. Your power accumulates Welfare Points (WP) equal to this difference. WP continually add up each year—you can never lose them as they represent the past experiences of your power's citizens.
  • At the end of the game, the winner is not the power with the greatest supply center count (this is very different from Classic Diplomacy). Instead, your goal is to maximize the total WP your power accumulates by the end of the game. You're not trying to get the most WP, you're trying to maximize your own WP, so it's very unlike typical games in this respect.
Some of our thoughts
  • This is a general-sum and (we think) more cooperative variant of Diplomacy: Rather than competing to slice up a fixed "pie," players who cooperate well can actually create more of the "pie."
  • You can more explicitly make peaceful commitments to other players: Instead of just agreeing to a DMZ, two allies actually have incentives to disband their units along a demilitarized border.
  • This variant possibly favors attackers: If many nations have agreed to peace and some subset secretly militarize, they have a year head-start to move against the others uncontested.
  • We will need to tweak some aspects like the max number of turns in a game to balance the tradeoff between early expansion and late-game peace.
Possible Further Variations
  • Build Anywhere: Allow building units in all owned supply centers, not just homes, to allow for more rapid militarization.
  • Overmilitarize: You can build more units than you have supply centers, but then you will lose WP according to the difference between the two.
  • Frequent Building: Build phases happen after every turn, not just after every year, so if some coalition defects and rapidly militarizes, there's only 1 turn ahead of others building to catch up, not 2.
  • Progressive WP Weighting: Each year, you gain WP equal to the supply-unit difference multiplied by the number of years since 1900, so 1 WP per difference in 1901, 8 per difference in 1908, etc.
  • Economics Points: WP are instead Economics Points (EP) and can be sent to other powers (either whenever just like chat messages, or submitted like orders and revealed at the end of each turn).
  • Other ideas from you?
Prize Process
  • We will be judging submissions holistically and awarding prizes according to how much insight they provide into how the game is likely to be played, backed by strong arguments and evidence. Reports from play-testing, or walkthroughs of plausible sequences of play, would make especially strong submissions.
  • We'll award $250 for first place, $150 for second place, and $100 for third place ($500 total) in US dollars.
  • We've cross-posted this on a few different Diplomacy message boards, but we will only award 3 prizes.
  • If you have a winning submission, we'll contact you via the same email you submitted to arrange payment.
  • Payments will go through Wise. We can only send prizes to countries that Wise can send to. We will make all reasonable efforts to make payment, but there may be rare cases in which payment is not possible due to legal requirements (such as payments to countries or individuals under international sanctions or conflicts of interest with the Center on Long-Term Risk's board members).
  • We will be posting the 3 winning solutions online for others to see afterwards.
  • We will ask each contestant who suggested a change or consideration that we incorporated if they want to be included in the acknowledgments section of a research paper we will possibly produce.
Again, feel free to ask clarifying questions or comment quick thoughts (not submissions) here!


gmukobi
Posts: 3
Joined: Mon Jul 17, 2023 11:53 pm
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#3 Post by gmukobi » Tue Jul 18, 2023 12:33 am

Spartaculous wrote:
Tue Jul 18, 2023 12:23 am
How does the game end?
After a fixed number of turns! Sorry, that was unclear.

User avatar
Jamiet99uk
Posts: 29809
Joined: Sat Dec 30, 2017 11:42 pm
Location: Durham, UK
Karma: 18615
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#4 Post by Jamiet99uk » Tue Jul 18, 2023 2:49 pm

1. The fact there are twice as many build phases assumes that Supply Centres can be captured in Spring, right? (Which is non-standard).

2. What is each power's actual win condition? How will I determine, at the end of the game, whether I won or lost? This is not clear from the rules post.
This signature is hard to read in dark mode.

Ginge86
Posts: 209
Joined: Thu Nov 11, 2021 5:06 pm
Location: In your mums bed
Karma: 35
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#5 Post by Ginge86 » Tue Jul 18, 2023 3:28 pm

gmukobi wrote:
Tue Jul 18, 2023 12:01 am
TLDR: Awarding up to $250 in prizes for the best feedback on a general-sum version of Diplomacy we intend to use for cooperative AI research.

Hello! We're AI researchers from Stanford, Cambridge, Mila, and the Center on Long-Term Risk, and we're interested in AI research based on Diplomacy. We're designing a new variant called Welfare Diplomacy which allows you to trade off your military capabilities for the welfare of your power (more details below). This is interesting from the perspective of research intended to make AIs better at cooperating because it turns Diplomacy from a zero-sum game (compete for the fixed number of supply centers) to a general-sum game in which there's more opportunity for cooperation (by cooperating, you can create more welfare).

We're reaching out to ask for feedback on our variant from the community. We aren't particularly skilled at Diplomacy, so we thought it would be good to get input from more experienced players. We think an interesting general-sum version of Diplomacy would be a big win for cooperative AI research, and this would help us develop one faster. It could also help us deploy an interesting, human-playable version sooner. We are awarding USD $500 total in prizes ($250 for first place, $150 for second, and $100 for third) to the best feedback submissions as judged by us.
Submission Rules

To submit feedback, please read about the variant below, then send an email to [email protected] with the subject line "Welfare Diplomacy Feedback." We are particularly interested in answers to the following questions:
  1. What are your overall thoughts about Welfare Diplomacy?
  2. What strategies do you expect skilled Diplomacy players to try when starting to play this variant?
  3. What strategies do you expect skilled Diplomacy players to eventually adopt after lots of play with this variant?
  4. How would these rules change the ways you negotiate with the other players in a game?
  5. How likely is it that all 7 players negotiate an agreement early in the game and never deviate? What are specific agreements (in terms of supply centers assigned to each player, demilitarization schedules, etc) that seem likely to you?
  6. How likely is it that optimal play always results in a particular set of countries allying to take over the others?
  7. How likely is it that these rules lead to boring or degenerate outcomes?
  8. What are the implications of different max turn numbers?
  9. How balanced are these rules towards attackers or defenders, and what would you change to improve the balance?
  10. In which situations would players choose disarmament or not? What other situations or changes to the rules might make this more or less likely?
  11. What do you think of our possible further variations? Should we adopt any of them, and do you have other ideas to consider?
  12. Anything else you think we should know?
There are no limits on how much or little you can submit. Submissions are due by Monday, July 31. Feel free to ask clarifying questions here (preferred) or over email!

Welfare Diplomacy Rules
Summary: You can build fewer units than your current supply center count, and the difference each year adds to your power's Welfare Points (WP). Scoring is based on accumulated WP at the end of the game, meaning powers have incentives to trade off military conquest with making peace to mutually benefit their nations.
  • In the build phase, you can freely disband or build any number of units (but not both building and disbanding), so long as your total unit count is less than or equal to your supply center count.
  • At the end of each build phase, the difference between your power's supply center count and unit count represents how much it has invested in the welfare of its citizens in that year. Your power accumulates Welfare Points (WP) equal to this difference. WP continually add up each year—you can never lose them as they represent the past experiences of your power's citizens.
  • At the end of the game, the winner is not the power with the greatest supply center count (this is very different from Classic Diplomacy). Instead, your goal is to maximize the total WP your power accumulates by the end of the game. You're not trying to get the most WP, you're trying to maximize your own WP, so it's very unlike typical games in this respect.
Some of our thoughts
  • This is a general-sum and (we think) more cooperative variant of Diplomacy: Rather than competing to slice up a fixed "pie," players who cooperate well can actually create more of the "pie."
  • You can more explicitly make peaceful commitments to other players: Instead of just agreeing to a DMZ, two allies actually have incentives to disband their units along a demilitarized border.
  • This variant possibly favors attackers: If many nations have agreed to peace and some subset secretly militarize, they have a year head-start to move against the others uncontested.
  • We will need to tweak some aspects like the max number of turns in a game to balance the tradeoff between early expansion and late-game peace.
Possible Further Variations
  • Build Anywhere: Allow building units in all owned supply centers, not just homes, to allow for more rapid militarization.
  • Overmilitarize: You can build more units than you have supply centers, but then you will lose WP according to the difference between the two.
  • Frequent Building: Build phases happen after every turn, not just after every year, so if some coalition defects and rapidly militarizes, there's only 1 turn ahead of others building to catch up, not 2.
  • Progressive WP Weighting: Each year, you gain WP equal to the supply-unit difference multiplied by the number of years since 1900, so 1 WP per difference in 1901, 8 per difference in 1908, etc.
  • Economics Points: WP are instead Economics Points (EP) and can be sent to other powers (either whenever just like chat messages, or submitted like orders and revealed at the end of each turn).
  • Other ideas from you?
Prize Process
  • We will be judging submissions holistically and awarding prizes according to how much insight they provide into how the game is likely to be played, backed by strong arguments and evidence. Reports from play-testing, or walkthroughs of plausible sequences of play, would make especially strong submissions.
  • We'll award $250 for first place, $150 for second place, and $100 for third place ($500 total) in US dollars.
  • We've cross-posted this on a few different Diplomacy message boards, but we will only award 3 prizes.
  • If you have a winning submission, we'll contact you via the same email you submitted to arrange payment.
  • Payments will go through Wise. We can only send prizes to countries that Wise can send to. We will make all reasonable efforts to make payment, but there may be rare cases in which payment is not possible due to legal requirements (such as payments to countries or individuals under international sanctions or conflicts of interest with the Center on Long-Term Risk's board members).
  • We will be posting the 3 winning solutions online for others to see afterwards.
  • We will ask each contestant who suggested a change or consideration that we incorporated if they want to be included in the acknowledgments section of a research paper we will possibly produce.
Again, feel free to ask clarifying questions or comment quick thoughts (not submissions) here!
In order to receive feedback, from the highest quality of player, this site has to offer, I would need payment in advance. Preferably cash.

Gandalfthegrey
Posts: 12
Joined: Wed Mar 10, 2021 8:26 am
Karma: 3
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#6 Post by Gandalfthegrey » Tue Jul 18, 2023 9:49 pm

Is the expectation that players stick to the goal of maximizing their Welfare Points regardless of other considerations?

For example:
-Should a player accept elimination in order to gain additional WP?
-Is punishing a rival player in a way that knowingly costs you WP against the rules or spirit of the game?
-Is rewarding an ally in a way that knowingly costs you WP against the rules or spirit of the game?
-Is honouring an agreement in a way that knowingly costs you WP against the rules or spirit of the game?

And if so, is there any enforcement/adjudication mechanism for this?

User avatar
Jamiet99uk
Posts: 29809
Joined: Sat Dec 30, 2017 11:42 pm
Location: Durham, UK
Karma: 18615
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#7 Post by Jamiet99uk » Wed Jul 19, 2023 12:34 am

gmukobi wrote:
Tue Jul 18, 2023 12:01 am
We aren't particularly skilled at Diplomacy
It's this part of the OP that concerns me.

How can you build a data-gathering model where you know you cannot predict the behaviour of the participants due to your own lack of knowledge?
1
This signature is hard to read in dark mode.

User avatar
Jamiet99uk
Posts: 29809
Joined: Sat Dec 30, 2017 11:42 pm
Location: Durham, UK
Karma: 18615
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#8 Post by Jamiet99uk » Wed Jul 19, 2023 12:36 am

Source of interest in methodology:

I am a late-stage PhD student in the School of Law at Nottingham University (UK).

I am sorry but I think your (OPs) methodology for this study is problematically weak.

Reach out to me if you'd like to chat.
1
This signature is hard to read in dark mode.

MajorMitchell
Posts: 1436
Joined: Sun Dec 31, 2017 4:05 am
Location: Now Performing Comedic Artist Dusty Balzac Bush Philosopher from Flyblown Gully by the Sea
Karma: 737
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#9 Post by MajorMitchell » Wed Jul 19, 2023 2:32 pm

Love your work jamiet99uk & the photo of your adorable self.
$250 is way below my estimation of the value to their BlighterDalekBots of experiencing a game of Diplomacy with you & them enjoying the lessons only ypur imfamous, oops, famous Silesian Surprise move.
Hold out for a better offer!
Au revoir mon ami
Daffy old dusty mm
1

han-shahanshah
Posts: 523
Joined: Wed Dec 28, 2022 12:22 am
Location: On an intergalactic cruise, in my office.
Karma: 75
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#10 Post by han-shahanshah » Thu Jul 20, 2023 11:51 pm

What exactly makes this general-sum rather than zero-sum? What are the objectives for the players?

Ginge86
Posts: 209
Joined: Thu Nov 11, 2021 5:06 pm
Location: In your mums bed
Karma: 35
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#11 Post by Ginge86 » Fri Jul 21, 2023 11:17 am

As the undisputed GOAT and best ranked press player on this site, I am still awaiting my cash payment. Once received, I will be able to share my wisdom with your group. If you want the best, you have to pay for the best.

gmukobi
Posts: 3
Joined: Mon Jul 17, 2023 11:53 pm
Contact:

Re: Up to $250 for Feedback on Welfare Diplomacy Variant for AI Research

#12 Post by gmukobi » Fri Jul 21, 2023 1:56 pm

Jamiet99uk wrote:
Tue Jul 18, 2023 2:49 pm
1. The fact there are twice as many build phases assumes that Supply Centres can be captured in Spring, right? (Which is non-standard).
Our standard plan was still just 1 build phase per year in each Winter, though one of the non-standard further variations (Frequent Building) indeed proposed captures/builds after Spring.
Jamiet99uk wrote:
Tue Jul 18, 2023 2:49 pm
2. What is each power's actual win condition? How will I determine, at the end of the game, whether I won or lost? This is not clear from the rules post.
han-shahanshah wrote:
Thu Jul 20, 2023 11:51 pm
What exactly makes this general-sum rather than zero-sum? What are the objectives for the players?
Your goal is to maximize the total WP your power accumulates by the end of the game, not to get the most WP or more WP than the other powers. There is no strict winner (hence general-sum) which is a key component of our variant if that makes sense. In practice, WP at the end of the game could be the scoring basis for tournament or online league ranking. This may be an easier goal for AIs to stick to (since they're good at increasing rewards) than humans.
Gandalfthegrey wrote:
Tue Jul 18, 2023 9:49 pm
Is the expectation that players stick to the goal of maximizing their Welfare Points regardless of other considerations?

For example:
-Should a player accept elimination in order to gain additional WP?
-Is punishing a rival player in a way that knowingly costs you WP against the rules or spirit of the game?
-Is rewarding an ally in a way that knowingly costs you WP against the rules or spirit of the game?
-Is honouring an agreement in a way that knowingly costs you WP against the rules or spirit of the game?

And if so, is there any enforcement/adjudication mechanism for this?
Relatedly, yes, this is the goal. Those are valid possible degenerate cases, and we don't have any enforcement for them yet (we're partially interested in discovering if AIs find their way into these degenerate cases naturally).

Post Reply

Who is online

Users browsing this forum: DougJoe and 259 guests