In classic 7-player bot games, the bots sometimes don't realize I've stabbed them, and continue to act as though allied with me.
E.g. look at Turkey's behavior after I (Austria) stabbed them in 1909 in this game: https://webdiplomacy.net/board.php?game ... #gamePanel
I attacked them in Apulia in spring 1909, then they supported me in Denmark in the fall! They continued to avoid attacking me, even as I continued to take their centers.
Bots don't know they've been stabbed
-
- Posts: 123
- Joined: Sun Dec 31, 2017 8:13 am
- Contact:
Re: Bots don't know they've been stabbed
A large part of the training data was points per supply centre games, so the bots are more than happy to play for second place. It is my hope that a newer bot version will expunge these games now we have a larger pool of finished games, but I don't think anyone is volunteering to do this.
Re: Bots don't know they've been stabbed
I'm not sure that playing for second place explains it. Once I'm clearly hostile why don't they attack me? In the game I linked above, Turkey could have taken Warsaw and probably other centers from me if they wanted to -- instead they continued to treat me as an ally while I ate them up.
Re: Bots don't know they've been stabbed
I don't know that your solo was really stoppable at that point - a human might not fight back then either.
Re: Bots don't know they've been stabbed
Those particular bots do suffer from a lack of understanding of stalemate lines - so I don't think it was really coming from a place of realising that you were going to win and couldn't be stopped, though. Similarly, I don't think they reason that you were going to win and *should* be stopped.
I think the Facebook AI bot / approach is much stronger.
I think the Facebook AI bot / approach is much stronger.
Re: Bots don't know they've been stabbed
Yes, that's right:
https://github.com/diplomacy/research/b ... per_v1.pdf
Reading this https://arxiv.org/pdf/2010.02923.pdf , It looks like the facebook effort is trained instead with an (also trained) estimate of the final SoS scores given the board position - (at least the one in that paper is, I think). I would expect that approach to lead to better behaviour at stopping solos.
https://github.com/diplomacy/research/b ... per_v1.pdf
So, the bots are unlikely to value stopping solos. It's also interesting that the scoring system they use is not available on webdiplomacy (and never has been). I didn't notice that last time I looked.As a reward function, we use the average of (1) a local reward function (+1/-1) when a supply center is gained or lost (updated every phase and not just in Winter)), and (2) a terminal reward function (for a solo victory, the winner gets 34 points; for a draw the 34 points are divided proportionally to the number of supply centers).
Reading this https://arxiv.org/pdf/2010.02923.pdf , It looks like the facebook effort is trained instead with an (also trained) estimate of the final SoS scores given the board position - (at least the one in that paper is, I think). I would expect that approach to lead to better behaviour at stopping solos.
Who is online
Users browsing this forum: Google [Bot]