Old Forum - webDiplomacy

Forum

A place to discuss topics/games with other webDiplomacy players.

Page 918 of 1419

First

Previous

Next

Last

smcbride1983 (517

)
13 May 12 UTC

Thread is too old to reply to

I, Claudius Discussion Thread

Howdy, Webdiplo. I am coming to the end of The Satanic Verses and about to pick up I,Claudius by Robert Graves. If anyone would like to discuss the book with me feel free to pick up a copy and read along.

34 replies

(S)

)
23 May 12 UTC

Thread is too old to reply to

Ghost Rating update

http://tournaments.webdiplomacy.net/theghost-ratingslist

By Category link added

Page 4 of 5

First

Previous

Next

Last

Sargmacher (0 )
25 May 12 UTC

"I want to see how quickly I can crudely assess your skill [...] I can do that better than the current GR system"

Oh yes. MadMarx's infinite modesty strikes again.

Draugnar (0 )
25 May 12 UTC

He said it would be a crude assessment. I can do that right now. Everyone here (myself included) is a fucktard and a gaming geek.

MadMarx (36299 )
25 May 12 UTC

Sarg, that is malicious, you have some nerve!

Sargmacher (0 )
25 May 12 UTC

Drats, caught again!

The Hanged Man (4160 )
25 May 12 UTC

"This discussion makes me wish I had stuck with being a math major in college"

Haha, not me. I finished high school and said "Woohoo! I'm never taking math again!" I should have quit even earlier.

@CSteinhardt: If you're going to make it two-dimensional, why not three-dimensional with time as the third component. Wouldn't it be better/more reliable to have a rating that includes recency of results? If somebody kicked ass against good competition five years ago but hasn't played anybody since, including 90% of the current pool, should that be devalued? Some rating systems require the top echelon to periodically re-prove their skill, such as requiring some result (a solo? a certain percentage of low-participant draws?) against some competition (top X of current GR list?) within a set period of time (past year?), or they slip in the rankings. In other words, you can't just rest on your laurels if you want to be top of the ladder.

Just brainstorming...

Frickin'Zeus (85 )
25 May 12 UTC

If I am interpreting CSteinhardt correctly then "two-dimensional" might be a confusing term to use, maybe "parameter" would be more appropriate.

With the GR we only get a single score. This score represents what the algorithm believes our "true diplomacy talent" to be. What CSteinhardt is talking about is adding another parameter. One number signifies what the algorithm believes your "true diplomacy talent" to be, and then the other is the uncertainty, which takes into account the random factors of diplomacy and also how inconsistent of a player you are.

The point of GR and CSteinhardt's idea is to use past data to discover how good someone is at the game, and to give them a number that explains their talent. If it were a requirement to play a certain number of a certain type of game against a certain bunch of people, then it seems to be less about discovering the a person's true diplomacy skill and more about creating something similar to (D).

Also wouldn't it be cool to compare a present player to someone that left the site a while ago? That wouldn't be possible if such requirements were necessary to "keep" your skill value. The algorithms are about discovering something that is already defined, namely "diplomacy skill".

orathaic (1009 )
25 May 12 UTC

Hanged man - interestingly the sysyem inherently takes this into account, though i'm not sute it was intentional.

There is inflation within the rating system. The pool on average gains in rating because some players drop out after losing some rating to the pool - the average dropout is sligthly below the initial rating of 100 - thus overtime your delfated score will not rank you as highly as before...(on the all-time list at least)

Yonni (136 )
25 May 12 UTC

Note: I ran through all of this and then thought – my god, this must have been done to death already – a quick google brought me here: http://www.diplom.org/Zine/S1998R/Nichols/ratings2.html. It’s been done before but I’m sure with our great minds we can come up with some appropriate tweaks.

The Elo rating has been a very successful and widely used system so I’m not sure why we’d try reinvent the wheel so my idea is to try figure out how to best fit the Elo system to diplomacy. I’m first only going to consider WTA and then give a few words on how I’d probably adjust it for PPSC.

First, let’s look at what the formula is for the Elo rating:

R_i’ = R_i + K * (S_i – E_i)

R_i is the rating of player i
‘K-value’ is a variable that affects the rate at which ratings change
S_i is the score of player i during a period (single game, month, tournament, etc.)
E_i is the expected score of the player i during that period

In chess, the score of a player is defined as 1 for a win, 0.5 for a draw, and 0 for a loss. In diplomacy the appropriate analogy, imho, would be 1/n where n is the number of players in the final draw/solo. Of course, you could come up with a different weighting where solos and smaller draws are more highly rated as long as it the score of every player sums to 1. However, for the sake of this argument, let’s leave it at 1/n.

The next step is to determine what an expected score of two differently rated players would be. This can be arbitrarily chosen. In chess, a spread of 200

results in an expected score of .75 (i.e. if you played two games, a player rated 200

higher than another would be expected to win one and draw one). This can be tweaked to adjust the spread and but, for the sake of simplicity, let’s just leave it there – although an expected score of .75 in diplomacy is very high.

Writing formulae in the forum isn’t pretty so it may be best to follow the wiki page to understand what the hell I’m trying to type. http://en.wikipedia.org/wiki/Elo_rating_system#Mathematical_details

For Elo rating, E_a is given by

E_a = 1/{ 1 + 10 ^ [ ( R_b – R_a) / 400] }

Where a is the player we’re evaluating and b is the opponent.

This is where the analogy breaks down and we need to make a decision. In diplomacy there are, of course, 6 opponents and not 1. There are two ways apparent to me for how to deal with this.

Option 1:

R_b = (R_b1 + R_b2 + … R_b6) / 6

Where b1 through 6 are your opponents

Option 2:

E_a = ( 1/{ 1 + 10 ^ [ ( R_b1 – R_a) / 400] } + 1/{ 1 + 10 ^ [ ( R_b2 – R_a) / 400] } … ) / 6

Strangely, I don’t own a pen or pencil in my house (Jesus that’s pretty weird, eh?) so I can’t work through this any further to deduce anything from the math but I imagine, at least, that in both cases E_a+E_b1+..+E_b6 still equals 1.
I see no obvious advantage to either one but maybe with some more digging there’ll be something. At very least, I’d lean towards the first option because of the simplicity.

The next and final step is choosing K. K determines how quickly ratings can fluctuate. A K that is too large will cause ratings to fluctuate wildly and unpredictably and a K that is too low will make it take forever for players to reach their true ratings. Furthermore, K does not have to be a constant. In chess, K often depends on a players score. The thinking is that high rated chess players don’t change as much while weaker players can alter their skill level quite significantly. Another way is make K dependent on the number of games played. I think the latter would work best for diplomacy. We could make it exceptionally large for first few games and then bring it back down to Earth after, say, 5 games or so. Really, this is something that has plenty of room for great discussion.

Other ideas that could be floated around are:

- Vaft can easily calculate the expected scores of each power compared to the other. We could take this ratings to tweak the expected scores.
- A means for injecting or removing points to maintain a constant average rating
- A rating floor
- How to incorporate this into a combined rating system with FP,GB, etc. Adjusting the K value would be the obvious answer
- A penalty for resigns. You could count them as double losses or something along that line.
- Figuring out an equivalent expected score for PPSC. I expect that it would be difficult to find a range of values that would exactly match WTA expected scores.
- Keeping provisional scores (<n games) off the leaderboard

Meh, that’s all for now. Looks like I Obi’d all over the page.

Draugnar (0 )
26 May 12 UTC

We'll call you Yonniwanyonniwan. :-)

Alderian (2425 )
26 May 12 UTC

@Yonni, umm, have you read the ghost rating faq on the main ghost rating page? That ELO forumla is the basis for the ghost ratings.

Yonni (136 )
26 May 12 UTC

Right, but in name only...

Yonni (136 )
26 May 12 UTC

The GR system is very much not the Elo system, no?

Alderian (2425 )
26 May 12 UTC

I'm not sure what you mean. It does follow this formula:
R_i’ = R_i + K * (S_i – E_i)

But then calculating the E_i is where things differ since there are 6 opponents instead of 1. For WTA this is a fairly simple your rating divided by the total of everyone's ratings. If everyone has a GR of 100, everyone has a 1/7 expected chance.

S_i for WTA is 1 for a win, 0 for eliminated/resigned/survived, 1/n for a draw of n players.

Yonni (136 )
26 May 12 UTC

The calculation (S_i - E_i) is a large part of the formula.

The Ghost rating formula is:

R_a'=R_a+K * (R_b1+..R_b6)

Yonni (136 )
26 May 12 UTC

It doesn't capture the difference between losing to a lot of strong player or losing to bunch of weak players. Elo does because E_i depends on the difference between your score and your opponents.

Yonni (136 )
26 May 12 UTC

It's actually

R_a'=[ R_a*(1-K)+K * (R_b1+..R_b6) ] * (s / n)

where s=1 if you win and 0 if you lose and n is the number of people in the draw.

Draugnar (0 )
26 May 12 UTC

The stronger the opponent the more the opponent contributes to the pool so it does reflect it to a degree. But considering Diplomacy isn't purely tactical and strategic skill and the best player in the world can get his assets kicked by 6 determined enemies, you don't want to alter the percentages based on the relative strengths.

Yonni (136 )
26 May 12 UTC

"But considering Diplomacy isn't purely tactical and strategic skill and the best player in the world can get his assets kicked by 6 determined enemies, you don't want to alter the percentages based on the relative strengths."

I don't really get what you mean by that.
You like that beating a strong opponent gains you more points but you would dislike that losing to a weak opponent makes you lose more points?

I would think that one would just be an extension of the other,

Alderian (2425 )
26 May 12 UTC

Yonni, I'm going to have to get back to you later tonight when I'm home and can run a few scenarios through the ghost rating program.

Draugnar (0 )
26 May 12 UTC

No, I like that everyone contributes an equal percentage of their GR so the best player contributes more real GR but the same relative to their total. So someone with a 100 GR contributes 1/5th what someone with a 500 GR contributes. See what I am saying?

Dejan0707 (1608 )
26 May 12 UTC

What I personally don't like is that most people believe that by taking out the (in their words) "GR heavy player" they will earn themselves more points. It is not uncommon to see that behavior in the games lately.

Draugnar (0 )
26 May 12 UTC

I don't think everyone who targets the fat cats think they are getting more GR, but that they are improving their chances. It's like taking out your opponent's starting quarterback or star receiver in American football. It's an odds evener .

Lando Calrissian (100 )
26 May 12 UTC

Yeah because intentionally injuring opposing players is something that is so highly regarded in American Football......

Yonni (136 )
26 May 12 UTC

Lando, maybe a better analogy would be breaking the ankles of your opposing commie forwards:
http://www.youtube.com/watch?v=qOMJsJhHlyM

But, back to GR...

Draug, I see what you're saying. And that is why GR works. I'm not saying it doesn't - just that there are better systems out there.

I guess what we have to start with is what do we want in a ranking system? Without giving it too much thought I would suggest that we're looking for a ranking that "best reflects the relative skills of each player."

The best way to do that is have some sort of expectation of how a strong player will do versus a weaker player. I think it is fair to say that a strong player will score better (on average) in a game with weaker players and visa versa. Now what if high ranked player in fact loses a bunch of games of the weaker player and the weaker player wins a bunch? You need some sort of mechanism to adjust the expectations - which is what Elo attempts to do a little more elegantly than GR does.

You really don't think that a stronger player will lose *less often* to a group of weak players than he would to a group of strong players? Yes, there is a lot more 'luck' in diplomacy but there still is a level of predictability.

Draugnar (0 )
26 May 12 UTC

@Lando - I didn't say other was good from or I agreed with it. I just explained why it happens and how taking out the big dogs aren't always a case of idiots thinking they get more if they kill the big dogs first. Personally, I like to keep them in play and pretend I am swayed by their arguments then stab before they do.

So try reading for content and context instead of putting your personal feelings into ship and you might actually find we agree and you don't Ned to be such a smartest fucktard to me.

Draugnar (0 )
26 May 12 UTC

* I didn't say it was good form.

Draugnar (0 )
26 May 12 UTC

I hate dumbphones...

The Hanged Man (4160 )
26 May 12 UTC

Ned's dead, baby.

MadMarx (36299 )
28 May 12 UTC

bump

Yonni (136 )
30 May 12 UTC

Anything exciting in store for next month's update?

Page 4 of 5

First

Previous

Next

Last

139 replies

)
30 May 12 UTC

Thread is too old to reply to

Live Gunboat - 220

9 replies

Victorious (768

)
30 May 12 UTC

Thread is too old to reply to

The Winter Gunboat Tournament has ended.

Congrats Manas.

9 replies

)
30 May 12 UTC

Thread is too old to reply to

Define "Work"

It's what I do between WebDiplomacy posts...

8 replies

(G)

)
18 May 12 UTC

Thread is too old to reply to

The Galaxians

OK, game 1 & 5 are over and here are your positions:-

50 replies

jwalters93 (288

)
30 May 12 UTC

Thread is too old to reply to

Is it just me...

Or does it seem like less and less people actually understand sarcasm?

14 replies

)
30 May 12 UTC

Thread is too old to reply to

Austria has been banned in Spring 1901

No moves yet played. 5 days for talking (with the default 24 hour extension). Standard map. Bet of five. Anyone welcome.
http://webdiplomacy.net/board.php?gameID=89765

0 replies

Diplomat33 (243

(B)

)
30 May 12 UTC

Thread is too old to reply to

New Word Association Thread

Because they are fun! Lets make the first word

"banana"

5 replies

redhouse1938 (429

)
26 May 12 UTC

Thread is too old to reply to

THE BEST PLAYERS

Who's the best player you ever played against on this site? Why? Share here.

78 replies

SacredDigits (102

)
30 May 12 UTC

Thread is too old to reply to

Lost a player, need a replacement Russia

Not gonna lie, it's not the best position...AT seem pretty hellbent on him. But it's not the most horrible either.

gameID=89589

2 replies

(G)

)
30 May 12 UTC

Thread is too old to reply to

Do Americans deserve Mitt Romney?

A Mormon or a black guy for Preident, no matter what you think of US politics, surely you wouldn't wish Romney on them. that's just mean.....

8 replies

thatwasawkward (4690

(B)

)
30 May 12 UTC

Thread is too old to reply to

Help me put my brain back together.

High pot gunboat time. I just got back from a long weekend and have no games going... Anyone interested in something along the lines of 1500 point buy-ins, 25 hour turns, WTA?

2 replies

(G)

)
30 May 12 UTC

Thread is too old to reply to

are you Flame-proof?

http://www.bbc.co.uk/news/technology-18238326

Flame, I'm gonna live for ever, I'm gonna learn how to spy

0 replies

)
29 May 12 UTC

Thread is too old to reply to

Short, concise messages, or long lasting novel in Full Press?

More details inside

21 replies

(B)

)
30 May 12 UTC

Thread is too old to reply to

EoG: wta gunboat-165

I knew they would get bored eventually...

3 replies

Mod

(P)

)
28 May 12 UTC

Thread is too old to reply to

Downtime

Apologies for the downtime. The web server's logs were taking up a huge amount of space, so the database had nowhere to write to.
24 hours have been added to all games, except short-phase games which will have been paused.

21 replies

)
30 May 12 UTC

Thread is too old to reply to

We Need To Talk About Kevin

Watching the film right now. Has anyone else seen it or read the book? What do you think?

1 reply

)
29 May 12 UTC

Thread is too old to reply to

Useful pre-game press

What have other players on this site wanted to learn before games start? I'm curious. Some of my favorite info-seeking bits are below.

16 replies

)
29 May 12 UTC

Thread is too old to reply to

anyone want to play a MED Game?

http://webdiplomacy.net/board.php?gameID=89936

0 replies

(G)

)
28 May 12 UTC

Thread is too old to reply to

august rush

Final password sent, will start tonight or Tuesday, player list inside:

http://webdiplomacy.net/board.php?gameID=89627

9 replies

(G)

)
29 May 12 UTC

Thread is too old to reply to

Disgusting Double-Standards from the UK government

http://www.bbc.co.uk/news/uk-18245780

19 replies

KingJohnII (1575

(B)

)
29 May 12 UTC

Thread is too old to reply to

Masters of the World

Starting a new game - Masters of the World http://webdiplomacy.net/board.php?gameID=90127

Looking for all good players - it's a 101 bet game. Should be fantastic if we can get the required players. Hope you want to join.

0 replies

KingJohnII (1575

(B)

)
29 May 12 UTC

Thread is too old to reply to

Contacting the Moderators

The game 'Masters of Europe' got paused during the recent problem, and we need to get it re-started by the moderator. There are 2 inactive players who won't vote to resume, and we never voted to pause it. Anyone know how to contact the moderator, or perhaps they will see this?
thanks.

2 replies

)
27 May 12 UTC

Thread is too old to reply to

mapleleaf is turning FIFTY this week!

Happy birthday to ME. In honour of myself and my tireless beneficial contributions to this Forum, I hereby present another installment of mapleleaf's Greatest Hits.

57 replies

James Cartwright (400

)
29 May 12 UTC

Thread is too old to reply to

Can a mod please draw "We are Back in Black"

gameID=90088

Turkey is stalling and waiting for one of the rest of us to go offline. He has no tactical/strategic reason not to draw at this point.

4 replies

abgemacht (1076

(G)

)
24 May 12 UTC

Thread is too old to reply to

TV no longer ubiquitous?

So, I'm currently moving to a new apartment and I'm probably not going to pay for TV service. Has anyone else taken this approach? It's so expensive for such a poor service, I'm having trouble justifying the cost.

49 replies

Diplomat33 (243

(B)

)
22 May 12 UTC

Thread is too old to reply to

Can a tactically skilled player succeed?

Would a player who is highly tactical and has great and innovative moving abilities, but lacks on some of the finer Diplomacy skills, manage to succeed? Basically, a debate on tactics vs diplomacy and level of importance.

16 replies

JimTheGrey (968

(S)

)
22 May 12 UTC

Thread is too old to reply to

22nd Annual World Diplomacy Championship

The 22nd Annual World Dip Con is coming to Chicago this summer. The dates are Aug. 10-12. Make your plans to be there!

9 replies

CSteinhardt (9560

(B)

)
25 May 12 UTC

Thread is too old to reply to

Sitter possibly needed

Looking for a sitter for a maximum of one turn in nopress games.

6 replies

Page 918 of 1419

First

Previous

Next

Last