lichess.org
Donate

Winter Marathon 0 inaccuracies game

@ #27
Right you accepted (1) to be useful.
If (2) it can go either way, I agree this depends on the game.

But I don't like you saying it is a very rough guide.
@ #17

I did a gauntlet test matches of Stockfish 7 beta 2 vs Houdini 4, Sf6, Gull 3, and Strelka 5.5, to see Sf 7 beta 2 winning percentage.

Sf7 beta 2 is using TC of 30s + 100ms inc. Its opponents are using TC 60s + 100ms inc. In this game sampling Sf7 beta 2 played 549 games, this is the data in the chart in the following link. The plot is not really smooth as this is only taking a small sample. The estimated average depth of Sf7 beta 2 is around 16.

http://www.mediafire.com/convkey/b5bb/m755yphdcqzc43vzg.jpg

At 0.5 pawn advantage Sf7 beta 2 has scored around 65% which would translate to 110 elo difference, table 8.1a "FIDE Rating Regulations effective from 1 July 2014"
http://www.fide.com/fide/handbook.html?id=172&view=article

Here is result summary using ordo, Sf6 set at 3204 per CEGT 40/4 rating list.

# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(next)
1 Sf6 : 3204.0 ---- 70.0 136 51.5% 70
2 stockfish_7_beta_2 : 3193.6 38.9 314.0 549 57.2% 99
3 H4 : 3149.1 54.2 59.5 136 43.8% 84
4 Gull_3_x64 : 3121.9 55.4 56.0 140 40.0% 84
5 Strelka_v5.5_64BIT : 3093.1 56.1 49.5 137 36.1% ---

Has there been a study translating Sf score into Glicko-2 rating system? I guess I will attempt to
translate this data into Glicko-2.
I go ahead calculating relationship with Glicko-2 for Sf7 beta 2.
Assumption, all players Sf7 beta 2 and others starts at
Rating, RD, Volatility = [1500, 350, 0.06]. But rating of Sf7 beta 2 changes after every period.

Procedure:
1. Period 1 starts vs Sf6
2. Period 2 vs Houdini 4
3. Period 3 vs Gull 3
4. Period 4 vs Strelka

Starting record of player Sf7_beta_2
Old Rating: 1500
Old Rating Deviation: 350.0
Old Volatility: 0.06

(1) Result vs Sf6 at [1500, 350, 0.06]
New Rating: 1484.97006018
New Rating Deviation: 44.1723185519
New Volatility: 0.0598281196872
-------------------------------------

Old Rating: 1484
Old Rating Deviation: 44.1723185519
Old Volatility: 0.0598281196872

(2) Result vs Houdini 4 at [1500, 350, 0.06]
New Rating: 1525.7224422
New Rating Deviation: 31.793551785
New Volatility: 0.0601385057921
-------------------------------------

Old Rating: 1525
Old Rating Deviation: 31.793551785
Old Volatility: 0.0601385057921

(3) Result vs Gull 3 at [1500, 350, 0.06]
New Rating: 1554.49568549
New Rating Deviation: 26.6290272103
New Volatility: 0.0603330874708
-------------------------------------

Old Rating: 1554
Old Rating Deviation: 26.6290272103
Old Volatility: 0.0603330874708

(4) Result vs Strelka 5.5 at [1500, 350, 0.06]
New Rating: 1580.95205877
New Rating Deviation: 24.0890745516
New Volatility: 0.0603841536455
-------------------------------------

Summary:
Player: Sf7_beta_2
Player RD: 24
Final Rating: 1580 +/- 48 (95% confidence = 2*RD)
Rating Change: +80
Actual overall score from 549 games: 57.2%
Win percentage (57.2%) from plot: 0.1 pawn advantage (approximately)
So Sf7 beta 2 10cp advantage is approximately 80 Glicko-2 rating points.
Including RD, the range is [80-48, 80+48] = [+32, +129] at 95% confidence interval

This result is only based on test matches from post # 32.

Glicko-2 formula used are based from python source which can be found here.
http://www.glicko.net/glicko.html
I noticed something interesting:
Against weaker or equal strength players I actually have a number of 0 0 0 wins but if I beat stronger players it's usually not a good game (like here: http://en.lichess.org/MpYMVnRr/white#77).
That's somewhat counterintuitive since you would expect higher ratings = better game quality.

Though a reason for this might be that a strong player puts more pressure on opponents to force mistakes, I don't know.
@ #34
That is normal, you can play perfectly if your opp is weak. Note only you has played perfect but not your opp.

Better game quality implies good looking moves from both sides. Hence lower BMI counts for both sides.

BMI = Blunder/Mistake/Inaccuracy.

This topic has been archived and can no longer be replied to.