lichess.org
Donate

Introducing Maia, a human-like neural network chess engine

@FC-in-the-UK
Pls open your mind do not restrict yourself as it is only about playing abot.
It can be a very helpful analysing tool such as predicting on which move u would do a Blunder.
It can be great beside sf 13.
When I read this on lichess blog my heart just jumped.
OMG
So I was thinking more about my question#52 and I was wondering the following question. Suppose we take a large pool of chess players and of chess positions and we ask each player what they would play in each position. This way we could determine for each position and each suggested move
1) what is the difference of win probability with the optimal move (by evaluating it against sf or any strong engine)
2) what is the probability this move is predicted by a human (by seeing what proportion of chess players suggested this move)
Averaging these data we could plot a curve as in the last graph of the article. Arguably this curve would represent something 'objective' about human reasoning (though a priori we would get different results with a pool of 1100 players or with a pool of 2500 players...) Would that 'explain' the similaities in the various curves produced by Maia and Leela?
I would be interested to have the opinion of @ashtonanderson @reidmcy and @sidsen.
So basically Leela is Optimus Prime (optimal moves) and Maia is Bumble Bee.
@mars69ha, @TheRisottoVariation The maia bots are only accepting timed games. If the opponent leaves during a Correspondence game it causes issues for the backend.

@Eyvazova_2009, @jrepo Thanks for the heads up, we had some issues with Lichess's DDOS protection, so some games the bots timed out. But it should be fixed now.

@WilkBardzoZly There's been another year of games played on Lichess since we made these models. So we're optimistic that stronger (and weaker) ones can be made. We did do some experiments with higher skill models last year, and they were not good.

@FC-in-the-UK , You make a good point. I think part of why small mistakes are harder to predict is because there can be more than one of them. We don't have any good theory to explain why though, which is one of the things we're working on right now.
@reidmcy thank you for your reply. My question wasabit more specific than nsmall mistakes are harder to predict' though: if we look at the curves, they are all roughly increasing (that's expected), but we also see that they all have three local maxima and three local minima, which seem to 'propagate' and getting more accentuated as we move from Leela 1000 to Maia 1900, and seem to converge to some limit curve.
This phenomenon is very surprising and interesting to me, and my only hypothesis as for now would be that the local extrema correspond to some 'objective' stuff happening that is detected more and more accurately as we move from Leela 1000 to Maia 1900.
Maybe (but I am really imprivising here) the three minima could correspond respectively to small mistakes when we calculate at depth 1, 2, 3; or something similar?
This bot doesn't accept custom games (Which include boards from specific positions). I think this is a shame since you can't practice openings against it.

Is there a reason why this is ignored?
@FC-in-the-UK Thanks for clarifying. I don't think we should read too much into the shape of the curves. We are using Stockfish, plus a rough approximation of win probability to judge each position so there's a lot of error that can build up. So looking at trends and relative results should be the main focus.

I think you could be right about us being able to pickup on different levels of thinking depth, but it would require a much more cleanly constructed test.
@reidmcy I agree with this kind of problem one should not read too much in the curve as there are several layers of approximation. However what I find really troubling is that all curves, from Leela 1000 to Maia 1900, have the same characteristics. If it were just some artefact we would not expect to see such consistency, would we?

This topic has been archived and can no longer be replied to.