-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interest in super-ko? #13
Comments
Hi RohanM, I'm wondering how you would implement this Zobrist hashing. This codebase is largely reliant on numpy/scipy and I would prefer if your zobrist hashing used numpy/scipy if possible. Also I had a similar experience as you and my solution was to just end the games after a max number of steps. Have you considered that? |
I've considered the same solution (and that's what I'm using in prototypes at the moment). But I thought that tracking super ko would provide more information from each game (an actual winner rather than a draw), and also keep the games shorter, saving computation. In terms of the implementation, Zobrist hashing is very simple - generate a 64 bit random integer for each piece in each position, and XOR it in and out of the hash as the board position changes. As such, it's easy to implement with just numpy, and no other dependencies would be required. Anyway, thanks for getting back! I'll do a little more work on my prototype and post a link so you can take a look and see what would be involved. |
I might not have the best understanding of Zobrist hashing but since you're working with small boards anyway. Why not just compare the board states directly? I doubt it would take significant memory and I feel like it would be just as fast? |
I'm using this gym to experiment with an AlphaZero-like algorithm, starting with a very small board (2x2 or 3x3). In that context it's very easy to have games that result in multi-step cycles, which the current ko protection code doesn't detect.
As part of that project, I'm working on implementing super-ko, which would prevent such illegal moves. I'm using Zobrist hashing, so the implementation should be efficient. Would there be interest in incorporating that feature into this gym? If so, I'm happy to write it up as a PR (with tests etc), but I thought I'd check in first to see if that's the case.
The text was updated successfully, but these errors were encountered: