Best 100-episode average reward was -0.99 ± 0.01. (Go9x9-v0 does not have a specified reward threshold at which it's considered solved.)
Algorithm
This evaluation was generated by running alg_kInVxyv9RqqwZqDnKCcgdw.