Superhuman AI for heads-up no-limit poker: Libratus beats top professionals

Learn Poker Video Source & Info:

This talk gives a high-level explanation of Libratus, the first AI to defeat top humans in no-limit poker. A paper on the AI was published in Science in 2017.

No-limit Texas hold’em is the most popular form of poker. Despite AI successes in perfect-information games, the private information and massive game tree have made no-limit poker difficult to tackle. We present Libratus, an AI that, in a 120,000-hand competition, defeated four top human specialist professionals in heads-up no-limit Texas hold’em, the leading benchmark and long-standing challenge problem in imperfect-information game solving. Our game-theoretic approach features application-independent techniques: an algorithm for computing a blueprint for the overall strategy, an algorithm that fleshes out the details of the strategy for subgames that are reached during play, and a self-improver algorithm that fixes potential weaknesses that opponents have identified in the blueprint strategy.

Source: YouTube

Share this video:
Superhuman AI for heads-up no-limit poker: Libratus beats top professionals

10 thoughts on “Superhuman AI for heads-up no-limit poker: Libratus beats top professionals

  1. You mention that sometimes Libratus will for example raise 95 % of the time and fold 5 % of the time. By only going on regret the AI would always pick the action with most positive value. How does it decide/figure out when to use a split strategy?

  2. Is there a software or app where I can play a bot and learn from it or will it will teach this?

  3. This was fascinating–thank you for such a complete talk. What were the bot's % opening ranges from the SB and BB? Did they change according to the opponent's bet sizing/ ranges or remain nearly the same throughout the match? Finally, did it help to provide or suggest any solutions as to the actual value of being in position?

  4. This is a fantastic presentation!

    In the video you mention exploitability. How is that calculated? It sounds like you first determine how a worst case opponent would perform then measure performance relative to that. What I don't understand is how you estimated the equalibrium strategy performance to compare your strategy to.

    Also, have you heard of, and tried, double neural cfr? I recently read a paper still in review that shows some really promising results in terms of speed in training(converges in smaller games like rih in ~1k iterations) – https://openreview.net/forum?id=Bkeuz20cYm

  5. So when you solve a subgame, because there is the limitation that you have to make sure your opponent can't exploit you by increasing the frequency that they reach the subgame with certain hands, does that mean that you don't necessarily find an equilibrium for the subgame, i.e., the subgame strategy is not necessarily a best response strategy or an equilibrium strategy? It seems counter intuitive to solve a subgame and have the resulting strategy not be doing the best possible in that subgame. Perhaps it is finding an equilibrium strategy, but if a hand is sometimes doing action x and sometimes doing action y, the percentages have to be fine tuned to achieve safe subgame solving.

    Also is a subgame any subtree of the entire game tree, or does the root of the subgame need to be an information set with a single node?

Comments are closed.