• October 4th, 2018

Gambling Agents

Paper , Order, or Assignment Requirements

For reasons that baffle much of the population, televising poker tournaments, such as the World Series of Poker, has mysteriously become a staple of sports network programming. It seems unlikely that athletic grace and prowess are the main attractions of this “sport,” so perhaps there is something about the nature of the competitive task, and the complexity of the environment in which the task is conducted, that rivets the attention of the adoring public.
In part to see if you can discover whether this is the case, this problem set consists of questions about the World Series of Poker and about the design of artificial agents to make decisions in this context. The version of poker played is known as “Texas Hold ’em.” Full rules for the game can be found at https://www.partypoker.com/how-to-play/texas-holdem.html , and you should be familiar with how the game works before attempting this assignment (initial hand being dealt, a hand vs. a game, betting rules for staying in game). A quick google search for “how to play texas holdem” turns up many resources.
Just to be clear about terminology in this assignment, a hand is assumed to involve the dealing of cards to the players, followed by rounds of betting and revelation of other cards until one of the players wins the money that has been bet. A fresh deck of cards is used for each hand, and the dealer has no possibility of cheating to benefit any player.
We will assume that a full game consists of a table of players who repeatedly play hands until one of them has won all of the money from the others.
Games take place in the context of a tournament. In the larger tournament, winners from different tables then play in games against each other; for the purposes of this assignment, assume that each game brings together a group of players that are unfamiliar with each other. This continues until there is one tournament winner. Assume that winnings are not carried over between games; every player in every game in every round of the tournament starts with the same amount of money.
1. 1. Specify a game of poker in terms of each element of the PEAS description. Assume that the agent just plays poker, it does not need to move between tables, socialize, order drinks, etc.
2. 2. A characterization of one poker environment has been presented in the book on page 45 (Image attached). Using this as a starting point, characterize a poker game environment as described in this assignment. For each dimension of the characterization, write a few sentences justifying your choice.
3. 3. While in Vegas, you observe your very drunk friend play poker at a table. You notice that they sometimes seem to make the wrong decision. For example, your friend folded on a [2♣, 4♣] but the communal cards turned out to be [3♣, 9♦ ,J♠ ,5♣ ,6♣], which would have given them a straight flush and would’ve won them that hand! What can we say about the rationality of your friend (when it comes to playing poker, not drinking or making life decisions in Vegas)? Explain your answer.
4. 4. Consider a hypothetical agent that decides whether to fold or call based solely on the value of the largest card in its pair. Is this agent rational? Why or why not?
5. 5. We’ve completely forgotten one factor of the game and the whole reason why people gamble: the bets. Betting different amounts is an important part of playing poker and can be advantageous for a player. Regardless of what type of agent it is (table lookup, simple reflex, etc.), the agent we implement doesn’t have actuators that allow it to do things like go “all-in” on a bluff to scare the other players into folding. With this in mind, is it still possible for this agent still be considered rational? Explain your answer.
6. 6. Now let’s consider a learning agent. The agent watches an expert who is playing poker, and observes the expert’s initial (pre-flop) hand. The agent watches the expert’s decision as to whether to check, bet, or fold based on only these two cards. The agent uses this information to update its performance element to match the behavior of the expert.
If the agent’s performance element does table lookup, the agent will find the appropriate entry in the table (corresponding to the pair held by the expert) and record the expert’s decision for that particular pair of cards.

If the agent’s performance element uses a reflex-threshold architecture, it instead does the following. It uses statistical information to rank all of the possible pairs in terms of likelihood of ultimately winning. Given this ranking, what it needs to learn are thresholds for classifying the pairs into different actions. For example, if a pair is in the top-third then bet, middle third then check, and bottom third then fold. By watching the expert, though, the agent can improve these thresholds – if the expert folds on the pair ranked in the center of the ranking, for example, the agent might adjust its thresholds to fold on the bottom half rather than the bottom third.

For each of these two types of performance elements, describe how well the agent will do when it has finished learning and must play on its own. In particular, how good will its decisions be when it is dealt pairs that it never saw the expert receive, and how will the quality of the decisions depend on how many different observations of the expert it got?

Latest completed orders:

Completed Orders
# Title Academic Level Subject Area # of Pages Paper Urgency