machines robot

Rise of the machines

Most people who learn how to play Monopoly are taught by someone: their parents, friends, even children. Unfortunately, I did not have anyone to teach me, so I did my learning by studying actual rules – the rules, majority of the players have never read firsthand. The most difficult and complicated concept of the game is trading, and there is precious little in the rules about how this activity should be conducted. Of course, it is a good thing that rules do not limit trading in any way, but it is hard to understand the value of the properties. Having at least an outline of the trading strategies would really help. I tried to lookup online some advice, monopoly trading guides but the best piece of information I was able to find was limited to a single line: “Don’t forget you can also trade your properties!”. Not very helpful.

Then I attempted to play with bots. I tried EA / HasBro Monopolies – could beat the bots on highest difficulty consistently without losing a single game (playing against 3 bots). The bots clearly had no clue about any strategies involved, nor they understood value of the properties. Bots almost never trade, and when they do, they just give up cards to each other practically for nothing. So you are really playing one against 3 bots, who share properties and money any way they like, but because they are so stupid, you still have no trouble beating them 100% of the time. Many people reported dice being rigged too. For bots, it would almost always be a perfect roll, when a human player would get random rolls at best. Did not find any information on trading that way.

There were a few Monopoly apps created by independent developers. One of them, Monopolist Businessman, could be worth mentioning. Although it inherited most of the issues above, it had very basic distinction in price bots would pay for second card in the set, and for the card which would finish their set. The dice did not look rigged either. Unfortunately, the game had lots of bugs which significantly affected the gameplay, half of the Monopoly rules were not supported, bots’ actions on highest difficulty were just a valid moves with no strategy, they were building houses only occasionally and a few at a time, never mortgaged to raise money to improve other properties, and traded with each other with preferential prices compared to trading with human. Again, you are playing one against 3. Bots could refuse to trade regardless of what you propose in exchange (you could propose all your cards and all money for 1 card and the trade might be rejected), so there is a slight chance of being very unlucky and loose, but a good player could still easily have 90%-95% winrate, even without using multiple bugs which allow infinite movement across the board and landings without paying rent by crashing the app at the right moment. But at least the developers tried to implement something which could help shaping the trading by introducing different prices to cards depending on how many cards of the set the other party has. That was the only app which did it at the time, and it was free, so by all means despite all the issues it was the best Monopoly app available.

That was it, I cannot credit anyone else in helping me to understand the monopoly trading tactics. How would one go about creating intelligent bots to play Monopoly, with trading being the main (if not the only) intelligence test with almost nothing to start with? It might look like a task with infinite number of variables (and it is), but don’t rush to pronounce the issue unsolvable. Thanks to semi-modern technology and developments in applied mathematics the answer is simple – Machine Learning.

Sounds easy? Not that fast. What Machine Learning algorithms need is a lot of training data sets to analyse, to “train” your model the same way the brain of a child gets “trained” by his environment. I did not have the data and had no way of obtaining it. Even if I could dig up a footage of 100 Monopoly game tournaments, it would take me half a year to go through and analyse it manually, and still would be a tiny portion of the data I needed. Instead, I decided I would build the AI, let them play against each other and learn by doing so. Some of the tricks they would be able to “invent” themselves, and I would be there to analyse the results of thousands games where those “inventions” were used. Those which work – will be added to a regular AI’s arsenal, those which don’t – would become a rule to avoid or trap your opponents into, and those showing mixed performance would be studied in more detail to determine when they are best applicable. When the game goes live on Google Play Store, the process of AI’s improvement becomes easier: during the game every action is logged, and when the game is finished – the log, stripped out of any personal identifiable information, is uploaded to Amazon S3 (the log is extremely compact and averages less than 6Kb per game). Once or twice a week I would run a Redshift cluster to do all the analysis on those logs – selecting games with the worst outcome for AI’s, looking at grouping different sets of actions which led to AI’s demise and feeding those scenarios to my models, teaching them to respond to new tactics used against them. So, in a way, every player who wins AIs in more than 25% of his games (playing WITH  3 AIs – not against, as every AI treats a player the same way as another AI), would be contributing to improving AIs with the next update.

Right now, the game with 4 AIs takes between 3 and 12 rounds to complete. AIs are very good at price negotiations. They have no idea what the other party (be it AI or human player) would agree to, so they start with the best possible deal for themselves which might, or might not be their lowest deal they would accept. On top of that, this reserved price for every deal is slightly different each time AI is asked to evaluate a trade. However, to make AI look more “real” and to prevent players from getting the best deal possible by asking AI to evaluate the deal repeatedly, each evaluation raises AI’s annoyance level, which result in a small extra for opponent’s hand evaluation. This annoyance works the same way against other AIs – there is not a single line of code in the game which penalizes human or prioritizes AI to other AIs in any way, regardless of the difficulty level.

AIs can evaluate any deal thrown at them accurately and decide what could be done to the deal to make it worth accepting, if it is not acceptable in its initial state. They can also initiate any possible deal themselves – up to 4 cards + money on each side of the table. The logic is extremely complex and evolves daily.

Apart from trading, every other aspect of AIs behavior is straight forward. It uses best possible order of property development, mortgaging, unmortgaging, demolishing, which every professional player would find intuitive. For a beginner, watching AI’s actions is a great source of learning. Everything they do on a Monopolist level, except trading, done in a most effective way possible. This only applies to the highest difficulty levels though, as other difficulties have very significant chance to forget to do the next action on a given turn, as well as being restricted to the most basic tactics. An additional feature of the game, something which has never been solved by anyone else before but clearly stated in the rules as valid move – is an ability to build, mortgage, unmortgage and demolish during someone else’s turn. AI’s are more than capable to use it too – just try proposing a trade to sell a property to AI on whose hotel you have just landed and watch him building up his other properties instead of buying your card.

With introduction of reduced income in prison comes another important decision to make: whether sit in prison and be safe or pay $50 and enjoy full rent on your properties. AIs decide this based on how many properties are unoccupied, how many houses they have themselves and how many houses are built by their opponents. The decision is made based on those numbers and the percentage of the rent they would be receiving while sitting in prison.

In the next post I will describe the main factors which affect the price of the property. AIs are using much more detailed version of it, with many additions to make the logic more dynamic and human-like, adding such irrationality like biases, sets of which form “personalities”. With thousands of possible personalities to choose from, the chance that 2 AIs would value the property identically even with all other parameters being equal is non-existent – as a result, every price they use as a “reserved” for a given property would be always different from the price another AI would use. However, understanding of the basics from the Trading Guide would help to estimate the applicable price ranges which could be assigned in each situation.