Starcraft 2 updating blizzard update agent

Over the previous two decades, Star Craft I and II have been pioneering and enduring e-sports, 2 with millions of casual and highly competitive professional players.Defeating top human players therefore becomes a meaningful and measurable long-term objective.This is a collaboration between Deep Mind and Blizzard to develop Star Craft II into a rich environment for RL research.Py SC2 provides an interface for RL agents to interact with Star Craft 2, getting observations and rewards and sending actions. Playing the whole game is quite an ambitious goal that currently is only whithin the reach of scripted agents.

In typical real-time strategy games, players build armies and vie for control of the battlefield.Moreover, we want to experiment with the reward system to see how several changes may influence the behaviour of the agent.That's why we can define our objectives by: Before starting to train the SC2 agents, we went through a series of tutorials, which implement in Tensor Flow the different RL algorithms applied to the Open AI GYM environment.A3C works by updating the policy using the so called advantage.The advantage is an estimate of how much increasing the probability of executing an action in a given state would contribute to increase or worsen the long-term reward.

