ML-Agents PettingZoo Wrapper

Setup

python

#@title Install Rendering Dependencies { display-mode: "form" }
#@markdown (You only need to run this code when using Colab's hosted runtime)

import os
from IPython.display import HTML, display

def progress(value, max=100):
    return HTML("""
        <progress
            value='{value}'
            max='{max}',
            style='width: 100%'
        >
            {value}
        </progress>
    """.format(value=value, max=max))

pro_bar = display(progress(0, 100), display_id=True)

try:
  import google.colab
  INSTALL_XVFB = True
except ImportError:
  INSTALL_XVFB = 'COLAB_ALWAYS_INSTALL_XVFB' in os.environ

if INSTALL_XVFB:
  with open('frame-buffer', 'w') as writefile:
    writefile.write("""#taken from https://gist.github.com/jterrace/2911875
XVFB=/usr/bin/Xvfb
XVFBARGS=":1 -screen 0 1024x768x24 -ac +extension GLX +render -noreset"
PIDFILE=./frame-buffer.pid
case "$1" in
  start)
    echo -n "Starting virtual X frame buffer: Xvfb"
    /sbin/start-stop-daemon --start --quiet --pidfile $PIDFILE --make-pidfile --background --exec $XVFB -- $XVFBARGS
    echo "."
    ;;
  stop)
    echo -n "Stopping virtual X frame buffer: Xvfb"
    /sbin/start-stop-daemon --stop --quiet --pidfile $PIDFILE
    rm $PIDFILE
    echo "."
    ;;
  restart)
    $0 stop
    $0 start
    ;;
  *)
        echo "Usage: /etc/init.d/xvfb {start|stop|restart}"
        exit 1
esac
exit 0
    """)
  pro_bar.update(progress(5, 100))
  !apt-get install daemon >/dev/null 2>&1
  pro_bar.update(progress(10, 100))
  !apt-get install wget >/dev/null 2>&1
  pro_bar.update(progress(20, 100))
  !wget http://security.ubuntu.com/ubuntu/pool/main/libx/libxfont/libxfont1_1.5.1-1ubuntu0.16.04.4_amd64.deb >/dev/null 2>&1
  pro_bar.update(progress(30, 100))
  !wget --output-document xvfb.deb http://security.ubuntu.com/ubuntu/pool/universe/x/xorg-server/xvfb_1.18.4-0ubuntu0.12_amd64.deb >/dev/null 2>&1
  pro_bar.update(progress(40, 100))
  !dpkg -i libxfont1_1.5.1-1ubuntu0.16.04.4_amd64.deb >/dev/null 2>&1
  pro_bar.update(progress(50, 100))
  !dpkg -i xvfb.deb >/dev/null 2>&1
  pro_bar.update(progress(70, 100))
  !rm libxfont1_1.5.1-1ubuntu0.16.04.4_amd64.deb
  pro_bar.update(progress(80, 100))
  !rm xvfb.deb
  pro_bar.update(progress(90, 100))
  !bash frame-buffer start
  os.environ["DISPLAY"] = ":1"
pro_bar.update(progress(100, 100))

Installing ml-agents

python

try:
  import mlagents
  print("ml-agents already installed")
except ImportError:
  !git clone -b main --single-branch https://github.com/Unity-Technologies/ml-agents.git
  !python -m pip install -q ./ml-agents/ml-agents-envs
  !python -m pip install -q ./ml-agents/ml-agents
  print("Installed ml-agents")

Run the Environment

List of available environments:

Basic
ThreeDBall
ThreeDBallHard
GridWorld
Hallway
VisualHallway
CrawlerDynamicTarget
CrawlerStaticTarget
Bouncer
SoccerTwos
PushBlock
VisualPushBlock
WallJump
Tennis
Reacher
Pyramids
VisualPyramids
Walker
FoodCollector
VisualFoodCollector
StrikersVsGoalie
WormStaticTarget
WormDynamicTarget

Start Environment with PettingZoo Wrapper

python

# -----------------
# This code is used to close an env that might not have been closed before
try:
  env.close()
except:
  pass
# -----------------

import numpy as np
from mlagents_envs.envs import StrikersVsGoalie # import unity environment
env = StrikersVsGoalie.env()

Stepping the environment

Example of interacting with the environment in basic RL loop. It follows the same interface as described in PettingZoo API page.

python

num_cycles = 10

env.reset()
for agent in env.agent_iter(env.num_agents * num_cycles):
    prev_observe, reward, done, info = env.last()
    if isinstance(prev_observe, dict) and 'action_mask' in prev_observe:
        action_mask = prev_observe['action_mask']
    if done:
        action = None
    else:
        action = env.action_spaces[agent].sample() # randomly choose an action for example
    env.step(action)

Additional Environment API

All the API described in the Additional Environment API section in the PettingZoo API page are all supported. A few examples are shown below.

python

# `agents`: a list of the names of all current agents
print("Agent names:", env.agents)

python

# `agent_selection`: the currently agent that an action can be taken for.
print("Current agent:", env.agent_selection)

python

# `observation_spaces`: a dict of the observation spaces of every agent, keyed by name.
print("Observation space of current agent:", env.observation_spaces[env.agent_selection])

python

# `action_spaces`: a dict of the observation spaces of every agent, keyed by name.
print("Action space of current agent:", env.action_spaces[env.agent_selection])

Close the Environment to free the port it is using

python

env.close()