Getting Started
First steps
Run an experiment by launching a manager
python manager/manager.py
You can monitor the progress of your experiment in real-time with a tensorboard
tensorboard --logdir results
Experiment Configuration
An experiment is parametrized by a configuration dictionary.
{
"total_timesteps": 1_000_000, # experiment is run until this amount of experience samples is collected
"test_interval": 10_000, # samples after which the agent performance is tested
"number_tests": 100, # amount of episodes to be run for a test
"her_ratio": 0., # amount of data to be generate with hindsight experience replay relative to amount of collected samples. The generated data is added on top of the collected data.
"number_processes": 1, # amount of parallel processes in which environments are launched
"number_threads": 1, # amount of environments launched per process
"agent_config": {...}, # agent configuration, see seperate subchapter
"env_config": {...} # environment configuration, see seperate subchapter
}
The agent configuration agent_config
is specific to the agent you want to use. Please refer to the chapter Agents for specifics.
Likewise, the environment configuration agent_config
is specific to the environment you want to use. Please refer to the chapter Environments for specifics.
This framework uses Pytorch applied to Pybullet simulation environments composed of a robot and a task which can be arbitrarily combined. Run the minimal configuration with an algorithm and task/robot combination of your choice. Soft-Actor-Critic with reach task on pandas robot should converge after about 5 million steps to a successful policy. See architecture doc for further description of the components and contributing if you'd like to contribute a feature, robot, task or new agent.