# Dual Critic Reinforcement Learning under Partial Observability

This repository includes the source code for the paper "Dual Critic Reinforcement Learning under Partial Observability".

## Installation
Make sure you have Python 3.9+ installed. Install dependencies:
```
pip3 install -r requirements.txt
```
- We use `MiniGrid` the python-based environment package for MiniGrid games. Please see [https://github.com/Farama-Foundation/MiniGrid](https://github.com/Farama-Foundation/MiniGrid) for more details.
- We use `MiniWorld` the python-based environment package for MiniWorld games. Please see [https://github.com/Farama-Foundation/Miniworld](https://github.com/Farama-Foundation/Miniworld) for more details.
- We use `PyTorch` for training neural networks. Please see [https://pytorch.org/](https://pytorch.org/) for more details.

## Training
The training script is located in the `train.py` file. You can run the training script by specifying the method (`DCRL`, `Recurrent Actor-Critic`, `Asymmetric Actor-Critic`, `Oracle Guiding`, `Unbiased Asymmetric Actor-Critic`) and algo(`a2c`, `ppo`).

For example, run an experiment of DCRL(A2C) in the "MiniGrid-LavaCrossingS9N2" environment: 
```
python3 -m train \
    --method DCRL \
    --algo a2c \
    --seed 0 \
    --env-name MiniGrid-LavaCrossingS9N2-v0 \
    --total-frames 2000000
```

You can specify the storage address for wandb and local data by setting the following settings: 
```
export WANDB_DIR=your_wandb_dir && export RL_STORAGE=your_local_dir
```

For more customized configuration of training, you can directly obtain the documentation by using the --help flag.
```
python3 -m train --help
```

## Visualization
By default, we record all the training metrics (including average returns, etc.) via `Tensorboard` in the runs folder. To visualize them, You can run:
```
tensorboard --logdir runs
```
You can also use `Wandb` to track the experiments by the following command:
```
wandb login # only required for the first time
python3 -m XXX \
    --track \
    --wandb-project-name proj
```

## Acknowledgment
This code implementation is largely based on the RL Starter Files library ([https://github.com/lcswillems/rl-starter-files](https://github.com/lcswillems/rl-starter-files)).