<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable html -->

--------------------------------------------------------------------------------

The goal of this code is to validate the advantages of flipping-based methods over traditional methods on the Omnisafe platform.Including code for training the strategy network, loading trained results and getting the dataset, processing the dataset, as well as code for the same work on Flipping-based methods and comparisons of their results, and the code for training the strategy network, loading trained results and getting the dataset, processing the dataset, as well as code for the same work on Flipping-based methods and comparisons of their results. It also includes the strategy network parameters used for training and the hyperparameter configurations to obtain them.

--------------------------------------------------------------------------------

### Table of Contents  <!-- omit in toc --> <!-- markdownlint-disable heading-increment -->

- [Quick Start](#quick-start)
  - [Installation](#installation)
    - [Prerequisites](#prerequisites)
    - [Install from source](#install-from-source)
    - [Install from PyPI](#install-from-pypi)
- [Used Algorithms](#implemented-algorithms)
  - [Examples](#examples)
    - [Used Environments](#supported-environments)
    - [Try with CLI](#try-with-cli)
- [Getting Started](#getting-started)
  - [Important Hints](#important-hints)
- [License](#license)
- [Data process code](#script)
--------------------------------------------------------------------------------

## Quick Start

### Installation

We recommend setting up the experimental environment quickly through the following three steps: Pytorch, Omnisafe and scipy.

#### Prerequisites

Requires Python 3.8+ and PyTorch 1.10+.

> We support and test for Python 3.8, 3.9, 3.10 on Linux. Meanwhile, we also support M1 and M2 versions of macOS. We will accept PRs related to Windows, but do not officially support it.

#### Install from source

```bash
# Clone the repo
git clone https://github.com/PKU-Alignment/omnisafe.git
cd omnisafe

# Create a conda environment
conda env create --file conda-recipe.yaml
conda activate omnisafe

# Install omnisafe
pip install -e .
```

#### Install from PyPI

OmniSafe is hosted in [![PyPI](https://img.shields.io/pypi/v/omnisafe?label=pypi&logo=pypi)](https://pypi.org/project/omnisafe) / ![Status](https://img.shields.io/pypi/status/omnisafe?label=status).

```bash
pip install omnisafe
```

## Used Algorithms
**[ICML 2017]** [Constrained Policy Optimization (CPO)](https://proceedings.mlr.press/v70/achiam17a)
**[ICLR 2020]** [Projection-Based Constrained Policy Optimization (PCPO)](https://openreview.net/forum?id=rke3TJrtPS)
--------------------------------------------------------------------------------

### Examples

```bash
cd examples
python main.py --algo CPO --env-id SafetyPointGoal2-v0 --parallel 1 --total-steps 10000000 --device cpu --vector-env-nums 1 --torch-threads 1
```

#### Used Environments

SafetyPointGoal2-v0 in [Safety-Gymnasium](https://www.safety-gymnasium.com) supports:

#### Try with CLI

```bash
pip install omnisafe

omnisafe --help  # Ask for help

omnisafe benchmark --help  # The benchmark also can be replaced with 'eval', 'train', 'train-config'

# Quick benchmarking for your research, just specify:
# 1. exp_name
# 2. num_pool(how much processes are concurrent)
# 3. path of the config file (refer to omnisafe/examples/benchmarks for format)

# Here we provide an exampe in ./tests/saved_source.
# And you can set your benchmark_config.yaml by following it
omnisafe benchmark test_benchmark 2 ./tests/saved_source/benchmark_config.yaml

# Quick evaluating and rendering your trained policy, just specify:
# 1. path of algorithm which you trained
omnisafe eval ./tests/saved_source/PPO-{SafetyPointGoal2-v0} --num-episode 1

# Quick training some algorithms to validate your thoughts
# Note: use `key1:key2`, your can select key of hyperparameters which are recursively contained, and use `--custom-cfgs`, you can add custom cfgs via CLI
omnisafe train --algo CPO --total-steps 10000000 --vector-env-nums 1 --custom-cfgs algo_cfgs:steps_per_epoch --custom-cfgs 1024

# Quick training some algorithms via a saved config file, the format is as same as default format
omnisafe train-config ./tests/saved_source/train_config.yaml
```

--------------------------------------------------------------------------------

## License

OmniSafe is released under Apache License 2.0.

## Data process code

The data processing is done using script code, to ensure that the logic behind each processing step is clear and visible.