# Scaling Certified Adversarial Robustness with Generated Data

We investigate how certified adversarial robustness scales by training four existing models that currently achieve SOTA with generated data. Their original code releases have been adapted such that their data loader also supports additional generated data. Generated data is stored in `.npz` files and taken from [Wang et al.](https://github.com/wzekai99/DM-Improves-AT)'s data release.

| Original Implementation                                    | Ours      |
| ---------------------------------------------------------- | --------- |
| https://github.com/zbh2047/SortNet                         | sortnet-x |
| https://github.com/ai-secure/layerwise-orthogonal-training | lot-x     |
| https://github.com/klasleino/gloro                         | gloro-x   |

In many cases we had to subsample larger datasets to create smaller ones. This is the case for the 50k, 100k, 200k, 500k, and 1m datasets for CIFAR-10 and the 1m and 5m datasets for CIFAR-100. The two scripts `subsample_cifar10.py` and `subsample_cifar100.py` may be used to recreate these subsampled datasets. Subsampling was performed on compute nodes with 2x AMD Rome 7502 @2.5 GHz CPUs and 512 GB RAM.

## Model Training

As our original training scripts are specific to each compute environment, we summarize the Python calls used to train each individual model below. For required dependencies we refer to each individual repository. See paper and appendix for configuration details.

To launch **l-infty-dist Net** with parameters $1 (None, 50k, 100k, 200k, 500k, 1m, 5m, 10m), $2 (fraction of generated data, default 0.7), and $3 (number of epochs, default 800):

```bash
python main.py --dataset CIFAR10 --auxiliary-dir "$WORK/aux/cifar10/" --auxiliary $1 --fraction $2 --model "SortMLPModel(depth=6,width=5120,identity_val=10.0,scalar=False,dropout=1.0)" --loss 'hinge' --p-start 8 --p-end 1000 --epochs 0,0,$(($3/8)),$(($3-$3/16)),$3 --eps-test 0.03137 --eps-train 0.1569 -b 512 --lr 0.02 --gpu 0
```

To launch **SortNet** with parameters $1 (None, 50k, 100k, 200k, 500k, 1m, 5m, 10m), $2 (fraction of generated data, default 0.7), $3 (dropout rate), and $4 (number of epochs, default 3000):

```bash
python main.py --dataset CIFAR10 --auxiliary-dir "$WORK/aux/cifar10/" --auxiliary $1 --fraction $2 --model "SortMLPModel(depth=6,width=5120,scalar=True,dropout=$3)" --loss "mixture(lam0=0.2,lam_end=0.002)" --p-start 8 --p-end 1000 --epochs 0,0,$(($4/15)),$(($4-$4/60)),$4 --eps-test 0.03137 --eps-train 0.09411 -b 512 --lr 0.02 --wd 0.02 --gpu 0
```

To launch **LOT** with parameters $1 (None, 50k, 100k, 200k, 500k, 1m, 5m, 10m), $2 (fraction of generated data, default 0.7), $3 (multi-step, one-cycle), $4 (number of epochs, default 200) and $5 (number of blocks):

```bash
python train_robust.py --auxiliary-dir "$WORK/aux/cifar10/" --auxiliary $1 --fraction $2 --scheduler $3 --lr-max 0.1 --epochs $4 --conv-layer lot --activation hh1 --block-size $5 --dataset cifar10 --gamma 0.5 --opt-level O0 --residual
```

To launch **GloroNet** with parameters $1 (None, 50k, 100k, 200k, 500k, 1m, 5m, 10m), $2 (fraction of generated data, default 0.7), $3 (number of epochs, default 800), $4 (depth), and $5 (width):

```bash
python train.py --auxiliary-dir "$WORK/aux/cifar10/" --auxiliary $1 --fraction $2 --config='configs/cifar10.yaml' --epochs $3 --depth $4 --width $5
```

To summarize multiple runs of a model in a `.csv` file you may use the `summarize_sortnet.py`, `summarize_lot.py`, and `summarize_gloro.py` scripts. Each takes exactly one argument `-d` pointing to the directory containing the respective result files or folders.

## Figures & Tables

The folder `cert-robust` includes scripts for reproducing the figures and tables given in the paper. It also ships with all results files require to generate the figures and tables, most importantly `sortnet-results.csv`, `lot-results.csv`, and `gloro-results.csv`. To print tables and create figures (as both PNG and PDF), execute the following:

```bash
python table.py
python plot.py
```

To install required dependencies, call `pip install -r requirements.txt` or manually install the packages `pandas`, `numpy`, `scikit-learn`, `matplotlib`, and `seaborn`.