# Earthformer

## Installation
### Mandatory Packages
```bash
python3 -m pip install torch==1.8.1+cu111 torchvision==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
python3 -m pip install pytorch_lightning==1.5.6 einops omegaconf opencv-python
```

### Apex Installation
Install [NVIDIA Apex](https://github.com/NVIDIA/apex) to utilize DDP from Apex which enables gradient checkpointing jointly with DDP training.
Otherwise, disable `ApexDDPPlugin` and set `strategy="ddp"` in the `pytorch_lightning` `Trainer()` instead.
```bash
CUDA_HOME=/usr/local/cuda pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" pytorch-extension git+https://github.com/NVIDIA/apex.git
```
See detailed instructions in [official github repo](https://github.com/NVIDIA/apex).

### Earthformer
```bash
cd ROOT_DIR/Earthformer
python3 -m pip install -U -e . --no-build-isolation
```

## Generate N-Body MNIST Dataset
```bash
cd ROOT_DIR/Earthformer
python ./scripts/generate_nbody_dataset.py
```

## Earthformer Training
run the following command to train Earthformer. Change configurations in [cfg.yaml](./scripts/cfg.yaml)
```bash
cd ROOT_DIR/Earthformer
MASTER_ADDR=localhost MASTER_PORT=10001 python ./scripts/train_cuboid_nbody.py --gpus 1 --cfg ./scripts/cfg.yaml --ckpt_name last.ckpt --save tmp_train
```
run the tensorboard command to upload experiment records
```bash
cd ROOT_DIR/Earthformer
tensorboard dev upload --logdir ./experiments/tmp_train/lightning_logs --name 'tmp_train'
```
