The code is in three directories:


1. adult

Logistic regression (with bounded loss) on the UCI Adult dataset,
corresponding to Sections 6 (Optimization) and 7.1 (Optimization
experiments) and Appendix G.1 (training details).

Download the files "adult.data" and "adult.test" from
	https://archive.ics.uci.edu/ml/machine-learning-databases/adult/
and run all the cells in the two jupyter notebooks. The
adult_linear_regresssion notebook consists experiments for perfect joins for
different epsilon and hash buckets. The
adult_linear_regresssion_different_join_sizes notebook consists of experiments
at fixed epsilon = 1 but different join sizes and hash buckets.

All notebooks create a pickled log file containing results collected during
training. 


2. EMNIST

Classifying handwritten digits. Corresponds to Sections 6 (Optimization), 7.1
(Optimization experiments), 7.2 (Comparison with other sketch implementations),
and Appendix G.2 (training details).

Run all the cells in the three jupyter notebooks. The EMNIST dataset will be
downloaded automatically.

The EMNIST_perfect_joins notebook consists experiments for perfect joins for
different epsilon and hash buckets. The EMNIST_different_join_sizes notebook
consists of experiments at fixed epsilon = 1 but different join sizes and hash
buckets. 

The notebook Baseline_Zhao compares our method with Zhao et al. Run this
notebook after running EMNIST_perfect_joins, since its results are required to
plot the figure that compares the two methods in Baseline_Zhao. 

All notebooks create a pickled log file containing results collected during
training. 


3. joint_distribution

Some general code implementing our techniques. In particular, contains an
experiment reconstructing a joint distribution, corresponding to Section 5
(Linear queries) and Appendix F (Experiments with linear queries). To reproduce
that experiment, follow Steps 1 and 2 in the README file in the
joint_distribution directory.
