Correlated-Sequential-Halving

Finds the medoid of n points efficiently, approximately O(n log^2 n): brute force takes O(n^2) time.

This is a codebase to reproduce all the figures and numerical results in the paper titled - "Ultra Fast Medoid Identification via Correlated Sequential Halving". Please reach out to me at tavorb "at" stanford.edu if you have any questions on how to run the code or replicate results.

All the figures can be viewed and generated via ipython notebooks in 'figure' folder
The above figures are generated from experiments, which can be re-generated using the following code

python algorithm_rand.py --dataset *** --num_exp 1000 --num_jobs 32 --verbose False
python algorithm_brute.py --dataset *** --num_exp 1 --num_jobs 1 --verbose False
python algorithm_meddit.py --dataset *** --num_exp 1000 --num_jobs 32 --verbose False
python algorithm_correlated.py --dataset *** --num_exp 1000 --num_jobs 32 --verbose False
- budget can be modified by changing valRange in algorithm_correlated.py. Can perform 'doubling trick' to search and find a good budget.

With the following options:

dataset - name of the dataset (rnaseq20k, netflix20k, netflix100k, mnist)
- netflix dataset not included on github due to size, downloadable at https://www.kaggle.com/netflix-inc/netflix-prize-data
- MNIST command deprecated
num_exp - Number of total experiments
num_jobs - Number of experiments run parallely

Dependencies: tables and h5py are used in loading rnaseq data, can be installed with python -m pip install h5py. Code is compatible with both Python 2 and 3.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets		datasets
figure		figure
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Correlated-Sequential-Halving

Finds the medoid of n points efficiently, approximately O(n log^2 n): brute force takes O(n^2) time.

About

Releases

Packages

Contributors 2

Languages

TavorB/Correlated-Sequential-Halving

Folders and files

Latest commit

History

Repository files navigation

Correlated-Sequential-Halving

Finds the medoid of n points efficiently, approximately O(n log^2 n): brute force takes O(n^2) time.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages