Skip to content
This repository has been archived by the owner on Jul 27, 2024. It is now read-only.

Commit

Permalink
add reference to Facets Overview spark project
Browse files Browse the repository at this point in the history
  • Loading branch information
jameswex authored Sep 4, 2018
1 parent 8f80597 commit 71f63d2
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions facets_overview/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ import pandas as pd
df = pd.DataFrame({'num' : [1, 2, 3, 4], 'str' : ['a', 'a', 'b', None]})
proto = GenericFeatureStatisticsGenerator().ProtoFromDataFrames([{'name': 'test', 'table': df}])
```

## Large Datasets

The python code in this repository for generating feature stats only works on datasets that are small enough to fit into memory on your local machine. For distributed generation of feature stats for large datasets, check out the independently-developed [Facets Overview Spark project](https://github.com/gopro/facets-overview-spark).

# Visualization

A proto can easily be visualized in a Jupyter notebook using the installed nbextension.
Expand Down

0 comments on commit 71f63d2

Please sign in to comment.