This project contains example application that is able to read a dataset from HDFS and present it in a graphical form to user.
Let's imagine flow as below:
- Dataset is uploaded through data catalog into the platform. The file is stored on the HDFS
- Data scientist does some analysis on it using ATK. The result is also stored on HDFS
- Application developer uploades the dataset-reader application into the platform and binds it with the file.
- Dataset-reader presents the dataset in a nice form of a set of charts
- Clone this repository
git clone https://github.com/trustedanalytics/dataset-reader-sample.git
- Compile it using Maven
mvn compile
- (optional) Run it locally passing path to the file
FILE=<path_to_the_file> mvn spring-boot:run -Dspring.profiles.active=local
- Make Java package
mvn package
- Login and set proper organization and space
cf api <platform API address>
cf login
cf target -o <organization name> -s <space name>
- (optional) Change the application name and host name if necessary in the
manifest.yml
name: <your application name>
host: <application host name>
ℹ️ E.g. if you set host to "dataset-reader" and your platform URL is "example.com", the application will be hosted under 'dataset-reader.example.com' domain.
- Push dataset-reader to the platform
cf push
- Application will start but won't show anything, because it doesn't know which file to serve. To fix that, pass the path to the file on HDFS as a environment variable called "FILE":
cf set-env <application name> FILE <path to file on HDFS>
cf restart <application name>