Skip to content

mohsenasm/Python-Spark-Log-Parser

Repository files navigation

Spark-Log-Parser

Setup

For Debian based OS (like Ubuntu):

sudo apt-get install graphviz
pip3 install -r requirements.txt

For Mac:

brew install graphviz

Usage

python3 main.py <log_dir>

Then reports and stages DAGs will be stored in the output directory.

Use with Docker

First go to the parent directory of spark-history-directory, Then:

alias spark-parser='docker run -ti --rm -v `pwd`:/files mohsenasm/python-spark-log-parser'
spark-parser spark-history-directory

Then, for fixing the permission issue, use this command: sudo chown -R $USER:$USER parser_output/

Reference

About

A python script for spark log parsing

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published