AzFinSim is a simple Python application for synthetic risk simulation. This is forked from https://github.com/mkiernan/azfinsim to remove all Azure infrastructure code and simply be structured as standard Python application for risk analysis.
This is a Python based application and hence requires Python 3.8 or newer to be installed on the workstation. The application can be installed using PIP. A virtual environment is recommended to avoid clobbering your Python environment. Alternatively, you can use the Docker-based approach, described here.
# clone repository
git clone https://github.com/utkarshayachit/azfinsim.git
cd azfinsim
# create virtual environment
python3 -m venv env0
# activate virtual environment
source env0/bin/activate
# upgrade pip
python3 -m pip install --upgrade pip
# install azfinsim (-e is optional)
python3 -m pip install -e .
# validate installation
python3 -m azfinsim.azfinsim --help
# this should generate output as follows:
usage: azfinsim [-h] [--config CONFIG] [--verbose VERBOSE] ...
...
# to exit virtual environment use the following
deactivate
Note Don't forget to activate the virtual environment before trying out any of the tools using commands described in the following sections.
The azfinsim
package includes following tools as executable Python modules:
azfinsim.generator
: a tool to generate synthetic trade data; the generated data can be stored in redis cache or on disk.azfinsim.split
: a tool to split generated synthetic data generated by thegenerator
. It can read the data from redis cache or disk and generate partitioned datasets on disk.azfinsim.concat
: a simple tool to concatenate multiple files in to one.azfinsim.azfinsim
: a tool process trades from disk or redis cache and optionally generate synthetic data results.
The azfinsim.generator
can be used to generate synthetic dataset. In a real world setup, this step shouldn't be necessary since
the data will be coming from some real data source.
To generate trades, use the following command. You can populate a redis cache or save the trades
to a file on the filesystem. Additional cache types can easily be added by modifying cache.py
.
# populate redis cache
python3 -m azfinsim.generator \
--cache-type "redis" \
--cache-name <redis url> \
--cache-key <redis key> \
--start-trade <start trade number> \
--trade-window <total number of trades to generate>
# populate file on disk
python3 -m azfinsim.generator \
--cache-type "filesystem" \
--cache-path <filename> \
--start-trade <start trade number> \
--trade-window <total number of trades to generate>
The generator will populate the cache using multiple threads concurrently.
This step emulates the data processing stage in a financial data processing workflow.
# process trades from/to a redis cache
python3 -m azfinsim.azfinsim \
--cache-type "redis" \
--cache-name <redis url> \
--cache-key <redis key> \
--start-trade <start trade number> \
--trade-window <total number of trades to process>
# process trades from/to disk
python3 -m azfinsim.azfinsim \
--cache-type "filesystem" \
--cache-path <filename> \
--start-trade <start trade number> \
--trade-window <total number of trades to process>
This will process the trades and produce results of the risk analysis. If using redis cache, the results
will be stored in the same cache. If using filesystem, the results will be stored in a file at the same location
as the input file, but with a .results
added before the extension.
For example, if the input file is trades.csv
, the results will be stored in trades.results.csv
. The output directory,
can be overridden using the optional --output-path
parameter.
When using --cache-type filesystem
, --start-trade
and --trade-window
parameters are optional. If not specified,
the entire file will be processed.
azfinsim.split
and azfinsim.concat
are simple tools to split and merge files respectively. These are useful when
modelling a real world scenario where the data is split across multiple files and needs to be processed in parallel and then
results merged back together.
To split a trade file into multiple files, use the following command:
# split trade file
python3 -m azfinsim.split \
--cache-path <filename> \
--output-path <output directory> \
--trade-window <total number of trades to process>
This will split the input file into multiple files in the output directory. The number of files will be equal to the
number of trades in the input file divided by the trade-window
parameter. For example, if the input file has 1000
trades and the trade-window
is 100, then the output directory will have 10 files. The output files will be named
trades.0.csv
, trades.1.csv
, ..., trades.9.csv
and placed in the output directory specified by the output-path
parameter. If the output-path
parameter is not specified, the output files will be placed in the same directory
as the input file.
To merge the split files back into a single file, use the following command:
# merge trade/results files
python3 -m azfinsim.concat \
--cache-path <input files glob pattern> \
--output-path <output filename>
For concat
, the cache-path
parameter is a glob pattern that matches the input files.
For example, if the input files are named trades.0.csv
, trades.1.csv
, ..., trades.9.csv
and are placed in
the directory /tmp/trades
, then the following command can be used to merge them back into a single file:
python3 -m azfinsim.concat \
--cache-path "/tmp/trades/trades.[0-9]*.csv" \
--output-path "/tmp/trades/trades.csv"
To better understand how the tools can be used, here are some example workflows.
#!/bin/bash
# create directory to store trades / results
mkdir -p /tmp/demo1/
# generate trades
python3 -m azfinsim.generator \
--cache-path "/tmp/demo1/trades.csv" \
--start-trade 0 \
--trade-window 10000
# process trades
python3 -m azfinsim.azfinsim \
--cache-path "/tmp/demo1/trades.csv"
# view results
head /tmp/demo1/trades.results.csv
#!/bin/bash
# create redis cache, if not already running
docker run -d --name redis -p 6379:6379 redis
# generate trades
python3 -m azfinsim.generator \
--cache-name "localhost" \
--cache-port 6379 \
--start-trade 0 \
--trade-window 10000
# process trades (using pv algorithm)
python3 -m azfinsim.azfinsim \
--cache-name "localhost" \
--cache-port 6379 \
--start-trade 0 \
--trade-window 10000 \
--algorithm pvonly
# results are stored back in the same cache
The generator stores each trade generated in the redis cache with a key of the form trade:<trade number>
. The results
are stored in the same cache with a key of the form <algorithm>:<trade number>
, where <algorithm>
is the name of the
algorithm used to process the trade. For example, if the pvonly
algorithm is used, the results will be stored in the
cache with a key of the form pvonly:<trade number>
.
#!/bin/bash
# create directory to store trades / results
mkdir -p /tmp/demo2/
# generate 100,000 trades
python3 -m azfinsim.generator \
--cache-path "/tmp/demo2/trades.csv" \
--trade-window 100000
# split trades into multiple files each with 10,000 trades
python3 -m azfinsim.split \
--cache-path "/tmp/demo2/trades.csv" \
--trade-window 10000
# process trades from each file
for i in {0..9}; do
# process all trades in each file
# results are stored back in file `.../trades.${i}.results.csv`
python3 -m azfinsim.azfinsim \
--cache-path "/tmp/demo2/trades.${i}.csv"
done
# merge results back into a single file
python3 -m azfinsim.concat \
--cache-path "/tmp/demo2/trades.[0-9]*.results.csv" \
--output-path "/tmp/demo2/trades.results.csv"
# view results
head /tmp/demo2/trades.results.csv
The application can be configured to send telemetry data to Azure Application Insights. To enable this, you'll need to create an Azure Application Insights resource in Azure. Once created, you'll need to get the connection string for the resource. This can be done by going to the resource in the Azure portal copy the "Connection String" from the "Overview" page. The connection string will look something like this:
# Connection String
InstrumentationKey=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx;IngestionEndpoint=https://eastus-8.in.applicationinsights.azure.com/;LiveEndpoint=https://eastus.livediagnostics.monitor.azure.com/
All tools under azfinim can be configured to send telemetry data to Azure Application Insights. To enable this, you'll need
pass this connection string to the --app-insights
command line option. For example, to enable telemetry for the
azfinsim.generator
tool, you can use the following command:
python3 -m azfinsim.generator \
--cache-path "/tmp/demo2/trades.csv" \
--app-insights "InstrumentationKey=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx;IngestionEndpoint=https://eastus-8.in.applicationinsights.azure.com/;LiveEndpoint=https://eastus.livediagnostics.monitor.azure.com/"
Once enabled, the application will send telemetry data to Azure Application Insights resource. These include logs, exceptions,
and metrics. There is a short delay before the data is available in the Azure portal, so you may need to wait a few minutes
before you can see the data. Logs are stored as traces
and exceptions are stored as exceptions
. Metrics are
stored as customMetrics
. You can inspect each of these in the Azure portal by going to the resource and clicking on
the "Logs" viewer.
Instead of installing the application locally, you can build and use a container instead. For that, you'll need Docker installed on your workstation.
# clone repository
git clone https://github.com/utkarshayachit/azfinsim.git
cd azfinsim
# to build the container image
docker build -t azfinsim:latest .
# test the container
docker run -it azfinsim:latest -m azfinsim.azfinsim --help
# now, you can run the application using the following instead of
# `python3` (as described earlier)
# generate trades
docker run -it azfinsim:latest -m azfinsim.generator \
--cache-name <redis url> \
--cache-key <redis key> \
--start-trade <start trade number> \
--trade-window <total number of trades to generate>
# process trades
docker run -it azfinsim:latest -m azfinsim.azfinsim \
--cache-name <redis url> \
--cache-key <redis key> \
--start-trade <start trade number> \
--trade-window <total number of trades to process>