Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds cml train yml action #33

Open
wants to merge 41 commits into
base: staging
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
ae0b077
Merge pull request #21 from Amharic-STT/staging
Azariagmt Aug 4, 2021
b796d3c
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
jedisam Aug 6, 2021
a58de09
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
jedisam Aug 6, 2021
c262e1d
data:track
jedisam Aug 6, 2021
631ae5c
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
jedisam Aug 6, 2021
23c1278
adds initial pickle objects
jedisam Aug 7, 2021
7cca17f
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
Azariagmt Aug 7, 2021
bac1d0e
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
Azariagmt Aug 7, 2021
7a5712e
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
Azariagmt Aug 7, 2021
a9b09be
saved the trained model
rasyosef Aug 7, 2021
be181de
saved the character encoder
rasyosef Aug 7, 2021
0e4263d
sets up cml training on own server
Azariagmt Aug 7, 2021
54d80f9
Merge branch 'staging' into dev-azaria
Azariagmt Aug 7, 2021
2e3ed8f
Merge branch 'staging' of https://github.com/Amharic-STT/STT-engine i…
Azariagmt Aug 7, 2021
fa85a9b
Merge branch 'dev-yosef' of https://github.com/Amharic-STT/STT-engine…
rasyosef Aug 8, 2021
4bd688d
data loading and preprocessing modules
rasyosef Aug 8, 2021
6661467
encoding module and saved encoder
rasyosef Aug 8, 2021
14b08a6
data loading and preprocessing modules
rasyosef Aug 8, 2021
ca5fd46
modified feature extraction functions
rasyosef Aug 8, 2021
682aff2
moved model creation to models,py
rasyosef Aug 8, 2021
97e6b63
modular version of Amharic_STT
rasyosef Aug 8, 2021
085c304
mlflow parameter logging
rasyosef Aug 8, 2021
1b80b44
mlflow parameter logging
rasyosef Aug 8, 2021
352c459
Merge branch 'dev-yosef' into integration
Azariagmt Aug 9, 2021
174d174
modified preprocessing functions
rasyosef Aug 9, 2021
9cf2cf8
modified training notebook
rasyosef Aug 9, 2021
31ab1b6
modified requirements.txt
rasyosef Aug 9, 2021
34bc7ed
modified readme
rasyosef Aug 9, 2021
fbc29cb
.
rasyosef Aug 9, 2021
e3e82c1
added batch training
rasyosef Aug 9, 2021
ede46fd
added mel spectrogram saving function
rasyosef Aug 9, 2021
cd221db
added model training python file
rasyosef Aug 10, 2021
87e7ccd
better model
rasyosef Aug 10, 2021
3b9b95f
fixes configuration to run on server
Azariagmt Aug 11, 2021
eb52f05
untracks .pyc files
Azariagmt Aug 11, 2021
c85655c
Merge branch 'dev-yosef' into dev-azaria
Azariagmt Aug 11, 2021
7672431
Merge branch 'integration' into dev-azaria
Azariagmt Aug 11, 2021
1b2e776
setsup dvc pipeline
Azariagmt Aug 11, 2021
1116fd5
modifies requirements and untracks .pyc files
Azariagmt Aug 11, 2021
1107474
setsup github actions for training
Azariagmt Aug 11, 2021
69b689d
removes tensorboard plugin from requirements
Azariagmt Aug 11, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 0 additions & 20 deletions .github/workflows/cml.yml

This file was deleted.

28 changes: 28 additions & 0 deletions .github/workflows/train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: train-model

on:
push:
branches:
- "train"

jobs:
train:
runs-on: [self-hosted, cml, gpu]
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
- uses: iterative/setup-cml@v1
- name: Train model
env:
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GDRIVE_CREDENTIALS_DATA: ${{ secrets.GDRIVE_CREDENTIALS_DATA }}
- name: cml_run
run: |
pip install -r requirements.txt
dvc repro


git fetch prune
dvc metrics diff --show-md staging > report.md

cml-send-comment report.md
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
/data
*.pyc
/prediction.txt
26 changes: 5 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,9 @@

<p>Our responsibility was to build a deep learning model that is capable of transcribing a speech to text in the Amharic language. The model we produce will be accurate and is robust against background noise.</p>

## Code
The code of our analysis can be found in the **notebooks** folder. The data preprocessing and visualization, and model training parts can be found in the **Amharic_STT_preprocessing.ipynb** jupyter notebook. This notebook can be run in google colab. The **Amharic_Speech_To_Text.ipynb** contains a modularized version of the first notebook. The **scripts** folder contains the data loading and preprocessing functions. The trained models will be stored in the **models** folder.

Structure
├── logs
├── modules
├── notebooks
├── tests
└── Dockerfile

# Contributors

* [Azaria Tamrat](https://github.com/Azariagmt)
* [Bethelhem Sisay](https://github.com/Bethelsis)
* [Daniel Zelalem](https://github.com/daniEL2371)
* [Dorothy Cheruiyot](https://github.com/Doro97)
* [Eliphaz Niyodusenga]()
* [Elizabeth Nanjala]()
* [Natneal Teshome](https://github.com/Natty-star)
* [UWASE Rachel](https://github.com/ntabanarachel)
* [Yosef Alemneh](https://github.com/mozartofmath)



## Dependencies
To install the necessary dependencies, execute the command
```$ pip install -r requirements.txt"```
6 changes: 3 additions & 3 deletions data.dvc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
outs:
- md5: bc8108ef9de98e206b7cfcc971293348.dir
size: 387974316
nfiles: 1958
- md5: 0482446076c63ee87c4a2fd58e77dcc9.dir
size: 20618614073
nfiles: 38001
path: data
26 changes: 26 additions & 0 deletions dvc.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
schema: '2.0'
stages:
install_requirements:
cmd: pip3 install -r requirements.txt
deps:
- path: requirements.txt
md5: 355a4d492eb3ce554c1b42ff2dbea391
size: 251
train:
cmd:
- python3 ./scripts/train.py
deps:
- path: ./data
md5: 0482446076c63ee87c4a2fd58e77dcc9.dir
size: 20618614073
nfiles: 38001
- path: train.py
md5: aa3a029d753ed048f62c49c6290ae70a
size: 3935
outs:
- path: metrics.json
md5: 6a43012e4745d0dafa01cac604836f22
size: 12
- path: prediction.txt
md5: 865865e14d992c8caf58125f6fdad107
size: 158
15 changes: 15 additions & 0 deletions dvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
stages:
install_requirements:
cmd: pip3 install -r requirements.txt
deps:
- requirements.txt
train:
cmd: ["python3 ./scripts/train.py"]
deps:
- train.py
- ./data
outs:
- prediction.txt
metrics:
- metrics.json:
cache: false
Binary file added models/amharic_stt_mfcc.h5
Binary file not shown.
Binary file added models/encoder.pkl
Binary file not shown.
1,229 changes: 1,229 additions & 0 deletions notebooks/.ipynb_checkpoints/Amharic_Speech_To_Text-checkpoint.ipynb

Large diffs are not rendered by default.

1,229 changes: 1,229 additions & 0 deletions notebooks/Amharic_Speech_To_Text.ipynb

Large diffs are not rendered by default.

15 changes: 15 additions & 0 deletions notebooks/mlruns/0/00861c5cb9864b799eee65ee9a9fe6b7/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/00861c5cb9864b799eee65ee9a9fe6b7/artifacts
end_time: 1628537815666
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 00861c5cb9864b799eee65ee9a9fe6b7
run_uuid: 00861c5cb9864b799eee65ee9a9fe6b7
source_name: ''
source_type: 4
source_version: ''
start_time: 1628537815526
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628537815662 64.27766418457031 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/135adce6eccf4914828f2b74d9f3f9a1/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/135adce6eccf4914828f2b74d9f3f9a1/artifacts
end_time: 1628386596867
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 135adce6eccf4914828f2b74d9f3f9a1
run_uuid: 135adce6eccf4914828f2b74d9f3f9a1
source_name: ''
source_type: 4
source_version: ''
start_time: 1628386596789
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628386596864 17.531429290771484 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/27b953f4e24f4d7b859a6ea39d434997/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/27b953f4e24f4d7b859a6ea39d434997/artifacts
end_time: 1628383852112
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 27b953f4e24f4d7b859a6ea39d434997
run_uuid: 27b953f4e24f4d7b859a6ea39d434997
source_name: ''
source_type: 4
source_version: ''
start_time: 1628383851398
status: 4
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/2f4eac0c16be44bf845c9b8ac0c6e4b1/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/2f4eac0c16be44bf845c9b8ac0c6e4b1/artifacts
end_time: 1628387236792
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 2f4eac0c16be44bf845c9b8ac0c6e4b1
run_uuid: 2f4eac0c16be44bf845c9b8ac0c6e4b1
source_name: ''
source_type: 4
source_version: ''
start_time: 1628387236683
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628387236789 9.323555946350098 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/30683cc6a0394bb9b474f766e49aec65/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/30683cc6a0394bb9b474f766e49aec65/artifacts
end_time: 1628385821276
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 30683cc6a0394bb9b474f766e49aec65
run_uuid: 30683cc6a0394bb9b474f766e49aec65
source_name: ''
source_type: 4
source_version: ''
start_time: 1628385820280
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628385821268 74.62281036376953 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/78c7f5d081f147b7baa2daefac5bc025/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/78c7f5d081f147b7baa2daefac5bc025/artifacts
end_time: 1628534263380
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 78c7f5d081f147b7baa2daefac5bc025
run_uuid: 78c7f5d081f147b7baa2daefac5bc025
source_name: ''
source_type: 4
source_version: ''
start_time: 1628534263057
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628534263370 61.04814147949219 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/79af4a77f9c1453fae086a0cd7902a67/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/79af4a77f9c1453fae086a0cd7902a67/artifacts
end_time: 1628539288958
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 79af4a77f9c1453fae086a0cd7902a67
run_uuid: 79af4a77f9c1453fae086a0cd7902a67
source_name: ''
source_type: 4
source_version: ''
start_time: 1628539288605
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628539288906 28.640695571899414 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/7de992db615a479fb798710d4d820703/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/7de992db615a479fb798710d4d820703/artifacts
end_time: 1628533475566
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 7de992db615a479fb798710d4d820703
run_uuid: 7de992db615a479fb798710d4d820703
source_name: ''
source_type: 4
source_version: ''
start_time: 1628533475184
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628533475559 65.5356216430664 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/88ed94ae85814a1692f24f83e35bd90b/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/88ed94ae85814a1692f24f83e35bd90b/artifacts
end_time: 1628386005574
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 88ed94ae85814a1692f24f83e35bd90b
run_uuid: 88ed94ae85814a1692f24f83e35bd90b
source_name: ''
source_type: 4
source_version: ''
start_time: 1628386005355
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628386005570 59.76906967163086 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
HP
15 changes: 15 additions & 0 deletions notebooks/mlruns/0/8cebc9aac6574721af5a5de54016cd9c/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
artifact_uri: file:///C:/Users/HP/Desktop/10academy/Week-4/AmharicSTT/notebooks/mlruns/0/8cebc9aac6574721af5a5de54016cd9c/artifacts
end_time: 1628537718299
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
name: ''
run_id: 8cebc9aac6574721af5a5de54016cd9c
run_uuid: 8cebc9aac6574721af5a5de54016cd9c
source_name: ''
source_type: 4
source_version: ''
start_time: 1628537717895
status: 3
tags: []
user_id: HP
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1628537718292 66.45594024658203 0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C:\Users\HP\Anaconda3\lib\site-packages\ipykernel_launcher.py
Loading