-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using mkdocs to serve docs from our markdown files #62
Merged
Merged
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
name: Deploy MkDocs to GitHub Pages | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
|
||
jobs: | ||
deploy: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
persist-credentials: false | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.x' | ||
|
||
- name: Install Dependencies | ||
run: | | ||
pip install mkdocs-material | ||
|
||
- name: Build Site | ||
run: mkdocs build --clean | ||
|
||
- name: Deploy to GitHub Pages | ||
uses: peaceiris/actions-gh-pages@v3 | ||
with: | ||
github_token: ${{ secrets.GITHUB_TOKEN }} | ||
publish_dir: ./site |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# microdata-tools | ||
Tools for the [microdata.no](https://www.microdata.no/) platform | ||
|
||
## Installation | ||
`microdata-tools` can be installed from PyPI using pip: | ||
``` | ||
pip install microdata-tools | ||
``` | ||
|
||
## Usage | ||
Once you have your metadata and data files ready to go, they should be named and stored like this: | ||
``` | ||
my-input-directory/ | ||
MY_DATASET_NAME/ | ||
MY_DATASET_NAME.csv | ||
MY_DATASET_NAME.json | ||
``` | ||
The CSV file is optional in some cases. | ||
|
||
### Package dataset | ||
The `package_dataset()` function will encrypt and package your dataset as a tar archive. The process is as follows: | ||
|
||
1. Generate the symmetric key for a dataset. | ||
2. Encrypt the dataset data (CSV) using the symmetric key and store the encrypted file as `<DATASET_NAME>.csv.encr` | ||
3. Encrypt the symmetric key using the asymmetric RSA public key `microdata_public_key.pem` | ||
and store the encrypted file as `<DATASET_NAME>.symkey.encr` | ||
4. Gather the encrypted CSV, encrypted symmetric key and metadata (JSON) file in one tar file. | ||
|
||
### Unpackage dataset | ||
The `unpackage_dataset()` function will untar and decrypt your dataset using the `microdata_private_key.pem` | ||
RSA private key. | ||
|
||
The packaged file has to have the `<DATASET_NAME>.tar` extension. Its contents should be as follows: | ||
|
||
```<DATASET_NAME>.json``` : Required medata file. | ||
|
||
```<DATASET_NAME>.csv.encr``` : Optional encrypted dataset file. | ||
|
||
```<DATASET_NAME>.symkey.encr``` : Optional encrypted file containing the symmetrical key used to decrypt the dataset file. Required if the `.csv.encr` file is present. | ||
|
||
Decryption uses the RSA private key located at ```RSA_KEY_DIR```. | ||
|
||
The packaged file is then stored in `output_dir/archive/unpackaged` after a successful run or `output_dir/archive/failed` after an unsuccessful run. | ||
|
||
## Example | ||
Python script that uses a RSA public key named `microdata_public_key.pem` and packages a dataset: | ||
|
||
```py | ||
from pathlib import Path | ||
from microdata_tools import package_dataset | ||
|
||
RSA_KEYS_DIRECTORY = Path("tests/resources/rsa_keys") | ||
DATASET_DIRECTORY = Path("tests/resources/input_package/DATASET_1") | ||
OUTPUT_DIRECTORY = Path("tests/resources/output") | ||
|
||
package_dataset( | ||
rsa_keys_dir=RSA_KEYS_DIRECTORY, | ||
dataset_dir=DATASET_DIRECTORY, | ||
output_dir=OUTPUT_DIRECTORY, | ||
) | ||
``` | ||
|
||
### Validation | ||
|
||
Once you have your metadata and data files ready to go, they should be named and stored like this: | ||
``` | ||
my-input-directory/ | ||
MY_DATASET_NAME/ | ||
MY_DATASET_NAME.csv | ||
MY_DATASET_NAME.json | ||
``` | ||
Note that the filename only allows upper case letters A-Z, number 0-9 and underscores. | ||
|
||
|
||
Import microdata-tools in your script and validate your files: | ||
```py | ||
from microdata_tools import validate_dataset | ||
|
||
validation_errors = validate_dataset( | ||
"MY_DATASET_NAME", | ||
input_directory="path/to/my-input-directory" | ||
) | ||
|
||
if not validation_errors: | ||
print("My dataset is valid") | ||
else: | ||
print("Dataset is invalid :(") | ||
# You can print your errors like this: | ||
for error in validation_errors: | ||
print(error) | ||
``` | ||
|
||
For a more in-depth explanation of usage visit [the usage documentation](/microdata-tools/USAGE). | ||
|
||
### Data format description | ||
A dataset as defined in microdata consists of one data file, and one metadata file. | ||
|
||
The data file is a csv file seperated by semicolons. A valid example would be: | ||
```csv | ||
000000000000001;123;2020-01-01;2020-12-31; | ||
000000000000002;123;2020-01-01;2020-12-31; | ||
000000000000003;123;2020-01-01;2020-12-31; | ||
000000000000004;123;2020-01-01;2020-12-31; | ||
``` | ||
|
||
The metadata files should be in json format. The requirements for the metadata is best described through the [Pydantic model](https://github.com/statisticsnorway/microdata-tools/blob/main/microdata_tools/validation/model/metadata.py) and [the examples](https://github.com/statisticsnorway/microdata-tools/tree/main/docs/examples) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
[data-md-color-scheme="mdata"] { | ||
--md-primary-fg-color: #104050; | ||
--md-accent-fg-color: #e94f35; | ||
|
||
} | ||
|
||
[data-md-color-scheme="slate"] { | ||
--md-primary-fg-color: #104050; | ||
--md-accent-fg-color: #e94f35; | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
site_name: Microdata-tools | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💯 |
||
site_url: https://statisticsnorway.github.io/microdata-tools/ | ||
repo_url: https://github.com/statisticsnorway/microdata-tools | ||
|
||
theme: | ||
name: material | ||
palette: | ||
- scheme: mdata | ||
toggle: | ||
icon: material/weather-night | ||
name: Switch to dark mode | ||
- scheme: slate | ||
toggle: | ||
icon: material/weather-sunny | ||
name: Switch to light mode | ||
font: | ||
text: Source Sans Pro | ||
code: Source Code Pro | ||
logo: assets/microdata.png | ||
favicon: assets/favicon.ico | ||
features: | ||
- navigation.instant | ||
- navigation.external | ||
- content.code.copy | ||
- content.code.select | ||
|
||
pygments_style: default | ||
|
||
extra_css: | ||
- stylesheets/extra.css | ||
|
||
nav: | ||
- Getting Started: index.md | ||
- The Metadata model: metadata-model.md | ||
- Usage: USAGE.md | ||
- Report an Issue: | ||
- Issue template EN: issue_templates/issue_template_en.md | ||
- Issue template NO: issue_templates/issue_template_no.md | ||
- Releases: https://github.com/statisticsnorway/microdata-tools/releases | ||
|
||
docs_dir: docs | ||
|
||
plugins: | ||
- search | ||
|
||
markdown_extensions: | ||
- pymdownx.highlight: | ||
anchor_linenums: true | ||
line_spans: __span | ||
pygments_lang_class: true | ||
- pymdownx.inlinehilite | ||
- pymdownx.snippets | ||
- pymdownx.superfences |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a simpler way to publish:
https://github.com/statisticsnorway/nais-system/blob/main/.github/workflows/publish-gh-pages.yml
https://www.mkdocs.org/user-guide/deploying-your-docs/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion!