BigQuery Autogen Schema

This is a tool used to convert .JSON strings (inside JSON files) into acceptable Python BigQuery schema lists.

It's not completely full proof as support for key/value pairs in JSON files where the value is a whitespace delimited string fails to parse properly. Open to any patches (or I'll just patch it sometime).

Example:

JSON input file

cat testcase/test.json
{"elements_0_role": "PERSON", "elements_0_roleAssignee": "urn:li:linkedinapi", "elements_0_state": "APPR", "elements_0_organizationalTarget": "company", "paging_count": 10, "paging_start": 0, "fake_field": {"name": "Sephiroth", "age": 99}, "final_record": 42 }

Output :

python3 bigquery_schema_gen.py testcase/test.json
schema = [
            {
                "mode": "NULLABLE",
                "name": "elements_0_role",
                "type": "STRING"
            },
            {
                "mode": "NULLABLE",
                "name": "elements_0_roleAssignee",
                "type": "STRING"
            },
            {
                "mode": "NULLABLE",
                "name": "elements_0_state",
                "type": "STRING"
            },
            {
                "mode": "NULLABLE",
                "name": "elements_0_organizationalTarget",
                "type": "STRING"
            },
            {
                "mode": "NULLABLE",
                "name": "paging_count",
                "type": "INTEGER"
            },
            {
                "mode": "NULLABLE",
                "name": "paging_start",
                "type": "INTEGER"
            },
            {
                "mode": "REPEATED",
                "name": "fake_field",
                "type": "RECORD",
                "fields": [
                    {
                        "mode": "NULLABLE",
                        "name": "name",
                        "type": "STRING"
                    },
                    {
                        "mode": "NULLABLE",
                        "name": "age",
                        "type": "INTEGER"
                    },
                ]
            },
            {
                "mode": "NULLABLE",
                "name": "final_record",
                "type": "INTEGER"
            },
]

Important

A lot of the parsing will be handled for basic fields, but the resulting Python schema must be looked over for inconsisencies.

Brutally hacked together on a Friday afternoon by AlysonBelle (or AlysonNgonyama, depending on who's asking). alyson.belle7@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
testcase		testcase
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
bigquery_schema_gen.py		bigquery_schema_gen.py
unpack.py		unpack.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BigQuery Autogen Schema

Example:

Important

About

Releases

Packages

Contributors 2

Languages

License

AlysonNumberFIVE/Python_BigQuery_Schema_Autogen

Folders and files

Latest commit

History

Repository files navigation

BigQuery Autogen Schema

Example:

Important

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages