Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First class files #657

Closed
wants to merge 48 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
043b81e
added initial file class structure
nevoodoo Jan 15, 2024
292e39f
feat(analysis): Added initial File model definition
nevoodoo Jan 23, 2024
6daed68
feat(analysis): Draft migration script for the analysis model
nevoodoo Jan 23, 2024
e860302
chore(fixed linting and added TODO):
nevoodoo Jan 23, 2024
5d5b101
Address missing author on analysis-runner by using audit_logs (#648)
illusional Jan 11, 2024
43eb18c
Update md5 creating script for requester pays buckets (#651)
EddieLF Jan 16, 2024
7f0ddf6
chore: fixed linting
nevoodoo Jan 23, 2024
b82c2c7
feat(migration): added support for JSON output structure in existing …
nevoodoo Jan 23, 2024
73f7849
feat(migration): migration script for analysis output
nevoodoo Jan 24, 2024
c6f7da2
chore(migration): fixed linting on script
nevoodoo Jan 24, 2024
1ce8f9c
removed FileInternal
nevoodoo Jan 24, 2024
5fd3f60
changed file to a many-many relationship model
nevoodoo Jan 29, 2024
f75d27d
added output file querying, mutation via tables
nevoodoo Jan 31, 2024
7c9eb6d
added system versioning for analysis_file
nevoodoo Jan 31, 2024
0d65777
updated tests
nevoodoo Jan 31, 2024
74925ee
deprecating output field on analysis model
nevoodoo Feb 2, 2024
de4e465
merged from dev
nevoodoo Feb 2, 2024
292387c
working file class implementation
nevoodoo Feb 12, 2024
cf88350
Merge branch 'dev' of github.com:populationgenomics/metamist into fir…
nevoodoo Feb 12, 2024
2ea75dd
Merged origin/dev into first-class-files-v2
nevoodoo Feb 16, 2024
3b6b1ab
Merged origin/dev into first-class-files-v2
nevoodoo Feb 16, 2024
2975d7f
Merged origin/dev into first-class-files-v2
nevoodoo Feb 20, 2024
f5b7289
Fixed existing tests to use outputs
nevoodoo Feb 22, 2024
d69ce87
added FileInternal use for reconstructing json
nevoodoo Feb 22, 2024
975ef5e
Merged origin/dev into first-class-files-v2
nevoodoo Feb 22, 2024
64d0311
fixed indentation caused by isort
nevoodoo Feb 22, 2024
cb426ee
fixed breaking gitbutler changes
nevoodoo Feb 22, 2024
0ea3164
removed gitbutler
nevoodoo Feb 22, 2024
9b7d379
fixed indentation caused by gitbutler
nevoodoo Feb 22, 2024
4c4f9e2
Merge branch 'first-class-files-v2' of github.com:populationgenomics/…
nevoodoo Feb 22, 2024
412c030
updated front-end to use outputs
nevoodoo Feb 22, 2024
8429d59
added output files tests
nevoodoo Feb 23, 2024
eeacee7
added fake gcs server
nevoodoo Feb 26, 2024
2b1432d
patching requirements for cloudpathlib
nevoodoo Feb 26, 2024
a2d9860
add local env declaration to tests
nevoodoo Feb 26, 2024
afa6778
fixed fileinternal from_db
nevoodoo Feb 26, 2024
9f1f1bb
add validator for field
nevoodoo Feb 26, 2024
633259a
added logging to confirm fakegcs setup
nevoodoo Feb 26, 2024
3e34921
added parse_sql_bool
nevoodoo Feb 26, 2024
fbed3e8
added parse_sql_bool
nevoodoo Feb 26, 2024
9442aa7
update table name and fix str output
nevoodoo Feb 28, 2024
78fd910
refactored file to output, added better typing annotations
nevoodoo Feb 28, 2024
22a7b98
updated analysis table call from dev change
nevoodoo Feb 28, 2024
b875b7e
removed testbase comments, fixed migration file
nevoodoo Feb 28, 2024
f68bb63
added more tests to capture file str
nevoodoo Mar 5, 2024
dced3d6
fixed file behaviour with gs str
nevoodoo Mar 12, 2024
fd021ad
Merge branch 'dev' of github.com:populationgenomics/metamist into fir…
nevoodoo Mar 12, 2024
abe04fd
removed comments
nevoodoo Mar 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ jobs:
- name: "Run unit tests"
id: runtests
run: |
export SM_ENVIRONMENT=local
coverage run -m unittest discover -p 'test*.py' -s '.'
rc=$?
coverage xml
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,9 @@ web/src/__generated__

# pulumi config files
Pulumi*.yaml

# pnpm package manager
pnpm-lock.yaml

# env
.env
8 changes: 5 additions & 3 deletions api/graphql/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,27 +235,29 @@ class GraphQLAnalysis:
id: int
type: str
status: strawberry.enum(AnalysisStatus)
output: str | None
timestamp_completed: datetime.datetime | None = None
active: bool
meta: strawberry.scalars.JSON

output: strawberry.scalars.JSON
outputs: strawberry.scalars.JSON
@staticmethod
def from_internal(internal: AnalysisInternal) -> 'GraphQLAnalysis':
return GraphQLAnalysis(
id=internal.id,
type=internal.type,
status=internal.status,
output=internal.output,
timestamp_completed=internal.timestamp_completed,
active=internal.active,
meta=internal.meta,
output=internal.output,
outputs=internal.outputs,
)

@strawberry.field
async def sequencing_groups(
self, info: Info, root: 'GraphQLAnalysis'
) -> list['GraphQLSequencingGroup']:

loader = info.context[LoaderKeys.SEQUENCING_GROUPS_FOR_ANALYSIS]
sgs = await loader.load(root.id)
return [GraphQLSequencingGroup.from_internal(sg) for sg in sgs]
Expand Down
10 changes: 5 additions & 5 deletions api/routes/analysis.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import csv
import io
from datetime import date
from typing import Any
from typing import Any, Optional, Union

from fastapi import APIRouter
from fastapi.params import Body, Query
Expand Down Expand Up @@ -45,7 +45,7 @@ class AnalysisModel(BaseModel):
type: str
status: AnalysisStatus
meta: dict[str, Any] | None = None
output: str | None = None
outputs: Optional[Union[str, dict]] = None
active: bool = True
# please don't use this, unless you're the analysis-runner,
# the usage is tracked ... (Ծ_Ծ)
Expand All @@ -56,7 +56,7 @@ class AnalysisUpdateModel(BaseModel):
"""Update analysis model"""

status: AnalysisStatus
output: str | None = None
outputs: str | None = None
meta: dict[str, Any] | None = None
active: bool | None = None

Expand All @@ -73,7 +73,7 @@ class AnalysisQueryModel(BaseModel):
type: str | None = None
status: AnalysisStatus | None = None
meta: dict[str, Any] | None = None
output: str | None = None
outputs: str | None = None
active: bool | None = None

def to_filter(self, project_id_map: dict[str, int]) -> AnalysisFilter:
Expand Down Expand Up @@ -130,7 +130,7 @@ async def update_analysis(
"""Update status of analysis"""
atable = AnalysisLayer(connection)
await atable.update_analysis(
analysis_id, status=analysis.status, output=analysis.output, meta=analysis.meta
analysis_id, status=analysis.status, outputs=analysis.outputs, meta=analysis.meta
)
return True

Expand Down
86 changes: 86 additions & 0 deletions db/project.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1107,4 +1107,90 @@
<sql>ALTER TABLE sequencing_group_assay CHANGE author author VARCHAR(255) NULL;</sql>
<sql>ALTER TABLE sequencing_group_external_id CHANGE author author VARCHAR(255) NULL;</sql>
</changeSet>
<changeSet id="2024-01-23_output_file" author="yash.pankhania">
<sql>SET @@system_versioning_alter_history = 1;</sql>
<createTable tableName="output_file">
<column name="id" type="INT" autoIncrement="true">
<constraints primaryKey="true" nullable="false" />
</column>
<column name="path" type="VARCHAR(255)">
<constraints
nullable="false"
unique="true"
/>
</column>
<column name="basename" type="VARCHAR(255)">
<constraints
nullable="false"
/>
</column>
<column name="dirname" type="VARCHAR(100)">
<constraints
nullable="false"
/>
</column>
<column name="nameroot" type="VARCHAR(255)">
<constraints
nullable="false"
/>
</column>
<column name="nameext" type="VARCHAR(25)">
<constraints
nullable="true"
/>
</column>
<column name="file_checksum" type="VARCHAR(255)">
<constraints
nullable="true"
/>
</column>
<column name="size" type="BIGINT">
<constraints
nullable="false"
/>
</column>
<column name="meta" type="VARCHAR(255)">
<constraints
nullable="true"
/>
</column>
<column name="valid" type="BOOLEAN" />
<column name="parent_id" type="INT">
<constraints
nullable="true"
foreignKeyName="FK_SECONDARY_FILE_PARENT_ID"
references="output_file(id)"
/>
</column>
</createTable>
<createTable tableName="analysis_outputs">
<column name="analysis_id" type="INT">
<constraints
nullable="false"
foreignKeyName="FK_ANALYSIS_OUTPUTS_ANALYSIS_ID"
references="analysis(id)"
/>
</column>
<column name="file_id" type="INT">
<constraints
nullable="true"
foreignKeyName="FK_ANALYSIS_OUTPUTS_FILE_ID"
references="output_file(id)"
/>
</column>
<column name="output" type="VARCHAR(255)">
<constraints
nullable="true"
/>
</column>
<column name="json_structure" type="VARCHAR(255)">
<constraints
nullable="true"
/>
</column>
</createTable>
<sql>ALTER TABLE `output_file` ADD SYSTEM VERSIONING;</sql>
<sql>ALTER TABLE `analysis_outputs` ADD SYSTEM VERSIONING;</sql>
<sql>ALTER TABLE `analysis_outputs` ADD CONSTRAINT `chk_file_id_output` CHECK ((file_id IS NOT NULL AND output IS NULL) OR (file_id IS NULL AND output IS NOT NULL));</sql>
</changeSet>
</databaseChangeLog>
1 change: 1 addition & 0 deletions db/python/connect.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
'sequencing_group',
'assay',
'sequencing_group_assay',
'analysis_outputs',
'analysis_sequencing_group',
'analysis_sample',
'assay_external_id',
Expand Down
Loading
Loading