Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide whether to use Flatterer or Flatten Tool for converting from JSON to CSV format #14

Closed
duncandewhurst opened this issue Aug 9, 2022 · 8 comments
Assignees
Labels
CSV format This issue relates to the CSV publication format Tooling This issue relates to tooling
Milestone

Comments

@duncandewhurst
Copy link
Collaborator

duncandewhurst commented Aug 9, 2022

We need to decide whether to use Flatterer or Flatten Tool for converting from JSON to CSV format.

Some things to consider:

@duncandewhurst duncandewhurst added the Tooling This issue relates to tooling label Aug 9, 2022
@duncandewhurst duncandewhurst changed the title Use Flatterer or Flatten Tool? Decide whether to use Flatterer or Flatten Tool for converting from JSON to CSV format Aug 9, 2022
@duncandewhurst duncandewhurst added the CSV format This issue relates to the CSV publication format label Aug 9, 2022
@stevesong
Copy link
Contributor

The number one question for me is what is the CSV use case? Perhaps if the records are capturing the kilometre length of links, it would be useful to open in a spreadsheet. Other scenarios?

@duncandewhurst
Copy link
Collaborator Author

duncandewhurst commented Aug 11, 2022

Edit: @stevesong, since your question is about the CSV format in general, rather than conversion tooling, I've copied it to the discussion on publication formats and moved my reply there.

@duncandewhurst duncandewhurst added this to the Alpha milestone Aug 12, 2022
@lgs85
Copy link
Contributor

lgs85 commented Aug 18, 2022

Noting here that this is also dependent on any decision made in #17.

@duncandewhurst
Copy link
Collaborator Author

@kindly if we were to use Flatterer's output as the CSV format for OFDS data, how much work would it be to author a conversion script so that CSV data could be converted to JSON format for validation in CoVE?

If it's a lot of work, could we instead use Flatterer to generate a datapackage.csv that could be used to validate CSV data using frictionless?

@kindly
Copy link

kindly commented Sep 9, 2022

@duncandewhurst flatterer is really intended for just flattening as it creates _link fields to express the join conditions and those link fields would be diffucult to create by hand.

Nonetheless recently flatterer has a feature called pushdown which can copy all id fields to the child one-to-many relationships and it prefixes those id fields with the table where they originated. This is similar to the id_field feature in flattentool. Using this feature I think it would be possible to do the unflattening in a script or potentially convert the headings to ones that flatten-tool expects. I have not thought about this deeply though so I am not 100% sure how feesable this.

@duncandewhurst
Copy link
Collaborator Author

Thanks, @kindly

Would using a data package schema be an alternative to unflattering for CSV validation?

If it's a lot of work, could we instead use Flatterer to generate a datapackage.csv that could be used to validate CSV data using frictionless?

@duncandewhurst duncandewhurst self-assigned this Sep 12, 2022
@duncandewhurst
Copy link
Collaborator Author

For the Alpha at least, we've decided to use Flatten Tool because it offers support for round-tripping and has better organisational support. I've opened an issue about support for streaming in Flatten Tool: OpenDataServices/flatten-tool#399

@duncandewhurst duncandewhurst modified the milestones: Alpha, Beta Sep 15, 2022
@duncandewhurst
Copy link
Collaborator Author

We've not heard any further feedback on this issue so I'm going to close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CSV format This issue relates to the CSV publication format Tooling This issue relates to tooling
Projects
None yet
Development

No branches or pull requests

4 participants