Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote "Metadata in Table Schema" recipe to the specs. #961

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ebfc90f
Schema Metadata spec proposal
pierrecamilleri Apr 25, 2024
cdbd1c9
🔵 wording and typos
pierrecamilleri Apr 25, 2024
2631462
taking into account review comments
pierrecamilleri Jun 11, 2024
83396b7
Remove `path` property
pierrecamilleri Jun 24, 2024
d9221c6
Remove `updated` property
pierrecamilleri Jun 24, 2024
ded47a6
inherits "if distribted in a data package descriptor"
pierrecamilleri Jun 24, 2024
2161fb0
resource -> example resource
pierrecamilleri Jun 24, 2024
f7ffacf
`name` "as for Data Package
pierrecamilleri Jun 24, 2024
e59b7d9
Remove sources and build profile
pierrecamilleri Jun 24, 2024
213ef87
`examples` are `Data Resource`s
pierrecamilleri Jun 26, 2024
5fdd322
Attempt at removing "schema" property from table schema example
pierrecamilleri Jun 26, 2024
9a32b79
Revert "Attempt at removing "schema" property from table schema example"
pierrecamilleri Jun 26, 2024
1088c70
Revert "`examples` are `Data Resource`s"
pierrecamilleri Jun 26, 2024
229bbf6
Clear-up examples documentation (ref to data files)
pierrecamilleri Jun 26, 2024
eddb54c
fix: created stores only creation date
pierrecamilleri Jun 26, 2024
063ba61
changelog
pierrecamilleri Jul 5, 2024
02c8482
fix: link
pierrecamilleri Jun 26, 2024
fac7c7f
fix: unintended change
pierrecamilleri Jul 5, 2024
e640063
fix: link
pierrecamilleri Jul 5, 2024
c5ced5b
Update doc with Peters suggestions + fix links
pierrecamilleri Jul 12, 2024
800593f
"partial Data Resource" ?
pierrecamilleri Jul 12, 2024
3f80e97
Use title case for Data Package
peterdesmet Jul 15, 2024
5773c6b
Correct typo + align Data Package
peterdesmet Jul 15, 2024
935cbf5
Require name for examples
peterdesmet Jul 15, 2024
ad023bf
Update profiles/source/dictionary/common.yaml
pierrecamilleri Jul 26, 2024
eb1117e
Update profiles/source/dictionary/schema.yaml
pierrecamilleri Jul 26, 2024
dbd4ebd
Update profiles/source/dictionary/schema.yaml
pierrecamilleri Jul 26, 2024
79d915e
Update profiles/source/dictionary/schema.yaml
pierrecamilleri Jul 26, 2024
615a635
Update : example title to example name
pierrecamilleri Jul 26, 2024
a3233f6
Rephrase #961 changes
peterdesmet Sep 19, 2024
8288aa7
Change order cf. Data Package
peterdesmet Sep 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions content/docs/overview/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,44 @@ sidebar:

This document includes all meaningful changes made to the **Data Package standard**. It does not cover changes made to other documents like Recipes or Guides.

## v2.1-draft

##### `schema.name` (new)

[`name`](/standard/table-schema/#name) allows to specify a name for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.title` (new)

[`title`](/standard/table-schema/#title) allows to specify a title for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.description` (new)

[`description`](/standard/table-schema/#description) allows to specify a description for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.homepage` (new)

[`homepage`](/standard/table-schema/#homepage) allows to specify a homepage for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.keywords` (new)

[`keywords`](/standard/table-schema/#keywords) allows to specify keywords for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.examples` (new)

[`examples`](/standard/table-schema/#examples) allows to specify a list of illustrative data resources that use a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.created` (new)

[`created`](/standard/table-schema/#created) allows to specify when a schema was created ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.version` (new)

[`version`](/standard/table-schema/#version) allows to specify a version for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

##### `schema.contributors` (new)

[`contributors`](/standard/table-schema/#contributors) allows to specify contributors for a schema ([#961](https://github.com/frictionlessdata/datapackage/pull/961)).

## v2.0

> June 26, 2024
Expand Down
2 changes: 1 addition & 1 deletion content/docs/standard/data-package.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ An Array of string keywords to assist users searching for the package in catalog

### `contributors`

The people or organizations who contributed to this Data Package. It `MUST` be an array. Each entry is a Contributor and `MUST` be an `object`. A Contributor `MUST` have at least one property. A Contributor is `RECOMMENDED` to have `title` property and `MAY` contain `givenName`, `familyName`, `path`, `email`, `roles`, and `organization` properties:
The people or organizations that contributed to this Data Package. It `MUST` be an array. Each entry is a Contributor and `MUST` be an `object`. A Contributor `MUST` have at least one property. A Contributor is `RECOMMENDED` to have `title` property and `MAY` contain `givenName`, `familyName`, `path`, `email`, `roles`, and `organization` properties:

- `title`: A string containing a name of the contributor.
- `givenName`: A string containing the name a person has been given, if the contributor is a person.
Expand Down
42 changes: 40 additions & 2 deletions content/docs/standard/table-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ In contrast with `field.constraints.unique`, `uniqueKeys` allows to define uniqu

#### `foreignKeys` {#foreignKeys}

A foreign key is a reference where values in a field (or fields) on the table ('resource' in data package terminology) described by this Table Schema connect to values a field (or fields) on this or a separate table (resource). They are directly modelled on the concept of foreign keys in SQL.
A foreign key is a reference where values in a field (or fields) on the table ('resource' in Data Package terminology) described by this Table Schema connect to values a field (or fields) on this or a separate table (resource). They are directly modelled on the concept of foreign keys in SQL.

The `foreignKeys` property, if present, `MUST` be an Array. Each entry in the array `MUST` be a `foreignKey`. A `foreignKey` `MUST` be a `object` and `MUST` have the following properties:

Expand All @@ -198,7 +198,7 @@ The `foreignKeys` property, if present, `MUST` be an Array. Each entry in the ar
key. The structure of the array is as per `primaryKey` above.
- `reference` - `reference` `MUST` be a `object`. The `object`
- `MUST` have a property `fields` which is an array of strings of the same length as the outer `fields`, describing the field (or fields) references on the destination resource. The structure of the array is as per `primaryKey` above.
- `MAY` have a property `resource` which is the name of the resource within the current data package, i.e. the data package within which this Table Schema is located. For referencing another data resource the `resource` property `MUST` be provided. For self-referencing, i.e. references between fields in this Table Schema, the `resource` property `MUST` be omitted.
- `MAY` have a property `resource` which is the name of the resource within the current Data Package, i.e. the Data Package within which this Table Schema is located. For referencing another Data Resource the `resource` property `MUST` be provided. For self-referencing, i.e. references between fields in this Table Schema, the `resource` property `MUST` be omitted.

Here's an example:

Expand Down Expand Up @@ -266,6 +266,44 @@ If the value of the `foreignKey.reference.resource` property is an empty string
Data consumer MUST support the `foreignKey.fields` and `foreignKey.reference.fields` properties in a form of a single string e.g. `"fields": "a"` which was a part of the `v1.0` of the specification.
:::

#### `name`

A simple name or identifier for the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#name)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A simple name or identifier for the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#name)).
A simple name or identifier for the schema.


#### `title`

A string providing a title or one sentence description for the schema.

#### `description`

A description of the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#description)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A description of the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#description)).
A description of the schema.


#### `homepage`

A URL for the home on the web that is related to the schema.

#### `keywords`

An array of string keywords to assist users searching for the schema in catalogs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pierrecamilleri would you argue this inherits from Data Package or not? If so, please change to:

Suggested change
An array of string keywords to assist users searching for the schema in catalogs.
An array of string keywords to assist users searching for the schema in catalogs. If not specified, the schema inherits from the Data Package if distributed in a Data Package descriptor.


#### `examples`

A list of Data Resources that use and illustrate the schema.

If present, it `MUST` be a non-empty array of objects. Each object is a [Data Resource](https://datapackage.org/standard/data-resource/) that `MUST` at least have the `name` and `path` property. The `path` must be a URL.

#### `created`

The datetime on which the schema was created (cf. [Data Package](https://datapackage.org/standard/data-package/#created)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The datetime on which the schema was created (cf. [Data Package](https://datapackage.org/standard/data-package/#created)).
The datetime on which the schema was created.

Copy link
Member

@peterdesmet peterdesmet Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could inherit from Data Package. Would add:

If not specified, the schema inherits from the Data Package if distributed in a Data Package descriptor.


#### `version`

A version string identifying the version of the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#version)). If not specified, the schema inherits from the Data Package if distributed in a Data Package descriptor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A version string identifying the version of the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#version)). If not specified, the schema inherits from the Data Package if distributed in a Data Package descriptor.
A version string identifying the version of the schema. If not specified, the schema inherits from the Data Package if distributed in a Data Package descriptor.


#### `contributors`

The people or organizations that contributed to the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#contributors)). If not specified the schema inherits from the Data Package if distributed in a Data Package descriptor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The people or organizations that contributed to the schema (cf. [Data Package](https://datapackage.org/standard/data-package/#contributors)). If not specified the schema inherits from the Data Package if distributed in a Data Package descriptor.
The people or organizations that contributed to the schema. If not specified the schema inherits from the Data Package if distributed in a Data Package descriptor.


### Field

A field descriptor `MUST` be a JSON `object` that describes a single field. The descriptor provides additional human-readable documentation for a field, as well as additional information that can be used to validate the field or create a user interface for data entry.
Expand Down
4 changes: 2 additions & 2 deletions profiles/source/dictionary/common.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ example:
}
homepage:
title: Home Page
description: The home on the web that is related to this data package.
description: The home on the web that is related to this descriptor.
type: string
format: uri
examples:
Expand Down Expand Up @@ -149,7 +149,7 @@ created:
}
keywords:
title: Keywords
description: A list of keywords that describe this package.
description: A list of keywords that describe this descriptor.
type: array
minItems: 1
items:
Expand Down
45 changes: 45 additions & 0 deletions profiles/source/dictionary/schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,24 @@ tableSchema:
}
missingValues:
"$ref": "#/definitions/tableSchemaMissingValues"
name:
"$ref": "#/definitions/name"
title:
"$ref": "#/definitions/title"
description:
"$ref": "#/definitions/description"
homepage:
"$ref": "#/definitions/homepage"
keywords:
"$ref": "#/definitions/keywords"
examples:
"$ref": "#/definitions/tableSchemaExamples"
created:
"$ref": "#/definitions/created"
version:
"$ref": "#/definitions/version"
contributors:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Four properties from Data Package are not carried over to Schema. I'm fine with waiting for use cases, but wanted to know your opinion on these @pierrecamilleri:

  • id: potentially useful. In my use case, we opted for path instead (the URL where the schema is available), which could be useful to add.
  • licenses: likely a bit tricky to inherit, would not add for now
  • image: probably seldom useful
  • sources: not really applicable here

"$ref": "#/definitions/contributors"
examples:
- |
{
Expand Down Expand Up @@ -263,6 +281,33 @@ tableSchemaMissingValues:
{
"missingValues": []
}
tableSchemaExamples:
title: Examples
description: A list of Data Resources that use and illustrate the schema.
type: array
minItems: 0
items:
"$ref": "#/definitions/tableSchemaExample"
examples:
- |
{
"examples": [
{
"name": "valid-data",
"path": "http://example.com/valid-data.csv"
}
]
}
tableSchemaExample:
title: Example
description: A Data Resource that uses and illustrates the schema.
type: object
properties:
name:
"$ref": "#/definitions/name"
path:
"$ref": "#/definitions/path"
required: ["name", "path"]
tableSchemaFieldString:
type: object
title: String Field
Expand Down
174 changes: 171 additions & 3 deletions profiles/target/2.0/datapackage.json
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
"homepage": {
"propertyOrder": 60,
"title": "Home Page",
"description": "The home on the web that is related to this data package.",
"description": "The home on the web that is related to this resource.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two notes:

  1. These files should be rebuild based on the changes above.
  2. Is it a good idea to rebuild these files? I think we should have a new directory target/2.1-draft

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with the build process, in particular what is the difference between "build/profiles" and "profiles/target" ?

When I run the "build.js", with "VERSION=2.1-draft", it does not create (or for the matter, populate) a target/2.1-draft directory 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterdesmet
@pierrecamilleri
I need to set up the branching process for new versions so it's OK if this PR just includes actual changes to the standard and I'll handle the rest

"type": "string",
"format": "uri",
"examples": [
Expand Down Expand Up @@ -153,7 +153,7 @@
"keywords": {
"propertyOrder": 90,
"title": "Keywords",
"description": "A list of keywords that describe this package.",
"description": "A list of keywords that describe this descriptor.",
"type": "array",
"minItems": 1,
"items": {
Expand Down Expand Up @@ -348,7 +348,7 @@
"homepage": {
"propertyOrder": 70,
"title": "Home Page",
"description": "The home on the web that is related to this data package.",
"description": "The home on the web that is related to this resource.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The home on the web that is related to this resource.",
"description": "The home on the web that is related to this data package.",

@roll If these files are rebuild, it should say descriptor (general) or Data Package (specific) here.

"type": "string",
"format": "uri",
"examples": [
Expand Down Expand Up @@ -3093,6 +3093,174 @@
"{\n \"missingValues\": [\n \"-\",\n \"NaN\",\n \"\"\n ]\n}\n",
"{\n \"missingValues\": []\n}\n"
]
},
"name": {
"title": "Name",
"description": "An identifier string.",
"type": "string",
"context": "This is ideally a url-usable and human-readable name. Name `SHOULD` be invariant, meaning it `SHOULD NOT` change when its parent descriptor is updated.",
"examples": [
"{\n \"name\": \"my-nice-name\"\n}\n"
]
},
"title": {
"title": "Title",
"description": "A human-readable title.",
"type": "string",
"examples": [
"{\n \"title\": \"My Package Title\"\n}\n"
]
},
"description": {
"title": "Description",
"description": "A text description. Markdown is encouraged.",
"type": "string",
"examples": [
"{\n \"description\": \"# My Package description\\nAll about my package.\"\n}\n"
]
},
"homepage": {
"title": "Home Page",
"description": "The home on the web that is related to this resource.",
"type": "string",
"format": "uri",
"examples": [
"{\n \"homepage\": \"http://example.com/\"\n}\n"
]
},
"keywords": {
"title": "Keywords",
"description": "A list of keywords that describe this descriptor.",
"type": "array",
"minItems": 1,
"items": {
"type": "string"
},
"examples": [
"{\n \"keywords\": [\n \"data\",\n \"fiscal\",\n \"transparency\"\n ]\n}\n"
]
},
"examples": {
"title": "Examples",
"description": "Links to example data files",
"type": "array",
"minItems": 0,
"items": {
"title": "Example",
"description": "Link to an example data file",
"type": "object",
"properties": {
"title": {
"title": "Title",
"description": "A human-readable title.",
"type": "string",
"examples": [
"{\n \"title\": \"My Package Title\"\n}\n"
]
},
"path": {
"title": "Path",
"description": "A fully qualified URL, or a POSIX file path.",
"type": "string",
"pattern": "^((?=[^./~])(?!file:)((?!\\/\\.\\.\\/)(?!\\\\)(?!:\\/\\/).)*|(http|ftp)s?:\\/\\/.*)$",
"examples": [
"{\n \"path\": \"file.csv\"\n}\n",
"{\n \"path\": \"http://example.com/file.csv\"\n}\n"
],
"context": "Implementations need to negotiate the type of path provided, and dereference the data accordingly."
}
},
"required": [
"title",
"path"
]
},
"examples": [
"{\n \"examples\": [\n {\n \"title\": \"Valid data\",\n \"path\": \"http://example.com/valid-data.csv\"\n }\n ]\n}\n"
]
},
"created": {
"title": "Created",
"description": "The datetime on which this descriptor was created.",
"context": "The datetime must conform to the string formats for datetime as described in [RFC3339](https://tools.ietf.org/html/rfc3339#section-5.6)",
"type": "string",
"format": "date-time",
"examples": [
"{\n \"created\": \"1985-04-12T23:20:50.52Z\"\n}\n"
]
},
"version": {
"title": "Version",
"description": "A unique version number for this descriptor.",
"type": "string",
"examples": [
"{\n \"version\": \"0.0.1\"\n}\n",
"{\n \"version\": \"1.0.1-beta\"\n}\n"
]
},
"contributors": {
"title": "Contributors",
"description": "The contributors to this descriptor.",
"type": "array",
"minItems": 1,
"items": {
"title": "Contributor",
"description": "A contributor to this descriptor.",
"properties": {
"title": {
"title": "Title",
"description": "A human-readable title.",
"type": "string",
"examples": [
"{\n \"title\": \"My Package Title\"\n}\n"
]
},
"path": {
"title": "Path",
"description": "A fully qualified URL, or a POSIX file path.",
"type": "string",
"pattern": "^((?=[^./~])(?!file:)((?!\\/\\.\\.\\/)(?!\\\\)(?!:\\/\\/).)*|(http|ftp)s?:\\/\\/.*)$",
"examples": [
"{\n \"path\": \"file.csv\"\n}\n",
"{\n \"path\": \"http://example.com/file.csv\"\n}\n"
],
"context": "Implementations need to negotiate the type of path provided, and dereference the data accordingly."
},
"email": {
"title": "Email",
"description": "An email address.",
"type": "string",
"format": "email",
"examples": [
"{\n \"email\": \"example@example.com\"\n}\n"
]
},
"givenName": {
"type": "string"
},
"familyName": {
"type": "string"
},
"organization": {
"title": "Organization",
"description": "An organizational affiliation for this contributor.",
"type": "string"
},
"roles": {
"type": "array",
"minItems": 1,
"items": {
"type": "string"
}
}
},
"minProperties": 1,
"context": "Use of this property does not imply that the person was the original creator of, or a contributor to, the data in the descriptor, but refers to the composition of the descriptor itself."
},
"examples": [
"{\n \"contributors\": [\n {\n \"title\": \"Joe Bloggs\"\n }\n ]\n}\n",
"{\n \"contributors\": [\n {\n \"title\": \"Joe Bloggs\",\n \"email\": \"joe@example.com\",\n \"role\": \"author\"\n }\n ]\n}\n"
]
}
},
"examples": [
Expand Down
Loading
Loading