Create extraction interface #3

rumkin · 2019-04-03T04:00:09Z

Create content extraction interface. It should receive html and return an object with:

title
content
entry point

The list could be extended.

tamb · 2019-08-23T20:58:43Z

I agree, but I think it should follow the Response API closer. https://developer.mozilla.org/en-US/docs/Web/API/Response

So maybe instead of title we have headers.

headers should contain an object of any meta data. All of this should be sanitized for functions. So it should be stringified to prevent xss, idk the best method to do this.

body : contains the response body

and then entry point or container or slot or saddle or whatever term for where the content will be loaded into.

rumkin · 2019-08-25T09:47:59Z

In current form I was trying to avoid collisions in HTML of two pages and minify memory usage, that's why it's just pulling only title and container's data.

And now I think it should be improved to make it possible to load pages with different structure, for injecting things like viewport meta tag (<meta name="viewport" ...) and os specific links (like < link rel="apple-touch-icon" ...). So I think there will be head and content properties presented as strings and expires property.

While expires property could require information from response headers, they will be passed to extraction callback as argument and used once to produce static data. Maybe it would be correct to give an ability to define own properties for an extracted content. And thus we need to use extraction and prerendering callbacks. I think this interface could look like that:

type Page = {
  head: String,
  content: String,
  expires: Date,
  props: CustomProps,
}

type CustomProps = {[key:String]: Boolean|Number|String|Object|Array}

type ExtractCallback = (url: String, headers: Map, body: Uint8Array) => Page
type PrerenderCallback = (page: Page) => Page

The Page type is internal pill's representation of the document. It should be convertible to JSON to be placed into history. Though it has props where developer could put some data which could be used on prerendering step.

Will it cover your needs?

tamb · 2019-08-25T13:13:05Z

So the head would be replaced entirely? That would be fine with me. Technically it's a new page so that would be what's expected.

The issue that would exist with custom props, etc is that you'd have to remove them on unload.

This is tricky. Turbolinks merges the head. But I think that's overkill. I think it's probably safe to assume that the head will be similar across pages.

Maybe the extraction interface should just contain more data, and let the dev do what they want with it. Keep the content loading fairly unnopinionated.

rumkin · 2019-09-01T21:41:00Z

CustomProps is just a key-value storage which developer could use on rendering stage. Thus develop only decides what and how will be stored and rendered. As I think it intersects with your suggestion to let developer decide what to do. I just want to make extracted data structured well. In this case developer could store everything stringified in head and content props or as structured data in custom props.

rumkin added the enhancement New feature or request label Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create extraction interface #3

Create extraction interface #3

rumkin commented Apr 3, 2019

tamb commented Aug 23, 2019

rumkin commented Aug 25, 2019

tamb commented Aug 25, 2019

rumkin commented Sep 1, 2019 •

edited

Loading

Create extraction interface #3

Create extraction interface #3

Comments

rumkin commented Apr 3, 2019

tamb commented Aug 23, 2019

rumkin commented Aug 25, 2019

tamb commented Aug 25, 2019

rumkin commented Sep 1, 2019 • edited Loading

rumkin commented Sep 1, 2019 •

edited

Loading