Skip to content

Getting started

timrdf edited this page Feb 15, 2012 · 74 revisions

What do you want to do?

I want to see results!

Head to http://aquarius.tw.rpi.edu/datafaqs to see the 332 lodcloud datasets (those in the LOD Cloud diagram).

(We're in the process of transitioning to http://aquarius.tw.rpi.edu/projects/datafaqs.)

I want to know why you're making DataFAQs

What is quality?

We don't know -- and want to find out. Although we have some ideas about [what makes up data quality](Data Quality), we're really sure that others have different and better ideas. That's why DataFAQs is designed so that others can share their views on what makes "good data".

We're kicking around some ideas for using DataFAQs, and jotting notes here:

I want to evaluate others' datasets with others' existing services

Do you want to get your feet wet?

We hope you do!

This is the simplest route, since you don't have to worry about publishing data or writing an evaluation service. After you finish this, you'll probably want to move on to analyze your own datasets or write an evaluation service to reflect an aspect that you think is important for others to know about.

Take this route:

  • DataFAQs runs server side, so you'll have to [install it](Installing DataFAQs).
  • Prototype deployment details walks through the steps we took to set everything up.
  • DATAFAQS environment variables are used to specify some directory locations and processing options.
  • Once installed, you can create a FAqT Brick to run an analysis with a default configuration.
  • A list of Errors and fixes might get you along a little faster.

I want to evaluate my own datasets with others' existing services

If you're publishing data, would you like to know what your audience thinks about it? Would you like to get status updates for how well your published data is doing? Would you like concrete, actionable analysis that leads you towards publishing better data?

We do too.

Take this route:

  • Listing your dataset at CKAN is a quick and easy way to announce your dataset. This will let more people find it. Plus, a bunch of systems are built to pull from CKAN's listings (including DataFAQs). So it's a win-win.
  • If you have a pile of datasets and want to avoid manually entering them into CKAN, they have a pretty simple API. Unfortunately, if you want to get into the LOD Cloud, then you have to go through some extra hoops and use some barely documented conventions. We think it'd be nicer to describe your datasets using [RDF to begin with](CKAN lodcloud RDF vocabulary), and let some thingamawidget submit it to CKAN for you.
  • If you use some thingamawidget to submit your datasets to CKAN, you'll need to make sure you're not Missing CKAN API Key.
  • DCAT Data Catalog Vocabulary - another convention from which one can find out about datasets.
  • LOD Cloud - the subset of Linked Data that is in the lodcloud CKAN group.

I want to tell people how much I like/dislike their dataset

Are you trying to use other peoples' data? Are they making it harder than it needs to be for you to use it? Want to let them know? After you go through the hassle of telling them, would you like it if other data publishers heeded your feedback without you having to lift finger?

We do too.

Take this route:

I want to know how to analyze the results

We do too.

Take this route:

I want nuts and bolts

I want some background

Related work:

  • Pedantic Web Group
  • Integration tools
  • Validation tools
  • Testing apparatuses
  • frbr:lebo2012datafaqs
Clone this wiki locally