Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define schema for sensor data table #112

Open
JSegrave-IBM opened this issue Sep 9, 2020 · 10 comments
Open

Define schema for sensor data table #112

JSegrave-IBM opened this issue Sep 9, 2020 · 10 comments
Assignees

Comments

@JSegrave-IBM
Copy link
Collaborator

JSegrave-IBM commented Sep 9, 2020

Proposed Schema below.

Key points:

  • There are two tables:
    • firefighter_sensor_log where each row is the raw/unmodified sensor readings associated with firefighter_id during the minute timestamp_mins - we never modify this data
    • firefighter_status_analytics - where each row the same data as the sensor log, plus all the analytic information about firefighter_id during the minute timestamp_mins. This is a kind of one-stop-shop for data analysis - everything about the firefighter in one place. Making it a separate table to firefighter_sensor_log allows us to keep an untouched record of what was sent. It also removes a few potential contention points between the processes that read and write information.
  • Keys for both tables are (timestamp_mins, firefighter_id) - note the ID change from previous SensorID
  • timestamp_mins is a minute-quantized timestamp. Important that these keys be minute-quantized - e.g. '10/02/2020 09:40' not '10/02/2020 09:40:34'. Also note the name change - because plain 'ol timestamp is an SQL keyword
  • device_timestamp is the unmodified original device timestamp.
  • The sensors and all time-weighted averages (TWAs) are floats because the ranges vary a lot (from 10^3 to 10^-3)
  • The gauges are all ints because they're the percentage over/under the limit - we don't need float-level detail.
  • Example dataset from Feb (with real CO and mocked-up NO2, but no other gases) is on box here.
  • Using the gas name, because while 'CO' and 'NO2' have become familiar, CH2O, C3H4O and C6H3(CH3)3 ... not so much :-) not a strong opinion though (it's a database, only technical people will see it!), so feel free to change this.
  • Normally the device_status_LED and analytics_status_LED (i.e. the LED colors) will be the same, but in a disconnected scenario, they may be different. We're capturing both. Associated values are { 1 = green ; 2 = yellow ; 3 = red } and for this release, 'yellow' is a configurable percentage, currently defaulting to 80% (i.e. any gas reaching 80% of any AEGL-2 time-period limit will cause the LED to go yellow)
  • DEFAULT CHARSET is set to UTF8mb4; as per discussion with Dan about internationalization.

See Table definition for details below

@JSegrave-IBM
Copy link
Collaborator Author

JSegrave-IBM commented Sep 10, 2020

@lidderupk @mrodrise - table definitions for the sensor log and analytics output...

Key points:

  • There are two tables:
    • firefighter_sensor_log where each row is the raw/unmodified sensor readings associated with firefighter_id during the minute timestamp_mins - we never modify this data
    • firefighter_status_analytics - where each row the same data as the sensor log, plus all the analytic information about firefighter_id during the minute timestamp_mins. This is a kind of one-stop-shop for data analysis - everything about the firefighter in one place. Making it a separate table to firefighter_sensor_log allows us to keep an untouched record of what was sent. It also removes a few potential contention points between the processes that read and write information.
  • Keys for both tables are (timestamp_mins, firefighter_id) - note the ID change from previous SensorID
  • timestamp_mins is a minute-quantized timestamp. Whereasdevice_timestamp is the unmodified original device timestamp.
  • temp, humidity and all gauge percentages are SMALLINT
  • sensors and all time-weighted averages - are FLOAT
  • Normally the device_status_LED and analytics_status_LED (i.e. the LED colors) will be the same, but in a disconnected scenario, they may be different. We're capturing both. Associated values are { 1 = green ; 2 = yellow ; 3 = red } and for this release, 'yellow' is a configurable percentage, currently defaulting to 80% (i.e. any gas reaching 80% of any AEGL-2 time-period limit will cause the LED to go yellow)
  • DEFAULT CHARSET is set to UTF8mb4; as per discussion with Dan about internationalization.
CREATE TABLE `firefighter_status_analytics` (
  `timestamp_mins` timestamp NOT NULL,
  `firefighter_id` VARCHAR(20) NOT NULL,
  `device_id` VARCHAR(20) DEFAULT NULL,
  `device_battery_level` FLOAT DEFAULT NULL,
  `temperature` SMALLINT DEFAULT NULL,
  `humidity` SMALLINT DEFAULT NULL,
  `device_timestamp` timestamp DEFAULT NULL,
  `device_status_LED` SMALLINT DEFAULT NULL,
  `analytics_status_LED` SMALLINT DEFAULT NULL,
  `carbon_monoxide` FLOAT DEFAULT NULL,
  `carbon_monoxide_twa_10min` FLOAT DEFAULT NULL,
  `carbon_monoxide_twa_30min` FLOAT DEFAULT NULL,
  `carbon_monoxide_twa_60min` FLOAT DEFAULT NULL,
  `carbon_monoxide_twa_240min` FLOAT DEFAULT NULL,
  `carbon_monoxide_twa_480min` FLOAT DEFAULT NULL,
  `carbon_monoxide_gauge_10min` SMALLINT DEFAULT NULL,
  `carbon_monoxide_gauge_30min` SMALLINT DEFAULT NULL,
  `carbon_monoxide_gauge_60min` SMALLINT DEFAULT NULL,
  `carbon_monoxide_gauge_240min` SMALLINT DEFAULT NULL,
  `carbon_monoxide_gauge_480min` SMALLINT DEFAULT NULL,
  `nitrogen_dioxide` FLOAT DEFAULT NULL,
  `nitrogen_dioxide_twa_10min` FLOAT DEFAULT NULL,
  `nitrogen_dioxide_twa_30min` FLOAT DEFAULT NULL,
  `nitrogen_dioxide_twa_60min` FLOAT DEFAULT NULL,
  `nitrogen_dioxide_twa_240min` FLOAT DEFAULT NULL,
  `nitrogen_dioxide_twa_480min` FLOAT DEFAULT NULL,
  `nitrogen_dioxide_gauge_10min` SMALLINT DEFAULT NULL,
  `nitrogen_dioxide_gauge_30min` SMALLINT DEFAULT NULL,
  `nitrogen_dioxide_gauge_60min` SMALLINT DEFAULT NULL,
  `nitrogen_dioxide_gauge_240min` SMALLINT DEFAULT NULL,
  `nitrogen_dioxide_gauge_480min` SMALLINT DEFAULT NULL,
  `formaldehyde` FLOAT DEFAULT NULL,
  `formaldehyde_twa_10min` FLOAT DEFAULT NULL,
  `formaldehyde_twa_30min` FLOAT DEFAULT NULL,
  `formaldehyde_twa_60min` FLOAT DEFAULT NULL,
  `formaldehyde_twa_240min` FLOAT DEFAULT NULL,
  `formaldehyde_twa_480min` FLOAT DEFAULT NULL,
  `formaldehyde_gauge_10min` SMALLINT DEFAULT NULL,
  `formaldehyde_gauge_30min` SMALLINT DEFAULT NULL,
  `formaldehyde_gauge_60min` SMALLINT DEFAULT NULL,
  `formaldehyde_gauge_240min` SMALLINT DEFAULT NULL,
  `formaldehyde_gauge_480min` SMALLINT DEFAULT NULL,
  `acrolein` FLOAT DEFAULT NULL,
  `acrolein_twa_10min` FLOAT DEFAULT NULL,
  `acrolein_twa_30min` FLOAT DEFAULT NULL,
  `acrolein_twa_60min` FLOAT DEFAULT NULL,
  `acrolein_twa_240min` FLOAT DEFAULT NULL,
  `acrolein_twa_480min` FLOAT DEFAULT NULL,
  `acrolein_gauge_10min` SMALLINT DEFAULT NULL,
  `acrolein_gauge_30min` SMALLINT DEFAULT NULL,
  `acrolein_gauge_60min` SMALLINT DEFAULT NULL,
  `acrolein_gauge_240min` SMALLINT DEFAULT NULL,
  `acrolein_gauge_480min` SMALLINT DEFAULT NULL,
  `benzene` FLOAT DEFAULT NULL,
  `benzene_twa_10min` FLOAT DEFAULT NULL,
  `benzene_twa_30min` FLOAT DEFAULT NULL,
  `benzene_twa_60min` FLOAT DEFAULT NULL,
  `benzene_twa_240min` FLOAT DEFAULT NULL,
  `benzene_twa_480min` FLOAT DEFAULT NULL,
  `benzene_gauge_10min` SMALLINT DEFAULT NULL,
  `benzene_gauge_30min` SMALLINT DEFAULT NULL,
  `benzene_gauge_60min` SMALLINT DEFAULT NULL,
  `benzene_gauge_240min` SMALLINT DEFAULT NULL,
  `benzene_gauge_480min` SMALLINT DEFAULT NULL,
  PRIMARY KEY (`timestamp_mins`, `firefighter_id`)
) ENGINE=InnoDB DEFAULT CHARSET= UTF8mb4;

CREATE TABLE `firefighter_sensor_log` (
  `timestamp_mins` timestamp NOT NULL,
  `firefighter_id` VARCHAR(20) NOT NULL,
  `device_id` VARCHAR(20) DEFAULT NULL,
  `device_battery_level` FLOAT DEFAULT NULL,
  `temperature` SMALLINT DEFAULT NULL,
  `humidity` SMALLINT DEFAULT NULL,
  `carbon_monoxide` FLOAT DEFAULT NULL,
  `nitrogen_dioxide` FLOAT DEFAULT NULL,
  `formaldehyde` FLOAT DEFAULT NULL,
  `acrolein` FLOAT DEFAULT NULL,
  `benzene` FLOAT DEFAULT NULL,
  `device_timestamp` timestamp DEFAULT NULL,
  `device_status_LED` SMALLINT DEFAULT NULL,
  PRIMARY KEY (`timestamp_mins`, `firefighter_id`)
) DEFAULT CHARSET= UTF8mb4;

@krook
Copy link
Member

krook commented Sep 10, 2020

Thanks @JSegrave-IBM!

Yes, the database and each table should be UTF-8. We may have to dump data and recreate it:

CREATE DATABASE IF NOT EXISTS prometeo character set UTF8mb4 collate utf8mb4_unicode_ci;

@mrodrise
Copy link
Collaborator

Thanks for all, great work.

I consider that we should add a field for the device id that is sending the data

Maybe we can add fields for the status color that gives the algorithm in that moment, only to have the history

We can change the name of the table... I never liked metrics... we can rename to firefighters_status_readings... or whatever...

@JSegrave-IBM
Copy link
Collaborator Author

Thanks @mrodrise - table name is now 'firefighter_status_analytics' - is that OK? (mix of calculated values and 'cleaned' sensor values - e.g. dropout-refilled, minute-aligned ). Also added 'device_id' and 'device_battery_level' so all the screen display data is in one place.

@JSegrave-IBM
Copy link
Collaborator Author

CREATE DATABASE IF NOT EXISTS prometeo character set UTF8mb4 collate utf8mb4_unicode_ci;

Thanks @krook - I presume all the CREATE TABLE statements should have DEFAULT CHARSET=UTF8mb4; in kind?

@krook
Copy link
Member

krook commented Sep 11, 2020

@JSegrave-IBM correct. But it may automatically cascade as the default. Looking at another database I have:

image

@JSegrave-IBM
Copy link
Collaborator Author

Changes: (FYI - @mrodrise @lidderupk )

  • There are two tables:
    • firefighter_sensor_log where each row is the raw/unmodified sensor readings associated with firefighter_id during the minute timestamp_mins - we never modify this data
    • firefighter_status_analytics - where each row the same data as the sensor log, plus all the analytic information about firefighter_id during the minute timestamp_mins. This is a kind of one-stop-shop for data analysis - everything about the firefighter in one place. Making it a separate table to firefighter_sensor_log allows us to keep an untouched record of what was sent. It also removes a few potential contention points between the processes that read and write information.
  • Added device_timestamp - the unmodified original device timestamp (not minute-quantized)

@JSegrave-IBM
Copy link
Collaborator Author

JSegrave-IBM commented Sep 18, 2020

Normally the device_status_LED and analytics_status_LED (i.e. the LED colors) will be the same, but in a disconnected scenario, they may be different. We'll capture both. Hence these two updates above:

  • firefighter_sensor_log - added device_status_LED
  • firefighter_status_analytics - added device_status_LED and analytics_status_LED

For both _status_LED { 1 = green ; 2 = yellow ; 3 = red } and for this release, 'yellow' is a configurable percentage, currently defaulting to 80% (i.e. any gas reaching 80% of any AEGL-2 time-period limit will cause the LED to go yellow)

@mrodrise - is 80% OK for the default? (it can easily be configured to 75% or any other percentage of an AEGL-2 limit).

@JSegrave-IBM
Copy link
Collaborator Author

Pull request: Pyrrha-Platform/Pyrrha-Database#3

@JSegrave-IBM
Copy link
Collaborator Author

Modified DB field names to make them less sensitive to configuration changes. Fields for the different time-window TWAs are now named the after the number of minutes in that window (stable) and not by the display name (less stable and prone to SQL issues).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants