Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

PVF: add landlock sandboxing #7303

Merged
merged 19 commits into from
Jul 5, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions node/core/pvf/common/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,12 @@ sc-executor-wasmtime = { git = "https://github.com/paritytech/substrate", branch
sp-core = { git = "https://github.com/paritytech/substrate", branch = "master" }
sp-tracing = { git = "https://github.com/paritytech/substrate", branch = "master" }

[target.'cfg(target_os = "linux")'.dependencies]
landlock = "0.2.0"

[dev-dependencies]
assert_matches = "1.4.0"
tempfile = "3.3.0"

[build-dependencies]
substrate-build-script-utils = { git = "https://github.com/paritytech/substrate", branch = "master" }
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

//! Functionality common to both prepare and execute workers.

pub mod security;

use crate::LOG_TARGET;
use cpu_time::ProcessTime;
use futures::never::Never;
Expand Down Expand Up @@ -203,7 +205,7 @@ pub mod thread {
};

/// Contains the outcome of waiting on threads, or `Pending` if none are ready.
#[derive(Clone, Copy)]
#[derive(Debug, Clone, Copy)]
pub enum WaitOutcome {
Finished,
TimedOut,
Expand All @@ -224,8 +226,9 @@ pub mod thread {
Arc::new((Mutex::new(WaitOutcome::Pending), Condvar::new()))
}

/// Runs a thread, afterwards notifying the threads waiting on the condvar. Catches panics and
/// resumes them after triggering the condvar, so that the waiting thread is notified on panics.
/// Runs a worker thread. Will first enable security features, and afterwards notify the threads waiting on the
/// condvar. Catches panics during execution and resumes the panics after triggering the condvar, so that the
/// waiting thread is notified on panics.
pub fn spawn_worker_thread<F, R>(
name: &str,
f: F,
Expand All @@ -237,9 +240,18 @@ pub mod thread {
F: Send + 'static + panic::UnwindSafe,
R: Send + 'static,
{
thread::Builder::new()
.name(name.into())
.spawn(move || cond_notify_on_done(f, cond, outcome))
thread::Builder::new().name(name.into()).spawn(move || {
cond_notify_on_done(
|| {
#[cfg(target_os = "linux")]
let _ = crate::worker::security::landlock::try_restrict_thread();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, instead of simply really doing nothing, I think we should issue a warning that landlock is not available and for maximum security encourage the operator to upgrade their Kernel/make sure landlock is available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for non Linux - a warning that the landlock sandboxing is not available would be a good idea, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can emit a warning (once on host startup I think), but the way the validator guide is written right now, I'm not sure how much control a validator can have on their kernel version/configuration:

"The most common way for a beginner to run a validator is on a cloud server running Linux. You may choose whatever VPS provider that your prefer."

I guess we could recommend "try to find a VPS with landlock enabled", but unless we give specific recommendations, it might be hard to find this info.

For people not running on a VPS, I found a guide for enabling landlock, but I haven't tested it, and I'm not a Linux expert. I'm not sure we should be officially recommending random guides from the internet...

Advice would be really appreciated. Let me know what you think!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can emit a warning (once on host startup I think)

Hmm, so given the above, I'm not sure how actionable a warning would be. It could just be noise for most operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found out that landlock is enabled by default on most distros including Ubuntu 22.04 LTS (what we recommend). And I don't see any reason why a VPS would explicitly opt-out of a security feature. So, we can simply emit a warning like this:

WARNING: Could not enable landlock, a Linux kernel security feature. Running validation of PVF code may compromise this machine. Consider upgrading the kernel version for maximum security.

This comment was marked as duplicate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if we consider this the recommended way of running the validator then we should refuse to start if the feature is not present and build in an explicit cmdline argument to bypass this check.

That's a good idea, I've added it to #7073 (see description). I don't think we can mandate it right now because like half of validators are on too-old kernel versions (according to the visible telemetry data). We already need to announce that "secure mode" change in advance, maybe we can also mention that validators should upgrade to kernel 5.13+ to avoid running in "insecure mode". I will bug Will again to see if we can get that announcement going.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, makes sense then. So, keep the warning for now and make it mandatory at a later date.

Copy link
Member

@eskimor eskimor Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this makes sense, I would nevertheless suggest a soft launch:

  1. Enable the feature and print warning on startup if not available, hinting that this feature will be mandatory on some future release XY and that the operator should upgrade the machine. In addition this should obviously be part of the release notes as they are less likely to be ignored.
  2. Release version XY which makes it mandatory, exiting with a warning that the check can be bypassed by that command line flag.

Updating the wiki would be good as well, but is least likely to be read by already operating operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eskimor I missed your comment before merging. Here's what we have now, let me know what needs a follow-up:

mrcnski marked this conversation as resolved.
Show resolved Hide resolved

f()
},
cond,
outcome,
)
})
}

/// Runs a worker thread with the given stack size. See [`spawn_worker_thread`].
Expand Down
141 changes: 141 additions & 0 deletions node/core/pvf/common/src/worker/security.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.

// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.

// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.

// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.

//! Functionality for securing workers.
//!
//! This is needed because workers are used to compile and execute untrusted code (PVFs).

/// The [landlock] docs say it best:
///
/// > "Landlock is a security feature available since Linux 5.13. The goal is to enable to restrict
/// ambient rights (e.g., global filesystem access) for a set of processes by creating safe security
/// sandboxes as new security layers in addition to the existing system-wide access-controls. This
/// kind of sandbox is expected to help mitigate the security impact of bugs, unexpected or
/// malicious behaviors in applications. Landlock empowers any process, including unprivileged ones,
/// to securely restrict themselves."
///
/// [landlock]: https://docs.rs/landlock/latest/landlock/index.html
#[cfg(target_os = "linux")]
pub mod landlock {
// Export for checking the status.
pub use landlock::RulesetStatus;

use landlock::{Access, AccessFs, Ruleset, RulesetAttr, RulesetError, ABI};

/// Version of landlock ABI. Use the latest version supported by our reference kernel version.
///
/// - Reference kernel version: 5.16+
/// - ABI V1: 5.13
/// - ABI V2: 5.19
///
/// Please update these docs if they are out-of-date, and bump the ABI version to the minimum
/// ABI supported by the reference kernel version.
pub const LANDLOCK_ABI: ABI = ABI::V1;

// TODO: <https://github.com/landlock-lsm/rust-landlock/issues/36>
/// Returns to what degree landlock is enabled on the current Linux environment.
pub fn get_status() -> Result<RulesetStatus, Box<dyn std::error::Error>> {
match std::thread::spawn(|| try_restrict_thread()).join() {
Ok(Ok(status)) => Ok(status),
Ok(Err(ruleset_err)) => Err(ruleset_err.into()),
Err(_err) => Err("a panic occurred in try_restrict_thread".into()),
}
}

/// Returns a single bool indicating whether landlock is fully enabled on the current Linux
/// environment.
pub fn is_fully_enabled() -> bool {
matches!(get_status(), Ok(RulesetStatus::FullyEnforced))
}

/// Tries to restrict the current thread with the following landlock access controls:
///
/// 1. all global filesystem access
/// 2. ... more may be supported in the future.
///
/// If landlock is not supported in the current environment this is simply a noop.
///
/// # Returns
///
/// The status of the restriction (whether it was fully, partially, or not-at-all enforced).
pub fn try_restrict_thread() -> Result<RulesetStatus, RulesetError> {
let status = Ruleset::new()
.handle_access(AccessFs::from_all(LANDLOCK_ABI))?
.create()?
.restrict_self()?;
Ok(status.ruleset)
}

#[cfg(test)]
mod tests {
use super::*;
use std::{fs, io::ErrorKind, thread};

#[test]
fn restricted_thread_cannot_access_fs() {
// TODO: This would be nice: <https://github.com/rust-lang/rust/issues/68007>.
if !is_fully_enabled() {
return
}

// Restricted thread cannot read from FS.
let handle = thread::spawn(|| {
// Write to a tmp file, this should succeed before landlock is applied.
let text = "foo";
let tmpfile = tempfile::NamedTempFile::new().unwrap();
let path = tmpfile.path();
fs::write(path, text).unwrap();
let s = fs::read_to_string(path).unwrap();
assert_eq!(s, text);

let status = try_restrict_thread().unwrap();
if !matches!(status, RulesetStatus::FullyEnforced) {
panic!("Ruleset should be enforced since we checked if landlock is enabled");
}

// Try to read from the tmp file after landlock.
let result = fs::read_to_string(path);
assert!(matches!(
result,
Err(err) if matches!(err.kind(), ErrorKind::PermissionDenied)
));
});

assert!(handle.join().is_ok());

// Restricted thread cannot write to FS.
let handle = thread::spawn(|| {
let text = "foo";
let tmpfile = tempfile::NamedTempFile::new().unwrap();
let path = tmpfile.path();

let status = try_restrict_thread().unwrap();
if !matches!(status, RulesetStatus::FullyEnforced) {
panic!("Ruleset should be enforced since we checked if landlock is enabled");
}

// Try to write to the tmp file after landlock.
let result = fs::write(path, text);
assert!(matches!(
result,
Err(err) if matches!(err.kind(), ErrorKind::PermissionDenied)
));
});

assert!(handle.join().is_ok());
}
}
}
29 changes: 29 additions & 0 deletions node/core/pvf/src/host.rs
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ struct ExecutePvfInputs {
}

/// Configuration for the validation host.
#[derive(Debug)]
pub struct Config {
/// The root directory where the prepared artifacts can be stored.
pub cache_path: PathBuf,
Expand Down Expand Up @@ -189,6 +190,11 @@ impl Config {
/// In that case all pending requests will be canceled, dropping the result senders and new ones
/// will be rejected.
pub fn start(config: Config, metrics: Metrics) -> (ValidationHost, impl Future<Output = ()>) {
gum::debug!(target: LOG_TARGET, ?config, "starting PVF validation host");

// Run checks for supported security features once per host startup.
warn_if_no_landlock();

let (to_host_tx, to_host_rx) = mpsc::channel(10);

let validation_host = ValidationHost { to_host_tx };
Expand Down Expand Up @@ -854,6 +860,29 @@ fn pulse_every(interval: std::time::Duration) -> impl futures::Stream<Item = ()>
.map(|_| ())
}

/// Check if landlock is supported and emit a warning if not.
fn warn_if_no_landlock() {
#[cfg(target_os = "linux")]
{
use polkadot_node_core_pvf_common::worker::security::landlock;
let status = landlock::get_status();
if !matches!(status, Ok(landlock::RulesetStatus::FullyEnforced)) {
mrcnski marked this conversation as resolved.
Show resolved Hide resolved
let abi = landlock::LANDLOCK_ABI as u8;
gum::warn!(
target: LOG_TARGET,
?status,
%abi,
"Cannot fully enable landlock, a Linux kernel security feature. Running validation of PVF code has a higher risk of compromising this machine. Consider upgrading the kernel version for maximum security."
mrcnski marked this conversation as resolved.
Show resolved Hide resolved
);
}
}
#[cfg(not(target_os = "linux"))]
gum::warn!(
target: LOG_TARGET,
"Cannot enable landlock, a Linux kernel security feature. Running validation of PVF code has a higher risk of compromising this machine. Consider running on Linux with landlock support for maximum security."
mrcnski marked this conversation as resolved.
Show resolved Hide resolved
);
}

#[cfg(test)]
pub(crate) mod tests {
use super::*;
Expand Down
1 change: 1 addition & 0 deletions roadmap/implementers-guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
- [Utility Subsystems](node/utility/README.md)
- [Availability Store](node/utility/availability-store.md)
- [Candidate Validation](node/utility/candidate-validation.md)
- [PVF Host and Workers](node/utility/pvf-host-and-workers.md)
- [Provisioner](node/utility/provisioner.md)
- [Network Bridge](node/utility/network-bridge.md)
- [Gossip Support](node/utility/gossip-support.md)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,86 +44,10 @@ Once we have all parameters, we can spin up a background task to perform the val
* The collator signature is valid
* The PoV provided matches the `pov_hash` field of the descriptor

For more details please see [PVF Host and Workers](pvf-host-and-workers.md).

### Checking Validation Outputs

If we can assume the presence of the relay-chain state (that is, during processing [`CandidateValidationMessage`][CVM]`::ValidateFromChainState`) we can run all the checks that the relay-chain would run at the inclusion time thus confirming that the candidate will be accepted.

### PVF Host

The PVF host is responsible for handling requests to prepare and execute PVF
code blobs.

One high-level goal is to make PVF operations as deterministic as possible, to
reduce the rate of disputes. Disputes can happen due to e.g. a job timing out on
one machine, but not another. While we do not yet have full determinism, there
are some dispute reduction mechanisms in place right now.

#### Retrying execution requests

If the execution request fails during **preparation**, we will retry if it is
possible that the preparation error was transient (e.g. if the error was a panic
or time out). We will only retry preparation if another request comes in after
15 minutes, to ensure any potential transient conditions had time to be
resolved. We will retry up to 5 times.

If the actual **execution** of the artifact fails, we will retry once if it was
a possibly transient error, to allow the conditions that led to the error to
hopefully resolve. We use a more brief delay here (1 second as opposed to 15
minutes for preparation (see above)), because a successful execution must happen
in a short amount of time.

We currently know of the following specific cases that will lead to a retried
execution request:

1. **OOM:** The host might have been temporarily low on memory due to other
processes running on the same machine. **NOTE:** This case will lead to
voting against the candidate (and possibly a dispute) if the retry is still
not successful.
2. **Artifact missing:** The prepared artifact might have been deleted due to
operator error or some bug in the system.
3. **Panic:** The worker thread panicked for some indeterminate reason, which
may or may not be independent of the candidate or PVF.

#### Preparation timeouts

We use timeouts for both preparation and execution jobs to limit the amount of
time they can take. As the time for a job can vary depending on the machine and
load on the machine, this can potentially lead to disputes where some validators
successfuly execute a PVF and others don't.

One dispute mitigation we have in place is a more lenient timeout for
preparation during execution than during pre-checking. The rationale is that the
PVF has already passed pre-checking, so we know it should be valid, and we allow
it to take longer than expected, as this is likely due to an issue with the
machine and not the PVF.

#### CPU clock timeouts

Another timeout-related mitigation we employ is to measure the time taken by
jobs using CPU time, rather than wall clock time. This is because the CPU time
of a process is less variable under different system conditions. When the
overall system is under heavy load, the wall clock time of a job is affected
more than the CPU time.

#### Internal errors

In general, for errors not raising a dispute we have to be very careful. This is
only sound, if we either:

1. Ruled out that error in pre-checking. If something is not checked in
pre-checking, even if independent of the candidate and PVF, we must raise a
dispute.
2. We are 100% confident that it is a hardware/local issue: Like corrupted file,
etc.

Reasoning: Otherwise it would be possible to register a PVF where candidates can
not be checked, but we don't get a dispute - so nobody gets punished. Second, we
end up with a finality stall that is not going to resolve!

There are some error conditions where we can't be sure whether the candidate is
really invalid or some internal glitch occurred, e.g. panics. Whenever we are
unsure, we can never treat an error as internal as we would abstain from voting.
So we will first retry the candidate, and if the issue persists we are forced to
vote invalid.

[CVM]: ../../types/overseer-protocol.md#validationrequesttype
Loading