Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node Reminders #84

Merged
merged 9 commits into from
Sep 18, 2024
Merged

Conversation

arik-so
Copy link
Contributor

@arik-so arik-so commented Jul 2, 2024

Send reminders for node announcements, as we have been doing with channel updates.

Given that we now have similar logic for channels and nodes, there are some preparatory steps we take in this PR prior to its focal commit.

We need to move the prune interval to a config variable, extract the logic for determining reminder inclusion into its own method, and rename some fields whose role will now subtly change.

The reminder logic itself has the following breakdown:

  • If the pre-sync update was more than 6 days ago, serialize in full.
  • Otherwise:
    • If the last mutation occurred after the last sync, serialize the mutated properties.
    • Otherwise:
      • If the last mutation occurred more than 6 days ago, serialize as a reminder.
      • Otherwise, don't serialize at all.

This PR replaces #83.

@@ -23,6 +23,9 @@ pub(crate) const MAX_SNAPSHOT_SCOPE: u32 = 3600 * 24 * 21; // three weeks
/// updates in both directions.
pub(crate) const CHANNEL_REMINDER_AGE: Duration = Duration::from_secs(6 * 24 * 60 * 60);

/// The interval after which graph data gets pruned after it was first seen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about how this must match LDK's behavior would be nice.

src/lib.rs Outdated
@@ -187,11 +187,11 @@ async fn calculate_delta<L: Deref + Clone>(network_graph: Arc<NetworkGraph<L>>,
// for announcement-free incremental-only updates, chain hash can be skipped

let mut delta_set = DeltaSet::new();
lookup::fetch_channel_announcements(&mut delta_set, network_graph, &client, last_sync_timestamp, snapshot_reference_timestamp, logger.clone()).await;
lookup::fetch_channel_announcements(&mut delta_set, network_graph.clone(), &client, last_sync_timestamp, snapshot_reference_timestamp, logger.clone()).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prefer Arc::clone(&network_graph) as its clear what's going on, whereas this reads as if we're actually clone'ing the network graph.

ORDER BY public_key ASC, seen DESC
", [last_sync_timestamp_float]).await.unwrap();
", params).await.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the handling of this query's response now never insert into the map, cause we pre-populate it? Further, can't we entirely drop this query now cause we can just fill in the current values from the graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the network graph doesn't give us the seen timestamps of the latest values, so we still need to look those up.

src/lookup.rs Outdated
features: details.features().clone(),
addresses: details.addresses().into_iter().cloned().collect(),
}
} else {
println!("nope, here we go");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh

src/lib.rs Outdated
if node_delta.has_address_set_changed {
node_address_update_count += 1;

if matches!(strategy, NodeSerializationStrategy::Mutated(MutatedNodeProperties { addresses: true, .. }) | NodeSerializationStrategy::Full) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if matches! is super weird...this is what match is for (and it avoids forgetting cases in case the enum changes)

src/lookup.rs Outdated
@@ -583,31 +589,50 @@ pub(super) async fn fetch_node_updates<L: Deref>(network_graph: Arc<NetworkGraph
let current_node_delta = delta_set.entry(node_id).or_insert(NodeDelta::default());
let address_set: HashSet<SocketAddress> = unsigned_node_announcement.addresses.into_iter().collect();

// determine mutations
if !is_previously_processed_node_id {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I find this kind of thing totally unreadable - why do we need a variable to store Some(node_id) == previous_node_id? Its just indirection I have to go look up when trying to review.

features: unsigned_node_announcement.features,
addresses: address_set,
});
if current_seen_timestamp >= last_sync_timestamp {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be checked before setting has_address/feature_set_changed? Otherwise eg a node could update its features set last week, then update its address set every hour and once an hour we'll include its features set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we only serialize the mutations that have occurred? So if since the last value the client had seen, the features had changed, we would include it, but if they hadn't, we would only serialize the address mutation?

// this is the highest timestamp value, so set the seen timestamp accordingly
current_node_delta.latest_details.as_mut().map(|mut d| d.seen.replace(current_seen_timestamp));
}

if let Some(last_seen_update) = current_node_delta.last_details_before_seen.as_ref() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this logic is fine but honestly its pretty weird its done repeatedly for each update - we aren't really "deciding" what the serialization path is for a node in it, despite the way it reads. A comment could be helpful but you might also write it for the previous node in the above if block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I quite follow what you mean. We need to check for each update pairwise whether it has diverged from the last value the client would have seen. We can only do so if we know what the last value saw a client had seen, so I'm not quite sure what it has to do with whether it's a new node ID iteration or not?

@arik-so arik-so mentioned this pull request Jul 10, 2024
@arik-so arik-so marked this pull request as ready for review July 18, 2024 08:55
Create a config variable representing the time interval whereafter
graph data gets pruned. This value should be used to limit lookup
time frames.
If the latest node details have already been seen by a client, we
still need to store them for correctly detecting reminder necessity
in the future.
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think this LGTM. I'm still quite annoyed at the level of code duplication (and difference!) across node and channel handling when they're logically doing basically the same thing, but I'm not sure its worth rewriting this again and again.

src/lookup.rs Show resolved Hide resolved
Cargo.toml Outdated
@@ -4,18 +4,18 @@ version = "0.1.0"
edition = "2021"

[dependencies]
bitcoin = "0.30"
bitcoin = "0.31.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanna update this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dog_ptsd

src/lookup.rs Outdated
/// Fetch all the channel announcements that are presently in the network graph, regardless of
/// whether they had been seen before.
/// Also include all announcements for which the first update was announced
/// after `last_sync_timestamp`
pub(super) async fn fetch_channel_announcements<L: Deref>(delta_set: &mut DeltaSet, network_graph: Arc<NetworkGraph<L>>, client: &Client, last_sync_timestamp: u32, snapshot_reference_timestamp: Option<u64>, logger: L) where L::Target: Logger {
pub(super) async fn fetch_channel_announcements<L: Deref + Clone>(delta_set: &mut DeltaSet, network_graph: Arc<NetworkGraph<L>>, client: &Client, last_sync_timestamp: u32, snapshot_reference_timestamp: Option<u64>, logger: L) where L::Target: Logger {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really can pass references, I promise they won't bite.

@TheBlueMatt
Copy link
Contributor

Looks like 1.63 CI is failing.

let details: NodeDetails = if let Some(details) = node_info.announcement_info.as_ref() {
NodeDetails {
seen: 0,
features: details.features().clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit fails to compile because features is a field, not a method.

src/tests/mod.rs Outdated
@@ -118,7 +118,8 @@ fn generate_update(scid: u64, direction: bool, timestamp: u32, expiry_delta: u16
chain_hash: genesis_hash(),
short_channel_id: scid,
timestamp,
flags: 0 | flag_mask,
message_flags: 0,
channel_flags: 0 | flag_mask,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

Suggested change
channel_flags: 0 | flag_mask,
channel_flags: flag_mask,

@TheBlueMatt
Copy link
Contributor

Please make sure all individual commits compile on their own and the 1.63 build passes then we can land this!

@TheBlueMatt
Copy link
Contributor

Looks like postgres-types still needs to be pinned back one.

We want to ignore any node announcements that have already been
pruned. To do so, we extract all the node IDs from the network graph,
and use those to filter our queries.
We will need to determine whether or not a snapshot should include
reminders for both channel and node update messages. To prepare for
that, we extract the decision logic into its own method.
With the addition of reminders, we may encounter scenarios where
either a bit flip may suffice, instructing the client to look up its
latest data, or we may need to serialize all announcement details
a new if the client may have already purged the old data.

To better distinguish between these scenarios, we introduce a
serialization strategy enum that allows serializing either the full
announcement, just the mutations, or serve solely as a reminder and
serialize nothing at all.
This covers the following part of our serialization logic:

If the pre-sync update was more than 6 days ago, serialize in full.
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@TheBlueMatt TheBlueMatt merged commit dcb1d49 into lightningdevkit:main Sep 18, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants