Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Segment identify calls to remove email, add domain #405

Merged
merged 4 commits into from
Oct 22, 2024

Conversation

noeldevelops
Copy link
Member

@noeldevelops noeldevelops commented Oct 18, 2024

Summary of Changes

  • Instead of trying to send the full email to Segment, extract the domain ending and send that.
  • Created a new function specifically for sending a Segment Identify call that handles the details of parsing user data from the Session or UserInfo object.
  • Adds a call to this ^ function during the authentication part of Extension Activation to catch existing sessions.
  • ⚠️ Files changed looks bigger bc I refactored to split off getTelemetryLogger from the other functions to make testing easier (since mocking module-level functions is still a challenge)

Any additional details or context that should be provided?

  • We found that VS Code TelemetryLogger class will redact any email-looking string via RegEx. We'd like to filter our final analytics data by internal (Confluent) users while respecting this PII, hence the addition of a domain property instead of email
  • Here's the event as it appears in Segment:
Screenshot 2024-10-22 at 2 54 32 PM

Pull request checklist

Please check if your PR fulfills the following (if applicable):

Tests
  • Added new
  • Updated existing
  • Deleted existing
Other
  • Does anything in this PR need to be mentioned in the user-facing CHANGELOG or README?
  • Have you validated this change locally by packaging and installing the extension .vsix file?
    gulp clicktest

@noeldevelops noeldevelops requested a review from a team as a code owner October 18, 2024 16:01
@noeldevelops noeldevelops marked this pull request as draft October 18, 2024 16:01
@noeldevelops noeldevelops changed the title Fix emails segment identify Update Segment identify calls to remove email, add domain Oct 18, 2024
@rhauch rhauch added this to the Telemetry improvements milestone Oct 18, 2024
@noeldevelops noeldevelops marked this pull request as ready for review October 22, 2024 20:56
Copy link
Contributor

@MSeal MSeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor questions, good for merge independently imo.

let domain: string | undefined;
if (username) {
// email is redacted by VSCode TelemetryLogger, but we extract domain for Confluent analytics use
const emailRegex = /@[a-zA-Z0-9-]+\.[a-zA-Z0-9-]+/;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd of maybe just done if @ present, then split and grab [1] since we don't care about other properties and expect it to always be an email entry anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does sound simpler! I wasn't sure if we were 100% certain this will be email every time. Happy to update if it makes sense to do so in follow ups

});
// We don't want to send the user traits or identify prop in the following Track call
delete data.identify;
delete data.user;
}
analytics?.track({
userId,
anonymousId: data?.["common.vscodesessionid"] || segmentAnonId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just setting us up to handle logout in near future... We're already sending vscodesessionid with the common properties in data, and the segmentAnonId is stable in each instance of the TelemetryLogger, so it won't make much difference now but in updates I'll be able to reset the anon id (during log out flow or other cases where the User Id might be new for an existing VSCode session)

@noeldevelops noeldevelops merged commit 632e5e4 into main Oct 22, 2024
1 check passed
@noeldevelops noeldevelops deleted the ncothren/emails-segment-identify branch October 22, 2024 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants