Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priority Bidder Ejection #2952

Merged
merged 23 commits into from
Sep 19, 2023

Conversation

AlexBVolcy
Copy link
Contributor

@AlexBVolcy AlexBVolcy commented Jul 19, 2023

This PR addresses the Priority Bidder Ejection portion of this issue: #2173

The goal of this PR is to change how we eject UIDs from a cookie when a cookie is too full.

Currently, we simply eject the oldest UID from the cookie

The Ejector Object
In this PR, we introduce an ejector object with a Choose() method that can be implemented in various ways to adopt different strategist for how we Choose which UID to eject.

For the PriorityEjector, the host company can now define a list of priority bidders at this path usersync.priority_groups. The typing of this list is [][]string, where the last group of the list, represents the lowest priority elements. So this priority ejection strategy ejects the uid that is the lowest priority and the oldest element. Uids that are non priority elements will be ejected first before anything else. We utilize the OldestEjector to carry out the ejection in the cases where there are non priority elements present, or there are multiple uids that are of the lowest priority. The PriorityEjector will first filter out the elements in these cases, and pass just the relevant uids to the OldestEjector

OldestEjector simply eliminates the oldest uid from the uids it is given.

The actual ejection used to take place in usersync/cookie.go/SetCookieOnResponse(), but after the refactor, it now takes place in usersync/cookie.go/PrepareCookieForWrite()

pbs/usersync.go Outdated Show resolved Hide resolved
Copy link
Contributor

@guscarreon guscarreon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request rolls back the optimization in your previous PR #2830. p.PriorityGroups is a 2D matrix, so let c be its number of columns and r its number of rows. Let M = rc be the total number of elements in p.PriorityGroups and define N as the number of entries in cookie.uids. Under these assumptions, I think the time complexity of Choose() is T(N*M + N + M) if we exit in line 46 below:

35 func (p *PriorityBidderEjector) Choose(uids map[string]UIDEntry) (string, error) {
36     p.OldestEjector.nonPriorityKeys = getNonPriorityKeys(uids, p.PriorityGroups) // T(N*M)
37     if len(p.OldestEjector.nonPriorityKeys) > 0 {
38         return p.OldestEjector.Choose(uids)
39     }
40
41     if isSyncerPriority(p.SyncerKey, p.PriorityGroups) { // T(M)
42         // Eject Oldest Element from Lowest Priority Group and Update Priority Group
43         lowestPriorityGroup := p.PriorityGroups[len(p.PriorityGroups)-1]
44         oldestElement := getOldestElement(lowestPriorityGroup, uids) // T(N*M)
45         if oldestElement == "" {
46             return p.FallbackEjector.Choose(uids) // T(N)
47         }
48
49         updatedPriorityGroup := removeElementFromPriorityGroup(lowestPriorityGroup, oldestElement)
50         if updatedPriorityGroup == nil {
51             p.PriorityGroups = p.PriorityGroups[:len(p.PriorityGroups)-1]
52         } else {
53             p.PriorityGroups[len(p.PriorityGroups)-1] = updatedPriorityGroup
54         }
55
56         return oldestElement, nil
57     }
58     return "", errors.New("syncer key " + p.SyncerKey + " is not in priority groups")
59 } // T(N*M + M + N*M + N) ~> O(N*M)
usersync/ejector.go

And given that Choose() runs N times inside a for loop in PrepareCookieForWrite(), I believe we're looking at O(N^2*M)

59 // PrepareCookieForWrite ejects UIDs as long as the cookie is too full
60 func (cookie *Cookie) PrepareCookieForWrite(cfg *config.HostCookie, encoder Encoder, ejector Ejector) (string, error) {
61     for len(cookie.uids) > 0 { // T(N)
62         encodedCookie, err := encoder.Encode(cookie)
63         if err != nil {
64             return encodedCookie, nil
65         }
66
67         // Convert to HTTP Cookie to Get Size
68         httpCookie := &http.Cookie{
69             Name:    uidCookieName,
70             Value:   encodedCookie,
71             Expires: time.Now().Add(cfg.TTLDuration()),
72             Path:    "/",
73         }
74         cookieSize := len([]byte(httpCookie.String()))
75
76         isCookieTooBig := cookieSize > cfg.MaxCookieSizeBytes && cfg.MaxCookieSizeBytes > 0
77         if !isCookieTooBig {
78             return encodedCookie, nil
79         }
80
81         uidToDelete, err := ejector.Choose(cookie.uids) // T(N*M + M + N*M + N)
82         if err != nil {
83             return encodedCookie, err
84         }
85         delete(cookie.uids, uidToDelete)
86     }
87     return "", nil
88 } // T(N*(N*M + M + N*M + N)) -> T(N^2*M + N*M + N^2*M + N*M)) ~> O(N^2*M) 

If we reintroduce sort() and arrange PriorityGroups into an array of maps that allow for constant lookup time, I beleive we could cut this down to O(N^2 + M) at the expense of O(N+M) space:

60   func (cookie *Cookie) PrepareCookieForWrite(cfg *config.HostCookie, encoder Encoder, ejector Ejector) (string, error) {
   +     sortedUuidKeys := sortUIDs(cookie.uids) // T(N*logN), Space(N)
   +     
   +     priorityGroupsPutIntoHashTable := make([]map[string]struct{}, len(ejector.PriorityGroups) // Space(M)
   +     for i, group := ejector.PriorityGroups { // T(r)
   +         priorityGroupsPutIntoHashTable[i] = make(map[string]struct{}, len(group))
   +         for _, bidder := range group { // T(c)
   +             priorityGroupsPutIntoHashTable[i][bidder] = struct{}{}
   +         }
   +     } // T(r*c) -> T(M)
   +     
   +     i := 0
61       for len(cookie.uids) > 0 { // T(N)
62           encodedCookie, err := encoder.Encode(cookie)
63           if err != nil {
64               return encodedCookie, nil
65           }
66   
67           // Convert to HTTP Cookie to Get Size
68           httpCookie := &http.Cookie{
69               Name:    uidCookieName,
70               Value:   encodedCookie,
71               Expires: time.Now().Add(cfg.TTLDuration()),
72               Path:    "/",
73           }
74           cookieSize := len([]byte(httpCookie.String()))
75   
76           isCookieTooBig := cookieSize > cfg.MaxCookieSizeBytes && cfg.MaxCookieSizeBytes > 0
77           if !isCookieTooBig {
78               return encodedCookie, nil
79           }
80   
81 -         uidToDelete, err := ejector.Choose(cookie.uids)
   +         hashedGroupWithLessPriority = priorityGroupsPutIntoHashTable[len(priorityGroupsPutIntoHashTable) - 1]
   +
   +         // If we pass i, sortedUuidKeys, and the hashedGroupWithLessPriority, I think we could refactor `Choose()` to a time
   +         // complexity of O(N) because in N time we check each element of cookie.uids against the contents of hashedGroupWithLessPriority. 
   +         //   > If an element of cookie.uids matches against hashedGroupWithLessPriority, return the match.
   +         //   > If no element of cookie.uids is found in hashedGroupWithLessPriorit, we go back to simply returning the oldest entry, which is  uuidKeys[i]
   +         uidToDelete, err := ejector.Choose(cookie.uids, i, sortedUuidKeys, hashedGroupWithLessPriority) // T(N)
82           if err != nil {
83               return encodedCookie, err
84           }
85           delete(cookie.uids, uidToDelete)
   +         
   +         i++         
86       }
87       return "", nil
   +     // New time complexity: sorting + hashing the priority groups + traversing cookie.uids in this for loop times traversing cookie.uids in Choose()
   +     // T(N*logN + M + N^2) ~> O(N^2 + M)
88   }
usersync/cookie.go

Does this make sense? Am I missing something? Consider this pseudocode that's just trying to draft an idea and by no means encompasses all the scenarios here. Please let me know your thoughts.

usersync/ejector.go Show resolved Hide resolved
usersync/ejector_test.go Outdated Show resolved Hide resolved
@AlexBVolcy
Copy link
Contributor Author

@guscarreon Just wanted to respond to your proposal of optimizing the ejection process through sorting uids and converting the priority groups into a hash table. I was able to follow all of your suggestions, but my one question was when we set hashedGroupWithLessPriority = priorityGroupsPutIntoHashTable[len(priorityGroupsPutIntoHashTable) - 1], how do we know that this value is actually the lowest priority group? Wouldn't the conversion to a hash map, make the ordering of the groups now random?

Thank you for taking the time to write out this suggestion, very helpful! :)

@AlexBVolcy
Copy link
Contributor Author

After attempting to implement @guscarreon Choose() runtime optimization changes, I came to the conclusion that it would be too messy to the current code to introduce this update.

To introduce this change the following would occur:

  • The Choose() function signature would now need to have four elements passed to it, and each ejector would utilize a different set of these elements which I didn't like since I prefer that all elements passed to a function be used by that function.
  • sortUIDs would be reintroduced
  • We'd have to convert priority groups into a list of maps, which would then lead to there being a confusing juxtaposition between the PriorityGroups attached to the ejectors vs. the PriortyGroupsMap that is utilized within the Choose() method
  • removeElementFromPriorityGroup would have to be refactored to support the changes that come from priority groups now being a map
  • getOldestElement would no longer be able to be shared between the ejectors, and instead each ejector would have to have it's own version of a getElement() function
  • The Ejector interface would now need a GetPriorityGroups function in order for an ejectors PriorityGroups to be available within PrepareCookieForWrite
  • An overhaul to the current test framework that is accompanying this PR

usersync/ejector.go Outdated Show resolved Hide resolved
@AlexBVolcy
Copy link
Contributor Author

The latest push removes the FallbackEjector, and updates the OldestEjector to act as the fallback when the PriorityEjector either has non priority keys to eject, or there are multiple uids of the same priority that need to be ejected using the oldest strategy.

usersync/ejector.go Outdated Show resolved Hide resolved
usersync/ejector.go Outdated Show resolved Hide resolved
usersync/ejector.go Outdated Show resolved Hide resolved
usersync/ejector_test.go Outdated Show resolved Hide resolved
usersync/ejector.go Outdated Show resolved Hide resolved
usersync/ejector.go Show resolved Hide resolved
usersync/ejector.go Outdated Show resolved Hide resolved
usersync/ejector.go Outdated Show resolved Hide resolved
usersync/ejector_test.go Outdated Show resolved Hide resolved
usersync/ejector.go Outdated Show resolved Hide resolved
hhhjort
hhhjort previously approved these changes Aug 21, 2023
Copy link
Contributor

@guscarreon guscarreon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In one of my comments below I mention the assumption that in practice, the majority of times we won't need to eject entries from the cookie.uids map. But, to be honest, we don't have any metric for this. Do you think it'll be complicated to add a metric that counts the number of setuid requests that make us remove at least one element from cookie.uids?

If you think this PR is big enough, maybe we dshould do it in a separate PR if we decide this would be a worthy metric to follow.

endpoints/setuid.go Show resolved Hide resolved
hhhjort
hhhjort previously approved these changes Aug 24, 2023
guscarreon
guscarreon previously approved these changes Aug 25, 2023
Copy link
Contributor

@guscarreon guscarreon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AlexBVolcy AlexBVolcy dismissed stale reviews from guscarreon and hhhjort via 2e84e37 August 29, 2023 16:35
hhhjort
hhhjort previously approved these changes Aug 29, 2023
guscarreon
guscarreon previously approved these changes Aug 31, 2023
Copy link
Contributor

@guscarreon guscarreon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SyntaxNode
Copy link
Contributor

Manually tested several scenarios, and it looks good. I didn't find any holes in the implementation. The only question I have is how the server should respond if the uid cannot fit in the cookie and isn't a priority. Right now, a 400 is returned. Does this match PBS-Java?

@AlexBVolcy AlexBVolcy dismissed stale reviews from guscarreon and hhhjort via 958fcdc September 12, 2023 18:57
endpoints/setuid.go Outdated Show resolved Hide resolved
hhhjort
hhhjort previously approved these changes Sep 18, 2023
Copy link
Collaborator

@hhhjort hhhjort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@guscarreon guscarreon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hhhjort hhhjort merged commit b96e2f7 into prebid:master Sep 19, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants