You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The sequence to be ensured here is:
t0=Failure to report load to leader as observed by backend, tick is not incremented in next report if failure observed
t1.a1=Failures continue to happen, leader marks backend as defunct, stops assigning new process groups to it
t1.a2=Reports succeed, back to state before t0
t2.a1.b1=Failures continue to happen, backend marks self as defunct, sends 5xx for /poll when existing assigned recorders make requests. Ensure t2>t1, i.e. backend marks self as defunct only after leader marks it as defunct
t3.a1.b1=Backend starts sending exploratory reports (tick=0)
t4.a1.b1=Recorder gives up on backend since it keeps receiving 5xx, calls /associate to get a new backend, leader assigns a new backend to recorder's process group, de-associates the defunct backend
t5.a1.b1=N successive exploratory ticks succeed, backend marks self as available again, sends usual reports to leader
t6.a1.b1=Reports succeed, eventually leader marks backend as available again
Description:
If backend is not able to talk to leader (in case of n/w partition) for some successive load reports, it should mark self as defunct and respond to all /poll requests with 5xx. /profile calls should not be errored.
The above is required so that recorder detects something to be wrong with backend and calls /association again to get a healthy backend via leader.
Backend should mark self as available again if it can communicate with leader for some successive report intervals.
The text was updated successfully, but these errors were encountered:
The sequence to be ensured here is:
t0=Failure to report load to leader as observed by backend, tick is not incremented in next report if failure observed
t1.a1=Failures continue to happen, leader marks backend as defunct, stops assigning new process groups to it
t1.a2=Reports succeed, back to state before t0
t2.a1.b1=Failures continue to happen, backend marks self as defunct, sends 5xx for /poll when existing assigned recorders make requests. Ensure t2>t1, i.e. backend marks self as defunct only after leader marks it as defunct
t3.a1.b1=Backend starts sending exploratory reports (tick=0)
t4.a1.b1=Recorder gives up on backend since it keeps receiving 5xx, calls /associate to get a new backend, leader assigns a new backend to recorder's process group, de-associates the defunct backend
t5.a1.b1=N successive exploratory ticks succeed, backend marks self as available again, sends usual reports to leader
t6.a1.b1=Reports succeed, eventually leader marks backend as available again
Description:
If backend is not able to talk to leader (in case of n/w partition) for some successive load reports, it should mark self as defunct and respond to all /poll requests with 5xx. /profile calls should not be errored.
The above is required so that recorder detects something to be wrong with backend and calls /association again to get a healthy backend via leader.
Backend should mark self as available again if it can communicate with leader for some successive report intervals.
The text was updated successfully, but these errors were encountered: