Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AV-214603 Changes for accepting Gateway with some valid and some invalid listener #1550

Merged
merged 2 commits into from
Oct 11, 2024

Conversation

pkoshtavmware
Copy link
Contributor

@pkoshtavmware pkoshtavmware commented Oct 4, 2024

This PR contains following changes:

  • AV-214603 Changes for accepting Gateway with some valid and some invalid listener
  • Execution of Gateway and HTTPRoute in any order
  • Creating map for storing HTTPRoute status during its validation to be used later for processing

Testing status:
Unit Test Cases Added and updated older ones to support this
Manually tested this PR end to end.
Following Transition cases are working as expected without reboot:

1. One gateway with two routes partially valid to valid 
2. One gateway with two routes invalid to valid
3. One gateway with two routes invalid to partially valid
4. One gateway with one route valid to partially valid (port not getting updated which will be taken up seperately by @arihantg )
5. One gateway with one route partially valid to valid 
6. One gateway with one route invalid to valid
7. One gateway with one route invalid to partially valid

Following Transition cases are working with reboot:

1. One gateway with two routes valid to partially valid
2. One gateway with two routes valid to invalid
3. One gateway with two routes partially valid to invalid
4. One gateway with one route partially valid to invalid
5. One gateway with one route valid to invalid

TODO:
Run FTs

@pkoshtavmware pkoshtavmware force-pushed the AV-214603-master branch 11 times, most recently from 118cf9f to 27b4a10 Compare October 7, 2024 14:03
@pkoshtavmware
Copy link
Contributor Author

Ut run result:

        github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests               coverage: 0.0% of statements
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/graphlayer    435.151s        coverage: 18.5% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/ingestion     220.658s        coverage: 2.8% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/npltests      60.356s coverage: 16.0% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/status        320.931s        coverage: 12.9% of statements in ./...

@pkoshtavmware pkoshtavmware added the open-for-review Pull request is up for review label Oct 8, 2024

gwStatus := akogatewayapiobjects.GatewayApiLister().GetGatewayToGatewayStatusMapping(gateway.Namespace + "/" + gateway.Name)
for i, listener := range gateway.Spec.Listeners {
if gwStatus.Listeners[i].Conditions[0].Type == string(gatewayv1.ListenerConditionAccepted) && gwStatus.Listeners[i].Conditions[0].Status == "False" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should check if len(gwStatus.Listeners) >= i before this to avoid out of index error. Also consider moving this check in a function since this is called in multiple places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During ingestion layer, while validating each and every listener, its status update is enqueued to Status layer queue. If that status update takes time or didn;t go through, we might end up processing invalid listeners.
Should we introduce some retry if status for that listener is not there? or should we take out dependency on status update and introduce some internal structure to manage valid and invalid listeners?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have already introduced an internal structure, GetGatewayToGatewayStatusMapping function is taking value from that map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did len(gwStatus.Listeners) >= i change

Copy link
Contributor

@akshayhavile akshayhavile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkoshtavmware: In DequeueIngestion, we have logic
'''
if !IsHTTPRouteValid(key, httpRoute) {
return
}
'''
If route changes from valid to partial invalid, how are we deleting existing objects?

return
}
listRoutes, err := validateReferredHTTPRoute(key, allowedRoutesAll, gw)
if err != nil {
utils.AviLog.Errorf("Validation of Referred HTTPRoutes Failed due to error : %s", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Failed --> failed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -242,8 +250,8 @@ func isValidListener(key string, gateway *gatewayv1.Gateway, gatewayStatus *gate
}
name := string(certRef.Name)
cs := utils.GetInformers().ClientSet
secretObj, err := cs.CoreV1().Secrets(gateway.ObjectMeta.Namespace).Get(context.TODO(), name, metav1.GetOptions{})
if err != nil || secretObj == nil {
_, err := cs.CoreV1().Secrets(gateway.ObjectMeta.Namespace).Get(context.TODO(), name, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can avoid cs variable creation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


gwStatus := akogatewayapiobjects.GatewayApiLister().GetGatewayToGatewayStatusMapping(gateway.Namespace + "/" + gateway.Name)
for i, listener := range gateway.Spec.Listeners {
if gwStatus.Listeners[i].Conditions[0].Type == string(gatewayv1.ListenerConditionAccepted) && gwStatus.Listeners[i].Conditions[0].Status == "False" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During ingestion layer, while validating each and every listener, its status update is enqueued to Status layer queue. If that status update takes time or didn;t go through, we might end up processing invalid listeners.
Should we introduce some retry if status for that listener is not there? or should we take out dependency on status update and introduce some internal structure to manage valid and invalid listeners?

ako-gateway-api/nodes/gateway_model_rel.go Show resolved Hide resolved
}

func TestMultipleHttpRoutesWithValidAndInvalidGatewayListeners(t *testing.T) {
gatewayName := "gateway-hr-09"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use incremental naming convention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}, 25*time.Second).Should(gomega.Equal(true))

integrationtest.CreateSVC(t, DEFAULT_NAMESPACE, svcName, corev1.ProtocolTCP, corev1.ServiceTypeClusterIP, false)
integrationtest.CreateEP(t, DEFAULT_NAMESPACE, svcName, false, false, "1.2.3")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CreateEPorEPS and DelEPorEPS for rest of the unit tests cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
nodes := aviModel.(*avinodes.AviObjectGraph).GetAviEvhVS()
return len(nodes[0].EvhNodes)
}, 50*time.Second).Should(gomega.Equal(2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use retry timeout everywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

akogatewayapitests.UpdateGateway(t, gatewayName, DEFAULT_NAMESPACE, gatewayClassName, nil, listeners)

g.Eventually(func() bool {
gateway, err := akogatewayapitests.GatewayClient.GatewayV1().Gateways(DEFAULT_NAMESPACE).Get(context.TODO(), gatewayName, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are using informer client in the code, shouldnt we be using the informer client here as well? The informer may be out of date but these cases will pass because kubernetes client is updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, it will be taken up later

//apimeta.FindStatusCondition(gateway.Status.Conditions, string(gatewayv1.GatewayConditionAccepted)) != nil
}, 30*time.Second).Should(gomega.Equal(true))

gateway, _ := akogatewayapitests.GatewayClient.GatewayV1().Gateways(DEFAULT_NAMESPACE).Get(context.TODO(), gatewayName, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, it will be taken up later

}
return len(nodes[0].EvhNodes[1].VHMatches)
}, 30*time.Second).Should(gomega.Equal(2))
time.Sleep(20 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using constant sleep here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


hostnames := []gatewayv1.Hostname{"foo-8080.com", "foo-8081.com"}
akogatewayapitests.SetupHTTPRoute(t, httpRouteName, DEFAULT_NAMESPACE, parentRefs, hostnames, rules)
time.Sleep(10 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use eventually case instead of constant sleep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be replaced as this one, will not lead to any object being created which can be checked in eventually block

listeners[1].Protocol = "HTTPS"
akogatewayapitests.UpdateGateway(t, gatewayName, DEFAULT_NAMESPACE, gatewayClassName, nil, listeners)
g.Eventually(func() bool {
gateway, err := akogatewayapitests.GatewayClient.GatewayV1().Gateways(DEFAULT_NAMESPACE).Get(context.TODO(), gatewayName, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use informer client since gateway update may update the etcd but not the informer cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, it will be taken up later

parentRefs = akogatewayapitests.GetParentReferencesV1([]string{gatewayName}, DEFAULT_NAMESPACE, []int32{ports[1]})
akogatewayapitests.SetupHTTPRoute(t, httpRoute2Name, DEFAULT_NAMESPACE, parentRefs, hostnames, rules)
listeners[0].Protocol = "HTTPS"
time.Sleep(10 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid using time.Sleep and instead use eventually case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be replaced as this one, will not lead to any object being created which can be checked in eventually block

conditionMap := make(map[string][]metav1.Condition)

for _, port := range ports {
conditions := make([]metav1.Condition, 0, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of make and append this can be refactored to
conditions := []metav1.Condition{ {Type: ..., Reason...} }
conditionMap[fmt.Sprintf("%s-%d", gatewayName, port)] = conditions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@pkoshtavmware
Copy link
Contributor Author

pkoshtavmware commented Oct 10, 2024

@pkoshtavmware: In DequeueIngestion, we have logic ''' if !IsHTTPRouteValid(key, httpRoute) { return } ''' If route changes from valid to partial invalid, how are we deleting existing objects?

@akshayhavile , it will return only if HTTPRoute transitions from valid/partially valid to completely invalid. We were not doing anything for this even before. Existing object deletion will happen upon reboot

@pkoshtavmware
Copy link
Contributor Author

Ut run results:

        github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests               coverage: 0.0% of statements
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/graphlayer    435.184s        coverage: 18.5% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/ingestion     220.697s        coverage: 2.8% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/npltests      60.468s coverage: 16.0% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/status        300.915s        coverage: 12.9% of statements in ./...

@pkoshtavmware
Copy link
Contributor Author

build ako

@pkoshtavmware
Copy link
Contributor Author

pkoshtavmware commented Oct 10, 2024

Screenshot 2024-10-10 at 5 02 58 PM

Screenshot 2024-10-11 at 10 55 59 AM

Screenshot 2024-10-11 at 11 00 47 AM

Screenshot 2024-10-11 at 11 11 56 AM

Screenshot 2024-10-11 at 11 43 15 AM

Screenshot 2024-10-11 at 11 51 27 AM

Screenshot 2024-10-11 at 12 01 12 PM

Copy link
Contributor

@akshayhavile akshayhavile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkoshtavmware : Do we have test case which creates httproute first and then gateway?

@@ -75,10 +76,20 @@ func (o *AviObjectGraph) BuildGatewayParent(gateway *gatewayv1.Gateway, key stri
return parentVsNode
}

func IsListenerInvalid(gwStatus *gatewayv1.GatewayStatus, listenerIndex int) bool {
if len(gwStatus.Listeners) >= int(listenerIndex) && gwStatus.Listeners[listenerIndex].Conditions[0].Type == string(gatewayv1.ListenerConditionAccepted) && gwStatus.Listeners[listenerIndex].Conditions[0].Status == "False" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

condition should be > instead of >= to avoid out of bound error. From caller, index will always be in range but still we should change in case if it gets reused somewhere.

@pkoshtavmware pkoshtavmware force-pushed the AV-214603-master branch 2 times, most recently from 8c52b6b to c96eb45 Compare October 11, 2024 11:25
@pkoshtavmware
Copy link
Contributor Author

Fresh Ut run results:

        github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests               coverage: 0.0% of statements
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/graphlayer    449.294s        coverage: 18.5% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/ingestion     220.692s        coverage: 2.8% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/npltests      60.386s coverage: 16.0% of statements in ./...
ok      github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/tests/gatewayapitests/status        300.974s        coverage: 12.9% of statements in ./...

Copy link
Contributor

@arihantg arihantg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akshayhavile akshayhavile merged commit 100e36d into vmware:master Oct 11, 2024
2 checks passed
@pkoshtavmware pkoshtavmware deleted the AV-214603-master branch October 15, 2024 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open-for-review Pull request is up for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants