Functional Proposal - Tenant Workload Scheduling Loci Relative To Persistent Storage Node Controls - Akash Provider System #248
arnoldoree
started this conversation in
Ideas
Replies: 1 comment
-
(for the devs) I think the following node affinity can be leveraged in the
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I write to put forward this proposed priority functional development action upon the Akash Provider System.
Overview
The proposed development is to add mechanisms, means, and measures to ensure that Tenant pods / workloads that have been subscribed to persistent storage services are only ever scheduled on Kubernetes nodes within local proximity or within storage speed network connectivity to the baremetal host upon which Rook-Ceph persistent storage has been enabled.
Value Proposition
The value on offer from what should in technical terms be limited exertions hugely outweigh any architecting, design and implementation costs.
As a non-exhaustive itemisation, I put forward the following benefits:
1.a. Hardware costs
1.b. power consumption costs, and
1.c. Requisite Provider datacentre network complexity
2.a. Servers within the cluster dedicated to persistent storage can be designed, implemented, and engineered at a premium to meet the in all probability higher continuity needs/requirements of Tenants who subscribe to persistent storage; thus providing persistent storage tenants the service levels they require, for the premium they are prepared to pay. Whilst ensuring ephemeral storage clients can keep costs reflective of levels of continuity in actuality required.
Implementation Strategy
The two identified implementation strategies for this functional proposal are:
The Kubernetes taints-and-tolerances mechanism
The taints and tolerances mechanism approach offers the opportunity for automating much of the logical function operation, and in particular where Condition-based Kubernetes taints are applied. This approach would be very good for safeguarding the operation of tenant workloads, as it could employ Kubernetes to systematically ensure that Tenant persistent storage pods / workloads are only ever deployed to nodes whereby they have fast read and write access to the persistent storage backing to which they are subscribed.
The Taints and tolerances mechanism does however come with considerably more technical overhead, as it would be necessary to define and the measure targeted and excluded conditions, in order for Kubernetes to be able to make its compliant scheduling decisions.
The Kubernetes labels and affinity / anti-affinity mechanism
The labels and affinity / anti-affinity mechanism will be far simpler to implement, yet will have less scope for automation as compared to taints and tolerances. Labelling nodes for persistent storage is already a manual engineer action of the Persistent Storage Enablement procedure that is an annex to the general Akash Provider Build procedure.
The gain asides from greater simplicity in implementation, is that more control and choice may be placed in Provider-Engineer hands.
Strategic Choice
It is in my mind necessary to see that both mechanisms will in the medium to long term need to be applied to realize and develop this proposed functionality. Especially given that Persistent Storage is only the first domain where this functionality finds its relevance. GPU tenant subscription is another obvious domain where Tenant workload locality within the Provider Cluster will be important if not critical, with or without high speed networking.
As the work needed from this proposed functionality grows, it will be necessary to apply the full range of tools to realize its simple and reliable performance across multiple domains.
<ADDITION | Hypervisor high availability contexts are another setting that will require intelligent application of both mechanisms, given that a hypervisor may at necessity change the present baremetal host running the virtual machine hosting a given Tenant persistent storage subscribed pod / workload. It will in this context therefore be necessary to apply a material degree of dynamic present Condition recognition and accounting in order to realize Kubernetes Tenant workload locality policy compliant scheduling. Such dynamic recognition and accounting as found in the Conditional-taint mechanism>
Further to the above, it will be the right course in my view to implement firstly a simplified labels + affinity/anti-affinity mechanism in a first iteration.
Implementation Design
In my view the labels + affinity/anti-affinity implementation should be closely related to but independent to the present beta1/beta2/beta3 labelling for persistent storage enabled nodes. The reason being that there will be considerable variance in Provider-Engineer architectures and networks; and specific variables need to be provided to allow Provider-Engineers to best tailor this proposed functionality to their specific provider context.
I would propose therefore a four label system consisting of:
storage.node.enabled
~ Persistent storage is enabled on this worker node
storage.node.local
~ Persistent storage is not enabled on this node, but this node is based on a virtual machine on the same local host as the target storage node
storage.node.network
~ Persistent storage is accessible over a high speed storage network
storage.node.remote
~ Persistent storage is on the other side of a low speed management or other network
I hope that the above label system variables should clearly enough articulate how Tenant persistent storage subscribed pod / workload placement could be granularly tailored by the Provider-Engineer to suit their particular provider context. Whereby they could both ensure that persistent storage subscribed Tenant workloads were guaranteed low latency locality to their target persistent storage node; whilst at the same time they could ensure that non-persistent storage Tenant workloads are not ill placed so as to occupy capacity in close proximity to irrelevant persistent storage nodes.
Arnold Opio Oree
arnoldoree@parallaxgo.com
Parallax Cloud Compute™
Quick Note: Would be great if there was a "Feature & Function Decision(s)" discussion category, where features and functions could be put forth with discussion around interested party appetite, and key implementation decision points. "Ideas" feels too vague for this intended focused collaborative feature & function development discussion.
Beta Was this translation helpful? Give feedback.
All reactions