-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to do dns based pooling? #9
Comments
Hey, First off, you'll probably want to stick to Next, well as far as I know none of the existing memcached proxies allowed for dns based pool definitions, so we definitely don't support that out of the box. If you give a hostname to a backend it'll pick the first dns response and stick to that forever. That said it's doable but you'll have to set up a connector to load your server list. There are a few options:
It's pretty easy to mix a cron file with routelib, which you do via passing multiple start scripts (in order): This sounds a little complicated, but because the system is flexible about where the server list comes from, it's designed to plug into any kind of server discovery already. The user might just need to do a little work to get the server list into the thing :) If you can give some hints as to which direction would work best for you I can help come up with a more complete framework example to save you some time. |
oh also: the reason why |
Thanks for the quick reply! So far the proxycron is the most appealing. Question though, why do I see the behaviour of the proxy hitting some of the instnaces. If what you said was true
Is it because there is 4 threads? It seems the ip address order is randomized when it's queried, am I just seeing the result of some of the threads getting a different address? |
Just in case it wasn't obvious from the above; you could dump your whole routelib focused If it's a big enough thing we can look into adding some internal dns utilities |
This, yup. Each worker thread has its own connection to the backend servers (unless you flip an option to turn on a consolidating io thread). So you're seeing the result of each threads' DNS lookup. |
Definitely easier from inside the conf.lua, as few dependencies as possible as far as I am willing to push it.
The dns utils inside memcached sound great! I'm gonna try following the cron template and use luasocket to see what I can come up with. |
Sounds good. Please don't suffer too long though; if something is frustrating that's probably a pain point for me to clarify, document, or patch if necessary. Though we probably wouldn't see direct DNS utilities for a while. So if something is going sideways you can show me what you have so far and I'll adjust the framework or add an example config to the routelib repo or something. The proxy's in pretty heavy use but not that many total users so far, so I've been focusing on polish this year to help more varied use cases. Good luck! Thanks for checking it out. |
Ok made some progress package.loaded["socket.core"] = nil
local s = require("socket.core")
verbose(true)
debug(true)
-- to be set via env later
local port = "11211" --string
local function dns_to_ips(address, socket)
local info = socket.dns.getaddrinfo(address)
if info == nil then
print("NO DNS IPs!")
return {}
end
local ips = {}
for index, value in ipairs(info) do
ips[index] = value.addr..':'..port
end
return ips
end
settings{
active_request_limit = 100,
backend_connect_timeout = 3,
}
pools{
myself = {
backends = {
"localhost:11211",
}
},
other = {
backends = dns_to_ips("memcached", s)
},
}
-- using cmap for the future and to see what different routing policies do
routes{
cmap = {
[mcp.CMD_GET] = route_allfastest{
children = { "other" },
},
[mcp.CMD_SET] = route_allfastest{
children = { "other" },
},
},
default = route_direct{
child = "myself",
},
}
--variance setup so that multiple instances don't get synced and spike the DNS server.
local variance = math.random(3, 6)
math.randomseed(os.time())
mcp.register_cron("dns",
{ every = variance,
func = function ()
mcp.schedule_config_reload()
end }) The hardest part was actually installing and figuring out Sadly luasocket's dns capabilities are very very basic. I can also see it reloading so that's nice!
The ips get shuffled every reload. A different problem. I am probably doing it wrong but none of the existing routers are giving valuable functionality.
What is the conceptual reasoning behind pools? I probably missed it on the wiki. Should I be creating a pool-per-backend? Or am I missing something crucial? |
Congrats! Well I was hoping saying "pool" everywhere would be pretty clear, but this is actually the second time this week it's confused someone :) I'll see about adding some extra examples/docs somewhere. The primary use case of memcached is "hash keys against a list of servers, and store/fetch that key from exactly one server", thus adding servers to a "pool" increases available memory. In the proxy a pool is exactly this (there are configuration options to control the key hashing). The routes are for directing/copying keys between pools. So "allfastest" will route a key to all pools. It doesn't know/care if a pool is a single server or 100. The concept of copying routes between pools was originally for things like "Datacenter A and B" or "availability zones 1/2", or racks/cages/etc. So if you just have a small list of memcached servers and you want keys copied to all of them, yes you need to create one pool per backend. You can create a "pool set" to make the configuration a bit easier. There's an example here: #7 (comment) |
Fwiw you should see if you can move more of the logic into the cron so you're only reloading the configuration if the backends actually change. It's designed to be reloaded very frequently but it is a bit wasteful on CPU. |
Agreed -- imports
package.loaded["socket.core"] = nil
local s = require("socket.core")
-- env
local port = os.getenv("BACKEND_PORT") or "11211" --string
local dnsname = os.getenv("BACKEND_HOSTNAME") or "memcached" --string
-- setup
verbose(true)
debug(true)
function DNS2IPS(address, socket)
local info = socket.dns.getaddrinfo(address)
if info == nil then
-- say("DNS returned no IPs!")
return {}
end
local ips = {}
for index, value in ipairs(info) do
ips[index] = value.addr..':'..port
end
-- dsay("Got DNS: "..dump(ips))
return ips
end
-- https://stackoverflow.com/a/54140176
function SIMPLE_TABLE_COMPARE(tA, tB)
return table.concat(tA) == table.concat(tB)
end
----
-- settings
settings{
active_request_limit = 100,
backend_connect_timeout = 3,
}
-- dns
DNS_IPS = DNS2IPS(dnsname, s)
table.sort(DNS_IPS) -- performance is 1s per 1M entries
-- pools
pools{
myself = {
backends = {
"localhost:11211",
}
},
other = {
backends = DNS_IPS
},
}
-- routes
routes{
cmap = {
[mcp.CMD_GET] = route_allfastest{
children = { "other" },
},
[mcp.CMD_SET] = route_direct{
child = "other",
},
},
default = route_direct{
child = "myself",
},
}
-- cron
local variance = math.random(3, 6)
math.randomseed(os.time())
mcp.register_cron("dns",
{ every = variance,
func = function ()
new_ips = DNS2IPS(dnsname, s)
table.sort(new_ips)
if not SIMPLE_TABLE_COMPARE(new_ips, DNS_IPS) then
mcp.schedule_config_reload()
end
end })
Unsure why, but |
I'm going to see what the zoned config can do. |
Using zpools package.loaded["socket.core"] = nil
local s = require("socket.core")
local port = os.getenv("BACKEND_PORT") or "11211" --string
local dnsname = os.getenv("BACKEND_HOSTNAME") or "memcached" --string
verbose(true)
debug(true)
local_zone("zlocal") -- we NEED to define a local zone
function GET_DNS(address, socket)
local info = socket.dns.getaddrinfo(address)
if info == nil then
-- say("NO DNS IPs!")
return {}
end
return info
end
-- IPP = IP:PORT
function DNS2IPPS(address, socket)
local info = GET_DNS(address,socket)
local ipps = {}
for index, value in ipairs(info) do
ipps[index] = value.addr..':'..port
end
-- dsay("Got DNS: "..dump(ips))
return ipps
end
function IPPS2POOLS(ipps)
local pools = {}
for index, ipp in ipairs(ipps) do
pools[ipp] = { ["backends"] = { [1] = ipp } }
end
-- dsay("Generated pool: "..dump(pools))
return pools
end
-- https://stackoverflow.com/a/54140176
function SIMPLE_TABLE_COMPARE(tA, tB)
return table.concat(tA) == table.concat(tB)
end
settings{
active_request_limit = 100,
backend_connect_timeout = 3,
}
DNS_IPPS = DNS2IPPS(dnsname, s)
table.sort(DNS_IPPS) -- performance is 1s per 1M entries
local bmyself = {
backends = {
"localhost:11211",
}
}
local szmain = IPPS2POOLS(DNS_IPPS)
szmain["zlocal"] = bmyself
-- pools
pools{
pmyself = bmyself,
set_zmain = szmain
}
-- routes
routes{
cmap = {
[mcp.CMD_GET] = route_zfailover{
children = "set_zmain",
stats = true,
miss = true,
},
[mcp.CMD_SET] = route_allsync{
children = "set_zmain",
},
},
default = route_direct{
child = "pmyself",
},
}
local variance = math.random(3, 6)
math.randomseed(os.time())
mcp.register_cron("dns",
{ every = variance,
func = function ()
candidate_ipps = DNS2IPPS(dnsname, s)
table.sort(candidate_ipps)
if not SIMPLE_TABLE_COMPARE(candidate_ipps, DNS_IPPS) then
mcp.schedule_config_reload()
end
end })
The idea is to make the Now that i am sorting the order of the dns responses, let me try the original settings again. Maybe it'll hash/route correctly now. I would prefer it to function the way you described pools. Currently it's coping all the keys which is cool for HA, but less cool for scaling. You mentioned there are distribution options available? I saw this in the Wiki
How do I set that up? |
Ok cool. Doing this now works fine: -- pools
pools{
pmyself = {
backends = {
"localhost:11211",
}
},
pother = {
backends = DNS_IPPS
},
}
-- routes
routes{
cmap = {
[mcp.CMD_GET] = route_direct{
child = "pother",
},
[mcp.CMD_SET] = route_direct{
child = "pother",
},
},
default = route_direct{
child = "pmyself",
},
} I can always Sadly if the ip table changes, chances are high that the new table will hash differently. Which feels wrong to me. The key is in the pool, just not where you expect it to be. Depending on your use case this could be a non issue: short lived keys (5s to 1m) would practically be uneffected. long lived keys and "permanent" keys would start showing issues. I might do something like this: local bmyself = {
backends = {
"localhost:11211",
}
}
local szmain = IPPS2POOLS(DNS_IPPS)
szmain["zlocal"] = bmyself
-- pools
pools{
pmyself = bmyself,
pother = {
backends = DNS_IPPS
},
set_zmain = szmain
}
-- routes
routes{
cmap = {
[mcp.CMD_GET] = route_zfailover{
children = "set_zmain",
stats = true,
miss = true,
},
[mcp.CMD_SET] = route_direct{
child = "pother",
},
},
default = route_direct{
child = "pmyself",
},
} That guarantees I set a key somewhere once, and I can find it where-ever it is. I could do [mcp.CMD_SET] = route_zfailover{
children = "set_zmain",
stats = true,
shuffle = true,
}, That would always set zlocal first. |
Ok I figured out something a little goofy. -- pool sets
local szmain = IPPS2POOLS(DNS_IPPS)
szmain["zlocal"] = {
backends = DNS_IPPS
}
-- pools
pools{
pmyself = {
backends = {
"localhost:11211",
}
},
set_zmain = szmain
}
-- routes
routes{
cmap = {
[mcp.CMD_GET] = route_zfailover{
children = "set_zmain",
stats = true,
miss = true,
},
[mcp.CMD_SET] = route_zfailover{
children = "set_zmain",
stats = true,
shuffle = true,
},
},
default = route_direct{
child = "pmyself",
},
} Basically I am setting set_zmain:
"10.0.0.110:11211":
backends: ["10.0.0.110:11211"]
"10.0.0.111:11211":
backends: ["10.0.0.111:11211"]
"10.0.0.112:11211":
backends: ["10.0.0.112:11211"]
zlocal:
backends:
- "10.0.0.110:11211"
- "10.0.0.111:11211"
- "10.0.0.112:11211" Now i get the best of both :^).
This does mean that zombie keys could happen. Consider a situation where an key is set for 5m TTL:
So in this situation we got old data sadly, and a client expecting fresh data won't know. :( Regardless I should definitely do: [mcp.CMD_DELETE] = route_allsync{
children = "set_zmain",
}, Deletes should go everywhere. Maybe I should just do the set routeall and hope the amplification isn't too bad? Maybe a stop job? If an instance is going away, maybe it should issue a global delete for every key it contains? |
How often are instances being added/removed from your setup here? Sorry you wrote a bunch here; can you back up and maybe clearly write something short about what your ultimate goal is? :) Then maybe I can clarify what's going on here. A few random questions:
On the hashing stuff TL:DR:
I can probably give you some clear ideas based on your goals once you clarify. Usually people's list of memcached servers are fairly static. They get added to or removed from very rarely. Thus people usually take a "one time hit" when a server dies and is replaced, and they have extra misses for a while. If this isn't acceptable we use the proxy to make extra copies of keys across pools. It kinda looks like you're currently sitting halfway between these two ideas. :) |
Looks like I can make some tweaks to routelib to make this easier, but lets see what we're trying to accomplish here first. I definitely need to set up ordered pool sets and add more examples for overriding pool options. This is all good stuff for understanding what I need to document still, thanks! :) |
I am doing is practising deploying workloads on k8s. In this case scaling Memcached. k8s does have zone primitives, but the actual getting it into the applications is a little undefined. I want to get the simple multi-server deployment correct before I start problem solving the zones. The problem space is dealing with the very dynamic structure of deployments (especially in cloud environments) in general you can expect at minimum one pod (memcached server instance) to get deleted and replaced once every day. K8s has a way to inject the ip addresses of all the deployed instances into a dns record avialable in-cluster (it's called a headless service) but that means that the client needs to deal with the balancing over all the instances (not guaranteed to be the case) I wanted a simple gateway into the distributed memcached deployment and the proxy seemed the best way to do it. At this point it is very close to working as i originally intended it. Testing with instances dropping and coming back are routed correctly. I'm going to do some more in-depth testing and report my findings here on how the |
Hey, was out last week. You're definitely swimming upstream a little. Hopefully you can make a small adjustment to make things easier (and take a look at a new example I just uploaded). To restate your problem, you have a list of servers that come back from a DNS entry as IP addresses, ie: Since k8s is complete chaos, your list can change at random, like:
This doesn't play well with how the proxy (or any memcached client) works. You need to stabilize the list order, then things will get easier. Let me walk through this for jump hash or ring hash. Jump hash
This is actually an array of servers, 1/2/3. Your system have a desired count of servers (3 in this case), and if a server is replaced it needs to go back into the same slot. IE: if 10.0.0.2 dies and is replaced with 10.0.0.5, the array should update to: Now jump hash is perfectly happy: all entries that originally mapped to 10.0.0.2 now go to 10.0.0.5: no keys move positions. If you want to add a new server to the list, you add it to the end, ie: ... this works well with jump hash. If servers are added to or removed from the end of the list, a minimal number of keys end up rehashed. Ring hashRing hash is more resilient to the server list reordering: This is because it internally is a hash map with the host/ip/port of the server. Unfortunately with the options it has today it can't only look at a backend label, so in the previous example of
BothEnsuring a dead server comes back with the same IP can help simplify things in both cases. Ensuring the length of the list of servers doesn't change for no reason helps a lot for the stability of the cache system. It's not designed to have the list randomly contract and expand: the number of servers should be a deliberate calculation. Gutter exampleIn this example: https://github.com/memcached/memcached-proxylibs/blob/main/lib/routelib/examples/failover-gutter.lua We show how to handle the temporary loss of a server with less impact to clients. In short: if a backend is down we fail over to another pool or to a remapped backend list within the same pool. At the same time we adjust the TTL of set commands so these failed over cache entries won't stay for long periods of time which will improve bad cache scenarios. This can bridge a gap where if you can get k8s to have a stable list of servers, but they still die and are replaced with some frequency, and it can take some time (minutes/etc) to update the server list, the gutter cache can help keep up your hit rate. |
I got some time to check on it today. When it's working it works great! But I am experiencing multiple problems at the moment. GutterThis snipet: children = { "foo", route_ttl{ ttl = 300, child = "gutter" } }, isn't working for me. IDK if I am doing something wrong but even the code does not look like it's expecting what memcached-proxylibs/lib/routelib/routelib.lua Lines 1216 to 1232 in b0c73a7
And memcached-proxylibs/lib/routelib/routelib.lua Lines 901 to 928 in b0c73a7
Don't seem interface-able. Right now I am just using it like so: children = { "foo", "gutter" }, CronsSomething regressed. Crons are broken for me. It works about 80% of the time on startup, then it executes the first reload and it always exits with The 80% I can't quite tell if it's this issue or something else, but I am going to assume it is. When I remove the cron stanza it's fine. I've tried trunning it once with a ValgrindSetup with
Seems to happen if you run Minimal ConfigIf you want to test it out: this is a minimal config to cause it to happen -- optional verbose and debug. Does not effect outcome
verbose(true)
debug(true)
pools{
pmyself = {
backends = {
"localhost:11211",
}
},
}
routes{
cmap = {},
default = route_direct{
child = "pmyself",
},
}
mcp.register_cron("test",
{ every = 1,
func = function ()
print("reloaded")
mcp.schedule_config_reload()
end })
Bad VersionMemcached Tested: ResultTesting it broken yielded positive results so far! |
Hey, I'll look into the cron failure, thanks for reporting! Can you give a more complete example for what "doesn't work" with routelib is calling the Thanks! |
Hey, I've pushed a fix for the segfault. I'll add the missing unit tests soon then maybe cut a bugfix release. Thanks for the report! My test suite didn't catch it because many of the standard tests are using an older API that didn't use the builtin router object. :/ |
Awesome! Cron
Skip Hash (Ketama)I'm testing with skip hash and I think I might have discovered another issue.
It doesn't SEGFAULT but I think it's effecting the consistency of the ketama, when I was testing it would sometimes freak out and then stabilizes. Route TTLNot sure if I am doing something wrong. Failed to execute mcp_config_routes: invalid argument to new_handle |
can you please include the configs you’re using that aren’t working? like you did for the cron issue. I don’t have a lot of time to go guessing :( thanks! |
Apologies: Skip hashI am definitely doing something weird. Not sure what is causing it to happen, trying some minimal stuff does not instance it. I'll do some more indepth stuff later. Ignore this issue for now. Route TTLMinimal to get it to happen local stat_list = {
"10.244.0.84:11211",
"10.244.0.86:11211",
"10.244.0.85:11211",
}
pools{
main = {
backends = stat_list ,
},
gutter_same = {
options = { seed = "failover" },
backends = stat_list,
},
}
routes{
cmap = {
[mcp.CMD_SET] = route_failover{
children = { "main", route_ttl{ ttl = 10, child = "gutter_same" } },
stats = true,
miss = false,
},
},
default = route_failover{
children = { "main", "gutter_same" },
},
} |
@WesselAtWork apologies for the delay. Just pushed a fix that should make the gutter example work again. Had a one character typo. I didn't yet further test the example but I'll try to do that myself when I get a minute. It should work fine now. I did test that it starts I just didn't try to push traffic. |
@WesselAtWork did you end up doing anything with this? I might keep the issue open as reference or I'll distill it into another issue (for support on working with dns pools) |
Currently trying to do a simple replication proxy by using a dns name that contains all the other memcached ip addresses
Using the dns name works kinda...
Setup
memcached
contains 3 ips, order is randomized every time I invoke itnslookup memcached
memcached -vv --port=22122 -o proxy_config=test.lua
test.lua
simple.lua
is in the same dir with a small modification (changed theo
to ac
)startup looks like:
Results
Setting a unique key in each instance separately, I can kind of
get
them from the proxy, but it's inconsistent.10.0.0.11 is the most consistent, request the key that's located here, I (almost) always get a response
10.0.010 is less consitent, i get about 40% to 50% of the time
Weirdly, 10.0.012 is never queried! The proxy never hits this one!
Also of note the 4 "setting up a zoneless flat pool" lines. I would expect it to only output 3 becuase there is only three ips, but it called
mcp_config_routes
4 times?Maybe there is a bad address in the list?
There is obviously some kind of resolution going on inside of
memcached
but it's not transferring correctly to the proxy, or my env is messing with the setup.I don't mind digging into the lua myself, but if possible can you send me on the right track?
Where do I need to focus?
As a start I should probably install
luasocket
and then hack around the dns module, but if I can avoid that, it would be ideal.Additionally I wonder about freshness. How does the resolution function?
Is it a one-and-done or does it query each time?
The text was updated successfully, but these errors were encountered: