Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using aedes-persistence-redis with Aedes resulting in slow client connections. #10

Open
rasalasantosh opened this issue Jan 6, 2017 · 67 comments
Labels

Comments

@rasalasantosh
Copy link

When aedes-persistence-redis used as persistence with Aedes , client connections are very slow . For 20k client connections it was taking around 10 minutes.But without persistence configured,connections are very fast around 50 k connection in less than a minute.

Please find the code below used to run aedes server.

var redis = require('mqemitter-redis');
var aedesPersistenceRedis = require('aedes-persistence-redis');

var mq = redis({
port: 6379,
host: '172.31.38.96',
db: 12
});

var persistence = aedesPersistenceRedis({
port: 6379,
host: '172.31.38.96'

});

var aedes = require('aedes')({
mq:mq,
persistence:persistence
})

var server = require('net').createServer(aedes.handle)
var port = 1883

server.listen(port, function () {
console.log('server listening on port', port)
})

@mcollina
Copy link
Collaborator

mcollina commented Jan 6, 2017

Can you upload a Node.js script to reproduce the problem?

@rasalasantosh
Copy link
Author

rasalasantosh commented Jan 6, 2017

I don't have Node.js script ,I have tried with java based client program. In program there is a while loop which creates new client and connects to server and after connection subscribe to a topic whose value is same as clientId.

@mcollina
Copy link
Collaborator

mcollina commented Jan 6, 2017

anything that can reproduce the problem is good, otherwise it will be extremely hard to fix.

@mcollina
Copy link
Collaborator

mcollina commented Jan 6, 2017

cc @GavinDmello

@GavinDmello
Copy link
Collaborator

Yea. This is an issue. We have a lot of anonymous functions in persistence which is preventing optimizations, I think. Also, ResultsHolder from fastSeries always seems to be un-optimized.

@mcollina
Copy link
Collaborator

mcollina commented Jan 7, 2017

@GavinDmello removing the anon functions should be feasible, do you want to send a PR? Mostly are forEach/Reduce etc, that needs to be changed into straight for loop.

ResultsHolder should be optimized if the rest is optimized. There might be a small issue somewhere in there too.

At any rate, we need a script to verify this problem is fixed.

@GavinDmello
Copy link
Collaborator

@mcollina Yea. That would be cool. I'll make a PR
@rasalasantosh Could you share that java script ? Would be helpful.

@rasalasantosh
Copy link
Author

@mcollina , kindly share the client script that you have used to test 900 k connections , so that i modify according to my use case and test.

@mcollina
Copy link
Collaborator

mcollina commented Jan 9, 2017

@rasalasantosh as said elsewhere, I did not do that test myself.

@mcollina
Copy link
Collaborator

@rasalasantosh can you try version v2.6.1 and see if it improves things?

@rasalasantosh
Copy link
Author

@mcollina , is it redis v2.6.1?If so will try and let you know.

@GavinDmello
Copy link
Collaborator

@rasalasantosh v2.6.1 of this module

@rasalasantosh
Copy link
Author

@GavinDmello , I installed aedes using npm i aedes aedes-persistence-redis@2.6.1 --save and tried again.

It will still taking around 11 minutes for 25k connections , but without persistence it was taking less than a minute for 50k connections.

And also tried with mosca with persistence ,it was taking around 3 minutes for 25k connections.

@mcollina
Copy link
Collaborator

@rasalasantosh are you connecting with what options? are you also subscribing to a topic?

@rasalasantosh
Copy link
Author

@mcollina ,yes after connection successful ,iam subscribing to topic as well.

@mcollina
Copy link
Collaborator

@rasalasantosh you should check v2.6.3, it has lua script caching, which should improve the throughput.

@rasalasantosh
Copy link
Author

@mcollina , I have tested again ,but giving the same results i.e. 11min for 25k connections.

@mcollina
Copy link
Collaborator

@rasalasantosh are you connecting with what options? are you also subscribing to a topic?
From our test, it takes just 30 seconds to connect 20k connections, so maybe something else is happening.

@rasalasantosh
Copy link
Author

rasalasantosh commented Jan 16, 2017

@mcollina , iam connecting with clean session =true , QoS=1,keepalive - 60 sec .And also after connecting we will be subscribing to topic which is same as clientId.

And also please share the test script,so i can verify at my side also with your script.

@GavinDmello
Copy link
Collaborator

@rasalasantosh Could you try it with this ? https://gist.github.com/GavinDmello/dda612661df4db11ba84b41097470c95
It's not currently subscribing to a topic and clean is true.

@rasalasantosh
Copy link
Author

rasalasantosh commented Jan 16, 2017

@GavinDmello , Aedes is taking 142 sec to connect for 50k connections without subscriptions,but 1360 sec to connect with subscriptions.
Tested the same with mosca ,it is taking 190 sec without subscriptions ,but 205 sec with subscriptions.

Please find the script below that is used to test subscriptions.

client-sub.txt

@mcollina
Copy link
Collaborator

@rasalasantosh with Redis configured as persistence, is it correct?

I think the createRetainedStream is where the problem is: https://github.com/mcollina/aedes-persistence-redis/blob/master/persistence.js#L98-L105. I'll try to send a PR to improve things

@mcollina
Copy link
Collaborator

@rasalasantosh check v2.6.4.

@rasalasantosh
Copy link
Author

@mcollina with v2.6.4 getting same results.

@GavinDmello
Copy link
Collaborator

@rasalasantosh Try removing all unwanted keys from redis. Also, is redis working on the same box ? Maybe you're running out of memory.

@rasalasantosh
Copy link
Author

@GavinDmello ,I tried after flushing redis,but getting same results . More over redis , Aedes and client are running in seperate instances.

@mcollina
Copy link
Collaborator

Mosca is using an hashtable for the retained messages, while we are just using keys and h here. https://github.com/mcollina/mosca/blob/master/lib/persistence/redis.js#L241-L267.

Maybe that's the approach we should take here as well. I thought using SCAN was smart

@toddtreece
Copy link
Contributor

@mcollina i'm wondering if this is due to the lua script. we eventually run into issues in production once the subscription count hits a certain point. i missed this section in the redis docs when testing the lua changes: ...while the script is running no other client can execute commands.

from the redis docs:

Redis uses the same Lua interpreter to run all the commands. Also Redis guarantees that a script is executed in an atomic way: no other script or Redis command will be executed while a script is being executed. This semantic is similar to the one of MULTI / EXEC. From the point of view of all the other clients the effects of a script are either still not visible or already completed.

However this also means that executing slow scripts is not a good idea. It is not hard to create fast scripts, as the script overhead is very low, but if you are going to use slow scripts you should be aware that while the script is running no other client can execute commands.

@mcollina
Copy link
Collaborator

@toddtreece that is probably the reason.
So, do you think we should revert #3?

@mcollina
Copy link
Collaborator

mcollina commented Jun 18, 2017

I'm ok with changes, as long as they are ported to the other persistences as well.

@mcollina mcollina reopened this Jun 18, 2017
@mcollina
Copy link
Collaborator

(sorry I was from mobile)

@GavinDmello
Copy link
Collaborator

GavinDmello commented Jun 18, 2017

@behrad How many subscriptions are dealing with ? I did not use any SCAN/HSCAN implementation of redis because, if count is not set to an optimum value the performance isn't good. Also the redis server was doing a lot of heavy lifting in terms of CPU , when many aedes instances are connected

@behrad
Copy link
Contributor

behrad commented Jun 18, 2017

How many subscriptions are dealing with ?

may be a few millions in production, however current implementation of aedes persistence [redis] will rise issues around a few hundred thousands.

I did not use any SCAN/HSCAN implementation of redis because

please check mosca's redis implementation here: https://github.com/mcollina/mosca/blob/master/lib/persistence/redis.js#L137

we can't set an upper limit for client subscription count

Also the redis server was doing a lot of heavy lifting in terms of CPU , when many servers are aedes instances are connected

This may be an issue with the core arch/design of a multi process/instance around a single shared redis model. I believe aedes-persistence-redis should support clustered redis internally (however we can test it against a transparent proxy (like twemproxy or codis))

Do you think your concern is the same as mcollina/mqemitter-child-process#1 @GavinDmello ?

@GavinDmello
Copy link
Collaborator

I think you can fetch all keys in a list with a -1 arg. My concern is using SCAN in hot paths. The count 25000 which is present in mosca may be good enough for a few million keys but as the subscriptions increase the number will have to be bumped .
Nutcracker can't be used with multi at the moment.

@behrad
Copy link
Contributor

behrad commented Jun 18, 2017

I think you can fetch all keys in a list with a -1 arg

You mean LRANGING a million sized list is faster than SCAN?
if yes, how much faster? does it worth the redundancy (in large deployments redis memory is valuable)
and why not a SET? why LIST? and there should be some dedup login inside using a list. currently it is duplicating subs on each handleSubscribe

The count 25000 which is present in mosca may be good enough for a few million keys but as the subscriptions increase the number will have to be bumped .

that is the chunk buffer size, and my tests show that should be lowered even to 5000 or lower.

Nutcracker can't be used with multi at the moment.

regarding multi, (multi is heavy for redis) What if we sacrifice a-bit of consistency for clusterability/performance and remove/option-lize multi? @mcollina
We can at-least provide an configurable option for this since subscriptions are idempotent and subsequent client subscriptions can fix any inconsistencies.

@mcollina
Copy link
Collaborator

I'm 👍 on removing MULTI, I do not think it would cause any issue in practice.

@GavinDmello
Copy link
Collaborator

and why not a SET?

Set would've been a good option. I wasn't aware of the handleSubscribe issue

that is the chunk buffer size, and my tests show that should be lowered even to 5000 or lower.

I'm okay with this if nutcracker support is added.

@GavinDmello
Copy link
Collaborator

GavinDmello commented Jun 21, 2017

@behrad SCAN is not supported on Nutcracker as well.

https://github.com/twitter/twemproxy/blob/master/notes/redis.md

SSCAN, ZSCAN, HSCAN is supported though

@behrad
Copy link
Contributor

behrad commented Jun 23, 2017

SCAN is not supported on Nutcracker as well.

Unfortunately yes, and seems we SHOULD have a container for all subs/clients:

  1. SET is preferred since it does the dedup for topics
  2. For storage/memory optimization I think we should store only one index, either subs or clients. and calculate/implement needed abstract persistence interface methods.

currently, this module is storing both. e.g. a client's subscription is stored both inside client's hash, and also inside the topic hash (only clientId would suffice in the latter)

SSCAN, ZSCAN, HSCAN is supported though

How do you think SSCAN will compare to SMEMBERS for a large (over a million entry) set @GavinDmello ?

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

To test further, i migrated my current actual subscription data (under mosca), to aedes format under redis. When I launch aedes, memory grows until process halts. I noticed the _setup has some issues. However when I moved the same code inside _setup (to hgetallbuffer all topic keys, but without the process part) and ran it isolated in a single node.js file. It worked! it took 67606.250ms to hgetall all topics.

console.time('@@@@@@@@@@@@@@@@@@ READY');
  function split (keys, enc, cb) {
    for (var i = 0, l = keys.length; i < l; i++) {
      this.push(keys[i])
    }
    cb()
  }
  var splitStream = through.obj(split);
  var count = 0; var result = 0;
  var hgetallStream = throughv.obj(function getStream (chunk, enc, cb) {
    ioredisClient.hgetallBuffer(chunk, function(e,r){ result++; cb(); });
    count++;
  }, function emitReady (cb) {
    console.timeEnd('@@@@@@@@@@@@@@@@@@ READY');
    console.log('count ', count);
    cb()
  });
  console.time('smembers');
  ioredisClient.smembers('sub:client', function lrangeResult (err, results) {
    console.timeEnd('smembers');
    setInterval(()=>{console.log('Count='+count, ' Result='+result)},1000);
    if (err) {
      splitStream.emit('error', err)
    } else {
      splitStream.write(results);
      splitStream.end()
    }
  });

  pump(splitStream, hgetallStream, function pumpStream (err) {})

Surprisingly when I moved in the same code above (which is not processing keys) inside _setup it didn't work again!
What can be introducing a memory leak here with the same code guys? @mcollina @GavinDmello

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

My heap dump shows it's inside SingleCaller -> thoughtv's ```fastparallel` usage

@mcollina
Copy link
Collaborator

You don't need to use streams here. You will get some speed bump from removing them.

@GavinDmello
Copy link
Collaborator

@mcollina Could it be qlobber ? We're adding all the decoded packets to qlobber versus the topic.

@mcollina
Copy link
Collaborator

@GavinDmello not sure, but I don't think so.

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

So strange!!! I figured it out: NODE_ENV=development
the difference between two runs was NODE_ENV! and when it's set on development, two things was happening:

  1. thoughtv was introducing a memory leak (as I mentioned inside it's fastparallel SingleCaller) changing it to through fixed that.
  2. A 10x speed reduction!!! I can't figure this out yet. Some module's development inspection (may be ioredis) should cause this.

Could it be qlobber ?

No, Qlobber part of code was commented.

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

You don't need to use streams here. You will get some speed bump from removing them.

by removing streams and putting hgetallbuffer inside a for loop, it gets faster, however it won't scale memory wise, we need a huge memory footprint on aedes startup and process freezes until persistence is ready.

@mcollina
Copy link
Collaborator

by removing streams and putting hgetallbuffer inside a for loop, it gets faster, however it won't scale memory wise, we need a huge memory footprint on aedes startup and process freezes until persistence is ready.

You can use something like http://npm.im/fastq.

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

My benchmarks ran just now:

through(stream)                70 secs
forloop:                     86 secs
fastq(concurrency=1)         126 secs
fastq(concurrency=100)       67 secs
fastq(concurrency=1000)     84 secs
fastq(concurrency=10000)   91 secs

So it seems changing to fastq won't help that much :)

Before I go and test with Qlobber code activated, there are some optimizitions I wanna mention:

  1. By loading from a topic index, we are hgetall-ing a clientId multiple times.
    Instead of a topic index, we'd better load by a client index (as mosca) and fetching less buffers. I'll try to optimize redis key schema for this.

  2. How much will it help, If we spawn multiple child processes, each hgetall-ing a subset of the set, then returning topic buffers back to the main setup process? (LRANGE or SSCAN can be used for this) My idea is to use multiple cores for loading large subscription data, will redis be saturated then @mcollina ?

  3. Aedes opens port before persistence is ready, persistence queues all callbacks until ready, and I don't like the user to be offline_msg_forwarded after 70 seconds, I prefer non ready instance has no port listening (like mosca) so that my load balancer relays user to a running instance. Can we change this to the method mosca behaves @mcollina ?

@mcollina
Copy link
Collaborator

on your results, can I see your code?

on 1. go ahead!

on 2. I'm really against going multi-process. This is very deployment specific.

on 3. Aedes does not listen automatically. That part of logic in Mosca was super complicated, so it's up to you. If you want to send an update to the README, go ahead. If we are missing some events for this, please send a PR and we'll add them.

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

on your results, can I see your code?

  1. stream code is same as the above snippet (obtained from current aedes' code)
  2. for loop, removes through and calls hgetallbuffer inside smembers callback.
  3. fastq is something like this:
  var queue = require('fastq')(worker, 100)
  queue.drain = function() {
    console.timeEnd('@@@@@@@@@@@@@@@@@@ READY')
    console.log('count ', count)
  }
  function worker (arg, cb) {
    that._db.hgetallBuffer(arg, function(e,r){ result++; cb(); })
    count++;
  }
  this._db.smembers(subscriptionsKey, function lrangeResult (err, results) {
    for (var i = 0, l = results.length; i < l; i++) {
      queue.push(results[i])
    }
  }

on 2. I'm really against going multi-process. This is very deployment specific.

So, I'll suspend this, and go for more priority tasks like number 1.

on 3. Aedes does not listen automatically. That part of logic in Mosca was super complicated, so it's up to you. If you want to send an update to the README, go ahead. If we are missing some events for this, please send a PR and we'll add them.

Great. I should check if persistence api already exposes ready .

@mcollina
Copy link
Collaborator

I think 1. might be the path that leads to the best benefits.

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

Yes, sure.

Can you clarify me also on these extra details @mcollina ?

  1. Why are you using thoughtv instead of though2 ? The same could've been achieved by through2

  2. Can we play with concurrency of thought.obj by setting a hightWaterMark?

  3. Why are using pump, how it helps?

@mcollina
Copy link
Collaborator

throughv parallelize, through2 does things in series.

you can play with throughv concurrency by setting a highWaterMark.

pump makes sure all streams are closed if one errors.

@behrad
Copy link
Contributor

behrad commented Jun 29, 2017

thank you man 👍

and have you got any clues why fastparallel's SingleCaller is eating memory when NODE_ENV=development ?

@behrad
Copy link
Contributor

behrad commented Jul 9, 2017

This issue should be resolved by #31

@robertsLando
Copy link
Member

@mcollina @behrad can this be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants