-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track-state functionality grinds CIDER to a halt on a project with many namespaces and heavy :refer :all usage #806
Comments
Thanks for looking into this!
Btw, you can address this partially by setting a big GC threshold in Emacs. It's going to reduce the number of GC pauses, but they are going to be slower. Unfortunately Emacs's GC is quite primitive.
I don't think they are needed. Probably just an oversight.
The font-locking is this data's top user for sure. If I recall correctly we use this for the dynamic indentation as well. One other usage is tracking the REPL type after evaluating code, although this can easily be extracted into a different middleware (or we can just try to do it client-side). One relatively simple workaround would be to put some limit on the data size or to make this middleware something optional that the clients have to initialize manually. We just have to handle the REPL type independently then. |
@Malabarba Any thoughts on this one? |
I've actually made some progress locally (through rebinding vars), I will run it like that for a while longer and then will try to produce a patch for the upstream. One thing that I immediately see is that the metadata of a Var is repeated in every possible namespace that interns that Var. We use interning heavily which is likely the main reason why things blow so out of hand. Droping |
Btw, here we should probably just whitelist the relevant meta so we can reduce the payload to the minimal possible size. Generally we can change the format however we want, as CIDER is the only editor using this currently. |
Probably you've seen this already, but all the relevant usage of the metadata is here https://github.com/clojure-emacs/cider/blob/0c99b0718e1825d020115e0da736ddbcecabb910/cider-mode.el#L757 If we just return the type of the vars instead in the response things like arglists won't be needed. |
There are probably no magic potions here. :( |
Issue moved to cider-nrepl |
@alexander-yakushev curious, do you / other people still have this problem? Would it be possible to have an example of the problematic ns pattern? (Just the basics would suffice - I could programatically create 1000 like those, for instance) |
I currently don't have access to the project where this has triggered. But I can give you the overall shape. So, there are around ~20 "common" namespaces, each having say ~20-30 functions. Then, there is ~1000 "consumer" namespaces, each doing I don't quite remember if the slowdown was all the time or just on fresh REPL initialization. I think it was the latter. Once the results for |
What I have is the hack that I used to workaround this issue: ;; Patch track-state middleware to reduce the amount of data transferred.
(let [meta-ns (try-req 'cider.nrepl.middleware.util.meta)
misc-ns (some-> meta-ns (.lookupAlias 'misc))
track-ns (try-req 'cider.nrepl.middleware.track-state)
rmk-var (some-> meta-ns (ns-resolve 'relevant-meta-keys))]
(when (and meta-ns misc-ns track-ns rmk-var)
(let [relevant-meta-keys (HashSet. (remove #{:doc :arglists :fn} @rmk-var))
relevant-meta-keys-with-fn (HashSet. (remove #{:doc :arglists} @rmk-var))
vmwf (ns-resolve track-ns 'var-meta-with-fn)
relevant-meta (fn [m, ^HashSet rmk]
(let [m (reduce-kv (fn [acc k v]
(if (and v (.contains rmk k))
(assoc acc k
(cond (true? v) "true"
(number? v) (str v)
:else (pr-str v)))
acc))
{} m)]
(when (> (count m) 0)
m)))
clojure-core (try-req 'clojure.core)
v (ns-resolve track-ns 'filter-core-and-get-meta)]
;; Remove :doc and :arglists for everyone, including clojure.core.
(.bindRoot (ns-resolve track-ns 'clojure-core-map)
{:aliases {}
:interns (reduce-kv (fn [acc k v]
(if (var? v)
(assoc acc k
(relevant-meta (vmwf v) relevant-meta-keys-with-fn))
acc))
{} (ns-map clojure-core))})
;; For others, remove :fn too.
(.bindRoot v (fn [refers]
(persistent!
(reduce-kv (fn [acc sym the-var]
(if-let [meta (when (var? the-var)
(let [the-meta (meta the-var)
the-ns (.ns ^Var the-var)]
(when-not (identical? the-ns clojure-core)
(relevant-meta the-meta relevant-meta-keys))))]
(assoc! acc sym meta)
acc))
(transient {}) refers))))))) I'm a bit hazy to say outright what this does, but if you ask me about the particular part I can probably try to remember. |
Thanks! Given that you don't face this problem first-hand anymore, would you mind if we simply close it? tbh, it sounds a lot to me like a "don't do that" kind of problem.
Having 1000 namespaces also is very much non-modular. It's increasingly accepted as better to have some sort of Polylith-like pattern. That's not to say that by closing it we won't ever work on this, but it seems a sensible prioritization measure. For all we know, the project in question may have been abandoned or have no CIDER users at the moment. Surely if it wasn't the case this thread would have get bumped from time to time. |
I understand the part about prioritization, that's fine. I don't mind this being closed. Having said that:
It is quite important to do it in that project. Those 1000 namespaces are indeed split into modules, almost Polylith style, but during testing, it is necessary to have all modules loaded at the same time. It is one of those projects that absolutely depend on being written in Clojure/other Lisp. Like, absolute most of the software can be written in anything else and Clojure is just a nice bonus there, but here it is vital. But that is also the reason why the common guidelines does not apply to it.
Oh no, it is still going strong and growing, and uses CIDER exclusively. Probably has 1.5k namespaces now. It's just me who don't have the access right now because of the armed forces and stuff. Also, the reason why nobody comes complaining is probably because the workaround still works as intended. So, sometime in the future I'd love to fix this. I don't like when tooling puts limits on things where the foundation does not. Like, Clojure itself is fine having those thousands of namespaces loaded. But for now this can be closed. |
Thanks for the insights!
I'm not sure I understand this part - why is :refer so important? Most likely there are tools that will help you refactor them them to Very likely, the patch could be simplified to removing I cannot guarantee that cider will properly work when :doc/etc are removed though. Maybe stuff will work slower, with one nrepl request per usage.
My understanding is that part of the problem has to do with (e.g. Clojure also probably supports 10KLOC namespaces, or 2KLOC defns, but little support can be expected from most tooling authors there). |
In the same way why
Yeah, that's why I didn't want to submit it back then. Don't want to break anything when I don't know exhaustively who needs this middleware and what for.
I agree with you in the situation when pushing the line further towards the "insanity" involves some sort of a tradeoff. Then, sure. But if the current limitation is only caused by bloat/redundancy, then I don't see why not remove it even though the run-of-the-mill usecases will never see the impact of it. |
We are using CIDER with a really big project (1000+ namespaces). Most of those namespaces have a common "prelude" form that requires and refers many other namespaces. With a project like this, CIDER consistently slows down the application and the Emacs. We were able to pinpoint the problem to
track-state
middleware, particularly the part that sends the namespaces that were changed since the last time. When the cache is empty, it sends all namespaces from cider-nrepl to cider. This actually creates three problems:Clojure side (on cider-nrepl) spins like crazy and spends a lot of time in
cider.nrepl.middleware.track-state/filter-core-and-get-meta
function.The huge result (~50Mb) gets transferred to the client. This takes ages when doing any remote development.
Emacs caches the received result and keeps it in memory all the time which makes Emacs GC last a few seconds. Most Emacs operations become ridiculously slow.
We were able to reduce this problem a bit by blacklisting
:doc
and:arglists
from the data that gets transferred from cider-nrepl to CIDER. Apparently, it doesn't break anything, so it is not obvious why they are there in the first place.In any case, it looks like this way of tracking changes (transferring huge maps on every change) doesn't scale too well. I could try to suggest a less strenuous way to do this, but I have to know where else the information from track-state is used. So far, I found only that font-locking relies on it.
The text was updated successfully, but these errors were encountered: