-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.libPaths() can conflict with user/system installed R/RStudio #37
Comments
Thanks for starting this discussion @ebolyen and sharing your research. I also get frustrated that the conda R does not ignore the user's personal R directory. To me this defeats one of the main reasons I use conda, which is to create reproducible computational environments that I can share with colleagues via an Also, this has bitten me before on Ubuntu, e.g. this bug where the user directory caused conda packages to be installed in the wrong location. However, I don't think we want to make the change unilaterally here. conda-forge is the community collection of conda packages, and I don't think we would want R to behave differently whether it was installed from the conda-forge channel or the defaults channels (the defaults recipe is hosted at AnacondaRecipes/r-base-feedstock). @mingwandroid What is your opinion on modifying the R installed by conda to ignore the user's local package directory? |
I am in two minds about this. Making such changes can annoy upstreams and lead to claims that we're fragmenting the ecosystem. Also documentation is no longer valid. But while we're at it how about re-routing install.packages() through conda in the first instance? I've been tempted to try to do this for a while. I'd probably hide it behind an env var that's off by default though. With R upstream providing binaries for Windows and now macOS it makes us no longer fully compatible. Will the same thing happen on Linux? I think I'd like there to be a manylinux2 style approach for binaries there but I'd like it to be based on our compilers. |
Exactly! We distribute a bioinformatics framework called QIIME 2, and our installation mechanism is a conda environment which has worked amazingly for the most part. The only exception is R, as many bioinformaticians use RStudio and have some of the very same libraries we use, so we run into this reasonably often.
That makes sense, and it got me thinking, "well if Python is virtualized by conda, why shouldn't R be as well? Maybe this should go in defaults!" So I double checked how conda treated Python's So I suppose conda isn't treating R any differently than Python. You can escape the environment in either language. It's just that a lot more people use RStudio which uses the user-package directory than there are people that know about that user site-packages for Python.
That would be pretty incredible.
Doesn't this already happen with libc? I know we're stuck using Centos 5 for the older libc, but it's probably only a matter of time before the forwards-compatibility breaks in a way that matters. I wonder if maybe the right way to go about this would be to create a patch-only package which alters the way conda integrates with R. Then if you want to ship a reproducible R environment you can include that package in your environment list. Otherwise R acts like it "should" and will prefer your user-packages, binary incompatible though they may be. That would work well for our purposes, and would make it easy for anyone to opt-in to the same behavior. Technically the same thing could be done for Python as well (though I've really never seen the user site-packages until I looked for them today). |
@mingwandroid I agree this is a big concern. However, I still think it makes sense to do this. The entire reason for the user directory is because the user needs a location where they have write-access. One of the benefits of using conda is that its a local installation. In that sense, when an R user installs R via conda, they are already breaking from the traditional ecosystem. Using the system
That would be a cool feature, but I think it is less urgent than fixing the current issue with the user directory.
@ebolyen That's an interesting proposal. I'd be willing to take that compromise. My worry though is that new conda users would be unlikely to know about this alternative R conda package. This comes down to what the purpose of installing R via conda is. When I just want to use R to do some exploratory analysis, I use the system version of R and R packages installed via
@ebolyen Also, I wanted to note that this isn't an RStudio feature. As you noted in your original post, the location of the user directory is determined by the configuration files shipped with R. |
Not to complicate things further, but there's also the root environment, which I've always thought of as a replacement for a system-install. In the root env you could argue that user-package directories make sense again (but why even use conda??). |
@ebolyen I'd argue that mixing conda and system packages is a bad idea even in the root environment. In the long run, mixing and matching always ends in frustration. So many users get frustrated with conda because they'll install a few compiled packages via conda, but then also set a custom |
@mingwandroid Another complication with trying to override
|
I like this line of argument. So then should the Anaconda distribution of Python also be patching |
We did sort out some issues early on to ignore |
Sure thing! I've already tested that the |
@ebolyen I agree. I'm surprised that this also affects local Python packages. Here's the test I performed to complement your test:
|
I had no idea |
I think I'm going to be working on a "patch" package since there doesn't seem to be a lot of traction over on conda's issue tracker. My current plan is to use post-link/pre-unlink scripts. They do say not to do what I'm about to do:
But I don't see a better option here (short of changing how the language installs itself in an environment). Does anyone have a better way to make this conda package? |
@ebolyen Could you please elaborate? How are you planning to use these script to modify the library paths? My idea would be to include a patch that deletes the R_LIBS_USER definitions from https://github.com/wch/r-source/blob/af7f52f70101960861e5d995d3a4bec010bc89e6/etc/Renviron.in#L43 This would cover the most common use case, where a user is installing local R packages to the default R_LIBS_USER for their OS. It wouldn't be able to prevent the situation in which a user has defined a custom R_LIBS_USER in their There may be a better way to do this (i.e. passing an option to one of the commands in build.sh), but I couldn't find an obvious solution after skimming R Installation and Administration |
Ah, yeah that is essentially my plan as well. Using My plan was to avoid creating a different version of R by instead making a recipe that modified your conda environment directly (instead of the R package before installation into the conda-environment). YAML:
|
Basically, there isn't actually any source to install, the package would just be a vehicle for executing the environment modifications (similar to distributing shared environment variables with post-activate hooks). |
OK. That makes sense. Please let me know when you have a prototype that I can test out.
What were you planning on calling this new package? |
Will do!
¯\_(ツ)_/¯ |
I was wondering why conda's The documentation at https://stat.ethz.ch/R-manual/R-devel/library/base/html/libPaths.html says the following:
As such, if a user has a local While for the sake of true installation-isolation I would not suggest to do so, conda's UPDATE: However, this might still create problems if a user has set |
Has there been any progress with this? prioritize_conda <- function(lib_tree){
cpath <- grep('conda', lib_tree, value=TRUE, ignore.case=TRUE)
ifelse(length(cpath) == 0, return(lib_tree), return(rev(c(lib_tree, cpath))))
}
new_tree <- prioritize_conda(lib_tree=.libPaths())
.libPaths(new_tree) |
Our solution was to just set the env vars in a post-activate hook: qiime2/qiime2#395 We haven't had an issue with this since, but it would be very nice if upstream considered changing the default behavior w.r.t. user-packages. |
Yes. Please see PR #65. Any feedback would be much appreciated!
…On Fri, Feb 22, 2019, 10:52 AM Evan Bolyen ***@***.***> wrote:
Our solution was to just set the env vars in a post-activate hook:
qiime2/qiime2#395 <qiime2/qiime2#395>
We haven't had an issue with this since, but it would be very nice if
upstream considered changing the default behavior w.r.t. user-packages.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#37 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABiKfSlNoCW9prmNhHXwGCyWNcNbP7w2ks5vQC5CgaJpZM4Sk2P0>
.
|
For what it's worth: I may have a similar issue with singularity: My container has an R installation, but if the user of the container also has local packages, R run in the container will try to load things from the user local packages. Its quite annoying that just by having local packages makes the container not usable. |
I've read this thread several times, but I still don't fully understand what the problem is. Can someone write up a reproducible example and explain specifically what the problem is? |
conda-installed R will use the R packages installed in the users personal directory instead of the versions installed via conda because the user directory is listed first in
This can be tricky because it depends on the setup on your machine. Here is an example I ran on Ubuntu. I have R3 installed, and I installed R4 with conda. If you have R4 installed on your machine, install R3 with conda to achieve the same effect.
Here's what I see:
The above demonstrates the issue when the user has explicitly set a value for Please see PR #65 for additional discussion and a proposed solution. |
For me as a long-time R user this is the expected and appropriate behavior. You specified If you don't set
That is a problem for the reason I explain in #169 , but that doesn't seem to be the thing people are objecting to here.
and indeed
That isn't environment-specific, but it is conda and R version specific, which will avoid the version incompatibility in your example, and that directory won't be used by default anyway. |
I agree that R is doing what it is documented to do in regards to the user library. I and others use conda environments to create isolated computational environments. I don't want them to be affected by what I happened to have installed in my user library for use with my system-wide installation of R. When I give a collaborator an
That is a new development. That wasn't the case when this issue was originally created. This fixes the default situation on Linux, as I had noted in #65 (comment) But conda-installed R is still not isolated if 1) the user has manually specified |
I think that is the key fact I missed, makes much more sense now! Thanks for helping me understand that.
It is isolated if the user sets
I just checked on Windows, and indeed R installed in conda environments shares the same |
It's a problem on Mac. Much googling has led me here. :). What is the recommended solution for a mac user? |
@kevinpauli Assuming you want R to ignore non-conda packages, put this in your scripts before any |
Another potential solution: you can force conda-installed R to ignore an explicitly set Warning though: this will not save you from user-installed packages installed in the default user-library on Windows and macOS. |
Hello,
I think I've finally found the issue and the right place to post it, but apologies if I'm mistaken.
It appears that when you have
R
installed fromconda-forge
and a system R (say RStudio), the conda installed R will include the user-package directory used by the system R in it's search path (.libPaths()
).This means if you have an R package installed with conda in your environment and the same package installed in RStudio, your conda environment will use the RStudio package (from your user directory) instead of the conda package. If there were any built extensions in that package (which usually there are) you'll end up with a cryptic segfault or memory not mapped error as the conda R is not binary compatible with the system R.
The reason the user-directory is included in
.libPaths()
is because of this section of the Renviron file.If you install
r-base
and then comment out that section in~/.conda/envs/<whatever>/lib/R/etc/Renviron
the issue with.libPaths()
disappears and yourlibrary()
calls will only ever see your conda environment's packages, which I believe is kind of the entire point of conda.I think this repo is where one would provide a patch file to drop that section of the
Renviron
file to make R installations isolated under conda. Assuming I'm not missing something important, would it make sense to provide a PR to fix this for everyone?Cross referencing this issue: qiime2/q2-dada2#68 (where we have been trying to figure this out for a while).
The text was updated successfully, but these errors were encountered: