-
-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dropping CentOS 6 & Moving to CentOS 7 #1436
Comments
This came up in a numpy issue uncovered by testing the rc's of 1.21.0 for conda-forge - in particular, a test fails due to a bug in glibc 2.12 (not present anymore in 2.17). There would be a patch to work around the bug, but @rgommers asked:
I brought this comment into the feedstock PR, where @xhochy noted:
Hence moving it here. |
Also xref #1432 |
Was this issue discussed further at recent core meetings (I've occasionally seen public hackmd notes, but no idea where to find the a collection of them)? Any statistics or arguments that go against doing this? Assuming it should be done, this probably needs a migrator (for adding |
It was. Nothing conclusive yet. We collect the meeting notes here Informally we know there are still some CentOS 6 users (the long tail of support). That said, we do lack statistics either way. So this is something we discussed. Namely how best to collect them Yeah I think we need to decide this is something we want to do first, which we haven’t done yet |
I understand that some people are stuck on EOL'd OSes, but IMO the case to hold back based on that is really tenuous. If you're on an EOL OS, you eventually get no software updates anymore - why should conda-forge go out of its way to still service those users? I have to agree with @rgommers' statement (I quoted) above - stuff like numpy/numpy#19192 has a real cost. It probably bound 10-20h of maintainer (resp. core contributor) time in total, and would have been completely avoided without an ancient glibc. |
Another datapoint: I now have a staged-recipes PR that cannot build because the GPU-build only has glibc 2.12 (pytorch >=1.8 needs 2.17), and the CentOS7 build doesn't start: conda-forge/staged-recipes#16306 |
That's not a datapoint. We've documented this in our docs on how to use CentOS7. |
I know how to do it per-feedstock, but the above packaged cannot currently make it through staged recipes, or at least I'll need help to pull it off. Someone could also merge it and I fix things once the feedstock is created. But it's um... suboptimal... and definitely related to CentOS6, so I'd still call it a datapoint. |
Have you tried doing the same in staged-recipes? It should work. |
It does work on staged-recipes, see here for an example (CentOS 6 fails as expected but the CentOS 7 based job passes and the feedstock is generated correctly thanks to the That said, I am noticing more and more places where CentOS 6 issues are appearing and moving a feedstock to CentOS 7 causes the downstream feedstocks to also need to be changed causing yet more manual intervention to be needed. |
In the last few weeks, I've probably spent upwards of 15h chasing down bugs that ended up being resolved by moving to CentOS 7. This is a real cost. Same for less experienced contributors running into cryptic resolution errors for trying to package something that (now) needs a
Can we quantify this? CentOS 6 is EOL for a year now. Why are we so beholden to that long tail? Are those parties contributing to conda-forge somehow (infra costs or packaging effort)? If not, why are we providing free support longer than even RedHat? More to the point: why do we accept them externalizing their costs for not dealing with 10+ year old software to conda-forge?
If it takes X months to collect those statistics, that is a bad trade-off IMO. |
@conda-forge/core Does anyone have an objections to changing the default sysroot to CentOS 7? If not I'll make PRs to change it early next week. |
I know of users this will impact. What exactly is the problem with our current setup? |
I also know users who this will effect, including myself. I also know people using CentOS 5-like systems with conda, who will continue to do so for at least the next decade so we can't wait until nobody is using CentOS 6 anymore.
Over the last 6 months hundreds of hours must have been spent dealing with these issues and I'm not convinced hundreds more should be spent over the next six months. For people really stuck on CentOS 6 we could add a global label (like |
Global labels don't get repo data patching which at this point will render the channel likely wrong. |
100% agree with what @chrisburr wrote. There are also some pretty gnarly bugs in the trigonometry functions of glibc < 2.17 that have bitten me at least 3 times already.
And they can keep using old packages, or use paid support for their ancient platforms. I empathise that there are some people between a rock and a hard place, but again:
Those 100s of hours Chris is mentioning might be "free" but they come at the cost of other things not being improved or fixed or packaged, and barring strong countervailing reasons, that's IMO a horrible trade-off to make against the ecosystem in favour of an unspecified handful of people who cannot manage to run less-than-decade-old software, yet need the newest packages. |
Many folks stuck on an older centos are not there by choice. They are constrained by the lack of upgrades on big systems run by government labs, etc. The idea that they can simply pay for support is a non-starter to anyone who works in or understands how those organizations work. I am bringing this up because the remedies for using cos6 that folks keep bringing up here are not really available to the people that need cos6. We are making a choice to leave them behind when a majority of the software we build does not require cos6 at all. I suspect a much better path would be to further improve our support for cos7 in smithy or our bots. |
If you are referring to DOE labs, last time I heard the BES office demanded a through upgrade from its facilities due to cybersecurity concerns (cc: @mrakitin) and I assume the similar mandates should also be posted by other offices. |
@beckermr the legacy software on the legacy systems will keep running even if conda forge starts building on CentOS7. CentOS 6 was released literally 10 years ago. Government labs running inefficient HW and SW stack is not something anyone should encourage or promote. That hurts the economy, research and the environment. Those systems cost everyone time and money (along with conda forge people and contributors). My understanding is that both build performance and the performance of the built libs is different on Conda 6 vs 7, isn't this true? |
Thanks for the responses everyone! I don't see anyone addressing directly the points I raised. The cost here is the time for folks who need cos7 and don't know it when they are building a package. They see an odd error and it costs them time to track down. I 100% agree that this cost is real. Moving the default to cos7 is one way to reduce this cost. However it is not the only way. My premise is that given the headache this will cause for cos6 users in general, and that fact that cos7 is not required the majority of the time, we're better off improving the tooling around cos7 so that maintainers can better use it. |
Good point, I forgot about this. Hopefully the
This is might an option but I'm not sure it's easy to do the "right" thing and it might not even be possible. How do you see this working? I have two ideas and I think I would lean towards option 1 for simplicity. Option 1The bot automatically migrates downstream feedstocks as soon an upstream feedstock moves to be CentOS 7-only. Option 2Try to be smarter and use solvability as a constraint i.e.
I'm not sure how stable it will be and I suspect there are a lot of unstable edgecases. In particular what happens if both CentOS 6 and CentOS 7 are unsolvable? |
Option 3Change the default docker image to be cos7 for all feedstocks, but keep the sysroot to be cos6. This would remove the solver errors. |
I'm running into a problem with llvm openmp 16 that looks like it might be testing the limits of what our current setup can handle. openmp needs a newer than glibc 2.12 for its assumptions about whats in Perhaps @isuruf has another ace up his sleeve though? Just wanted to note that openmp >= 16 currently looks unbuildable both with and without |
I tried to push some changes that force cos7 for openmp 16. FWIW, opencv just moved to COS7 with the release of 4.7.0. conda-forge/opencv-feedstock#346 I think we are well beyond the life cycle of COS6 and many package i've seen attempt to use newer features more and more |
We've been putting this off for a long time. I'd advocate we continue to do so and not switch until absolutely necessary. We should understand what the exact issue is here before we proceed. |
I would like to ask for guidance on what to do about Should we add the |
@beckermr zstd seems to be hitting the need to update to cos7 -- conda-forge/zstd-feedstock#71 While we could likely patch things away, seems like busy work on my present environment, the following packages depend on zstd
notably, llvm seems like it would get bumped to cos7... Do we feel like it is finally time? |
This may be the end indeed. Let's talk it over at the next dev meeting. |
I'm all for bumping to cos7, but the |
I'm in favour of bumping (and never thought I'd provide arguments to the contrary 😋), but two more reasons make this less of a make-or-break situation:
I agree that we shouldn't try to patch around libraries to try to keep them compatible1, that's IMO a sisyphean task with lots of risk and little reward. But linking an additional library should still be manageable. So I think that day is coming, I don't mind if it does, but it doesn't have to be this zstd-issue that breaks the camel's back. PS. As I noted in the other issue, one of the trickier things about llvm-vs-glibc will be that libcxx 17 will require glibc >=2.24. Footnotes
|
Right. The whole "redhat not available to alma" adds some additional color to this. |
Next issue: The upcoming OpenSSL 3.2 requires The saving grace in this case is that OpenSSL 3.x is ABI-compatible, so we could keep the pinning at 3.1 while allowing compatible clients to pull in OpenSSL 3.2 at runtime. Still, it's one more bandaid to have to keep in mind... Footnotes
|
I'm pretty inclined to try to get to a consensus on how to deal with operating systems that are end of lifed by their original companies. I feel like we are repeating many of the points we did for OSX. I know formulas are "bad" but it would be great to have something like Nep29 or SPEC0 where rules are centralized. |
FWIW, OpenSSL 3.2 restored compatibility with glibc <2.14.
💯 |
It would be good if we could get more stakeholders to buy into those rules (like the Python distribution and wheels). Would make it easier to move forward as a united front Other advantage of rules is it makes planning for large organizations easier. They can see how long they can use something and when they might need to plan for changes |
As we've actually dropped cos6 now, I am thinking we should close this. Comments @conda-forge/core? |
Sounds good to me in principle.
(I can't really look into it now but hope to get to look into conda-forge & co. stuff in 1-2 weeks.) |
Also this item from the cos6 PR: conda-forge/conda-forge-pinning-feedstock#6070 (comment) |
And we may want to drop the current repodata hacks. |
plus this discussion: conda-forge/linux-sysroot-feedstock#63 |
Raising this issue to track and discuss when we want to drop CentOS 6 and move to CentOS 7 as the new default. This also came up in the core meeting this week
cc @conda-forge/core
The text was updated successfully, but these errors were encountered: