-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-PCI IRQs not balanced. #326
Comments
Its probably just simpler to map IRQ_OTHER to the cache level universally, you can try with this patch:
If it works for you ,feel free to open a PR for it. Though, This isn't really a problem about picking the best balance level, its more a inherent problem with identifying what these platform devices are (or more specifically, what their interrupt volume is going to be), and theres no good solution for that, save for a policy script Take a look at your debug output. You have about 100 irqs there. two of them have over 100k events on them (likely ethernet devices). two have a few thousand (probably some gpio bus or some such), and 3 have a few hundred (maybe serial ports or other low throughput io). The rest have no volume at all. What irqbalance should do is place the first two on their own cores, place the second two at the cache level, and honestly, depending on the use case, just leave the rest of them alone, or affine them to the package level, because selecting anything more specific is effectively a no-op, as their not impacting cpu usage at all. Even if the patch above makes your life a bit easier, you're going to want to write a policy script that isolates the high volume interrupts above, and either balances them at the appropriate level, or masks them from irqbalance entirely and manually sets their affinity to the core you want, depending on how specific you want to be. |
This will not solve the issue, since (as shown in the above output) there is only one cache domain.
It doesn't really matter what they are. Since this system has only one package and cache level,
Agreed.
I don't want to manually set the affinity, because the final usage is unknown. It's entirely possible for the user to connect with different Ethernet ports than what I used in the example above. In that case, a static affinity assignment would be counterproductive, since it could inadvertently place the "active" IRQs on the same cores. |
right, what you need is some mechanism to identify the type of device that interrupt is connected to, and platform devices on arm give you nothing or very little to make that determination. What you want is a script like this:
Just selecting a default of "first topology level that has more than one element" seems like it might be reasonable in this situation, but I'm having a hard time seeing how it makes anything less confusing. You have lots of interrupts whos profile can't be categorized, thats the real problem. I get that having them all balanced at the level of PACKAGE is a bit confusing, but I would find it more confusing to see a high volume interrupt balanced at some level other than CORE. I get in your scenario, it works out to be equivalent since you only have N cores that share a single cache domain, but in other scenarios that may not be the case. |
Why do you need this? What does it matter if a device is "high-volume" or not? |
because setting affinity for irqs that don't need to be balanced to as fine grained a level takes up space the apic table for lots of architectures. Lots of systems out there have thousands of interrupts that get almost no assertions, and you can only affine so many irqs to a cpu. |
OK, but on this device there is no APIC table. Every interrupt can be individually enabled for any particular core using dedicated registers. And the default Linux behavior with the affinity set to all cores is to just pick a core to send all the interrupts to [1]. So there is not really a point in trying to determine the function of the interrupt, since we can just balance them all for "free". [1] As I understand it, the GIC's interrupt distribution capabilities are not very good. |
yes, on your system. Irqbalance runs on a wide range of systems, so I'm really hesitant to do thing like this that will work for you, but may have a significant negative impact on other systems. I'll tell you what. I have a wide variety of systems here to test on (though I don't currently have an arm based one on hand). If you want to write a PR for this, I'll happily test it on what I have here. If there is no negative impact, I'll pull it in. It should be fairly straightforward to do, I think:
|
I am running irqbalance on an embedded system, where the vast majority of interrupts are from platform devices. irqbalance makes an attempt to classify IRQs based on their name (
guess_arm_irq_hints
). However, this is generally ineffective because:/sys/devices/platform/
(such as if it is on another bus)net4, net5, net6, net7
)Because of all these reasons, the heuristics used by irqbalance generally just result in the default of
IRQ_TYPE_LEGACY
/IRQ_OTHER
. Although this does not correctly identify the type of interrupt, it might have been fine except that all CPUs are in a single package and cache domain. BecauseIRQ_OTHER
interrupts default toBALANCE_PACKAGE
, no balancing is performed at all! This may be remedied using a scriptbut the default behavior remains useless and confusing (since you could get the same result without running IRQbalance at all).
I think the best way to fix this would be to make the balance level for
IRQ_OTHER
default to the highest balance level with things to balance between. E.g. on a system with one package and two cache domains with four CPUs each, it would default toBALANCE_CACHE
. On a system with one package, one cache domain, and three CPUs it would default toBALANCE_CORE
.I think it also would be nice to be able to specify the balance level for
IRQ_OTHER
interrupts via a command-line option, but this would still leave undesirable default behavior.The text was updated successfully, but these errors were encountered: