Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP: Zero initialisation support in boot.c #51

Open
m8pple opened this issue Apr 24, 2018 · 1 comment
Open

PEP: Zero initialisation support in boot.c #51

m8pple opened this issue Apr 24, 2018 · 1 comment

Comments

@m8pple
Copy link
Contributor

m8pple commented Apr 24, 2018

This goes slightly against what I said in #37, but something that would be quite useful
would be a bulk zero-initialisation. There are lots of zero-holes in memory maps,
whether that is because it is zero-initialised data-section, or because there is a long
run of zeros within a sparsely initialised data-section.

An extra boot command that can zero initialise arbitrary segments of RAM using
a single packet would reduce the amount of message traffic needed at start-up,
particularly when we have MBs of data that is mostly zeros.

e.g. something like this:

else if (cmd == StoreZeroCmd) {
        // Store zeros to data memory
        int n = msgIn->args[0];  // Size ***in bytes*** to transfer (saves an instruction)
        uint32_t addrEnd=addrReg + n;
        while( addrReg < addrEnd ){
          * (uint32_t*) addrReg = 0;
          addrReg += 4;
        }
}

I estimate that a total burden of 10-ish instructions added to the bootloader,
and it should be able to fill at about 1 word per 5-ish instructions - presumably
it would end up being DRAM bandwidth limited.

This is assuming that:

  • DRAM is not already zero-initialised: I assume it isn't?

  • Bandwidth from host to boards is much less than total bandwidth to DRAMs; We've
    got one PCI Expression link at ~1GB/sec, but even with Aesop we have 6 DRAMs
    which offer 12GB/s * 6 = 72 GB/s.

So for a system which is loading multi-GB sections on to DRAM this could
reduce the serial cost quite a bit.

Note that I'm aware that a lot can already be done to support faster loading,
e.g. using multiple threads per DRAM to load, and packing multiple words
into each packet. However a memset instruction would be easy to integrate
into the existing hostlink loaders without adding much complexity, and also
make more sophisticated loaders faster.

Flagrantly not using the PEP system I literally only just proposed because I don't
have time right now - this is more a reminder to turn this into one if it makes sense.

@m8pple m8pple changed the title Zero initialisation support in boot.c PEP: Zero initialisation support in boot.c Apr 24, 2018
@mn416
Copy link
Collaborator

mn416 commented Nov 11, 2019

This is a good suggestion, and would probably give a nice boost to graph download times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants