You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I try example simulation with command which is described as in simulation.md: ./ci/blackbox.sh --clusters=1 --cores=4 --warps=4 --threads=4 --driver=opae --app=sgemm gives Verify result PASSED! PERF: core0: instrs=75388, cycles=93187, IPC=0.808997 PERF: core1: instrs=75388, cycles=93189, IPC=0.808980 PERF: core2: instrs=75388, cycles=93188, IPC=0.808988 PERF: core3: instrs=75387, cycles=93347, IPC=0.807600 PERF: instrs=301551, cycles=93347, IPC=3.230431
However expected output which is shown in simulation.md is: Verify result PASSED! PERF: core0: instrs=90802, cycles=52776, IPC=1.720517 PERF: core1: instrs=90693, cycles=53108, IPC=1.707709 PERF: core2: instrs=90849, cycles=53107, IPC=1.710678 PERF: core3: instrs=90836, cycles=50347, IPC=1.804199 PERF: instrs=363180, cycles=53108, IPC=6.838518
Expected performance is at least doubled with same config parameters. Is there any missing detail here?
The text was updated successfully, but these errors were encountered:
There is one more thing:
When I use --driver=opae, in the start I get overflowed values and then my cores run instructions one by one. The output results look something like this:
When I try example simulation with command which is described as in simulation.md:
./ci/blackbox.sh --clusters=1 --cores=4 --warps=4 --threads=4 --driver=opae --app=sgemm
givesVerify result PASSED! PERF: core0: instrs=75388, cycles=93187, IPC=0.808997 PERF: core1: instrs=75388, cycles=93189, IPC=0.808980 PERF: core2: instrs=75388, cycles=93188, IPC=0.808988 PERF: core3: instrs=75387, cycles=93347, IPC=0.807600 PERF: instrs=301551, cycles=93347, IPC=3.230431
However expected output which is shown in simulation.md is:
Verify result PASSED! PERF: core0: instrs=90802, cycles=52776, IPC=1.720517 PERF: core1: instrs=90693, cycles=53108, IPC=1.707709 PERF: core2: instrs=90849, cycles=53107, IPC=1.710678 PERF: core3: instrs=90836, cycles=50347, IPC=1.804199 PERF: instrs=363180, cycles=53108, IPC=6.838518
Expected performance is at least doubled with same config parameters. Is there any missing detail here?
The text was updated successfully, but these errors were encountered: