Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PAPI event multiplexing yields wrong data, possible fix? #20

Open
jscarretero opened this issue May 26, 2016 · 4 comments
Open

PAPI event multiplexing yields wrong data, possible fix? #20

jscarretero opened this issue May 26, 2016 · 4 comments

Comments

@jscarretero
Copy link

Dear Admin,

I guess that PAPI event multiplexing is not a very used IPM feature and that it might be experimental.
I have been toying with it and have enabled it by uncommenting /* #define USE_PAPI_MULTIPLEXING */ and by setting MAXNUM_PAPI_EVENTS to 48, MAXNUM_PAPI_COUNTERS to 32, MAXSIZE_PAPI_EVTNAME to 45 and MAXSIZE_ENVKEY to 2048.

When I try to profile a simple test MPI matrix multiplication program (https://github.com/mperlet/matrix_multiplication) having enabled IPM_HPM=IPM_HPM=UOPS_EXECUTED_PORT:PORT_0,PAPI_TOT_CYC,DTLB_LOAD_MISSES:MISS_CAUSES_A_WALK,AVX:ALL,MEM_UOPS_RETIRED:ALL_LOADS,UOPS_EXECUTED:CORE,PAGE-FAULTS:u=0,PAPI_L3_TCM (some randomly selected events) I was getting negative numbers for MEM_UOPS_RETIRED:ALL_LOADS and UOPS_EXECUTED:CORE.

I traced to problem to be occurring in rv = PAPI_set_multiplex(papi_evtset[comp].evtset); inside ipm_papi_startfrom mod_papi.c. I then modified the code embedded in the USE_PAPI_MULTIPLEXING define (for the imp_papi_startfunction) to be like:

#ifdef USE_PAPI_MULTIPLEXING
      rv = PAPI_assign_eventset_component(papi_evtset[comp].evtset, comp);
      if (rv != PAPI_OK) {
    IPMDBG("PAPI: [comp %d] Error calling assign_eventset_component\n", comp);
      }

      rv = PAPI_set_multiplex(papi_evtset[comp].evtset);
      if( rv!= PAPI_OK )    {
    IPMDBG("PAPI: [comp %d] Error calling set_multiplex\n", comp);
      }
#endif

And it seems to be working and returning presumably "correct" values. Does it look good to you? Have you faced similar results previously?

Thank you very much for your help and for this great tool.

Javi Carretero

@pmucci
Copy link

pmucci commented May 26, 2016

Greetings folks,

One needs to be very careful when using multiplexing in libraries such as IPM. In PAPI, multiplexing can be done in users base with alarms on signals, or internal space using Kernel timers. We default of the latter as it's a built-in part of the functionality of the perf events subsystem.

The only way multiplexing really make sense is when the granularity on the measurements is on the order of one second or greater. In other words, if you're instrumenting long-running sections of code.

If one takes measurements in very small time quanta, One needs to be aware that depending on the number of counters being used, a number of them may not have been scheduled at all during that interval. I believe the default interval in perf events maybe 100 hertz but One would need to check the kernel to be sure.

Phil

Apologies for brevity and errors as this was sent from my mobile device.

On May 26, 2016, at 14:19, jscarretero notifications@github.com wrote:

Dear Admin,

I guess that PAPI event multiplexing is not a very used IPM feature and that it might be experimental.
I have been toying with it and have enabled it by uncommenting /* #define USE_PAPI_MULTIPLEXING */ and by setting MAXNUM_PAPI_EVENTS to 48, MAXNUM_PAPI_COUNTERS to 32, MAXSIZE_PAPI_EVTNAME to 45 and MAXSIZE_ENVKEY to 2048.

When I try to profile a simple test MPI matrix multiplication program (https://github.com/mperlet/matrix_multiplication) having enabled IPM_HPM=IPM_HPM=UOPS_EXECUTED_PORT:PORT_0,PAPI_TOT_CYC,DTLB_LOAD_MISSES:MISS_CAUSES_A_WALK,AVX:ALL,MEM_UOPS_RETIRED:ALL_LOADS,UOPS_EXECUTED:CORE,PAGE-FAULTS:u=0,PAPI_L3_TCM (some randomly selected events) I was getting negative numbers for MEM_UOPS_RETIRED:ALL_LOADS and UOPS_EXECUTED:CORE.

I traced to problem to be occurring in rv = PAPI_set_multiplex(papi_evtset[comp].evtset); inside ipm_papi_startfrom mod_papi.c. I then modified the code embedded in the USE_PAPI_MULTIPLEXING define (for the imp_papi_startfunction) to be like:

#ifdef USE_PAPI_MULTIPLEXING
rv = PAPI_assign_eventset_component(papi_evtset[comp].evtset, comp);
if (rv != PAPI_OK) {
IPMDBG("PAPI: [comp %d] Error calling assign_eventset_component\n", comp);
}

  rv = PAPI_set_multiplex(papi_evtset[comp].evtset);
  if( rv!= PAPI_OK )    {
IPMDBG("PAPI: [comp %d] Error calling set_multiplex\n", comp);
  }

#endif
And it seems to be working and returning presumably "correct" values. Does it look good to you? Have you faced similar results previously?

Thank you very much for your help and for this great tool.

Javi Carretero


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub

@jscarretero
Copy link
Author

Hey Phil,
I was actually profiling a run of several seconds and I think PAPI multiplexes by default every 0.1 seconds (100000 microseconds).
By the way, I have seen that you guys at minimalmetrics are contributing to PAPIEx, great! Actually, right now I am taking a look at it, trying to install it :)

@pmucci
Copy link

pmucci commented May 26, 2016

Hi there,

Believe it or not I'm actually the original author of papiex and Papi many many years ago. I was hoping by now that some smart graduate students would've long rewritten everything but alas that hasn't happened yet. :-)

One can in fact program the multiplexing interval inside of papi, end it will either use that internally (for user space mpx) or send it along to Perf events.

I'm glad you're interested in PAPiex, note that we have a private tree on bit bucket with a number of fixes that I've not been pushed up to github. We are in the process of merging the two so any experiences you have please send them along.

To the IPM authors, sorry about the off-topic response and keep up the good work!

Apologies for brevity and errors as this was sent from my mobile device.

On May 26, 2016, at 19:03, jscarretero notifications@github.com wrote:

Hey Phil,
I was actually profiling a run of several seconds and I think PAPI multiplexes by default every 0.1 seconds (100000 microseconds).
By the way, I have seen that you guys at minimalmetrics are contributing to PAPIEx, great! Actually, right now I am taking a look at it, trying to install it :)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub

@nerscadmin
Copy link
Owner

Hi Phil,

I echo Javi's support. Multiplexing is hard but likely a nut we need to
crack.

For IPM generally we're interested in job level metrics from long runs on
many nodes. There is an opportunity to leverage sampling both across time
and potentially across cores.

-David

On Thu, May 26, 2016 at 10:03 AM, jscarretero notifications@github.com
wrote:

Hey Phil,
I was actually profiling a run of several seconds and I think PAPI
multiplexes by default every 0.1 seconds (100000 microseconds).
By the way, I have seen that you guys at minimalmetrics are contributing
to PAPIEx, great! Actually, right now I am taking a look at it, trying to
install it :)


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#20 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants