BUG due to a perf probe

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

BUG due to a perf probe

Torsten Polle
Hi,

when I used perf probes, I ran into the following bug:

[24411.722409] BUG: using smp_processor_id() in preemptible [00000000] code: l1cache/1326
[24411.730414] caller is debug_smp_processor_id+0x18/0x24
[24411.735668] CPU: 0 PID: 1326 Comm: l1cache Tainted: G           O 3.14.79-08370-gde93e13 #1
[24411.744056] Backtrace:
[24411.746618] [<80011c98>] (dump_backtrace) from [<80011ea4>] (show_stack+0x18/0x1c)
[24411.754272]  r7:8028c978 r6:00000000 r5:200d0013 r4:00000000
[24411.760050] [<80011e8c>] (show_stack) from [<805428b4>] (dump_stack+0x98/0xd4)
[24411.767365] [<8054281c>] (dump_stack) from [<8028c930>] (check_preemption_disabled+0xe0/0x110)
[24411.776062]  r7:8028c978 r6:00000000 r5:da54a008 r4:da54a000
[24411.781832] [<8028c850>] (check_preemption_disabled) from [<8028c978>] (debug_smp_processor_id+0x18/0x24)
[24411.791485]  r9:da54bfb0 r8:7f8016e8 r7:7f98e140 r6:7f8016e8 r5:7f988158 r4:da54bef4
[24411.799410] [<8028c960>] (debug_smp_processor_id) from [<7f7fa048>] (stap_perf_read_handler_0+0x18/0x70 [taptrek])
[24411.809913] [<7f7fa030>] (stap_perf_read_handler_0 [taptrek]) from [<7f7f7b44>] (stapiu_probe_prehandler+0xec/0x304 [taptrek])
[24411.821400]  r4:7f98f288
[24411.824012] [<7f7f7a58>] (stapiu_probe_prehandler [taptrek]) from [<800d6618>] (uprobe_notify_resume+0x410/0x834)
[24411.834366]  r9:00000000 r8:d6d16c00 r7:7f7f7a58 r6:7f8016e8 r5:da54bfb0 r4:da54a020
[24411.842253] [<800d6208>] (uprobe_notify_resume) from [<80011798>] (do_work_pending+0x84/0xc4)
[24411.850865]  r10:da54a000 r9:da54a000 r8:00000000 r7:da54bfb0 r6:da54a000 r5:00000480
[24411.858826]  r4:00000000
[24411.861404] [<80011714>] (do_work_pending) from [<8000e180>] (work_pending+0xc/0x20)
[24411.869231]  r7:00000000 r6:80547b18 r5:00000000 r4:00010540
[25240.284991] taptrek (taptrek_run_azit.stp): systemtap: 4.0/0.170, base: 7f999000, memory: 1644data/80text/242ctx/2063net/2369alloc kb, probes: 14

I've fixed the code generation temporarily by using preempt_disable()/preempt_enable(). I'm wondering whether this only covers a more serious problem or whether I’m simply doing something wrong.

void
uprobe_derived_probe::emit_perf_read_handler (systemtap_session &s,
                                              unsigned idx)
{
 if (perf_counter_refs.size())
   {
     unsigned ref_idx = 0;
     s.op->newline() << "static void stap_perf_read_handler_" << idx
                      << "(long *values) {";
     s.op->indent(1);
     s.op->newline() << "preempt_disable();";

     for (auto pcii = perf_counter_refs.begin();
           pcii != perf_counter_refs.end();
           pcii++)
       {
          // Find the associated perf.counter probe
          unsigned i = 0;
          for (auto it=s.perf_counters.begin() ;
               it != s.perf_counters.end();
               it++, i++)
            {
              if ((*it).first == (*pcii))
                {
                  s.op->newline() << "values[" << ref_idx
                                  << "] = _stp_perf_read(smp_processor_id(),"
                                  << i << ");";
                  ref_idx++;
                  break;
                }
            }
        }
     s.op->newline() << "preempt_enable();";
     s.op->newline() << "return;";
     s.op->newline(-1) << "}";
   }
}

These are the probes used.

probe perf.hw.stalled_cycles_frontend.counter("frontend") {}
probe perf.hw.stalled_cycles_backend.counter("backend") {}
probe perf.hw_cache.l1i.read.miss.counter("l1i_read_miss") {}
probe perf.hw_cache.l1d.read.miss.counter("l1d_read_miss") {}
probe process("/bin/l1cache").statement("*@/home/tpolle/work/issues/2018-09-13-bus-load/l1cache.c:20")
{
        printf("[%s]d4:%u:%u:%u:%u\n", get_clks(),
               @perf("frontend"),
               @perf("backend"),
               @perf("l1i_read_miss"),
               @perf("l1d_read_miss"));
}


Kind Regards,
Torsten