[Fwd: [RFC][PATCH 0/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1]

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[Fwd: [RFC][PATCH 0/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1]

Masami Hiramatsu
Hi,

I posted latest djprobe patch to Linux-kernel ML.
But it seems that it was rejected by spam-filter on this ML...
So I send it again.

-------- Original Message --------
Subject: [RFC][PATCH 0/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1
Date: Mon, 31 Oct 2005 20:04:04 +0900
From: Masami Hiramatsu <[hidden email]>
To: [hidden email]
CC: Satoshi Oshima <[hidden email]>, Hideo Aoki <[hidden email]>,  Yumiko Sugita <[hidden email]>,  [hidden email],  [hidden email]

Hello,

I would like to propose djprobe (Direct Jump Probe) for low overhead
probing.
The djprobe is useful for the performance analysis function and the
kernel flight-recording function which constantly traces events in
the kernel. Because we should make their influence on performance as
small as possible.
Djprobe is a kind of probes in kernel like kprobes.
 It has some features:
- Jump instruction based probe. This is so fast.
- Non interruption.
- Safely code insertion on SMP.
- Lockless probe after registered.
I attached detailed document of djprobe to this mail. If you need
more information, please see it.

This djprobe is NOT a replacement of kprobes. Djprobe and kprobes
have complementary qualities. (ex: djprobe's overhead is low, and
kprobes can be inserted in anywhere.)
You can use both kprobes and djprobe as the situation demands.

I measured the overhead of the djprobe on Pentium4 3.06GHz PC by
using gtodbench (*). The result I got was about 100ns. In the view
of performance, I think djprobe is the best probe method. What would
you think about this?

(*)The gtodbench is micro benchmark which is included in published
djprobe source package. You can download it from LKST's web site:
http://prdownloads.sourceforge.net/lkst/djprobe-20050713.tar.bz2

The following three patches introduce djprobe (Direct Jump Probe)
to linux-2.6.14-rc5-mm1.
patch 1:    Introduce a instruction slot management structure to
            handle different size slots. (a patch for kprobes)
patch 2:    Djprobe core (arch-independant) patch.
patch 3:    Djprobe i386 (arch-dependant) patch.

Please try to use djprobe.

Any comments or suggestions are welcome.

Best regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Djprobe Documentation
authors: Satoshi Oshima ([hidden email])
         Masami Hiramatsu ([hidden email])

INDEX

1. Djprobe concepts
2. How djprobe works
3. Further Considerations
4. Djprobe Features
5. Architectures Supported
6. Configuring Djprobe
7. API Reference
8. TODO
9. FAQ

1. Djprobe concepts

The basic idea of Djprobe is to dynamically hook at any kernel function
entry points and collect the debugging or performance analysis information
non-disruptively. The functionality of djprobe is very similar to Kprobe
or Jprobe. The distinction of djprobe is to use jump instruction instead
of break point instruction. This distinction reduces the overhead of each
probe.

Developers can trap at almost any kernel function entry points, specifying
a handler routine to be invoked when the jump instruction is executed.


2. How Djprobe works

Break point instruction is easily inserted on most architecture.
For example, binary size of break point instruction on i386 or x86_64
architecture is 1 byte. 1 byte replacement is took place in single step.
And replacement with breakpoint instruction is guaranteed as SMP safe.

On the other hand jump instruction is not easily inserted. Binary size of
jump instruction on i386 is 5 byte. 5 byte replacement cannot be executed
in single step. And beyond that dynamic code modification has some
complicated restriction.

To explain the djprobe mechanism, we introduce some terminology.
Image certain binary line which is constructed by 2 byte instruction,
2byte instruction and 3byte instruction.

         IA
         |
[-2][-1][0][1][2][3][4][5][6][7]
        [ins1][ins2][  ins3 ]
        [<-     DCR       ->]
           [<- JTPR ->]

ins1: 1st Instruction
ins2: 2nd Instruction
ins3: 3rd Instruction
IA:  Insertion Address
JTPR: Jump Target Prohibition Region
DCR: Detoured Code Region


The replacement procedure of djpopbes is 6 steps:

(1) copying instruction(s) in DCR
(2) putting break point instruction at IA
(3) scheduling works on each CPU
(4) executing CPU safety check on each work
(5) replacing original instruction(s) with jump instruction without
first byte and serializing code
(6) replacing break point instruction with first byte of jump instruction

Further explanation is given below.

(1) copying instruction(s) in DCR

Djprobe copies replaced instruction(s) to the region that djprobe allocates.
The replaced instructions must include the instruction that includes the byte
at IA+4. Therefore the size of DCR must be 5 byte or more. The size of DCR
must be given by djprobes user.

(2) putting break point instruction at IA

Djprobe replaces a break point instruction at Insertion Point. After this
replacement, the djprobe act like kprobes.

(3) scheduling works on each CPU

Djprobe schedules work(s) that execute CPU safety check on each CPU, and wait
till those works finished.

(4) executing CPU safety check on each work

Current Djprobe suppose that the context switch must NOT occur on extension
of interruption, which means that every interruption must return before
executing context switch.

Therefore, execution of scheduled works itself is the proof that every
interruption stack (and every process stack) doesn't include any address
in JTPR.

The last CPU that executes safety check work wakes the waiting process up.

(5) replacing original instruction(s) with jump instruction without first
byte and serializing code

After all safety check works are scheduled, djprobe can replace the codes
in JTPR safely. Because, now, any CPU is not executing JTPR. Even if a CPU
tries to execute the instructions in the top of DCR again, the CPU is
interrupted by kprobe and is led to execute the copied instructions. So
any CPU does not touch the instructions in the JTPR.

Djprobe replaces the bytes in the area from IA+1 to IA+4 with jump
instruction that doesn't contain first byte.
And it serializes the replaced code on every CPU.

(6) replacing breakpoint instruction with first byte of jump instruction

Djprobe replaces breakpoint instruction that is put by themselves with the
first byte of jump instruction.


3. Further considerations

There are many difficulties on implementation of djprobe. In this section,
we discuss restrictions of djprobe to understand these difficulties.

3.1 the way to confirm safety of DCR(dynamic analysis)

Djprobe tries to replace the code that includes one instruction or more.
This replacement usually accompanies changing the boundaries of instructions.
Therefore djprobe must ensure that the other CPUs don't execute DCR or every
stack doesn't contain the address in JTPR.

3.2 confirmation of safety of DCR(static analysis)

Djprobe must also avoid JTPR must not be targeted by any jump or call
instruction. Basically this must be extremely difficult to take place.
But some point such as function entry point can be expected that is not
target of jump or call instruction (because function entry point contains
fixed form that ensures the code convention.)

4. Djprobe Features

- Djprobe can probe entries of almost all functions without any interruption.

5. Architecture Supported

- i386


6. Configuring Djprobe
When configuring the kernel using make menuconfig/xconfig/oldconfig, ensure
that CONFIG_DJPROBE is set to "y".  Under "Instrumentation Support",
look for "Direct Jump probe". You may have to enable "Kprobes" and to
*DISABLE* "Preemptible Kernel".

7. API Reference
The Djprobe API includes "register_djprobe" function and
"unregister_djprobe" function. Here are specifications for these functions
and the associated probe handlers.

7.1 register_djprobe

#include <linux/djprobe.h>
int register_djprobe(struct djprobe *djp, void *addr, int size);

Inserts a jump instruction at the address addr. When the jump is
hit, Djprobe calls djp->handler.

register_djprobe() returns 0 on success, or a negative errno otherwise.

User's probe handler (djp->handler):
#include <linux/djprobe.h>
#include <linux/ptrace.h>
void handler(struct djprobe *djp, struct pt_regs *regs);

Called with p pointing to the djprobe associated with the probe point,
and regs pointing to the struct containing the registers saved when
the probe point was hit.

7.2 unregister_djprobe

#include <linux/djprobe.h>
void unregister_djprobe(struct djprobe *djp);

Removes the specified probe.  The unregister function can be called
at any time after the probe has been registered.


8. TODO

(1)support architecture transparent interface.
   (Djprobe interface emulated by kprobes)
(2)bulk registeration interface support
(3)kprobe interoperability (coexistance in same address)
(4)other architectures support

9. FAQ
Direct Jump Probe Q&A

Q: What is the Direct Jump Probe (Djprobe)?
A: Djprobe is a low overhead probe method for linux kernel.

Q: What is different from Kprobes?
A: The most different feature is that the djprobe uses a jump instruction
code instead of breakpoint instruction code. It can reduce overheads of
probing especially when the probes are executed frequently.

Q: How does the djprobe work?
A: First, Djprobe copies some instructions modified by a jump instruction
into the middle of a stub code buffer. Next, it overwrites the instructions
with the jump instruction whose destination is the top of that stub code
buffer. In the top of the stub code buffer, there is a call instruction
which calls a probe function. And, in the bottom of the stub code buffer,
there is a jump instruction whose destination is the next of the modified
instructions.
 On the other hand, Kprobe copies only one instruction which will be
modified by breakpoint instruction, and overwrites it breakpoint
instruction. When breakpoint interruption handling, it executes the copied
instruction with the trap flag. When trap interruption handling, it
corrects IP(*) for returning to the kernel code.
 So, djprobe's work sequence is "jump", "probe", "execute copies" and
"jump", whereas kprobes' sequence is "break", "probe", "execute copies",
and "trap".

(*)Instruction Pointer

Q: Does the djprobe need to modify kernel source code?
A: No. The djprobe is one of the dynamic probes. It can be inserted into
running kernel.

Q: Can djprobe work with CPU-hotplug?
A: Yes, djprobe locks cpu-hotplug in the critical section.

Q: Where can the djprobe be inserted in?
A: Djprobe can be inserted in almost all kernel code including the head of
almost kernel functions. The insertion area must satisfy the assumptions
described below.

(In i386 architecture)
         IA
         |
[-2][-1][0][1][2][3][4][5][6][7]
        [ins1][ins2][  ins3 ]
        [<-     DCR       ->]
           [<- JTPR ->]

ins1: 1st Instruction
ins2: 2nd Instruction
ins3: 3rd Instruction
IA:  Insertion Address
DCR (Detoured Code Region): The area which is including the instructions
whose first byte is in the range in 5 bytes (this size is from the size of
jump instruction) from the insertion address. These instructions are copied
into the middle of a stub code buffer.
JTPR (Jump Target Prohibition Region): The area which is including the
codes among codes rewritten in the jump instruction by djprobe except the
first one byte.

Assumptions:
i) The insertion address points the first byte of an instruction.
  This is for avoidance of a bad instruction exception.
ii) There are no instructions which refer IP (ex. relative jmp) in DCR.
  EIP has been changed when copied instruction is executed.
iii) There are no instructions which occur context-switch (ex. call
     schedule()) in DCR.
  If a context-switch occurs in DCR, the next address of an instruction
 (ex. the address of "ins2") is stored in the call stack of previous thread.
 After that, djprobe overwrites the instruction with jump instruction. When
 the previous thread switches back, it resumes execution from the stored
 address. So it will cause a bad instruction exception.
 iv) Destination address of jump or call is not included in JTPR.
  This is for avoidance of a bad instruction exception too.

Q: Can several djprobes be inserted in the same address?
A: Yes. Several djprobes which are inserted in the same address are
aggregated and share one instance.
NOTE: When a new djprobe's insertion address is in another djprobe's JTPR
(above described), or the another djprobe's insertion address is in the new
djprobe's JTPR, register_djprobe() fails to register the new djprobe and
returns -EEXIST error code.

Q: Can djprobe be used with kprobes in same address?
A: No, currently djprobe can not coexist with kprobes in same address. But
we will support this feature as soon as possible.

Q: Should the jump instruction be with in a page boundary to avoid access
 violation and page fault?
A: No. The x86 processors can handle non-aligned instructions correctly. We
can see many non-aligned instructions in the kernel binary. And, in the
kernel space, there is no page fault. Kernel code pages are always mapped
to the kernel page table.
So it is not necessary to care of page boundaries in x86 architecture.

Q: How does the djprobe resolve problems about self/cross-modifying code?
 In Pentium Series, Unsynchronized cross-modifying code operations except
 the first byte of an instruction can cause unexpected instruction
 execution results.
A: Djprobe uses a trick code to resolve the problems. It modifies the
 instructions as following.
1) Register special handler as a kprobe handler. (And a break point
  instruction is written on the first byte of the insertion address by
  kprobes.)
2) Check safety (this is described in the next question's answer).
3) Write only the destination address part of jump instruction on the
  kernel code. (This operation is not synchronized)
4) Call "cpuid" on each processor for synchronization.
5) Write the first byte of the jump instruction. (This operation is
  synchronized automatically)

Q: How does the djprobe guarantee no threads and no processors are  
 executing the modifying area? The IP of that area may be stored in the
 stack memory of those threads.
A: The problem would be caused for three reasons:
 i) Problem caused by the multi processor system
  Another processor may be executing the area which is overwritten by jump
 instruction. Djprobe should guarantee no processor is executing those
 instructions when modify it.
 ii) Problem caused by the interruption
  An interruption might have occurred in the area which is going to be
 overwritten by jump instruction. Djprobe should guarantee all
 interruptions which occurred in the area have finished.
 iii) Problem caused by full preempt kernel
  In case of Problem (iii), it is described in the next question's answer.

The Djprobe uses the workqueue to resolve Problem (i) and (ii). The
solution is described below:
1) Copy the entire of the DCR (described above) into the middle of a stub
 code buffer.
2) Register special handler as a kprobe handler. This special handler
 changes kprobe's resume point to the stub code buffer.
3) Clear the safety flags of all processors.
4) Register a safety checking work to the workqueue on each processor. And
 wait till those works are scheduled.
5) When keventd thread is scheduled on a processor, it executes the work.
 In this time, this processor is not executing the area which is
 overwritten by jump instruction. And also it has finished all
 interruptions. Because, in the case of voluntary preemption or non
 preemption kernel, the context switch does not occur in the extension of
 interruption.
6) The all works are scheduled, djprobe writes the jump instruction.

Q: Can the djprobe work with kernel full preemption?
A: No, but you can use the djprobe's interface. When kernel full preemption
 is enabled, we can't ensure that no threads are executing the modified
 area. It may be stored in the stack of the threads. In this case, the
 djprobe interfaces are emulated by using kprobe.
 The latest linux kernel supports not only full preemption but also the
 voluntarily preemption. In the case of voluntarily preemption, threads
 are scheduled from only limited addresses. So it is easy to check that
 the preemption can not occur in the modified area.


Reply | Threaded
Open this post in threaded view
|

[Fwd: [RFC][PATCH 1/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1]

Masami Hiramatsu
Hi,

Here is the second mail which I posted to LKML.

---
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

-------- Original Message --------
Subject: [RFC][PATCH 1/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1
Date: Mon, 31 Oct 2005 20:07:45 +0900
From: Masami Hiramatsu <[hidden email]>
To: [hidden email]
CC: Satoshi Oshima <[hidden email]>, Hideo Aoki <[hidden email]>,        Yumiko Sugita <[hidden email]>, [hidden email],        [hidden email]
References: <[hidden email]>

Hi,

This patch enables get_insn_slot() to handle slots that have
different size.
The djprobe requires this patch to work it on the machines which
support "NX bit".

---
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Signed-off-by: Masami Hiramatsu <[hidden email]>

 include/linux/kprobes.h |    5 ++++
 kernel/kprobes.c        |   58 +++++++++++++++++++++++++++++++-----------------
 2 files changed, 43 insertions(+), 20 deletions(-)
diff -Narup linux-2.6.14-rc5-mm1/include/linux/kprobes.h linux-2.6.14-rc5-mm1.djp.1/include/linux/kprobes.h
--- linux-2.6.14-rc5-mm1/include/linux/kprobes.h 2005-10-25 11:29:02.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.1/include/linux/kprobes.h 2005-10-25 13:11:26.000000000 +0900
@@ -147,6 +147,11 @@ struct kretprobe_instance {
  struct task_struct *task;
 };

+struct kprobe_insn_page_list {
+ struct hlist_head list;
+ int insn_size; /* size of an instruction slot */
+};
+
 #ifdef CONFIG_KPROBES
 extern spinlock_t kretprobe_lock;
 extern int arch_prepare_kprobe(struct kprobe *p);
diff -Narup linux-2.6.14-rc5-mm1/kernel/kprobes.c linux-2.6.14-rc5-mm1.djp.1/kernel/kprobes.c
--- linux-2.6.14-rc5-mm1/kernel/kprobes.c 2005-10-25 11:29:02.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.1/kernel/kprobes.c 2005-10-25 13:13:58.000000000 +0900
@@ -58,44 +58,50 @@ static DEFINE_PER_CPU(struct kprobe *, k
  * stepping on the instruction on a vmalloced/kmalloced/data page
  * is a recipe for disaster
  */
-#define INSNS_PER_PAGE (PAGE_SIZE/(MAX_INSN_SIZE * sizeof(kprobe_opcode_t)))
+#define INSNS_PER_PAGE(size) (PAGE_SIZE/(size * sizeof(kprobe_opcode_t)))

 struct kprobe_insn_page {
  struct hlist_node hlist;
  kprobe_opcode_t *insns; /* Page of instruction slots */
- char slot_used[INSNS_PER_PAGE];
  int nused;
+ char slot_used[1];
 };

-static struct hlist_head kprobe_insn_pages;
+static struct kprobe_insn_page_list kprobe_insn_pages = {
+ HLIST_HEAD_INIT, MAX_INSN_SIZE
+};

 /**
- * get_insn_slot() - Find a slot on an executable page for an instruction.
+ * __get_insn_slot() - Find a slot on an executable page for an instruction.
  * We allocate an executable page if there's no room on existing ones.
  */
-kprobe_opcode_t __kprobes *get_insn_slot(void)
+kprobe_opcode_t
+ __kprobes * __get_insn_slot(struct kprobe_insn_page_list *pages)
 {
  struct kprobe_insn_page *kip;
  struct hlist_node *pos;
+ int ninsns = INSNS_PER_PAGE(pages->insn_size);

- hlist_for_each(pos, &kprobe_insn_pages) {
+ hlist_for_each(pos, &pages->list) {
  kip = hlist_entry(pos, struct kprobe_insn_page, hlist);
- if (kip->nused < INSNS_PER_PAGE) {
+ if (kip->nused < ninsns) {
  int i;
- for (i = 0; i < INSNS_PER_PAGE; i++) {
+ for (i = 0; i < ninsns; i++) {
  if (!kip->slot_used[i]) {
  kip->slot_used[i] = 1;
  kip->nused++;
- return kip->insns + (i * MAX_INSN_SIZE);
+ return kip->insns +
+    (i * pages->insn_size);
  }
  }
  /* Surprise!  No unused slots.  Fix kip->nused. */
- kip->nused = INSNS_PER_PAGE;
+ kip->nused = ninsns;
  }
  }

- /* All out of space.  Need to allocate a new page. Use slot 0.*/
- kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL);
+ /* All out of space.  Need to allocate a new page. Use slot 0. */
+ kip = kmalloc(sizeof(struct kprobe_insn_page) +
+    sizeof(char) * (ninsns - 1), GFP_ATOMIC);
  if (!kip) {
  return NULL;
  }
@@ -111,23 +117,25 @@ kprobe_opcode_t __kprobes *get_insn_slot
  return NULL;
  }
  INIT_HLIST_NODE(&kip->hlist);
- hlist_add_head(&kip->hlist, &kprobe_insn_pages);
- memset(kip->slot_used, 0, INSNS_PER_PAGE);
+ hlist_add_head(&kip->hlist, &pages->list);
+ memset(kip->slot_used, 0, ninsns);
  kip->slot_used[0] = 1;
  kip->nused = 1;
  return kip->insns;
 }

-void __kprobes free_insn_slot(kprobe_opcode_t *slot)
+void __kprobes __free_insn_slot(struct kprobe_insn_page_list *pages,
+ kprobe_opcode_t * slot)
 {
  struct kprobe_insn_page *kip;
  struct hlist_node *pos;
+ int ninsns = INSNS_PER_PAGE(pages->insn_size);

- hlist_for_each(pos, &kprobe_insn_pages) {
+ hlist_for_each(pos, &pages->list) {
  kip = hlist_entry(pos, struct kprobe_insn_page, hlist);
  if (kip->insns <= slot &&
-    slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) {
- int i = (slot - kip->insns) / MAX_INSN_SIZE;
+    slot < kip->insns + (ninsns * pages->insn_size)) {
+ int i = (slot - kip->insns) / pages->insn_size;
  kip->slot_used[i] = 0;
  kip->nused--;
  if (kip->nused == 0) {
@@ -138,10 +146,10 @@ void __kprobes free_insn_slot(kprobe_opc
  * next time somebody inserts a probe.
  */
  hlist_del(&kip->hlist);
- if (hlist_empty(&kprobe_insn_pages)) {
+ if (hlist_empty(&pages->list)) {
  INIT_HLIST_NODE(&kip->hlist);
  hlist_add_head(&kip->hlist,
- &kprobe_insn_pages);
+       &pages->list);
  } else {
  module_free(NULL, kip->insns);
  kfree(kip);
@@ -152,6 +160,16 @@ void __kprobes free_insn_slot(kprobe_opc
  }
 }

+kprobe_opcode_t __kprobes *get_insn_slot(void)
+{
+ return __get_insn_slot(&kprobe_insn_pages);
+}
+
+void __kprobes free_insn_slot(kprobe_opcode_t * slot)
+{
+ __free_insn_slot(&kprobe_insn_pages, slot);
+}
+
 /* We have preemption disabled.. so it is safe to use __ versions */
 static inline void set_kprobe_instance(struct kprobe *kp)
 {



Reply | Threaded
Open this post in threaded view
|

[Fwd: [RFC][PATCH 2/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1]

Masami Hiramatsu
Hi,

Here is the third mail which I posted to LKML.

---
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]


-------- Original Message --------
Subject: [RFC][PATCH 2/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1
Date: Mon, 31 Oct 2005 20:08:50 +0900
From: Masami Hiramatsu <[hidden email]>
To: [hidden email]
CC: Satoshi Oshima <[hidden email]>, Hideo Aoki <[hidden email]>,        Yumiko Sugita <[hidden email]>, [hidden email],        [hidden email]
References: <[hidden email]> <[hidden email]>

Hi,

This patch is the architecture independant part of djprobe.
The djprobe would replace the kernel codes (target codes) to insert
a jump instruction.
But the target codes may be run by other processors. So the djprobe
should ensure that no other processor is running on the target codes.
First, the djprobe makes a bypass route from a copy of the target codes.
And it inserts kprobes at the top address of the target codes. Thus
other processors can detour the target codes by using the bypass route.
Next, the djprobe runs works on other processors and waits until all
works are finished to run. After that, it can ensure other processors
are not running on the target codes.

So, it can replace the target codes to a jump instruction safely.

---
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Signed-off-by: Masami Hiramatsu <[hidden email]>

 include/linux/djprobe.h |   80 +++++++++++++++
 include/linux/kprobes.h |    4
 kernel/Makefile         |    1
 kernel/djprobe.c        |  253 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/kprobes.c        |    8 +
 5 files changed, 345 insertions(+), 1 deletion(-)
diff -Narup linux-2.6.14-rc5-mm1.djp.1/include/linux/djprobe.h linux-2.6.14-rc5-mm1.djp.2/include/linux/djprobe.h
--- linux-2.6.14-rc5-mm1.djp.1/include/linux/djprobe.h 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.2/include/linux/djprobe.h 2005-10-26 15:52:22.000000000 +0900
@@ -0,0 +1,80 @@
+#ifndef _LINUX_DJPROBE_H
+#define _LINUX_DJPROBE_H
+/*
+ *  Kernel Direct Jump Probe (Djprobe)
+ *  include/linux/djprobe.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+#include <linux/config.h>
+#include <linux/list.h>
+#include <linux/smp.h>
+#include <linux/kprobes.h>
+#include <asm/djprobe.h>
+
+struct djprobe;
+/* djprobe's instance (internal use)*/
+struct djprobe_instance {
+ struct list_head plist; /* list of djprobes for multiprobe support */
+ struct arch_djprobe_stub stub;
+ struct kprobe kp;
+ struct hlist_node hlist; /* list of djprobe_instances */
+};
+#define DJPI_EMPTY(djpi)  (list_empty(&djpi->plist))
+
+struct djprobe;
+typedef void (*djprobe_handler_t) (struct djprobe *, struct pt_regs *);
+/*
+ * Direct Jump probe interface structure
+ */
+struct djprobe {
+ /* list of djprobes */
+ struct list_head plist;
+
+ /* probing handler (pre-executed) */
+ djprobe_handler_t handler;
+
+ /* pointer for instance */
+ struct djprobe_instance *inst;
+};
+
+#ifdef CONFIG_DJPROBE
+extern int arch_prepare_djprobe_instance(struct djprobe_instance *djpi,
+ unsigned long size);
+extern int djprobe_pre_handler(struct kprobe *, struct pt_regs *);
+extern void djprobe_post_handler(struct kprobe *, struct pt_regs *,
+ unsigned long);
+extern void arch_install_djprobe_instance(struct djprobe_instance *djpi);
+extern void arch_uninstall_djprobe_instance(struct djprobe_instance *djpi);
+struct djprobe_instance *__kprobes get_djprobe_instance(void *addr, int size);
+
+int register_djprobe(struct djprobe *p, void *addr, int size);
+void unregister_djprobe(struct djprobe *p);
+#else /* CONFIG_DJPROBE */
+static inline int register_djprobe(struct djprobe *p)
+{
+ return -ENOSYS;
+}
+static inline void unregister_djprobe(struct djprobe *p)
+{
+}
+#endif /* CONFIG_DJPROBE */
+#endif /* _LINUX_DJPROBE_H */
diff -Narup linux-2.6.14-rc5-mm1.djp.1/include/linux/kprobes.h linux-2.6.14-rc5-mm1.djp.2/include/linux/kprobes.h
--- linux-2.6.14-rc5-mm1.djp.1/include/linux/kprobes.h 2005-10-25 13:11:26.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.2/include/linux/kprobes.h 2005-10-25 13:32:59.000000000 +0900
@@ -163,10 +163,14 @@ extern int arch_init_kprobes(void);
 extern void show_registers(struct pt_regs *regs);
 extern kprobe_opcode_t *get_insn_slot(void);
 extern void free_insn_slot(kprobe_opcode_t *slot);
+extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_page_list *pages);
+extern void __free_insn_slot(struct kprobe_insn_page_list *pages,
+     kprobe_opcode_t * slot);

 /* Get the kprobe at this addr (if any) - called under a rcu_read_lock() */
 struct kprobe *get_kprobe(void *addr);
 struct hlist_head * kretprobe_inst_table_head(struct task_struct *tsk);
+int in_kprobes_functions(unsigned long addr);

 /* kprobe_running() will just return the current_kprobe on this CPU */
 static inline struct kprobe *kprobe_running(void)
diff -Narup linux-2.6.14-rc5-mm1.djp.1/kernel/djprobe.c linux-2.6.14-rc5-mm1.djp.2/kernel/djprobe.c
--- linux-2.6.14-rc5-mm1.djp.1/kernel/djprobe.c 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.2/kernel/djprobe.c 2005-10-27 11:59:10.000000000 +0900
@@ -0,0 +1,253 @@
+/*
+ *  Kernel Direct Jump Probe (Djprobe)
+ *  kernel/djprobes.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+#include <linux/djprobe.h>
+#include <linux/hash.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/moduleloader.h>
+#include <asm-generic/sections.h>
+#include <asm/cacheflush.h>
+#include <asm/errno.h>
+
+#include <linux/cpu.h>
+#include <linux/percpu.h>
+#include <asm/semaphore.h>
+
+/*
+ * The djprobe do not refer instances list when probe function called.
+ * This list is operated on registering and unregistering djprobe.
+ */
+#define DJPROBE_BLOCK_BITS 6
+#define DJPROBE_BLOCK_SIZE (1 << DJPROBE_BLOCK_BITS)
+#define DJPROBE_HASH_BITS 8
+#define DJPROBE_TABLE_SIZE (1 << DJPROBE_HASH_BITS)
+#define DJPROBE_TABLE_MASK (DJPROBE_TABLE_SIZE - 1)
+
+/* djprobe instance hash table */
+static struct hlist_head djprobe_inst_table[DJPROBE_TABLE_SIZE];
+
+#define hash_djprobe(key) \
+ (((unsigned long)(key) >> DJPROBE_BLOCK_BITS) & DJPROBE_TABLE_MASK)
+
+static DECLARE_MUTEX(djprobe_mutex);
+static DEFINE_PER_CPU(struct work_struct, djprobe_works);
+static DECLARE_WAIT_QUEUE_HEAD(djprobe_wqh);
+static atomic_t djprobe_count = ATOMIC_INIT(0);
+
+/* Instruction pages for djprobe's stub code */
+static struct kprobe_insn_page_list djprobe_insn_pages = {
+ HLIST_HEAD_INIT, 0
+};
+
+static inline void __free_djprobe_instance(struct djprobe_instance *djpi)
+{
+ hlist_del(&djpi->hlist);
+ if (djpi->kp.addr) {
+ unregister_kprobe(&(djpi->kp));
+ }
+ if (djpi->stub.insn)
+ __free_insn_slot(&djprobe_insn_pages, djpi->stub.insn);
+ kfree(djpi);
+}
+
+static inline
+    struct djprobe_instance *__create_djprobe_instance(struct djprobe *djp,
+       void *addr, int size)
+{
+ struct djprobe_instance *djpi;
+ /* allocate a new instance */
+ djpi = kcalloc(1, sizeof(struct djprobe_instance), GFP_ATOMIC);
+ if (djpi == NULL) {
+ goto out;
+ }
+ /* allocate stub */
+ djpi->stub.insn = __get_insn_slot(&djprobe_insn_pages);
+ if (djpi->stub.insn == NULL) {
+ __free_djprobe_instance(djpi);
+ djpi = NULL;
+ goto out;
+ }
+
+ /* attach */
+ djp->inst = djpi;
+ INIT_LIST_HEAD(&djpi->plist);
+ list_add_rcu(&djp->plist, &djpi->plist);
+ djpi->kp.addr = addr;
+ djpi->kp.pre_handler = djprobe_pre_handler;
+ djpi->kp.post_handler = djprobe_post_handler;
+ arch_prepare_djprobe_instance(djpi, size);
+
+ INIT_HLIST_NODE(&djpi->hlist);
+ hlist_add_head(&djpi->hlist, &djprobe_inst_table[hash_djprobe(addr)]);
+      out:
+ return djpi;
+}
+
+static struct djprobe_instance *__kprobes __get_djprobe_instance(void *addr,
+ int size)
+{
+ struct djprobe_instance *djpi;
+ struct hlist_node *node;
+ unsigned long idx, eidx;
+
+ idx = hash_djprobe(addr - ARCH_STUB_INSN_MAX);
+ eidx = ((hash_djprobe(addr + size) + 1) & DJPROBE_TABLE_MASK);
+ do {
+ hlist_for_each_entry(djpi, node, &djprobe_inst_table[idx],
+     hlist) {
+ if (((long)addr <
+     (long)djpi->kp.addr + DJPI_ARCH_SIZE(djpi))
+    && ((long)djpi->kp.addr < (long)addr + size)) {
+ return djpi;
+ }
+ }
+ idx = ((idx + 1) & DJPROBE_TABLE_MASK);
+ }while (idx != eidx);
+
+ return NULL;
+}
+
+struct djprobe_instance *__kprobes get_djprobe_instance(void *addr, int size)
+{
+ struct djprobe_instance *djpi;
+ down(&djprobe_mutex);
+ djpi = __get_djprobe_instance(addr, size);
+ up(&djprobe_mutex);
+ return djpi;
+}
+
+/* This work function invoked while djprobe_mutex is locked. */
+static void __kprobes __work_check_safety(void *data)
+{
+ if (atomic_dec_and_test(&djprobe_count)) {
+ wake_up_all(&djprobe_wqh);
+ }
+}
+
+static void __kprobes __check_safety(void)
+{
+ int cpu;
+ struct work_struct *wk;
+ lock_cpu_hotplug();
+ atomic_set(&djprobe_count, num_online_cpus() - 1);
+ for_each_online_cpu(cpu) {
+ if (cpu == smp_processor_id())
+ continue;
+ wk = &per_cpu(djprobe_works, cpu);
+ INIT_WORK(wk, __work_check_safety, NULL);
+ schedule_delayed_work_on(cpu, wk, 0);
+ }
+ wait_event(djprobe_wqh, (atomic_read(&djprobe_count) == 0));
+ unlock_cpu_hotplug();
+}
+
+int __kprobes register_djprobe(struct djprobe *djp, void *addr, int size)
+{
+ struct djprobe_instance *djpi;
+ struct kprobe *kp;
+ int ret = 0, i;
+
+ BUG_ON(in_interrupt());
+
+ if (size > ARCH_STUB_INSN_MAX || size < ARCH_STUB_INSN_MIN)
+ return -EINVAL;
+
+ if ((ret = in_kprobes_functions((unsigned long)addr)) != 0)
+ return ret;
+
+ down(&djprobe_mutex);
+ INIT_LIST_HEAD(&djp->plist);
+ /* check confliction with other djprobes */
+ djpi = __get_djprobe_instance(addr, size);
+ if (djpi) {
+ if (djpi->kp.addr == addr) {
+ djp->inst = djpi; /* add to another instance */
+ list_add_rcu(&djp->plist, &djpi->plist);
+ } else {
+ ret = -EEXIST; /* other djprobes were inserted */
+ }
+ goto out;
+ }
+ djpi = __create_djprobe_instance(djp, addr, size);
+ if (djpi == NULL) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ /* check confliction with kprobes */
+ for (i = 0; i < size; i++) {
+ kp = get_kprobe((void *)((long)addr + i));
+ if (kp != NULL) {
+ ret = -EEXIST; /* a kprobes were inserted */
+ goto fail;
+ }
+ }
+ ret = register_kprobe(&djpi->kp);
+ if (ret < 0) {
+       fail:
+ djpi->kp.addr = NULL;
+ djp->inst = NULL;
+ list_del_rcu(&djp->plist);
+ __free_djprobe_instance(djpi);
+ } else {
+ __check_safety();
+ arch_install_djprobe_instance(djpi);
+ }
+       out:
+ up(&djprobe_mutex);
+ return ret;
+}
+
+void __kprobes unregister_djprobe(struct djprobe *djp)
+{
+ struct djprobe_instance *djpi;
+
+ BUG_ON(in_interrupt());
+
+ down(&djprobe_mutex);
+ djpi = djp->inst;
+ if (djp->plist.next == djp->plist.prev) {
+ arch_uninstall_djprobe_instance(djpi); /* this requires irq enabled */
+ list_del_rcu(&djp->plist);
+ djp->inst = NULL;
+ __check_safety();
+ __free_djprobe_instance(djpi);
+ } else {
+ list_del_rcu(&djp->plist);
+ djp->inst = NULL;
+ }
+ up(&djprobe_mutex);
+}
+
+static int __init init_djprobe(void)
+{
+ djprobe_insn_pages.insn_size = ARCH_STUB_SIZE;
+ return 0;
+}
+
+__initcall(init_djprobe);
+
+EXPORT_SYMBOL_GPL(register_djprobe);
+EXPORT_SYMBOL_GPL(unregister_djprobe);
diff -Narup linux-2.6.14-rc5-mm1.djp.1/kernel/kprobes.c linux-2.6.14-rc5-mm1.djp.2/kernel/kprobes.c
--- linux-2.6.14-rc5-mm1.djp.1/kernel/kprobes.c 2005-10-25 13:13:58.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.2/kernel/kprobes.c 2005-10-26 15:53:05.000000000 +0900
@@ -37,6 +37,7 @@
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/moduleloader.h>
+#include <linux/djprobe.h>
 #include <asm-generic/sections.h>
 #include <asm/cacheflush.h>
 #include <asm/errno.h>
@@ -467,7 +468,7 @@ static inline void cleanup_aggr_kprobe(s
  spin_unlock_irqrestore(&kprobe_lock, flags);
 }

-static int __kprobes in_kprobes_functions(unsigned long addr)
+int __kprobes in_kprobes_functions(unsigned long addr)
 {
  if (addr >= (unsigned long)__kprobes_text_start
  && addr < (unsigned long)__kprobes_text_end)
@@ -483,6 +484,11 @@ int __kprobes register_kprobe(struct kpr

  if ((ret = in_kprobes_functions((unsigned long) p->addr)) != 0)
  return ret;
+#ifdef CONFIG_DJPROBE
+ if (p->pre_handler != djprobe_pre_handler &&
+    get_djprobe_instance(p->addr, 1) != NULL)
+ return -EEXIST;
+#endif /* CONFIG_DJPROBE */
  if ((ret = arch_prepare_kprobe(p)) != 0)
  goto rm_kprobe;

diff -Narup linux-2.6.14-rc5-mm1.djp.1/kernel/Makefile linux-2.6.14-rc5-mm1.djp.2/kernel/Makefile
--- linux-2.6.14-rc5-mm1.djp.1/kernel/Makefile 2005-10-25 11:29:02.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.2/kernel/Makefile 2005-10-25 13:22:27.000000000 +0900
@@ -27,6 +27,7 @@ obj-$(CONFIG_STOP_MACHINE) += stop_machi
 obj-$(CONFIG_AUDIT) += audit.o
 obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
 obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_DJPROBE) += djprobe.o
 obj-$(CONFIG_SYSFS) += ksysfs.o
 obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/



Reply | Threaded
Open this post in threaded view
|

[Fwd: [RFC][PATCH 3/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1]

Masami Hiramatsu
Hi,

Here is the forth mail which I posted to LKML.

-------- Original Message --------
Subject: [RFC][PATCH 3/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1
Date: Mon, 31 Oct 2005 20:10:50 +0900
From: Masami Hiramatsu <[hidden email]>
To: [hidden email]
CC: Satoshi Oshima <[hidden email]>, Hideo Aoki <[hidden email]>,        Yumiko Sugita <[hidden email]>, [hidden email],        [hidden email]
References: <[hidden email]> <[hidden email]> <[hidden email]>

Hi,

This patch is the i386 architecture dependent codes of djprobe.
I heard that we need to synchronize caches of each processor if we
execute self modifying on i386.
So, this patch synchronize caches by using CPUID and smp_call_function.

---
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Signed-off-by: Masami Hiramatsu <[hidden email]>

 arch/i386/Kconfig               |    8 +
 arch/i386/kernel/Makefile       |    1
 arch/i386/kernel/djprobe.c      |  172 ++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/stub_djprobe.S |   77 +++++++++++++++++
 include/asm-i386/djprobe.h      |   56 +++++++++++++
 5 files changed, 314 insertions(+)
diff -Narup linux-2.6.14-rc5-mm1.djp.2/arch/i386/Kconfig linux-2.6.14-rc5-mm1.djp.3/arch/i386/Kconfig
--- linux-2.6.14-rc5-mm1.djp.2/arch/i386/Kconfig 2005-10-25 11:28:49.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.3/arch/i386/Kconfig 2005-10-27 11:26:55.000000000 +0900
@@ -1317,6 +1317,14 @@ config KPROBES
   a probepoint and specifies the callback.  Kprobes is useful
   for kernel debugging, non-intrusive instrumentation and testing.
   If in doubt, say "N".
+
+config DJPROBE
+        bool "Direct Jump probe"
+ depends on KPROBES && !PREEMPT
+ help
+ Djprobe allows you to dynamically hook at any kernel function
+ entry points and collect the debugging or performance analysis
+ information non-disruptively.
 endmenu

 source "arch/i386/Kconfig.debug"
diff -Narup linux-2.6.14-rc5-mm1.djp.2/arch/i386/kernel/Makefile linux-2.6.14-rc5-mm1.djp.3/arch/i386/kernel/Makefile
--- linux-2.6.14-rc5-mm1.djp.2/arch/i386/kernel/Makefile 2005-10-25 11:28:49.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.3/arch/i386/kernel/Makefile 2005-10-25 14:39:12.000000000 +0900
@@ -29,6 +29,7 @@ obj-$(CONFIG_KEXEC) += machine_kexec.o
 obj-$(CONFIG_X86_NUMAQ) += numaq.o
 obj-$(CONFIG_X86_SUMMIT_NUMA) += summit.o
 obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_DJPROBE) += stub_djprobe.o djprobe.o
 obj-$(CONFIG_MODULES) += module.o
 obj-y += sysenter.o vsyscall.o
 obj-$(CONFIG_ACPI_SRAT) += srat.o
diff -Narup linux-2.6.14-rc5-mm1.djp.2/arch/i386/kernel/djprobe.c linux-2.6.14-rc5-mm1.djp.3/arch/i386/kernel/djprobe.c
--- linux-2.6.14-rc5-mm1.djp.2/arch/i386/kernel/djprobe.c 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.3/arch/i386/kernel/djprobe.c 2005-10-28 17:52:29.000000000 +0900
@@ -0,0 +1,172 @@
+/*
+ *  Kernel Direct Jump Probe (Djprobes)
+ *  arch/i386/kernel/djprobe.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+
+#include <linux/config.h>
+#include <linux/djprobe.h>
+#include <linux/ptrace.h>
+#include <linux/spinlock.h>
+#include <linux/preempt.h>
+#include <asm/cacheflush.h>
+#include <asm/kdebug.h>
+#include <asm/desc.h>
+#include <asm/processor.h>
+
+
+/*
+ * When kernel full preemption is enabled, we can't ensure that no threads
+ * are executing the modified code. It may be stored in the stack of the
+ * threads. In this case, the djprobe interfaces are emulated by using
+ * kprobe.
+ * When kernel full preemption is disabled, threads are scheduled
+ * from only limited addresses. So it is easy to check whether the
+ * preemption can occur in the modified code.
+ */
+
+/*
+ * On pentium series, Unsynchronized cross-modifying code
+ * operations can cause unexpected instruction execution results.
+ * So after code modified, we should synchronize it on each processor.
+ */
+static void __local_serialize_cpu(void * info)
+{
+ serialize_cpu();
+}
+
+static inline void smp_serialize_cpus(void)
+{
+ on_each_cpu(__local_serialize_cpu, NULL, 1,1);
+}
+
+/* jmp code manipulators */
+struct __arch_jmp_op {
+ char op;
+ long raddr;
+} __attribute__((packed));
+/* insert jmp code */
+static inline void __set_jmp_op(void *from, void *to, int sync)
+{
+ struct __arch_jmp_op *jop;
+ jop = (struct __arch_jmp_op *)from;
+ jop->raddr=(long)(to) - ((long)(from) + 5);
+ mb();
+ if (sync) smp_serialize_cpus();
+ jop->op = RELATIVEJUMP_INSTRUCTION;
+}
+/* switch back to the kprobe */
+static inline void __set_breakpoint_op(void *dest, void *orig)
+{
+ struct __arch_jmp_op *jop = (struct __arch_jmp_op *)dest,
+ *jop2 = (struct __arch_jmp_op *)orig;
+
+ jop->op = BREAKPOINT_INSTRUCTION;
+ jop->raddr = jop2->raddr;
+ mb();
+ smp_serialize_cpus();
+}
+
+/* djprobe call back function: called from stub code */
+static void asmlinkage djprobe_callback(struct djprobe_instance * djpi,
+ struct pt_regs *regs)
+{
+ struct djprobe *djp;
+ rcu_read_lock();
+ list_for_each_entry_rcu(djp, &djpi->plist, plist) {
+ if (djp->handler)
+ djp->handler(djp, regs);
+ }
+ rcu_read_unlock();
+}
+
+/*
+ * Copy post processing instructions
+ * Target instructions MUST be relocatable.
+ */
+int __kprobes arch_prepare_djprobe_instance(struct djprobe_instance *djpi,
+  unsigned long size)
+{
+ kprobe_opcode_t *stub;
+ stub = djpi->stub.insn;
+ djpi->stub.size = size;
+
+ /* copy arch-dep-instance from template */
+ memcpy((void*)stub, (void*)&arch_tmpl_stub_entry, ARCH_STUB_SIZE);
+
+ /* set probe information */
+ *((long*)(stub + ARCH_STUB_VAL_IDX)) = (long)djpi;
+ /* set probe function */
+ *((long*)(stub + ARCH_STUB_CALL_IDX)) = (long)djprobe_callback;
+
+ /* copy instructions into the middle of djporbe instance */
+ memcpy((void*)(stub + ARCH_STUB_INST_IDX),
+       (void*)djpi->kp.addr, size);
+
+ /* set returning jmp instruction at the tail of djporbe instance*/
+ __set_jmp_op(stub + ARCH_STUB_INST_IDX + size,
+     (void*)((long)djpi->kp.addr + size), 0);
+
+ return 0;
+}
+
+/* Insert "jmp" instruction into the probing point. */
+void __kprobes arch_install_djprobe_instance(struct djprobe_instance *djpi)
+{
+ __set_jmp_op((void*)djpi->kp.addr, (void*)djpi->stub.insn, 1);
+}
+
+/* Write back original instructions & kprobe */
+void __kprobes arch_uninstall_djprobe_instance(struct djprobe_instance *djpi)
+{
+ kprobe_opcode_t *stub;
+ stub = &djpi->stub.insn[ARCH_STUB_INST_IDX];
+ __set_breakpoint_op((void*)djpi->kp.addr, (void*)stub);
+}
+
+static DEFINE_SPINLOCK(djprobe_handler_lock);
+
+/* djprobe handler : switch to a bypass code */
+int __kprobes djprobe_pre_handler(struct kprobe * kp, struct pt_regs * regs)
+{
+ struct djprobe_instance *djpi =
+ container_of(kp,struct djprobe_instance, kp);
+ kprobe_opcode_t *stub = djpi->stub.insn;
+
+ spin_lock(&djprobe_handler_lock);
+ if (DJPI_EMPTY(djpi)) {
+ kp->ainsn.insn[0] = kp->opcode;
+ return 0;
+ } else {
+ regs->eip = (unsigned long)stub;
+ regs->eflags |= TF_MASK;
+ regs->eflags &= ~IF_MASK;
+ kp->ainsn.insn[0] = RETURN_INSTRUCTION;
+ return 1; /* already prepared */
+ }
+}
+
+void __kprobes djprobe_post_handler(struct kprobe * kp, struct pt_regs * regs,
+    unsigned long flags)
+{
+ spin_unlock(&djprobe_handler_lock);
+}
diff -Narup linux-2.6.14-rc5-mm1.djp.2/arch/i386/kernel/stub_djprobe.S linux-2.6.14-rc5-mm1.djp.3/arch/i386/kernel/stub_djprobe.S
--- linux-2.6.14-rc5-mm1.djp.2/arch/i386/kernel/stub_djprobe.S 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.3/arch/i386/kernel/stub_djprobe.S 2005-10-25 14:39:12.000000000 +0900
@@ -0,0 +1,77 @@
+/*
+ *  linux/arch/i386/stub_djprobe.S
+ *
+ *  Copyright (C) HITACHI,LTD. 2005
+ *  Created by Masami Hiramatsu <[hidden email]>
+ */
+
+#include <linux/config.h>
+
+# jmp into this function from other functions.
+.global arch_tmpl_stub_entry
+arch_tmpl_stub_entry:
+ nop
+ subl $8, %esp #skip segment registers.
+ pushf
+ subl $20, %esp #skip segment registers.
+ pushl %eax
+ pushl %ebp
+ pushl %edi
+ pushl %esi
+ pushl %edx
+ pushl %ecx
+ pushl %ebx
+
+ movl %esp, %eax
+ pushl %eax
+ addl $60, %eax
+ movl %eax, 56(%esp)
+.global arch_tmpl_stub_val
+arch_tmpl_stub_val:
+ movl $0xffffffff, %eax
+ pushl %eax
+.global arch_tmpl_stub_call
+arch_tmpl_stub_call:
+ movl $0xffffffff, %eax
+ call *%eax
+ addl $8, %esp
+
+ popl %ebx
+ popl %ecx
+ popl %edx
+ popl %esi
+ popl %edi
+ popl %ebp
+ popl %eax
+ addl $20, %esp
+ popf
+ addl $8, %esp
+.global arch_tmpl_stub_inst
+arch_tmpl_stub_inst:
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+.global arch_tmpl_stub_end
+arch_tmpl_stub_end:
diff -Narup linux-2.6.14-rc5-mm1.djp.2/include/asm-i386/djprobe.h linux-2.6.14-rc5-mm1.djp.3/include/asm-i386/djprobe.h
--- linux-2.6.14-rc5-mm1.djp.2/include/asm-i386/djprobe.h 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-rc5-mm1.djp.3/include/asm-i386/djprobe.h 2005-10-25 14:39:12.000000000 +0900
@@ -0,0 +1,56 @@
+#ifndef _ASM_DJPROBE_H
+#define _ASM_DJPROBE_H
+/*
+ *  Kernel Direct Jump Probe (Djprobe)
+ *  include/asm-i386/djprobe.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+
+#define RELATIVEJUMP_INSTRUCTION 0xe9
+#define RETURN_INSTRUCTION 0xc3
+
+#ifndef CONFIG_PREEMPT
+#define ARCH_SUPPORTS_DJPROBES
+#endif /* CONFIG_PREEMPT */
+
+/* stub template code */
+extern kprobe_opcode_t arch_tmpl_stub_entry;
+extern kprobe_opcode_t arch_tmpl_stub_val;
+extern kprobe_opcode_t arch_tmpl_stub_call;
+extern kprobe_opcode_t arch_tmpl_stub_inst;
+extern kprobe_opcode_t arch_tmpl_stub_end;
+
+#define ARCH_STUB_VAL_IDX ((long)&arch_tmpl_stub_val - (long)&arch_tmpl_stub_entry + 1)
+#define ARCH_STUB_CALL_IDX ((long)&arch_tmpl_stub_call - (long)&arch_tmpl_stub_entry + 1)
+#define ARCH_STUB_INST_IDX ((long)&arch_tmpl_stub_inst - (long)&arch_tmpl_stub_entry)
+#define ARCH_STUB_SIZE ((long)&arch_tmpl_stub_end - (long)&arch_tmpl_stub_entry)
+
+#define ARCH_STUB_INSN_MAX 20
+#define ARCH_STUB_INSN_MIN 5
+
+struct arch_djprobe_stub {
+ kprobe_opcode_t *insn;
+ int size;
+};
+#define DJPI_ARCH_SIZE(djpi) (djpi->stub.size)
+
+#endif /* _ASM_DJPROBE_H */


Reply | Threaded
Open this post in threaded view
|

Re: [Fwd: [RFC][PATCH 3/3]Djprobe (Direct Jump Probe) for 2.6.14-rc5-mm1]

Masami Hiramatsu
Hi,

Masami Hiramatsu wrote:

> +
> +static DEFINE_SPINLOCK(djprobe_handler_lock);
> +
> +/* djprobe handler : switch to a bypass code */
> +int __kprobes djprobe_pre_handler(struct kprobe * kp, struct pt_regs * regs)
> +{
> + struct djprobe_instance *djpi =
> + container_of(kp,struct djprobe_instance, kp);
> + kprobe_opcode_t *stub = djpi->stub.insn;
> +
> + spin_lock(&djprobe_handler_lock);
> + if (DJPI_EMPTY(djpi)) {
> + kp->ainsn.insn[0] = kp->opcode;
> + return 0;
> + } else {
> + regs->eip = (unsigned long)stub;
> + regs->eflags |= TF_MASK;
> + regs->eflags &= ~IF_MASK;
> + kp->ainsn.insn[0] = RETURN_INSTRUCTION;
> + return 1; /* already prepared */
> + }
> +}
> +
> +void __kprobes djprobe_post_handler(struct kprobe * kp, struct pt_regs * regs,
> +    unsigned long flags)
> +{
> + spin_unlock(&djprobe_handler_lock);
> +}

I learned from kretprobe that I can remove this spinlock and
a trick of instruction buffer. Here is a new pseudo-program.

djprobe_pre_handler()
{
        if (!DJPI_EMPTY(djpi)) {
                regs->eip = (unsigned long)djpi->stub.insn;
                reset_current_kprobe();
                preempt_enable_no_resched();
                return 1;
        }
        return 0;
}

I will send fixed patches again. Please review it.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Patch 0/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Masami Hiramatsu
Hi,

I updated the patches of djprobe against linux-2.6.14-mm1.
Also I removed all spinlocks from djprobe patches as stated below.

Masami Hiramatsu wrote:

>
> I learned from kretprobe that I can remove this spinlock and
> a trick of instruction buffer. Here is a new pseudo-program.
>
> djprobe_pre_handler()
> {
> if (!DJPI_EMPTY(djpi)) {
> regs->eip = (unsigned long)djpi->stub.insn;
> reset_current_kprobe();
> preempt_enable_no_resched();
> return 1;
> }
> return 0;
> }
>

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

[Patch 1/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Masami Hiramatsu
In reply to this post by Masami Hiramatsu
Hi,

Here is a patch that enables get_insn_slot() to handle slots
that have different size.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

 include/linux/kprobes.h |    5 ++++
 kernel/kprobes.c        |   58 +++++++++++++++++++++++++++++++-----------------
 2 files changed, 43 insertions(+), 20 deletions(-)
diff -Narup linux-2.6.14-mm1/include/linux/kprobes.h linux-2.6.14-mm1.djp.1/include/linux/kprobes.h
--- linux-2.6.14-mm1/include/linux/kprobes.h 2005-11-08 11:51:03.000000000 +0900
+++ linux-2.6.14-mm1.djp.1/include/linux/kprobes.h 2005-11-08 11:52:46.000000000 +0900
@@ -147,6 +147,11 @@ struct kretprobe_instance {
  struct task_struct *task;
 };

+struct kprobe_insn_page_list {
+ struct hlist_head list;
+ int insn_size; /* size of an instruction slot */
+};
+
 #ifdef CONFIG_KPROBES
 extern spinlock_t kretprobe_lock;
 extern int arch_prepare_kprobe(struct kprobe *p);
diff -Narup linux-2.6.14-mm1/kernel/kprobes.c linux-2.6.14-mm1.djp.1/kernel/kprobes.c
--- linux-2.6.14-mm1/kernel/kprobes.c 2005-11-08 11:51:04.000000000 +0900
+++ linux-2.6.14-mm1.djp.1/kernel/kprobes.c 2005-11-08 11:52:46.000000000 +0900
@@ -58,44 +58,50 @@ static DEFINE_PER_CPU(struct kprobe *, k
  * stepping on the instruction on a vmalloced/kmalloced/data page
  * is a recipe for disaster
  */
-#define INSNS_PER_PAGE (PAGE_SIZE/(MAX_INSN_SIZE * sizeof(kprobe_opcode_t)))
+#define INSNS_PER_PAGE(size) (PAGE_SIZE/(size * sizeof(kprobe_opcode_t)))

 struct kprobe_insn_page {
  struct hlist_node hlist;
  kprobe_opcode_t *insns; /* Page of instruction slots */
- char slot_used[INSNS_PER_PAGE];
  int nused;
+ char slot_used[1];
 };

-static struct hlist_head kprobe_insn_pages;
+static struct kprobe_insn_page_list kprobe_insn_pages = {
+ HLIST_HEAD_INIT, MAX_INSN_SIZE
+};

 /**
- * get_insn_slot() - Find a slot on an executable page for an instruction.
+ * __get_insn_slot() - Find a slot on an executable page for an instruction.
  * We allocate an executable page if there's no room on existing ones.
  */
-kprobe_opcode_t __kprobes *get_insn_slot(void)
+kprobe_opcode_t
+ __kprobes * __get_insn_slot(struct kprobe_insn_page_list *pages)
 {
  struct kprobe_insn_page *kip;
  struct hlist_node *pos;
+ int ninsns = INSNS_PER_PAGE(pages->insn_size);

- hlist_for_each(pos, &kprobe_insn_pages) {
+ hlist_for_each(pos, &pages->list) {
  kip = hlist_entry(pos, struct kprobe_insn_page, hlist);
- if (kip->nused < INSNS_PER_PAGE) {
+ if (kip->nused < ninsns) {
  int i;
- for (i = 0; i < INSNS_PER_PAGE; i++) {
+ for (i = 0; i < ninsns; i++) {
  if (!kip->slot_used[i]) {
  kip->slot_used[i] = 1;
  kip->nused++;
- return kip->insns + (i * MAX_INSN_SIZE);
+ return kip->insns +
+    (i * pages->insn_size);
  }
  }
  /* Surprise!  No unused slots.  Fix kip->nused. */
- kip->nused = INSNS_PER_PAGE;
+ kip->nused = ninsns;
  }
  }

- /* All out of space.  Need to allocate a new page. Use slot 0.*/
- kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL);
+ /* All out of space.  Need to allocate a new page. Use slot 0. */
+ kip = kmalloc(sizeof(struct kprobe_insn_page) +
+    sizeof(char) * (ninsns - 1), GFP_ATOMIC);
  if (!kip) {
  return NULL;
  }
@@ -111,23 +117,25 @@ kprobe_opcode_t __kprobes *get_insn_slot
  return NULL;
  }
  INIT_HLIST_NODE(&kip->hlist);
- hlist_add_head(&kip->hlist, &kprobe_insn_pages);
- memset(kip->slot_used, 0, INSNS_PER_PAGE);
+ hlist_add_head(&kip->hlist, &pages->list);
+ memset(kip->slot_used, 0, ninsns);
  kip->slot_used[0] = 1;
  kip->nused = 1;
  return kip->insns;
 }

-void __kprobes free_insn_slot(kprobe_opcode_t *slot)
+void __kprobes __free_insn_slot(struct kprobe_insn_page_list *pages,
+ kprobe_opcode_t * slot)
 {
  struct kprobe_insn_page *kip;
  struct hlist_node *pos;
+ int ninsns = INSNS_PER_PAGE(pages->insn_size);

- hlist_for_each(pos, &kprobe_insn_pages) {
+ hlist_for_each(pos, &pages->list) {
  kip = hlist_entry(pos, struct kprobe_insn_page, hlist);
  if (kip->insns <= slot &&
-    slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) {
- int i = (slot - kip->insns) / MAX_INSN_SIZE;
+    slot < kip->insns + (ninsns * pages->insn_size)) {
+ int i = (slot - kip->insns) / pages->insn_size;
  kip->slot_used[i] = 0;
  kip->nused--;
  if (kip->nused == 0) {
@@ -138,10 +146,10 @@ void __kprobes free_insn_slot(kprobe_opc
  * next time somebody inserts a probe.
  */
  hlist_del(&kip->hlist);
- if (hlist_empty(&kprobe_insn_pages)) {
+ if (hlist_empty(&pages->list)) {
  INIT_HLIST_NODE(&kip->hlist);
  hlist_add_head(&kip->hlist,
- &kprobe_insn_pages);
+       &pages->list);
  } else {
  module_free(NULL, kip->insns);
  kfree(kip);
@@ -152,6 +160,16 @@ void __kprobes free_insn_slot(kprobe_opc
  }
 }

+kprobe_opcode_t __kprobes *get_insn_slot(void)
+{
+ return __get_insn_slot(&kprobe_insn_pages);
+}
+
+void __kprobes free_insn_slot(kprobe_opcode_t * slot)
+{
+ __free_insn_slot(&kprobe_insn_pages, slot);
+}
+
 /* We have preemption disabled.. so it is safe to use __ versions */
 static inline void set_kprobe_instance(struct kprobe *kp)
 {

Reply | Threaded
Open this post in threaded view
|

[Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Masami Hiramatsu
In reply to this post by Masami Hiramatsu
Hi,

This patch is the architecture independant part of djprobe.
I removed djprobe_post_handler.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

 include/linux/djprobe.h |   78 ++++++++++++++
 include/linux/kprobes.h |    4
 kernel/Makefile         |    1
 kernel/djprobe.c        |  252 ++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/kprobes.c        |    8 +
 5 files changed, 342 insertions(+), 1 deletion(-)
diff -Narup linux-2.6.14-mm1.djp.1/include/linux/djprobe.h linux-2.6.14-mm1.djp.2/include/linux/djprobe.h
--- linux-2.6.14-mm1.djp.1/include/linux/djprobe.h 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-mm1.djp.2/include/linux/djprobe.h 2005-11-08 11:56:43.000000000 +0900
@@ -0,0 +1,78 @@
+#ifndef _LINUX_DJPROBE_H
+#define _LINUX_DJPROBE_H
+/*
+ *  Kernel Direct Jump Probe (Djprobe)
+ *  include/linux/djprobe.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+#include <linux/config.h>
+#include <linux/list.h>
+#include <linux/smp.h>
+#include <linux/kprobes.h>
+#include <asm/djprobe.h>
+
+struct djprobe;
+/* djprobe's instance (internal use)*/
+struct djprobe_instance {
+ struct list_head plist; /* list of djprobes for multiprobe support */
+ struct arch_djprobe_stub stub;
+ struct kprobe kp;
+ struct hlist_node hlist; /* list of djprobe_instances */
+};
+#define DJPI_EMPTY(djpi)  (list_empty(&djpi->plist))
+
+struct djprobe;
+typedef void (*djprobe_handler_t) (struct djprobe *, struct pt_regs *);
+/*
+ * Direct Jump probe interface structure
+ */
+struct djprobe {
+ /* list of djprobes */
+ struct list_head plist;
+
+ /* probing handler (pre-executed) */
+ djprobe_handler_t handler;
+
+ /* pointer for instance */
+ struct djprobe_instance *inst;
+};
+
+#ifdef CONFIG_DJPROBE
+extern int arch_prepare_djprobe_instance(struct djprobe_instance *djpi,
+ unsigned long size);
+extern int djprobe_pre_handler(struct kprobe *, struct pt_regs *);
+extern void arch_install_djprobe_instance(struct djprobe_instance *djpi);
+extern void arch_uninstall_djprobe_instance(struct djprobe_instance *djpi);
+struct djprobe_instance *__kprobes get_djprobe_instance(void *addr, int size);
+
+int register_djprobe(struct djprobe *p, void *addr, int size);
+void unregister_djprobe(struct djprobe *p);
+#else /* CONFIG_DJPROBE */
+static inline int register_djprobe(struct djprobe *p)
+{
+ return -ENOSYS;
+}
+static inline void unregister_djprobe(struct djprobe *p)
+{
+}
+#endif /* CONFIG_DJPROBE */
+#endif /* _LINUX_DJPROBE_H */
diff -Narup linux-2.6.14-mm1.djp.1/include/linux/kprobes.h linux-2.6.14-mm1.djp.2/include/linux/kprobes.h
--- linux-2.6.14-mm1.djp.1/include/linux/kprobes.h 2005-11-08 11:52:46.000000000 +0900
+++ linux-2.6.14-mm1.djp.2/include/linux/kprobes.h 2005-11-08 11:58:50.000000000 +0900
@@ -163,6 +163,10 @@ extern int arch_init_kprobes(void);
 extern void show_registers(struct pt_regs *regs);
 extern kprobe_opcode_t *get_insn_slot(void);
 extern void free_insn_slot(kprobe_opcode_t *slot);
+extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_page_list *pages);
+extern void __free_insn_slot(struct kprobe_insn_page_list *pages,
+     kprobe_opcode_t * slot);
+extern int in_kprobes_functions(unsigned long addr);

 /* Get the kprobe at this addr (if any) - called with preemption disabled */
 struct kprobe *get_kprobe(void *addr);
diff -Narup linux-2.6.14-mm1.djp.1/kernel/Makefile linux-2.6.14-mm1.djp.2/kernel/Makefile
--- linux-2.6.14-mm1.djp.1/kernel/Makefile 2005-11-08 11:51:04.000000000 +0900
+++ linux-2.6.14-mm1.djp.2/kernel/Makefile 2005-11-08 11:56:43.000000000 +0900
@@ -27,6 +27,7 @@ obj-$(CONFIG_STOP_MACHINE) += stop_machi
 obj-$(CONFIG_AUDIT) += audit.o
 obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
 obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_DJPROBE) += djprobe.o
 obj-$(CONFIG_SYSFS) += ksysfs.o
 obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
diff -Narup linux-2.6.14-mm1.djp.1/kernel/djprobe.c linux-2.6.14-mm1.djp.2/kernel/djprobe.c
--- linux-2.6.14-mm1.djp.1/kernel/djprobe.c 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-mm1.djp.2/kernel/djprobe.c 2005-11-08 11:56:43.000000000 +0900
@@ -0,0 +1,252 @@
+/*
+ *  Kernel Direct Jump Probe (Djprobe)
+ *  kernel/djprobes.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+#include <linux/djprobe.h>
+#include <linux/hash.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/moduleloader.h>
+#include <asm-generic/sections.h>
+#include <asm/cacheflush.h>
+#include <asm/errno.h>
+
+#include <linux/cpu.h>
+#include <linux/percpu.h>
+#include <asm/semaphore.h>
+
+/*
+ * The djprobe do not refer instances list when probe function called.
+ * This list is operated on registering and unregistering djprobe.
+ */
+#define DJPROBE_BLOCK_BITS 6
+#define DJPROBE_BLOCK_SIZE (1 << DJPROBE_BLOCK_BITS)
+#define DJPROBE_HASH_BITS 8
+#define DJPROBE_TABLE_SIZE (1 << DJPROBE_HASH_BITS)
+#define DJPROBE_TABLE_MASK (DJPROBE_TABLE_SIZE - 1)
+
+/* djprobe instance hash table */
+static struct hlist_head djprobe_inst_table[DJPROBE_TABLE_SIZE];
+
+#define hash_djprobe(key) \
+ (((unsigned long)(key) >> DJPROBE_BLOCK_BITS) & DJPROBE_TABLE_MASK)
+
+static DECLARE_MUTEX(djprobe_mutex);
+static DEFINE_PER_CPU(struct work_struct, djprobe_works);
+static DECLARE_WAIT_QUEUE_HEAD(djprobe_wqh);
+static atomic_t djprobe_count = ATOMIC_INIT(0);
+
+/* Instruction pages for djprobe's stub code */
+static struct kprobe_insn_page_list djprobe_insn_pages = {
+ HLIST_HEAD_INIT, 0
+};
+
+static inline void __free_djprobe_instance(struct djprobe_instance *djpi)
+{
+ hlist_del(&djpi->hlist);
+ if (djpi->kp.addr) {
+ unregister_kprobe(&(djpi->kp));
+ }
+ if (djpi->stub.insn)
+ __free_insn_slot(&djprobe_insn_pages, djpi->stub.insn);
+ kfree(djpi);
+}
+
+static inline
+    struct djprobe_instance *__create_djprobe_instance(struct djprobe *djp,
+       void *addr, int size)
+{
+ struct djprobe_instance *djpi;
+ /* allocate a new instance */
+ djpi = kcalloc(1, sizeof(struct djprobe_instance), GFP_ATOMIC);
+ if (djpi == NULL) {
+ goto out;
+ }
+ /* allocate stub */
+ djpi->stub.insn = __get_insn_slot(&djprobe_insn_pages);
+ if (djpi->stub.insn == NULL) {
+ __free_djprobe_instance(djpi);
+ djpi = NULL;
+ goto out;
+ }
+
+ /* attach */
+ djp->inst = djpi;
+ INIT_LIST_HEAD(&djpi->plist);
+ list_add_rcu(&djp->plist, &djpi->plist);
+ djpi->kp.addr = addr;
+ djpi->kp.pre_handler = djprobe_pre_handler;
+ arch_prepare_djprobe_instance(djpi, size);
+
+ INIT_HLIST_NODE(&djpi->hlist);
+ hlist_add_head(&djpi->hlist, &djprobe_inst_table[hash_djprobe(addr)]);
+      out:
+ return djpi;
+}
+
+static struct djprobe_instance *__kprobes __get_djprobe_instance(void *addr,
+ int size)
+{
+ struct djprobe_instance *djpi;
+ struct hlist_node *node;
+ unsigned long idx, eidx;
+
+ idx = hash_djprobe(addr - ARCH_STUB_INSN_MAX);
+ eidx = ((hash_djprobe(addr + size) + 1) & DJPROBE_TABLE_MASK);
+ do {
+ hlist_for_each_entry(djpi, node, &djprobe_inst_table[idx],
+     hlist) {
+ if (((long)addr <
+     (long)djpi->kp.addr + DJPI_ARCH_SIZE(djpi))
+    && ((long)djpi->kp.addr < (long)addr + size)) {
+ return djpi;
+ }
+ }
+ idx = ((idx + 1) & DJPROBE_TABLE_MASK);
+ }while (idx != eidx);
+
+ return NULL;
+}
+
+struct djprobe_instance *__kprobes get_djprobe_instance(void *addr, int size)
+{
+ struct djprobe_instance *djpi;
+ down(&djprobe_mutex);
+ djpi = __get_djprobe_instance(addr, size);
+ up(&djprobe_mutex);
+ return djpi;
+}
+
+/* This work function invoked while djprobe_mutex is locked. */
+static void __kprobes __work_check_safety(void *data)
+{
+ if (atomic_dec_and_test(&djprobe_count)) {
+ wake_up_all(&djprobe_wqh);
+ }
+}
+
+static void __kprobes __check_safety(void)
+{
+ int cpu;
+ struct work_struct *wk;
+ lock_cpu_hotplug();
+ atomic_set(&djprobe_count, num_online_cpus() - 1);
+ for_each_online_cpu(cpu) {
+ if (cpu == smp_processor_id())
+ continue;
+ wk = &per_cpu(djprobe_works, cpu);
+ INIT_WORK(wk, __work_check_safety, NULL);
+ schedule_delayed_work_on(cpu, wk, 0);
+ }
+ wait_event(djprobe_wqh, (atomic_read(&djprobe_count) == 0));
+ unlock_cpu_hotplug();
+}
+
+int __kprobes register_djprobe(struct djprobe *djp, void *addr, int size)
+{
+ struct djprobe_instance *djpi;
+ struct kprobe *kp;
+ int ret = 0, i;
+
+ BUG_ON(in_interrupt());
+
+ if (size > ARCH_STUB_INSN_MAX || size < ARCH_STUB_INSN_MIN)
+ return -EINVAL;
+
+ if ((ret = in_kprobes_functions((unsigned long)addr)) != 0)
+ return ret;
+
+ down(&djprobe_mutex);
+ INIT_LIST_HEAD(&djp->plist);
+ /* check confliction with other djprobes */
+ djpi = __get_djprobe_instance(addr, size);
+ if (djpi) {
+ if (djpi->kp.addr == addr) {
+ djp->inst = djpi; /* add to another instance */
+ list_add_rcu(&djp->plist, &djpi->plist);
+ } else {
+ ret = -EEXIST; /* other djprobes were inserted */
+ }
+ goto out;
+ }
+ djpi = __create_djprobe_instance(djp, addr, size);
+ if (djpi == NULL) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ /* check confliction with kprobes */
+ for (i = 0; i < size; i++) {
+ kp = get_kprobe((void *)((long)addr + i));
+ if (kp != NULL) {
+ ret = -EEXIST; /* a kprobes were inserted */
+ goto fail;
+ }
+ }
+ ret = register_kprobe(&djpi->kp);
+ if (ret < 0) {
+       fail:
+ djpi->kp.addr = NULL;
+ djp->inst = NULL;
+ list_del_rcu(&djp->plist);
+ __free_djprobe_instance(djpi);
+ } else {
+ __check_safety();
+ arch_install_djprobe_instance(djpi);
+ }
+       out:
+ up(&djprobe_mutex);
+ return ret;
+}
+
+void __kprobes unregister_djprobe(struct djprobe *djp)
+{
+ struct djprobe_instance *djpi;
+
+ BUG_ON(in_interrupt());
+
+ down(&djprobe_mutex);
+ djpi = djp->inst;
+ if (djp->plist.next == djp->plist.prev) {
+ arch_uninstall_djprobe_instance(djpi); /* this requires irq enabled */
+ list_del_rcu(&djp->plist);
+ djp->inst = NULL;
+ __check_safety();
+ __free_djprobe_instance(djpi);
+ } else {
+ list_del_rcu(&djp->plist);
+ djp->inst = NULL;
+ }
+ up(&djprobe_mutex);
+}
+
+static int __init init_djprobe(void)
+{
+ djprobe_insn_pages.insn_size = ARCH_STUB_SIZE;
+ return 0;
+}
+
+__initcall(init_djprobe);
+
+EXPORT_SYMBOL_GPL(register_djprobe);
+EXPORT_SYMBOL_GPL(unregister_djprobe);
diff -Narup linux-2.6.14-mm1.djp.1/kernel/kprobes.c linux-2.6.14-mm1.djp.2/kernel/kprobes.c
--- linux-2.6.14-mm1.djp.1/kernel/kprobes.c 2005-11-08 11:52:46.000000000 +0900
+++ linux-2.6.14-mm1.djp.2/kernel/kprobes.c 2005-11-08 11:56:43.000000000 +0900
@@ -37,6 +37,7 @@
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/moduleloader.h>
+#include <linux/djprobe.h>
 #include <asm-generic/sections.h>
 #include <asm/cacheflush.h>
 #include <asm/errno.h>
@@ -467,7 +468,7 @@ static inline void cleanup_aggr_kprobe(s
  spin_unlock_irqrestore(&kprobe_lock, flags);
 }

-static int __kprobes in_kprobes_functions(unsigned long addr)
+int __kprobes in_kprobes_functions(unsigned long addr)
 {
  if (addr >= (unsigned long)__kprobes_text_start
  && addr < (unsigned long)__kprobes_text_end)
@@ -483,6 +484,11 @@ int __kprobes register_kprobe(struct kpr

  if ((ret = in_kprobes_functions((unsigned long) p->addr)) != 0)
  return ret;
+#ifdef CONFIG_DJPROBE
+ if (p->pre_handler != djprobe_pre_handler &&
+    get_djprobe_instance(p->addr, 1) != NULL)
+ return -EEXIST;
+#endif /* CONFIG_DJPROBE */
  if ((ret = arch_prepare_kprobe(p)) != 0)
  goto rm_kprobe;


Reply | Threaded
Open this post in threaded view
|

[Patch 3/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Masami Hiramatsu
In reply to this post by Masami Hiramatsu
Hi,

This patch is the i386 architecture dependent codes of djprobe.
I removed djprobe_post_handler and djprobe_handler_lock.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

 arch/i386/Kconfig               |    8 ++
 arch/i386/kernel/Makefile       |    1
 arch/i386/kernel/djprobe.c      |  147 ++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/stub_djprobe.S |   77 ++++++++++++++++++++
 include/asm-i386/djprobe.h      |   51 +++++++++++++
 5 files changed, 284 insertions(+)
diff -Narup linux-2.6.14-mm1.djp.2/arch/i386/Kconfig linux-2.6.14-mm1.djp.3/arch/i386/Kconfig
--- linux-2.6.14-mm1.djp.2/arch/i386/Kconfig 2005-11-08 11:50:46.000000000 +0900
+++ linux-2.6.14-mm1.djp.3/arch/i386/Kconfig 2005-11-08 12:01:19.000000000 +0900
@@ -1010,6 +1010,14 @@ config KPROBES
   a probepoint and specifies the callback.  Kprobes is useful
   for kernel debugging, non-intrusive instrumentation and testing.
   If in doubt, say "N".
+
+config DJPROBE
+        bool "Direct Jump probe (EXPERIMENTAL)"
+ depends on KPROBES && !PREEMPT
+ help
+ Djprobe allows you to dynamically hook at any kernel function
+ entry points and collect the debugging or performance analysis
+ information non-disruptively.
 endmenu

 source "arch/i386/Kconfig.debug"
diff -Narup linux-2.6.14-mm1.djp.2/arch/i386/kernel/Makefile linux-2.6.14-mm1.djp.3/arch/i386/kernel/Makefile
--- linux-2.6.14-mm1.djp.2/arch/i386/kernel/Makefile 2005-11-08 11:50:46.000000000 +0900
+++ linux-2.6.14-mm1.djp.3/arch/i386/kernel/Makefile 2005-11-08 12:00:50.000000000 +0900
@@ -29,6 +29,7 @@ obj-$(CONFIG_KEXEC) += machine_kexec.o
 obj-$(CONFIG_X86_NUMAQ) += numaq.o
 obj-$(CONFIG_X86_SUMMIT_NUMA) += summit.o
 obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_DJPROBE) += stub_djprobe.o djprobe.o
 obj-$(CONFIG_MODULES) += module.o
 obj-y += sysenter.o vsyscall.o
 obj-$(CONFIG_ACPI_SRAT) += srat.o
diff -Narup linux-2.6.14-mm1.djp.2/arch/i386/kernel/djprobe.c linux-2.6.14-mm1.djp.3/arch/i386/kernel/djprobe.c
--- linux-2.6.14-mm1.djp.2/arch/i386/kernel/djprobe.c 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-mm1.djp.3/arch/i386/kernel/djprobe.c 2005-11-08 12:00:50.000000000 +0900
@@ -0,0 +1,147 @@
+/*
+ *  Kernel Direct Jump Probe (Djprobes)
+ *  arch/i386/kernel/djprobe.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+
+#include <linux/config.h>
+#include <linux/djprobe.h>
+#include <linux/ptrace.h>
+#include <linux/spinlock.h>
+#include <linux/preempt.h>
+#include <asm/cacheflush.h>
+#include <asm/kdebug.h>
+#include <asm/desc.h>
+#include <asm/processor.h>
+
+/*
+ * On pentium series, Unsynchronized cross-modifying code
+ * operations can cause unexpected instruction execution results.
+ * So after code modified, we should synchronize it on each processor.
+ */
+static void __local_serialize_cpu(void * info)
+{
+ serialize_cpu();
+}
+
+static inline void smp_serialize_cpus(void)
+{
+ on_each_cpu(__local_serialize_cpu, NULL, 1,1);
+}
+
+/* jmp code manipulators */
+struct __arch_jmp_op {
+ char op;
+ long raddr;
+} __attribute__((packed));
+/* insert jmp code */
+static inline void __set_jmp_op(void *from, void *to, int sync)
+{
+ struct __arch_jmp_op *jop;
+ jop = (struct __arch_jmp_op *)from;
+ jop->raddr=(long)(to) - ((long)(from) + 5);
+ mb();
+ if (sync) smp_serialize_cpus();
+ jop->op = RELATIVEJUMP_INSTRUCTION;
+}
+/* switch back to the kprobe */
+static inline void __set_breakpoint_op(void *dest, void *orig)
+{
+ struct __arch_jmp_op *jop = (struct __arch_jmp_op *)dest,
+ *jop2 = (struct __arch_jmp_op *)orig;
+
+ jop->op = BREAKPOINT_INSTRUCTION;
+ jop->raddr = jop2->raddr;
+ mb();
+ smp_serialize_cpus();
+}
+
+/* djprobe call back function: called from stub code */
+static void asmlinkage djprobe_callback(struct djprobe_instance * djpi,
+ struct pt_regs *regs)
+{
+ struct djprobe *djp;
+ rcu_read_lock();
+ list_for_each_entry_rcu(djp, &djpi->plist, plist) {
+ if (djp->handler)
+ djp->handler(djp, regs);
+ }
+ rcu_read_unlock();
+}
+
+/*
+ * Copy post processing instructions
+ * Target instructions MUST be relocatable.
+ */
+int __kprobes arch_prepare_djprobe_instance(struct djprobe_instance *djpi,
+  unsigned long size)
+{
+ kprobe_opcode_t *stub;
+ stub = djpi->stub.insn;
+ djpi->stub.size = size;
+
+ /* copy arch-dep-instance from template */
+ memcpy((void*)stub, (void*)&arch_tmpl_stub_entry, ARCH_STUB_SIZE);
+
+ /* set probe information */
+ *((long*)(stub + ARCH_STUB_VAL_IDX)) = (long)djpi;
+ /* set probe function */
+ *((long*)(stub + ARCH_STUB_CALL_IDX)) = (long)djprobe_callback;
+
+ /* copy instructions into the middle of djporbe instance */
+ memcpy((void*)(stub + ARCH_STUB_INST_IDX),
+       (void*)djpi->kp.addr, size);
+
+ /* set returning jmp instruction at the tail of djporbe instance*/
+ __set_jmp_op(stub + ARCH_STUB_INST_IDX + size,
+     (void*)((long)djpi->kp.addr + size), 0);
+
+ return 0;
+}
+
+/* Insert "jmp" instruction into the probing point. */
+void __kprobes arch_install_djprobe_instance(struct djprobe_instance *djpi)
+{
+ __set_jmp_op((void*)djpi->kp.addr, (void*)djpi->stub.insn, 1);
+}
+
+/* Write back original instructions & kprobe */
+void __kprobes arch_uninstall_djprobe_instance(struct djprobe_instance *djpi)
+{
+ __set_breakpoint_op((void*)djpi->kp.addr,
+    (void*)&djpi->stub.insn[ARCH_STUB_INST_IDX]);
+}
+
+/* djprobe handler : switch to a bypass code */
+int __kprobes djprobe_pre_handler(struct kprobe * kp, struct pt_regs * regs)
+{
+ struct djprobe_instance *djpi =
+ container_of(kp,struct djprobe_instance, kp);
+
+ if (!DJPI_EMPTY(djpi)) {
+ regs->eip = (unsigned long)djpi->stub.insn;
+ reset_current_kprobe();
+ preempt_enable_no_resched();
+ return 1; /* already prepared */
+ }
+ return 0;
+}
diff -Narup linux-2.6.14-mm1.djp.2/arch/i386/kernel/stub_djprobe.S linux-2.6.14-mm1.djp.3/arch/i386/kernel/stub_djprobe.S
--- linux-2.6.14-mm1.djp.2/arch/i386/kernel/stub_djprobe.S 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-mm1.djp.3/arch/i386/kernel/stub_djprobe.S 2005-11-08 12:00:50.000000000 +0900
@@ -0,0 +1,77 @@
+/*
+ *  linux/arch/i386/stub_djprobe.S
+ *
+ *  Copyright (C) HITACHI,LTD. 2005
+ *  Created by Masami Hiramatsu <[hidden email]>
+ */
+
+#include <linux/config.h>
+
+# jmp into this function from other functions.
+.global arch_tmpl_stub_entry
+arch_tmpl_stub_entry:
+ nop
+ subl $8, %esp #skip segment registers.
+ pushf
+ subl $20, %esp #skip segment registers.
+ pushl %eax
+ pushl %ebp
+ pushl %edi
+ pushl %esi
+ pushl %edx
+ pushl %ecx
+ pushl %ebx
+
+ movl %esp, %eax
+ pushl %eax
+ addl $60, %eax
+ movl %eax, 56(%esp)
+.global arch_tmpl_stub_val
+arch_tmpl_stub_val:
+ movl $0xffffffff, %eax
+ pushl %eax
+.global arch_tmpl_stub_call
+arch_tmpl_stub_call:
+ movl $0xffffffff, %eax
+ call *%eax
+ addl $8, %esp
+
+ popl %ebx
+ popl %ecx
+ popl %edx
+ popl %esi
+ popl %edi
+ popl %ebp
+ popl %eax
+ addl $20, %esp
+ popf
+ addl $8, %esp
+.global arch_tmpl_stub_inst
+arch_tmpl_stub_inst:
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+.global arch_tmpl_stub_end
+arch_tmpl_stub_end:
diff -Narup linux-2.6.14-mm1.djp.2/include/asm-i386/djprobe.h linux-2.6.14-mm1.djp.3/include/asm-i386/djprobe.h
--- linux-2.6.14-mm1.djp.2/include/asm-i386/djprobe.h 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.14-mm1.djp.3/include/asm-i386/djprobe.h 2005-11-08 12:00:50.000000000 +0900
@@ -0,0 +1,51 @@
+#ifndef _ASM_DJPROBE_H
+#define _ASM_DJPROBE_H
+/*
+ *  Kernel Direct Jump Probe (Djprobe)
+ *  include/asm-i386/djprobe.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi, Ltd. 2005
+ *
+ * 2005-Aug Created by Masami HIRAMATSU <[hidden email]>
+ * Initial implementation of Direct jump probe (djprobe)
+ *              to reduce overhead.
+ */
+
+#define RELATIVEJUMP_INSTRUCTION 0xe9
+
+/* stub template code */
+extern kprobe_opcode_t arch_tmpl_stub_entry;
+extern kprobe_opcode_t arch_tmpl_stub_val;
+extern kprobe_opcode_t arch_tmpl_stub_call;
+extern kprobe_opcode_t arch_tmpl_stub_inst;
+extern kprobe_opcode_t arch_tmpl_stub_end;
+
+#define ARCH_STUB_VAL_IDX ((long)&arch_tmpl_stub_val - (long)&arch_tmpl_stub_entry + 1)
+#define ARCH_STUB_CALL_IDX ((long)&arch_tmpl_stub_call - (long)&arch_tmpl_stub_entry + 1)
+#define ARCH_STUB_INST_IDX ((long)&arch_tmpl_stub_inst - (long)&arch_tmpl_stub_entry)
+#define ARCH_STUB_SIZE ((long)&arch_tmpl_stub_end - (long)&arch_tmpl_stub_entry)
+
+#define ARCH_STUB_INSN_MAX 20
+#define ARCH_STUB_INSN_MIN 5
+
+struct arch_djprobe_stub {
+ kprobe_opcode_t *insn;
+ int size;
+};
+#define DJPI_ARCH_SIZE(djpi) (djpi->stub.size)
+
+#endif /* _ASM_DJPROBE_H */

Reply | Threaded
Open this post in threaded view
|

RE: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Zhang, Yanmin
In reply to this post by Masami Hiramatsu
>>-----Original Message-----
>>From: [hidden email] [mailto:[hidden email]]
>>On Behalf Of Masami Hiramatsu
>>Sent: 2005年11月8日 21:26
>>To: [hidden email]
>>Cc: Satoshi Oshima; Yumiko Sugita; Hideo Aoki
>>Subject: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1
>>
>>Hi,
>>
>>This patch is the architecture independant part of djprobe.
>>+static inline
>>+    struct djprobe_instance *__create_djprobe_instance(struct djprobe *djp,
>>+       void *addr, int size)
>>+{
>>+ struct djprobe_instance *djpi;
>>+ /* allocate a new instance */
>>+ djpi = kcalloc(1, sizeof(struct djprobe_instance), GFP_ATOMIC);
>>+ if (djpi == NULL) {
>>+ goto out;
>>+ }
>>+ /* allocate stub */
>>+ djpi->stub.insn = __get_insn_slot(&djprobe_insn_pages);
>>+ if (djpi->stub.insn == NULL) {
[YM] If coming here, djpi->plist is not initiated. So __free_djprobe_instance=>hlist_del will cause panic. How about to move the INIT_LIST_HEAD(&djpi->plist) just after kcalloc?




>>+ __free_djprobe_instance(djpi);
>>+ djpi = NULL;
>>+ goto out;
>>+ }
>>+
>>+ /* attach */
>>+ djp->inst = djpi;
>>+ INIT_LIST_HEAD(&djpi->plist);
>>+ list_add_rcu(&djp->plist, &djpi->plist);
>>+ djpi->kp.addr = addr;
>>+ djpi->kp.pre_handler = djprobe_pre_handler;
>>+ arch_prepare_djprobe_instance(djpi, size);
>>+
>>+ INIT_HLIST_NODE(&djpi->hlist);
>>+ hlist_add_head(&djpi->hlist,
>>&djprobe_inst_table[hash_djprobe(addr)]);
>>+      out:
>>+ return djpi;
>>+}
>>+
>>+static struct djprobe_instance *__kprobes __get_djprobe_instance(void
>>*addr,
>>+ int size)
>>+{
>>+ struct djprobe_instance *djpi;
>>+ struct hlist_node *node;
>>+ unsigned long idx, eidx;
>>+
>>+ idx = hash_djprobe(addr - ARCH_STUB_INSN_MAX);
>>+ eidx = ((hash_djprobe(addr + size) + 1) & DJPROBE_TABLE_MASK);
>>+ do {
>>+ hlist_for_each_entry(djpi, node, &djprobe_inst_table[idx],
>>+     hlist) {
>>+ if (((long)addr <
>>+     (long)djpi->kp.addr + DJPI_ARCH_SIZE(djpi))
>>+    && ((long)djpi->kp.addr < (long)addr + size)) {
>>+ return djpi;
>>+ }
>>+ }
>>+ idx = ((idx + 1) & DJPROBE_TABLE_MASK);
>>+ }while (idx != eidx);
>>+
>>+ return NULL;
>>+}
>>+
>>+struct djprobe_instance *__kprobes get_djprobe_instance(void *addr, int
>>size)
>>+{
>>+ struct djprobe_instance *djpi;
>>+ down(&djprobe_mutex);
>>+ djpi = __get_djprobe_instance(addr, size);
>>+ up(&djprobe_mutex);
>>+ return djpi;
>>+}
>>+
>>+/* This work function invoked while djprobe_mutex is locked. */
>>+static void __kprobes __work_check_safety(void *data)
>>+{
>>+ if (atomic_dec_and_test(&djprobe_count)) {
>>+ wake_up_all(&djprobe_wqh);
>>+ }
>>+}
>>+
>>+static void __kprobes __check_safety(void)
>>+{
>>+ int cpu;
>>+ struct work_struct *wk;
>>+ lock_cpu_hotplug();
>>+ atomic_set(&djprobe_count, num_online_cpus() - 1);
>>+ for_each_online_cpu(cpu) {
>>+ if (cpu == smp_processor_id())
>>+ continue;
>>+ wk = &per_cpu(djprobe_works, cpu);
>>+ INIT_WORK(wk, __work_check_safety, NULL);
>>+ schedule_delayed_work_on(cpu, wk, 0);
>>+ }
>>+ wait_event(djprobe_wqh, (atomic_read(&djprobe_count) == 0));
>>+ unlock_cpu_hotplug();
>>+}
>>+
>>+int __kprobes register_djprobe(struct djprobe *djp, void *addr, int size)
>>+{
>>+ struct djprobe_instance *djpi;
>>+ struct kprobe *kp;
>>+ int ret = 0, i;
>>+
>>+ BUG_ON(in_interrupt());
>>+
>>+ if (size > ARCH_STUB_INSN_MAX || size < ARCH_STUB_INSN_MIN)
>>+ return -EINVAL;
>>+
>>+ if ((ret = in_kprobes_functions((unsigned long)addr)) != 0)
>>+ return ret;
>>+
>>+ down(&djprobe_mutex);
>>+ INIT_LIST_HEAD(&djp->plist);
>>+ /* check confliction with other djprobes */
>>+ djpi = __get_djprobe_instance(addr, size);
>>+ if (djpi) {
>>+ if (djpi->kp.addr == addr) {
>>+ djp->inst = djpi; /* add to another instance */
>>+ list_add_rcu(&djp->plist, &djpi->plist);
>>+ } else {
>>+ ret = -EEXIST; /* other djprobes were inserted */
>>+ }
>>+ goto out;
>>+ }
>>+ djpi = __create_djprobe_instance(djp, addr, size);
>>+ if (djpi == NULL) {
>>+ ret = -ENOMEM;
>>+ goto out;
>>+ }
>>+
>>+ /* check confliction with kprobes */
>>+ for (i = 0; i < size; i++) {
>>+ kp = get_kprobe((void *)((long)addr + i));
[YM] There is a race between get_kprobe and register_kprobe without locking kprobe_lock. Could register_kprobe to check if the address is in a JTPR of registered djprobe? I think djprobe and kprobe could share the same spin_lock, namely kprobe_lock.



>>+ if (kp != NULL) {
>>+ ret = -EEXIST; /* a kprobes were inserted */
>>+ goto fail;
>>+ }
>>+ }
>>+ ret = register_kprobe(&djpi->kp);
>>+ if (ret < 0) {
>>+       fail:
>>+ djpi->kp.addr = NULL;
>>+ djp->inst = NULL;
>>+ list_del_rcu(&djp->plist);
>>+ __free_djprobe_instance(djpi);
>>+ } else {
>>+ __check_safety();
>>+ arch_install_djprobe_instance(djpi);
>>+ }
>>+       out:
>>+ up(&djprobe_mutex);
>>+ return ret;
Reply | Threaded
Open this post in threaded view
|

Re: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Masami Hiramatsu
Hi,

Thank you for your review!

Zhang, Yanmin wrote:

>>>-----Original Message-----
>>>From: [hidden email] [mailto:[hidden email]]
>>>On Behalf Of Masami Hiramatsu
>>>Sent: 2005/11/8 21:26
>>>To: [hidden email]
>>>Cc: Satoshi Oshima; Yumiko Sugita; Hideo Aoki
>>>Subject: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1
>>>
>>>Hi,
>>>
>>>This patch is the architecture independant part of djprobe.
>>>+static inline
>>>+    struct djprobe_instance *__create_djprobe_instance(struct djprobe *djp,
>>>+       void *addr, int size)
>>>+{
>>>+ struct djprobe_instance *djpi;
>>>+ /* allocate a new instance */
>>>+ djpi = kcalloc(1, sizeof(struct djprobe_instance), GFP_ATOMIC);
>>>+ if (djpi == NULL) {
>>>+ goto out;
>>>+ }
>>>+ /* allocate stub */
>>>+ djpi->stub.insn = __get_insn_slot(&djprobe_insn_pages);
>>>+ if (djpi->stub.insn == NULL) {
>
> [YM] If coming here, djpi->plist is not initiated.
> So __free_djprobe_instance=>hlist_del will cause panic.
> How about to move the INIT_LIST_HEAD(&djpi->plist) just after kcalloc?

Thanks for finding that. I will fix it so.

>>>+int __kprobes register_djprobe(struct djprobe *djp, void *addr, int size)
>>>+{
>>>+ struct djprobe_instance *djpi;
>>>+ struct kprobe *kp;
>>>+ int ret = 0, i;
>>>+
>>>+ BUG_ON(in_interrupt());
>>>+
>>>+ if (size > ARCH_STUB_INSN_MAX || size < ARCH_STUB_INSN_MIN)
>>>+ return -EINVAL;
>>>+
>>>+ if ((ret = in_kprobes_functions((unsigned long)addr)) != 0)
>>>+ return ret;
>>>+
>>>+ down(&djprobe_mutex);
>>>+ INIT_LIST_HEAD(&djp->plist);
>>>+ /* check confliction with other djprobes */
>>>+ djpi = __get_djprobe_instance(addr, size);
>>>+ if (djpi) {
>>>+ if (djpi->kp.addr == addr) {
>>>+ djp->inst = djpi; /* add to another instance */
>>>+ list_add_rcu(&djp->plist, &djpi->plist);
>>>+ } else {
>>>+ ret = -EEXIST; /* other djprobes were inserted */
>>>+ }
>>>+ goto out;
>>>+ }
>>>+ djpi = __create_djprobe_instance(djp, addr, size);
>>>+ if (djpi == NULL) {
>>>+ ret = -ENOMEM;
>>>+ goto out;
>>>+ }
>>>+
>>>+ /* check confliction with kprobes */
>>>+ for (i = 0; i < size; i++) {
>>>+ kp = get_kprobe((void *)((long)addr + i));
>
> [YM] There is a race between get_kprobe and register_kprobe without
> locking kprobe_lock. Could register_kprobe to check if the address is
> in a JTPR of registered djprobe? I think djprobe and kprobe could
> share the same spin_lock, namely kprobe_lock.

hmm, but __check_safety() may sleep. So spin-lock will cause dead-lock.
I think it can avoid race condition by following two changes.

1) delay checking confliction like below.

       /* first, register as a kprobe.
        if there is another competitor, this waits until it registered */
        ret = register_kprobe(&djpi->kp);
        if (ret < 0) {
       fail:
                djpi->kp.addr = NULL;
                djp->inst = NULL;
                list_del_rcu(&djp->plist);
                __free_djprobe_instance(djpi);
        } else {
                /* next, check confliction with kprobes */
                for (i = 0; i < size; i++) {
                        kp = get_kprobe((void *)((long)addr + i));
                        if (kp != NULL && kp != &djpi->kp) {
                                ret = -EEXIST;  /* other kprobes were inserted */
                                goto fail;
                        }
                }
                __check_safety();
                arch_install_djprobe_instance(djpi);
        }


2) share the mutex of djprobe with kprobes like below.

int __kprobes register_kprobe(struct kprobe *p)
{
        int ret = 0;
        unsigned long flags = 0;
        struct kprobe *old_p;

        if ((ret = in_kprobes_functions((unsigned long) p->addr)) != 0)
                return ret;
#ifdef CONFIG_DJPROBE
        down(&djprobe_mutex);
        if (p->pre_handler != djprobe_pre_handler &&
            get_djprobe_instance(p->addr, 1) != NULL)
                return -EEXIST;
#endif /* CONFIG_DJPROBE */
        if ((ret = arch_prepare_kprobe(p)) != 0)
                goto rm_kprobe;

        p->nmissed = 0;
        spin_lock_irqsave(&kprobe_lock, flags);
        old_p = get_kprobe(p->addr);
        if (old_p) {
                ret = register_aggr_kprobe(old_p, p);
                goto out;
        }

        arch_copy_kprobe(p);
        INIT_HLIST_NODE(&p->hlist);
        hlist_add_head_rcu(&p->hlist,
                       &kprobe_table[hash_ptr(p->addr, KPROBE_HASH_BITS)]);

        arch_arm_kprobe(p);

out:
        spin_unlock_irqrestore(&kprobe_lock, flags);
rm_kprobe:
#ifdef CONFIG_DJPROBE
        up(&djprobe_mutex);
#endif /* CONFIG_DJPROBE */
        if (ret == -EEXIST)
                arch_remove_kprobe(p);
        return ret;
}

What would you think about this?

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

RE: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Zhang, Yanmin
In reply to this post by Masami Hiramatsu
>>-----Original Message-----
>>From: Masami Hiramatsu [mailto:[hidden email]]
>>Sent: 2005年11月12日 3:20
>>To: Zhang, Yanmin
>>Cc: [hidden email]; Satoshi Oshima; Yumiko Sugita; Hideo Aoki;
>>Keshavamurthy, Anil S
>>Subject: Re: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1
>>
>>Hi,
>>
>>Thank you for your review!
>>
>>Zhang, Yanmin wrote:
>>>>>-----Original Message-----
>>>>>From: [hidden email]
>>[mailto:[hidden email]]
>>>>>On Behalf Of Masami Hiramatsu
>>>>>Sent: 2005/11/8 21:26
>>>>>To: [hidden email]
>>>>>Cc: Satoshi Oshima; Yumiko Sugita; Hideo Aoki
>>>>>Subject: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1
>>>>>
>>>>>Hi,
>>>>>
>>>>>This patch is the architecture independant part of djprobe.
>>>>>+static inline
>>>>>+    struct djprobe_instance *__create_djprobe_instance(struct djprobe
>>*djp,
>>>>>+       void *addr, int size)
>>>>>+{
>>>>>+ struct djprobe_instance *djpi;
>>>>>+ /* allocate a new instance */
>>>>>+ djpi = kcalloc(1, sizeof(struct djprobe_instance), GFP_ATOMIC);
>>>>>+ if (djpi == NULL) {
>>>>>+ goto out;
>>>>>+ }
>>>>>+ /* allocate stub */
>>>>>+ djpi->stub.insn = __get_insn_slot(&djprobe_insn_pages);
>>>>>+ if (djpi->stub.insn == NULL) {
>>>
>>> [YM] If coming here, djpi->plist is not initiated.
>>> So __free_djprobe_instance=>hlist_del will cause panic.
>>> How about to move the INIT_LIST_HEAD(&djpi->plist) just after kcalloc?
>>
>>Thanks for finding that. I will fix it so.
>>
>>>>>+int __kprobes register_djprobe(struct djprobe *djp, void *addr, int size)
>>>>>+{
>>>>>+ struct djprobe_instance *djpi;
>>>>>+ struct kprobe *kp;
>>>>>+ int ret = 0, i;
>>>>>+
>>>>>+ BUG_ON(in_interrupt());
>>>>>+
>>>>>+ if (size > ARCH_STUB_INSN_MAX || size < ARCH_STUB_INSN_MIN)
>>>>>+ return -EINVAL;
>>>>>+
>>>>>+ if ((ret = in_kprobes_functions((unsigned long)addr)) != 0)
>>>>>+ return ret;
>>>>>+
>>>>>+ down(&djprobe_mutex);
>>>>>+ INIT_LIST_HEAD(&djp->plist);
>>>>>+ /* check confliction with other djprobes */
>>>>>+ djpi = __get_djprobe_instance(addr, size);
>>>>>+ if (djpi) {
>>>>>+ if (djpi->kp.addr == addr) {
>>>>>+ djp->inst = djpi; /* add to another instance */
>>>>>+ list_add_rcu(&djp->plist, &djpi->plist);
>>>>>+ } else {
>>>>>+ ret = -EEXIST; /* other djprobes were inserted */
>>>>>+ }
>>>>>+ goto out;
>>>>>+ }
>>>>>+ djpi = __create_djprobe_instance(djp, addr, size);
>>>>>+ if (djpi == NULL) {
>>>>>+ ret = -ENOMEM;
>>>>>+ goto out;
>>>>>+ }
>>>>>+
>>>>>+ /* check confliction with kprobes */
>>>>>+ for (i = 0; i < size; i++) {
>>>>>+ kp = get_kprobe((void *)((long)addr + i));
>>>
>>> [YM] There is a race between get_kprobe and register_kprobe without
>>> locking kprobe_lock. Could register_kprobe to check if the address is
>>> in a JTPR of registered djprobe? I think djprobe and kprobe could
>>> share the same spin_lock, namely kprobe_lock.
>>
>>hmm, but __check_safety() may sleep. So spin-lock will cause dead-lock.
>>I think it can avoid race condition by following two changes.
>>
>>1) delay checking confliction like below.
>>
>>       /* first, register as a kprobe.
>> if there is another competitor, this waits until it registered */
>>        ret = register_kprobe(&djpi->kp);
>>        if (ret < 0) {
>>       fail:
>>                djpi->kp.addr = NULL;
>>                djp->inst = NULL;
>>                list_del_rcu(&djp->plist);
>>                __free_djprobe_instance(djpi);
>>        } else {
>>                /* next, check confliction with kprobes */
>>                for (i = 0; i < size; i++) {
>>                        kp = get_kprobe((void *)((long)addr + i));
>>                        if (kp != NULL && kp != &djpi->kp) {
>>                                ret = -EEXIST;  /* other kprobes were
>>inserted */
>>                                goto fail;
>>                        }
>>                }
>>                __check_safety();
>>                arch_install_djprobe_instance(djpi);
>>        }
>>
>>
>>2) share the mutex of djprobe with kprobes like below.
>>
>>int __kprobes register_kprobe(struct kprobe *p)
>>{
>>        int ret = 0;
>>        unsigned long flags = 0;
>>        struct kprobe *old_p;
>>
>>        if ((ret = in_kprobes_functions((unsigned long) p->addr)) != 0)
>>                return ret;
>>#ifdef CONFIG_DJPROBE
>>        down(&djprobe_mutex);
>>        if (p->pre_handler != djprobe_pre_handler &&
>>            get_djprobe_instance(p->addr, 1) != NULL)
>>                return -EEXIST;
>>#endif /* CONFIG_DJPROBE */
>>        if ((ret = arch_prepare_kprobe(p)) != 0)
>>                goto rm_kprobe;
>>
>>        p->nmissed = 0;
>>        spin_lock_irqsave(&kprobe_lock, flags);
>>        old_p = get_kprobe(p->addr);
>>        if (old_p) {
>>                ret = register_aggr_kprobe(old_p, p);
>>                goto out;
>>        }
>>
>>        arch_copy_kprobe(p);
>>        INIT_HLIST_NODE(&p->hlist);
>>        hlist_add_head_rcu(&p->hlist,
>>                       &kprobe_table[hash_ptr(p->addr,
>>KPROBE_HASH_BITS)]);
>>
>>        arch_arm_kprobe(p);
>>
>>out:
>>        spin_unlock_irqrestore(&kprobe_lock, flags);
>>rm_kprobe:
>>#ifdef CONFIG_DJPROBE
>>        up(&djprobe_mutex);
>>#endif /* CONFIG_DJPROBE */
>>        if (ret == -EEXIST)
>>                arch_remove_kprobe(p);
>>        return ret;
>>}
[YM] It's reasonable. In function register_kprobe,
1) get_djprobe_instance should be __get_djprobe_instance if djprobe_mutex is used.
2) Release djprobe_mutex before " return -EEXIST".
3) Parameter size of call to get_djprobe_instance is always 1 here. How about to change it to ARCH_STUB_INSN_MAX?

One more comment on your 3rd patch, how about to change:
+#define ARCH_STUB_SIZE ((long)&arch_tmpl_stub_end - (long)&arch_tmpl_stub_entry)
to
+#define ARCH_STUB_SIZE (((long)&arch_tmpl_stub_end - (long)&arch_tmpl_stub_entry)/sizeof(kprobe_opcode_t))

On ia32, sizeof(kprobe_opcode_t) is equal to 1, but on other platform, it might not be. Just to make it clearer.


Reply | Threaded
Open this post in threaded view
|

Re: [Patch 2/3][Djprobe] Djprobe update for linux-2.6.14-mm1

Masami Hiramatsu
Hi, Zhang

I am sorry to reply so late.

Zhang, Yanmin wrote:
> [YM] It's reasonable. In function register_kprobe,

Thanks.

> 1) get_djprobe_instance should be __get_djprobe_instance if djprobe_mutex is used.

Exactly. I missed it.

> 2) Release djprobe_mutex before " return -EEXIST".

Thanks to find that!

> 3) Parameter size of call to get_djprobe_instance is always 1 here. How about to change it to ARCH_STUB_INSN_MAX?
> One more comment on your 3rd patch, how about to change:
> +#define ARCH_STUB_SIZE ((long)&arch_tmpl_stub_end - (long)&arch_tmpl_stub_entry)
> to
> +#define ARCH_STUB_SIZE (((long)&arch_tmpl_stub_end - (long)&arch_tmpl_stub_entry)/sizeof(kprobe_opcode_t))
>
> On ia32, sizeof(kprobe_opcode_t) is equal to 1, but on other platform, it might not be. Just to make it clearer.
>

OK, I will change it as like that.

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]