[RFC][Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[RFC][Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi,

I publish a couple of patches of kprobe-booster in next mails.
 With kprobe-booster patch, kprobes execute a copied
instruction directly and (if need) jump back to original code.
 This direct execution is executed when the kprobe don’t have
both post_handler and break_handler, and the copied instruction
can be executed directly.

What kinds of instructions can be executed directly or not?
- Call instructions are NG. We should correct the return
  address pushed into top of stack.
- Indirect instructions except for absolute indirect-jumps
  are NG. Those instructions changes EIP randomly. We should
  check EIP and correct it.
- Instructions that change EIP beyond the range of the
  instruction buffer are NG.
- Instructions that change EIP to tail 5 bytes of the
  instruction buffer (it is the size of a jump instruction).
  We must write a jump instruction which backs to original
  kernel code in the instruction buffer.
- Break point instruction is NG. We should not touch EIP and
  pass to other handlers.
- Absolute direct/indirect jumps are OK.
- Other instructions are OK. But those instructions need a
  jump back code.

Kprobe-booster checks instructions at resume_execution()
function. If it can be executed directly, it sets “boostable”
flag.
In kprobe_handler(), kprobe checks the “boostable” flag.
If the flag is set, it resets current kprobe and executes
instruction buffer directly instead of single stepping.

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

[RFC][Patch 1/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi,

This patch cleans kprobe's resume_execute() for i386 arch.
Before applying kprobe-booster, I'd like to cleanup codes.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

 kprobes.c |   24 +++++++++---------------
 1 files changed, 9 insertions(+), 15 deletions(-)

diff -Narup linux-2.6.14-mm1/arch/i386/kernel/kprobes.c linux-2.6.14-mm1.opt/arch/i386/kernel/kprobes.c
--- linux-2.6.14-mm1/arch/i386/kernel/kprobes.c 2005-11-08 16:44:21.000000000 +0900
+++ linux-2.6.14-mm1.opt/arch/i386/kernel/kprobes.c 2005-11-18 21:27:54.000000000 +0900
@@ -345,10 +345,10 @@ static void __kprobes resume_execution(s
  struct pt_regs *regs, struct kprobe_ctlblk *kcb)
 {
  unsigned long *tos = (unsigned long *)&regs->esp;
- unsigned long next_eip = 0;
  unsigned long copy_eip = (unsigned long)&p->ainsn.insn;
  unsigned long orig_eip = (unsigned long)p->addr;

+ regs->eflags &= ~TF_MASK;
  switch (p->ainsn.insn[0]) {
  case 0x9c: /* pushfl */
  *tos &= ~(TF_MASK | IF_MASK);
@@ -358,9 +358,9 @@ static void __kprobes resume_execution(s
  case 0xcb:
  case 0xc2:
  case 0xca:
- regs->eflags &= ~TF_MASK;
+ case 0xea: /* jmp absolute -- eip is correct */
  /* eip is already adjusted, no more changes required*/
- return;
+ goto no_change;
  case 0xe8: /* call relative - Fix return addr */
  *tos = orig_eip + (*tos - copy_eip);
  break;
@@ -368,27 +368,21 @@ static void __kprobes resume_execution(s
  if ((p->ainsn.insn[1] & 0x30) == 0x10) {
  /* call absolute, indirect */
  /* Fix return addr; eip is correct. */
- next_eip = regs->eip;
  *tos = orig_eip + (*tos - copy_eip);
+ goto no_change;
  } else if (((p->ainsn.insn[1] & 0x31) == 0x20) || /* jmp near, absolute indirect */
    ((p->ainsn.insn[1] & 0x31) == 0x21)) { /* jmp far, absolute indirect */
  /* eip is correct. */
- next_eip = regs->eip;
+ goto no_change;
  }
- break;
- case 0xea: /* jmp absolute -- eip is correct */
- next_eip = regs->eip;
- break;
  default:
  break;
  }

- regs->eflags &= ~TF_MASK;
- if (next_eip) {
- regs->eip = next_eip;
- } else {
- regs->eip = orig_eip + (regs->eip - copy_eip);
- }
+ regs->eip = orig_eip + (regs->eip - copy_eip);
+
+       no_change:
+ return ;
 }

 /*

Reply | Threaded
Open this post in threaded view
|

[RFC][Patch 2/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi,

This patch adds kprobe-booster function for i386 arch.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

 arch/i386/kernel/kprobes.c |   46 +++++++++++++++++++++++++++++++++++++++++----
 include/asm-i386/kprobes.h |    4 +++
 2 files changed, 46 insertions(+), 4 deletions(-)

diff -Narup linux-2.6.14-mm1.opt/arch/i386/kernel/kprobes.c linux-2.6.14-mm1.booster/arch/i386/kernel/kprobes.c
--- linux-2.6.14-mm1.opt/arch/i386/kernel/kprobes.c 2005-11-18 21:27:54.000000000 +0900
+++ linux-2.6.14-mm1.booster/arch/i386/kernel/kprobes.c 2005-11-18 21:26:07.000000000 +0900
@@ -41,6 +41,18 @@ void jprobe_return_end(void);
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
 DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);

+/* insert a jmp code */
+static inline void set_jmp_op(void *from, void *to)
+{
+ struct __arch_jmp_op {
+ char op;
+ long raddr;
+ } __attribute__((packed)) *jop;
+ jop = (struct __arch_jmp_op *)from;
+ jop->raddr = (long)(to) - ((long)(from) + 5);
+ jop->op = RELATIVEJUMP_INSTRUCTION;
+}
+
 /*
  * returns non-zero if opcode modifies the interrupt flag.
  */
@@ -65,6 +77,7 @@ void __kprobes arch_copy_kprobe(struct k
 {
  memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
  p->opcode = *p->addr;
+ p->ainsn.boostable = 0;
 }

 void __kprobes arch_arm_kprobe(struct kprobe *p)
@@ -235,6 +248,13 @@ static int __kprobes kprobe_handler(stru
  /* handler has already set things up, so skip ss setup */
  return 1;

+ if (p->ainsn.boostable && !p->post_handler && !p->break_handler ) {
+ /* Boost up -- we can execute copied instructions directly */
+ reset_current_kprobe();
+ regs->eip = (unsigned long)&p->ainsn.insn;
+ return 1;
+ }
+
 ss_probe:
  prepare_singlestep(p, regs);
  kcb->kprobe_status = KPROBE_HIT_SS;
@@ -340,6 +360,8 @@ int __kprobes trampoline_probe_handler(s
  * 2) If the single-stepped instruction was a call, the return address
  * that is atop the stack is the address following the copied instruction.
  * We need to make it the address following the original instruction.
+ *
+ * This function also prepares direct execution.
  */
 static void __kprobes resume_execution(struct kprobe *p,
  struct pt_regs *regs, struct kprobe_ctlblk *kcb)
@@ -367,22 +389,38 @@ static void __kprobes resume_execution(s
  case 0xff:
  if ((p->ainsn.insn[1] & 0x30) == 0x10) {
  /* call absolute, indirect */
- /* Fix return addr; eip is correct. */
+ /* Fix return addr; eip is correct
+   But this is not boostable */
  *tos = orig_eip + (*tos - copy_eip);
- goto no_change;
+ return ;
  } else if (((p->ainsn.insn[1] & 0x31) == 0x20) || /* jmp near, absolute indirect */
    ((p->ainsn.insn[1] & 0x31) == 0x21)) { /* jmp far, absolute indirect */
- /* eip is correct. */
+ /* eip is correct. And this is boostable */
  goto no_change;
  }
+ /* other indirect instructions can not be executed directly */
+ case BREAKPOINT_INSTRUCTION:
+ /* breakpoint instruction can not be executed directly */
+ break;
  default:
+ if (p->ainsn.boostable == 0 && regs->eip > copy_eip &&
+    (regs->eip - copy_eip) + 5 < MAX_INSN_SIZE) {
+ /* these instructions can be executed directly if it
+   jumps back to correct address. */
+ set_jmp_op((void *)regs->eip,
+   (void *)orig_eip + (regs->eip - copy_eip));
+ p->ainsn.boostable = 1;
+ }
  break;
  }

  regs->eip = orig_eip + (regs->eip - copy_eip);
+ return ;

        no_change:
- return ;
+ /* If eip is already adjusted, it can be executed directly. */
+ p->ainsn.boostable = 1;
+ return;
 }

 /*
diff -Narup linux-2.6.14-mm1.opt/include/asm-i386/kprobes.h linux-2.6.14-mm1.booster/include/asm-i386/kprobes.h
--- linux-2.6.14-mm1.opt/include/asm-i386/kprobes.h 2005-11-08 16:45:06.000000000 +0900
+++ linux-2.6.14-mm1.booster/include/asm-i386/kprobes.h 2005-11-17 02:48:49.000000000 +0900
@@ -31,6 +31,7 @@ struct pt_regs;

 typedef u8 kprobe_opcode_t;
 #define BREAKPOINT_INSTRUCTION 0xcc
+#define RELATIVEJUMP_INSTRUCTION 0xe9
 #define MAX_INSN_SIZE 16
 #define MAX_STACK_SIZE 64
 #define MIN_STACK_SIZE(ADDR) (((MAX_STACK_SIZE) < \
@@ -47,6 +48,9 @@ void kretprobe_trampoline(void);
 struct arch_specific_insn {
  /* copy of the original instruction */
  kprobe_opcode_t insn[MAX_INSN_SIZE];
+ /* If this flag is not 0, this kprobe can be boost when its
+   post_handler and break_handler is not set. */
+ int boostable;
 };

 struct prev_kprobe {

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Frank Ch. Eigler
In reply to this post by Masami Hiramatsu

Masami Hiramatsu wrote:

> I publish a couple of patches of kprobe-booster in next mails.
> [...]

Can you describe what kinds of tests you ran on this exciting
optimization, and their results?

- FChE
Reply | Threaded
Open this post in threaded view
|

[RFC][Patch 0/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
In reply to this post by Masami Hiramatsu
Hi,

I’m sorry. I found some mistakes in the previous patches.
So, I fixed those mistakes, and retake them in next mails.

Masami Hiramatsu wrote:

> What kinds of instructions can be executed directly or not?
> - Call instructions are NG. We should correct the return
>   address pushed into top of stack.
> - Indirect instructions except for absolute indirect-jumps
>   are NG. Those instructions changes EIP randomly. We should
>   check EIP and correct it.
> - Instructions that change EIP beyond the range of the
>   instruction buffer are NG.
> - Instructions that change EIP to tail 5 bytes of the
>   instruction buffer (it is the size of a jump instruction).
>   We must write a jump instruction which backs to original
>   kernel code in the instruction buffer.
> - Break point instruction is NG. We should not touch EIP and
>   pass to other handlers.
> - Absolute direct/indirect jumps are OK.

- Conditional Jumps are NG.
- Halt and software-interruptions are NG. Because it will stay on the
instruction buffer of kprobes.
- Prefixes are NG.
- Unknown/reserved opcode is NG.
- Other 1 byte instructions are OK. But those instructions need a
  jump back code.
- 2 bytes instructions are mapped sparsely. So, in this release,
  this patch don’t boost those instructions.

From Intel’s IA-32 opcode map described in IA-32 Intel
Architecture Software Developer’s Manual Vol.2 B,
I determined that following opcodes are not boostable.
- 0FH (2byte escape)
- 70H - 7FH (Jump on condition)
- 9AH (Call) and 9CH (Pushf)
- C0H-C1H (Grp 2: includes reserved opcode)
- C6H-C7H (Grp11: includes reserved opcode)
- CCH-CEH (Software-interrupt)
- D0H-D3H (Grp2: includes reserved opcode)
- D6H (Reserved)
- D8H-DFH (Coprocessor)
- E0H-E3H (loop/conditional jump)
- E8H (Call)
- F0H-F3H (Prefixes and reserved)
- F4H (Halt)
- F6H-F7H (Grp3: includes reserved opcode)
- FEH-FFH(Grp4,5: includes reserved opcode)

 Kprobe-booster checks whether target instruction can
be boost (executed directly) at arch_copy_kprobe()
function. If the target instruction can be boost, it
clears "boostable" flag. If not, it sets "boostable"
flag -1. This is disabled status.
 In resume_execution() function, If "boostable" flag is
cleared,  kprobe-booster measures the size of the target
instruction and sets “boostable” flag 1.

In kprobe_handler(), kprobe checks the "boostable" flag.
If the flag is 1, it resets current kprobe and executes
instruction buffer directly instead of single stepping.

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

[RFC][Patch 1/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi,

This patch cleans kprobe's resume_execute() for i386 arch.
Before applying kprobe-booster, I'd like to cleanup codes.
It is useful for simplify booster patch.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Signed-off-by: Masami Hiramatsu <[hidden email]>

 kprobes.c |   26 ++++++++++----------------
 1 files changed, 10 insertions(+), 16 deletions(-)
diff -Narup a/arch/i386/kernel/kprobes.c b/arch/i386/kernel/kprobes.c
--- a/arch/i386/kernel/kprobes.c 2005-11-08 11:50:46.000000000 +0900
+++ b/arch/i386/kernel/kprobes.c 2005-11-28 11:43:10.000000000 +0900
@@ -345,10 +345,10 @@ static void __kprobes resume_execution(s
  struct pt_regs *regs, struct kprobe_ctlblk *kcb)
 {
  unsigned long *tos = (unsigned long *)&regs->esp;
- unsigned long next_eip = 0;
  unsigned long copy_eip = (unsigned long)&p->ainsn.insn;
  unsigned long orig_eip = (unsigned long)p->addr;

+ regs->eflags &= ~TF_MASK;
  switch (p->ainsn.insn[0]) {
  case 0x9c: /* pushfl */
  *tos &= ~(TF_MASK | IF_MASK);
@@ -358,9 +358,9 @@ static void __kprobes resume_execution(s
  case 0xcb:
  case 0xc2:
  case 0xca:
- regs->eflags &= ~TF_MASK;
- /* eip is already adjusted, no more changes required*/
- return;
+ case 0xea: /* jmp absolute -- eip is correct */
+ /* eip is already adjusted, no more changes required */
+ goto no_change;
  case 0xe8: /* call relative - Fix return addr */
  *tos = orig_eip + (*tos - copy_eip);
  break;
@@ -368,27 +368,21 @@ static void __kprobes resume_execution(s
  if ((p->ainsn.insn[1] & 0x30) == 0x10) {
  /* call absolute, indirect */
  /* Fix return addr; eip is correct. */
- next_eip = regs->eip;
  *tos = orig_eip + (*tos - copy_eip);
+ goto no_change;
  } else if (((p->ainsn.insn[1] & 0x31) == 0x20) || /* jmp near, absolute indirect */
    ((p->ainsn.insn[1] & 0x31) == 0x21)) { /* jmp far, absolute indirect */
  /* eip is correct. */
- next_eip = regs->eip;
+ goto no_change;
  }
- break;
- case 0xea: /* jmp absolute -- eip is correct */
- next_eip = regs->eip;
- break;
  default:
  break;
  }

- regs->eflags &= ~TF_MASK;
- if (next_eip) {
- regs->eip = next_eip;
- } else {
- regs->eip = orig_eip + (regs->eip - copy_eip);
- }
+ regs->eip = orig_eip + (regs->eip - copy_eip);
+
+       no_change:
+ return ;
 }

 /*

Reply | Threaded
Open this post in threaded view
|

[RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi,

This patch adds kprobe-booster function for i386 arch.
I fixed mistakes in previous patch.

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Signed-off-by: Masami Hiramatsu <[hidden email]>

 arch/i386/kernel/kprobes.c |   78 +++++++++++++++++++++++++++++++++++++++++++--
 include/asm-i386/kprobes.h |    4 ++
 2 files changed, 80 insertions(+), 2 deletions(-)
diff -Narup a/arch/i386/kernel/kprobes.c b/arch/i386/kernel/kprobes.c
--- a/arch/i386/kernel/kprobes.c 2005-11-28 11:43:10.000000000 +0900
+++ b/arch/i386/kernel/kprobes.c 2005-11-28 15:57:39.000000000 +0900
@@ -41,6 +41,49 @@ void jprobe_return_end(void);
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
 DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);

+/* insert a jmp code */
+static inline void set_jmp_op(void *from, void *to)
+{
+ struct __arch_jmp_op {
+ char op;
+ long raddr;
+ } __attribute__((packed)) *jop;
+ jop = (struct __arch_jmp_op *)from;
+ jop->raddr = (long)(to) - ((long)(from) + 5);
+ jop->op = RELATIVEJUMP_INSTRUCTION;
+}
+
+/*
+ * returns non-zero if opcodes can be boosted.
+ */
+static inline int can_boost(kprobe_opcode_t opcode)
+{
+ switch (opcode & 0xf0 ) {
+ case 0x70:
+ return 0; /* can't boost conditional jump */
+ case 0x90:
+ /* can't boost call and pushf */
+ return opcode != 0x9a && opcode != 0x9c;
+ case 0xc0:
+ /* can't boost undefined opcodes and soft-interruptions */
+ return (0xc1 < opcode && opcode < 0xc6) ||
+ (0xc7 < opcode && opcode < 0xcc) || opcode == 0xcf;
+ case 0xd0:
+ /* can boost AA* and XLAT */
+ return (opcode == 0xd4 || opcode == 0xd5 || opcode == 0xd7);
+ case 0xe0:
+ /* can boost in/out and (may be) jmps */
+ return (0xe3 < opcode && opcode != 0xe8);
+ case 0xf0:
+ /* clear and set flags can be boost */
+ return (opcode == 0xf5 || (0xf7 < opcode && opcode < 0xfe));
+ default:
+ /* currently, can't boost 2 bytes opcodes */
+ return opcode != 0x0f;
+ }
+}
+
+
 /*
  * returns non-zero if opcode modifies the interrupt flag.
  */
@@ -65,6 +108,11 @@ void __kprobes arch_copy_kprobe(struct k
 {
  memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
  p->opcode = *p->addr;
+ if (can_boost(p->opcode)) {
+ p->ainsn.boostable = 0;
+ } else {
+ p->ainsn.boostable = -1;
+ }
 }

 void __kprobes arch_arm_kprobe(struct kprobe *p)
@@ -235,6 +283,14 @@ static int __kprobes kprobe_handler(stru
  /* handler has already set things up, so skip ss setup */
  return 1;

+ if (p->ainsn.boostable == 1 &&
+    !p->post_handler && !p->break_handler ) {
+ /* Boost up -- we can execute copied instructions directly */
+ reset_current_kprobe();
+ regs->eip = (unsigned long)&p->ainsn.insn;
+ return 1;
+ }
+
 ss_probe:
  prepare_singlestep(p, regs);
  kcb->kprobe_status = KPROBE_HIT_SS;
@@ -340,6 +396,8 @@ int __kprobes trampoline_probe_handler(s
  * 2) If the single-stepped instruction was a call, the return address
  * that is atop the stack is the address following the copied instruction.
  * We need to make it the address following the original instruction.
+ *
+ * This function also checks instruction size for preparing direct execution.
  */
 static void __kprobes resume_execution(struct kprobe *p,
  struct pt_regs *regs, struct kprobe_ctlblk *kcb)
@@ -360,6 +418,7 @@ static void __kprobes resume_execution(s
  case 0xca:
  case 0xea: /* jmp absolute -- eip is correct */
  /* eip is already adjusted, no more changes required */
+ p->ainsn.boostable = 1;
  goto no_change;
  case 0xe8: /* call relative - Fix return addr */
  *tos = orig_eip + (*tos - copy_eip);
@@ -367,18 +426,33 @@ static void __kprobes resume_execution(s
  case 0xff:
  if ((p->ainsn.insn[1] & 0x30) == 0x10) {
  /* call absolute, indirect */
- /* Fix return addr; eip is correct. */
+ /* Fix return addr; eip is correct
+   But this is not boostable */
  *tos = orig_eip + (*tos - copy_eip);
  goto no_change;
  } else if (((p->ainsn.insn[1] & 0x31) == 0x20) || /* jmp near, absolute indirect */
    ((p->ainsn.insn[1] & 0x31) == 0x21)) { /* jmp far, absolute indirect */
- /* eip is correct. */
+ /* eip is correct. And this is boostable */
+ p->ainsn.boostable = 1;
  goto no_change;
  }
  default:
  break;
  }

+ if (p->ainsn.boostable == 0) {
+ if ( regs->eip > copy_eip &&
+     (regs->eip - copy_eip) + 5 < MAX_INSN_SIZE) {
+ /* these instructions can be executed directly if it
+ jumps back to correct address. */
+ set_jmp_op((void *)regs->eip,
+   (void *)orig_eip + (regs->eip - copy_eip));
+ p->ainsn.boostable = 1;
+ } else {
+ p->ainsn.boostable = -1;
+ }
+ }
+
  regs->eip = orig_eip + (regs->eip - copy_eip);

        no_change:
diff -Narup a/include/asm-i386/kprobes.h b/include/asm-i386/kprobes.h
--- a/include/asm-i386/kprobes.h 2005-11-08 11:51:02.000000000 +0900
+++ b/include/asm-i386/kprobes.h 2005-11-24 21:41:30.000000000 +0900
@@ -31,6 +31,7 @@ struct pt_regs;

 typedef u8 kprobe_opcode_t;
 #define BREAKPOINT_INSTRUCTION 0xcc
+#define RELATIVEJUMP_INSTRUCTION 0xe9
 #define MAX_INSN_SIZE 16
 #define MAX_STACK_SIZE 64
 #define MIN_STACK_SIZE(ADDR) (((MAX_STACK_SIZE) < \
@@ -47,6 +48,9 @@ void kretprobe_trampoline(void);
 struct arch_specific_insn {
  /* copy of the original instruction */
  kprobe_opcode_t insn[MAX_INSN_SIZE];
+ /* If this flag is not 0, this kprobe can be boost when its
+   post_handler and break_handler is not set. */
+ int boostable;
 };

 struct prev_kprobe {

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
In reply to this post by Frank Ch. Eigler
Hi, Frank

Frank Ch. Eigler wrote:
> Masami Hiramatsu wrote:
>
>>I publish a couple of patches of kprobe-booster in next mails.
>>[...]
>
> Can you describe what kinds of tests you ran on this exciting
> optimization, and their results?

I made a test program attached to this mail.
I checked correctness of kprobe-booster (take2) by using
this program.

I copied booster functions (*) from arch/i386/kernel/kprobes.c
into this test program.
(*) can_boost, arch_copy_kprobe, and resume_execution.
This program makes some "1 byte opcode" maps indicating which
opcode will be boosted by kprobe-booster. In the opcode maps,
line number means top 4 bits of opcode, and column number
means bottom 4 bits of opcode.

Here is a example of results.

opcode2:21h, offset:1 byte(s)
   0h 1h 2h 3h 4h 5h 6h 7h 8h 9h ah bh ch dh eh fh
0h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 -1
1h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
2h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
3h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
4h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
5h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
6h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
7h -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
8h  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
9h  1  1  1  1  1  1  1  1  1  1 -1  1 -1  1  1  1
ah  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
bh  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
ch -1 -1  1  1  1  1 -1 -1  1  1  1  1 -1 -1 -1  1
dh -1 -1 -1 -1  1  1 -1  1 -1 -1 -1 -1 -1 -1 -1 -1
eh -1 -1 -1 -1  1  1  1  1 -1  1  1  1  1  1  1  1
fh -1 -1 -1 -1 -1  1 -1 -1  1  1  1  1  1  1 -1  1

In the result, there is a description line as like
 "opcode2:21h, offset:1 byte(s)"
This means "the 2nd byte of opcode is 0x21. After single
step execution, EIP moves 1 byte."
After that, you get an opcode map. In this map,
"1" means that the opcode can be boosted,
and "-1" means that the opcode is not boosted.

How would you think about it?
And would you have better idea for testing?

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

#include <stdio.h>

typedef unsigned char kprobe_opcode_t;
#define MAX_INSN_SIZE 16
#define TF_MASK 1
#define IF_MASK 2
struct kprobe {
        struct {
                kprobe_opcode_t insn[MAX_INSN_SIZE];
                int boostable;
        } ainsn;
        kprobe_opcode_t opcode;
        kprobe_opcode_t *addr;
};

struct pt_regs {
        long eip;
        long esp;
        long eflags;
};

struct kprobe_ctlblk {
        long kprobe_old_eflags;
};

void set_jmp_op(void *a, void *b){ }

#define __kprobes

// copy code from kprobe.c

/*
 * returns non-zero if opcodes can be boosted.
 */
static inline int can_boost(kprobe_opcode_t opcode)
{
        switch (opcode & 0xf0 ) {
        case 0x70:
                return 0; /* can't boost conditional jump */
        case 0x90:
                /* can't boost call and pushf */
                return opcode != 0x9a && opcode != 0x9c;
        case 0xc0:
                /* can't boost undefined opcodes and soft-interruptions */
                return (0xc1 < opcode && opcode < 0xc6) ||
                        (0xc7 < opcode && opcode < 0xcc) || opcode == 0xcf;
        case 0xd0:
                /* can boost AA* and XLAT */
                return (opcode == 0xd4 || opcode == 0xd5 || opcode == 0xd7);
        case 0xe0:
                /* can boost in/out and (may be) jmps */
                return (0xe3 < opcode && opcode != 0xe8);
        case 0xf0:
                /* clear and set flags can be boost */
                return (opcode == 0xf5 || (0xf7 < opcode && opcode < 0xfe));
        default:
                /* currently, can't boost 2 bytes opcodes */
                return opcode != 0x0f;
        }
}

static void __kprobes resume_execution(struct kprobe *p,
                struct pt_regs *regs, struct kprobe_ctlblk *kcb)
{
        unsigned long *tos = (unsigned long *)&regs->esp;
        unsigned long copy_eip = (unsigned long)&p->ainsn.insn;
        unsigned long orig_eip = (unsigned long)p->addr;

        regs->eflags &= ~TF_MASK;
        switch (p->ainsn.insn[0]) {
        case 0x9c: /* pushfl */
                *tos &= ~(TF_MASK | IF_MASK);
                *tos |= kcb->kprobe_old_eflags;
                break;
        case 0xc3: /* ret/lret */
        case 0xcb:
        case 0xc2:
        case 0xca:
        case 0xea: /* jmp absolute -- eip is correct */
                /* eip is already adjusted, no more changes required */
                p->ainsn.boostable = 1;
                goto no_change;
        case 0xe8: /* call relative - Fix return addr */
                *tos = orig_eip + (*tos - copy_eip);
                break;
        case 0xff:
                if ((p->ainsn.insn[1] & 0x30) == 0x10) {
                        /* call absolute, indirect */
                        /* Fix return addr; eip is correct
                           But this is not boostable */
                        *tos = orig_eip + (*tos - copy_eip);
                        goto no_change;
                } else if (((p->ainsn.insn[1] & 0x31) == 0x20) || /* jmp near, absolute indirect */
                           ((p->ainsn.insn[1] & 0x31) == 0x21)) { /* jmp far, absolute indirect */
                        /* eip is correct. And this is boostable */
                        p->ainsn.boostable = 1;
                        goto no_change;
                }
        default:
                break;
        }

        if (p->ainsn.boostable == 0) {
                if ( regs->eip > copy_eip &&
                     (regs->eip - copy_eip) + 5 < MAX_INSN_SIZE) {
                        /* these instructions can be executed directly if it
                         jumps back to correct address. */
                        set_jmp_op((void *)regs->eip,
                                   (void *)orig_eip + (regs->eip - copy_eip));
                        p->ainsn.boostable = 1;
                } else {
                        p->ainsn.boostable = -1;
                }
        }

        regs->eip = orig_eip + (regs->eip - copy_eip);

       no_change:
        return ;
}

void __kprobes arch_copy_kprobe(struct kprobe *p)
{
        memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
        p->opcode = *p->addr;
        if (can_boost(p->opcode)) {
                p->ainsn.boostable = 0;
        } else {
                p->ainsn.boostable = -1;
        }
}

int validate(int opcode, int opcode2, long offset)
{
        struct kprobe kp;
        struct pt_regs regs;
        kprobe_opcode_t dummy_code[MAX_INSN_SIZE];
        long dummy_stack[10];
        struct kprobe_ctlblk kcb = {0};
       
        dummy_code[0] = (kprobe_opcode_t)opcode;
        dummy_code[1] = (kprobe_opcode_t)opcode2;
        regs.esp = (long)dummy_stack;
        kp.addr = (void*)dummy_code;

        arch_copy_kprobe(&kp);
       
        regs.eip = (long)kp.ainsn.insn + offset;
       
        resume_execution(&kp, &regs, &kcb);
       
        return kp.ainsn.boostable;
}

void validate_map(int op2, long offs)
{
        int i, j, ret;
        printf("opcode2:%xh, offset:%d byte(s)\n", op2, offs);
        printf("   ");
        for (j = 0; j <= 0xf; j++) printf("%1xh ", j);
        printf("\n");
        for (i = 0; i <= 0xf; i++) {
                printf("%1xh ", i);
                for (j = 0; j <= 0xf; j++) {
                        ret = validate(i<<4|j, op2, offs);
                        printf("%2d ", ret);
                }
                printf("\n");
        }
}

int main(void)
{
        validate_map(0,1);
        validate_map(0,-1);
        validate_map(0,MAX_INSN_SIZE-4);
        validate_map(0x10,1);
        validate_map(0x20,1);
        validate_map(0x21,1);

        return 0;
}
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Frank Ch. Eigler
Hi -

> > Can you describe what kinds of tests you ran on this exciting
> > optimization, and their results?
>
> I made a test program attached to this mail.
> I checked correctness of kprobe-booster (take2) by using
> this program. [...]

OK.  Would you be able to try a larger test, such as running any
existing kprobes or especially systemtap pass-5 tests on a kernel with
this modification?

- FChE
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Keshavamurthy, Anil S
In reply to this post by Masami Hiramatsu
On Fri, Nov 25, 2005 at 05:12:59AM -0800, Masami Hiramatsu wrote:
>
>    Hi,
>
>    I publish a couple of patches of kprobe-booster in next mails.
>     With kprobe-booster patch, kprobes execute a copied
>    instruction directly and (if need) jump back to original code.
>     This direct execution is executed when the kprobe don't have
>    both post_handler and break_handler, and the copied instruction
>    can be executed directly.
Cool!! Before I look at your patch(which I will do later today), I had
a question for you. With this patch, does kprobes still works with CONFIG_PREEMPT=y?
Or in other words, do you expect preeempt to be disabled for your patch to work?

Cheers,
Anil Keshavamurthy

Reply | Threaded
Open this post in threaded view
|

Re: [Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
In reply to this post by Frank Ch. Eigler
Hi, Frank

Frank Ch. Eigler wrote:

>>>Can you describe what kinds of tests you ran on this exciting
>>>optimization, and their results?
>>
>>I made a test program attached to this mail.
>>I checked correctness of kprobe-booster (take2) by using
>>this program. [...]
>
> OK.  Would you be able to try a larger test, such as running any
> existing kprobes or especially systemtap pass-5 tests on a kernel with
> this modification?

Yes, I tested.
For kprobes, I ran a small test script on the
boosted kernel.
The script disassembles sys_symlink function,
inserts kprobes in each instructions and
runs ln commands which executes a success
path and two failure paths.

for i in `./disym.sh -a sys_symlink | grep ^c | cut -f1 -d\: `;do
(echo $i | grep ^c > /dev/null ) || continue;
sudo /sbin/insmod ./noop_kprobe.ko addr=0x$i; echo install $i
(echo OK; ln -s /dev/null /tmp; rm /tmp/null;
 echo eperm; ln -s /dev/null /usr;
 echo enoent; ln -s /dev/null /hoge/huga/ )
sudo /sbin/rmmod noop_kprobe
done

For SystemTap, I ran my original tapsets which I
demonstrated in Beaverton.

I confirmed both tests worked correctly.

I will try to run the test suite which can be got
from sourceware.org by CVS.

Best Regards.
--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [Patch 0/2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
In reply to this post by Keshavamurthy, Anil S
Hi,

Keshavamurthy Anil S wrote:
> On Fri, Nov 25, 2005 at 05:12:59AM -0800, Masami Hiramatsu wrote:
> Cool!! Before I look at your patch(which I will do later today), I had
> a question for you. With this patch, does kprobes still works with CONFIG_PREEMPT=y?
> Or in other words, do you expect preeempt to be disabled for your patch to work?

Currently it may not work perfectly with preemptable kernel.
I would like to discuss about it.
The problem is releasing timing. If an interruption is
occurred when a process is executing directly the copied
code, the address of instruction buffer is pushed in the
stack. After that, if the process is scheduled (preempted),
we can not release the instruction buffer.
I am considering the ideas to avoid this problem.


By the way, I think the original kprobes has a smp racing
problem.
On SMP machine, a CPU (CPU0) can unregister that kprobe
and release its instruction buffer, even if another CPU
(CPU1) is executing kprobe_handler(). In this case, CPU1
will execute released instructions.
I found the kprobe just uses synchronize_sched() function.
And I guess the kprobes uses this function to resolve the
problem. If it is correct, how does this function resolve
the problem?

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SMP race [was: kprobe: kprobe-booster against 2.6.14-mm1 for i386]

Jim Keniston
On Tue, 2005-11-29 at 06:14, Masami Hiramatsu wrote:

>
> By the way, I think the original kprobes has a smp racing
> problem.
> On SMP machine, a CPU (CPU0) can unregister that kprobe
> and release its instruction buffer, even if another CPU
> (CPU1) is executing kprobe_handler(). In this case, CPU1
> will execute released instructions.
> I found the kprobe just uses synchronize_sched() function.
> And I guess the kprobes uses this function to resolve the
> problem. If it is correct, how does this function resolve
> the problem?
>
Yes, I think there's a race, although I expect that we would see a
problem only very rarely.  I've created bugzilla #1947 to track this
problem; it contains my analysis so far.

Thanks for pointing this out.

Jim

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
In reply to this post by Masami Hiramatsu
Hi,

I forgot to enable preemption in previous booster patch.
With this fix, the booster can work safely on the preemptible
kernel.
For safety, the booster is enabled only if the pre-handler
of kprobe is called from where the preemption is prohibited.

Masami Hiramatsu wrote:
> @@ -235,6 +283,14 @@ static int __kprobes kprobe_handler(stru
>   /* handler has already set things up, so skip ss setup */
>   return 1;
>
> + if (p->ainsn.boostable == 1 &&

#ifdef CONFIG_PREEMPT
            preempt_count() != 1 &&
#endif

> +    !p->post_handler && !p->break_handler ) {
> + /* Boost up -- we can execute copied instructions directly */
> + reset_current_kprobe();
> + regs->eip = (unsigned long)&p->ainsn.insn;

                preempt_enable_no_resched();

> + return 1;
> + }
> +
>  ss_probe:
>   prepare_singlestep(p, regs);
>   kcb->kprobe_status = KPROBE_HIT_SS;



--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Keshavamurthy, Anil S
In reply to this post by Masami Hiramatsu
On Mon, Nov 28, 2005 at 06:32:01AM -0800, Masami Hiramatsu wrote:
>
>    Hi,
>
>    This patch adds kprobe-booster function for i386 arch.
>    I fixed mistakes in previous patch.
Hi Masami,
        As promised I finished reviewing your patch, at the moment
I see two issues w.r.t your patch. I think both the issues that
I have mentioned below is very hard to fix and I don;t have the
good solution to suggest at the moment.

My comments inline.

>    @@ -235,6 +283,14 @@ static int __kprobes kprobe_handler(stru
>                    /* handler has already set things up, so skip ss setup
>    */
>                    return 1;
>
>    +       if (p->ainsn.boostable == 1 &&
>    +           !p->post_handler && !p->break_handler ) {
>    +                 /*  Boost  up  -- we can execute copied instructions
>    directly */
>    +               reset_current_kprobe();
>    +               regs->eip = (unsigned long)&p->ainsn.insn;
>    +               return 1;
Issue No 1:
        In the above code patch, once you call reset_current_kprobes() and
preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
in your fix to this code), you are not suppose to relay on
&p->ainsn.insn, the reason is that, on the other cpu technically
this memory can get freed due to unregister_kprobe() call.
In other words, by the time this cpu tries to execte the instruction
at regs->eip, that memory might have gone and system is bound to crash.

       

>    @@ -367,18 +426,33 @@ static void __kprobes resume_execution(s
>    +       if (p->ainsn.boostable == 0) {
>    +               if ( regs->eip > copy_eip &&
>    +                    (regs->eip - copy_eip) + 5 < MAX_INSN_SIZE) {
>    +                       /* these instructions can be executed directly
>    if it
>    +                        jumps back to correct address. */
>    +                       set_jmp_op((void *)regs->eip,
>    +                                      (void *)orig_eip + (regs->eip -
>    copy_eip));
Issue No 2:
        Since Kprobes is now highly parallel(thanks to RCU changes),
the kprobe_handler() can be in execution on multiple cpu's.
Due this parallel kprobe_handler() execution nature, it is also possible
to have some other cpu trying to single step on the copied instruction
at &p->ainsn.insn, at this very instant if set_jmp_op() in the above
call makes changs to that location, I am not sure how badly processor
can behave.

(In the above call to set_jmp_op(), I am assuming that regs->eip is same as
&p->ainsn.insn, and modifying this instruction without understanding
where the other processor are, is very dangerous)


Cheers,
Anil Keshavamurthy
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi, Anil

Thank you for your review.

Keshavamurthy Anil S wrote:
> As promised I finished reviewing your patch, at the moment
> I see two issues w.r.t your patch. I think both the issues that
> I have mentioned below is very hard to fix and I don;t have the
> good solution to suggest at the moment.

OK, I hope the issues can be solved.

> My comments inline.
>>   @@ -235,6 +283,14 @@ static int __kprobes kprobe_handler(stru
>>                   /* handler has already set things up, so skip ss setup
>>   */
>>                   return 1;
>>
>>   +       if (p->ainsn.boostable == 1 &&
>>   +           !p->post_handler && !p->break_handler ) {
>>   +                 /*  Boost  up  -- we can execute copied instructions
>>   directly */
>>   +               reset_current_kprobe();
>>   +               regs->eip = (unsigned long)&p->ainsn.insn;
>>   +               return 1;
>
> Issue No 1:
> In the above code patch, once you call reset_current_kprobes() and
> preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
> in your fix to this code), you are not suppose to relay on
> &p->ainsn.insn, the reason is that, on the other cpu technically
> this memory can get freed due to unregister_kprobe() call.
> In other words, by the time this cpu tries to execte the instruction
> at regs->eip, that memory might have gone and system is bound to crash.

I think old djprobe has the same issue. So, it will be solved by using
safety check routine after removing breakpoint. Current kprobe uses
synchronize_sched() for waiting rcu processing in unregister_kprobe().
In my opinion, the solution is that introducing safety check same as
djprobe instead of synchronize_sched() in unregister_kprobe().

The safety check routine of djprobe waits until all cpus execute the works,
and the works are executed from the keventd.
So we can ensure followings when a work is executed:
- interrupt handlers are finished on the cpu.
- p->ainsn.insn is not executed on the cpu.

And also, on preemptible kernel, the booster is not enabled where the
kernel preemption is enabled. So, there are no preempted threads on
p->ainsn.insn.

>>   @@ -367,18 +426,33 @@ static void __kprobes resume_execution(s
>>   +       if (p->ainsn.boostable == 0) {
>>   +               if ( regs->eip > copy_eip &&
>>   +                    (regs->eip - copy_eip) + 5 < MAX_INSN_SIZE) {
>>   +                       /* these instructions can be executed directly
>>   if it
>>   +                        jumps back to correct address. */
>>   +                       set_jmp_op((void *)regs->eip,
>>   +                                      (void *)orig_eip + (regs->eip -
>>   copy_eip));
>
> Issue No 2:
> Since Kprobes is now highly parallel(thanks to RCU changes),
> the kprobe_handler() can be in execution on multiple cpu's.
> Due this parallel kprobe_handler() execution nature, it is also possible
> to have some other cpu trying to single step on the copied instruction
> at &p->ainsn.insn, at this very instant if set_jmp_op() in the above
> call makes changs to that location, I am not sure how badly processor
> can behave.

Please look this conditional sentence. (copy_eip == p->ainsn.insn)

>>   +               if ( regs->eip > copy_eip &&
                                 ^^^
Thus the kprobe-booster does not write jmp opcode on the 1st instruction
of p->ainsn.insn. Single step execution starts with the 1st byte of
p->ainsn.insn. And only 1st instruction is executed by single step
execution.
So, I think the second one you pointed out is not a problem.

Best Regards,

--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Masami Hiramatsu
Hi, Anil

Masami Hiramatsu wrote:

>>Issue No 1:
>> In the above code patch, once you call reset_current_kprobes() and
>>preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
>>in your fix to this code), you are not suppose to relay on
>>&p->ainsn.insn, the reason is that, on the other cpu technically
>>this memory can get freed due to unregister_kprobe() call.
>>In other words, by the time this cpu tries to execte the instruction
>>at regs->eip, that memory might have gone and system is bound to crash.
>
> I think old djprobe has the same issue. So, it will be solved by using
> safety check routine after removing breakpoint. Current kprobe uses
> synchronize_sched() for waiting rcu processing in unregister_kprobe().
> In my opinion, the solution is that introducing safety check same as
> djprobe instead of synchronize_sched() in unregister_kprobe().

I found the rcu_barrier() function that works similar to djprobe's
safety check routine.

> The safety check routine of djprobe waits until all cpus execute the works,
> and the works are executed from the keventd.

The rcu_barrier() function waits until all cpus have gone through a
quiescent state.

- The CPU performs a process switch.
- The CPU starts executing in User Mode.
- The CPU executes the idle loop
In each of these cases, We say that the CPU has gone through
a quiescent state.

If we call rcu_barrier() after "int3" is removed, we can ensure
followings after the rcu_barrier() called.
- interrupt handlers are finished on all CPUs.
- p->ainsn.insn is not executed on all CPUs.
And we can remove a boosted kprobe safely.

Current kprobes uses synchronize_sched(). I think this function
checks safety of only current CPU. So, it may not ensure above
conditions. But rcu_barrier() checks safety of all CPUs. So, it
can ensure above conditions.
How would you think about this?

> And also, on preemptible kernel, the booster is not enabled where the
> kernel preemption is enabled. So, there are no preempted threads on
> p->ainsn.insn.


--
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Satoshi Oshima
Hi, Anil and Masami

Masami Hiramatsu wrote:

> Hi, Anil
>
> Masami Hiramatsu wrote:
>
>>>Issue No 1:
>>> In the above code patch, once you call reset_current_kprobes() and
>>>preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
>>>in your fix to this code), you are not suppose to relay on
>>>&p->ainsn.insn, the reason is that, on the other cpu technically
>>>this memory can get freed due to unregister_kprobe() call.
>>>In other words, by the time this cpu tries to execte the instruction
>>>at regs->eip, that memory might have gone and system is bound to crash.
>>
>>I think old djprobe has the same issue. So, it will be solved by using
>>safety check routine after removing breakpoint. Current kprobe uses
>>synchronize_sched() for waiting rcu processing in unregister_kprobe().
>>In my opinion, the solution is that introducing safety check same as
>>djprobe instead of synchronize_sched() in unregister_kprobe().
>
> I found the rcu_barrier() function that works similar to djprobe's
> safety check routine.

I had not understood how RCU works. But I have learned it.

rcu_barrier() registers callbacks on all CPU.
synchronize_sched() registers callback on current CPU only.
Safety check of RCU is the same between rcu_barrier() and
synchronize_sched().

So synchronize_sched() is enough for safety checking of
kprobe-booster.

You don't have to use rcu_barrier() instead of
synchronize_sched().

>
>>The safety check routine of djprobe waits until all cpus execute the works,
>>and the works are executed from the keventd.
>
>
> The rcu_barrier() function waits until all cpus have gone through a
> quiescent state.
>
> - The CPU performs a process switch.
> - The CPU starts executing in User Mode.
> - The CPU executes the idle loop
> In each of these cases, We say that the CPU has gone through
> a quiescent state.
>
> If we call rcu_barrier() after "int3" is removed, we can ensure
> followings after the rcu_barrier() called.
> - interrupt handlers are finished on all CPUs.
> - p->ainsn.insn is not executed on all CPUs.
> And we can remove a boosted kprobe safely.
>
> Current kprobes uses synchronize_sched(). I think this function
> checks safety of only current CPU. So, it may not ensure above
> conditions. But rcu_barrier() checks safety of all CPUs. So, it
> can ensure above conditions.
> How would you think about this?
>
>
>>And also, on preemptible kernel, the booster is not enabled where the
>>kernel preemption is enabled. So, there are no preempted threads on
>>p->ainsn.insn.
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Keshavamurthy, Anil S
In reply to this post by Masami Hiramatsu
On Tue, Dec 20, 2005 at 10:21:14PM +0900, Masami Hiramatsu wrote:

> Hi, Anil
>
> Masami Hiramatsu wrote:
> >>Issue No 1:
> >> In the above code patch, once you call reset_current_kprobes() and
> >>preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
> >>in your fix to this code), you are not suppose to relay on
> >>&p->ainsn.insn, the reason is that, on the other cpu technically
> >>this memory can get freed due to unregister_kprobe() call.
> >>In other words, by the time this cpu tries to execte the instruction
> >>at regs->eip, that memory might have gone and system is bound to crash.
> >
> > I think old djprobe has the same issue. So, it will be solved by using
> > safety check routine after removing breakpoint. Current kprobe uses
> > synchronize_sched() for waiting rcu processing in unregister_kprobe().
> > In my opinion, the solution is that introducing safety check same as
> > djprobe instead of synchronize_sched() in unregister_kprobe().
>
> I found the rcu_barrier() function that works similar to djprobe's
> safety check routine.
>
> > The safety check routine of djprobe waits until all cpus execute the works,
> > and the works are executed from the keventd.
>
> The rcu_barrier() function waits until all cpus have gone through a
> quiescent state.
>
> - The CPU performs a process switch.
> - The CPU starts executing in User Mode.
> - The CPU executes the idle loop
> In each of these cases, We say that the CPU has gone through
> a quiescent state.
>
> If we call rcu_barrier() after "int3" is removed, we can ensure
> followings after the rcu_barrier() called.
> - interrupt handlers are finished on all CPUs.
> - p->ainsn.insn is not executed on all CPUs.
> And we can remove a boosted kprobe safely.
>
> Current kprobes uses synchronize_sched(). I think this function
> checks safety of only current CPU. So, it may not ensure above
> conditions. But rcu_barrier() checks safety of all CPUs. So, it
> can ensure above conditions.
> How would you think about this?

You are correct, we need to ensure safety of all CPUs.
rcu_barrier() will work.

>
> > And also, on preemptible kernel, the booster is not enabled where the
> > kernel preemption is enabled. So, there are no preempted threads on
> > p->ainsn.insn.
Yes, this is also true, you have to disable booster when kernel
preemption is enabled. So in effect booster will work only when
kernel preemption is disabled.


>
>
> --
> Masami HIRAMATSU
> 2nd Research Dept.
> Hitachi, Ltd., Systems Development Laboratory
> E-mail: [hidden email]
>
- Anil
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][Patch 2/2][take2]kprobe: kprobe-booster against 2.6.14-mm1 for i386

Ananth N Mavinakayanahalli-2
On Tue, Dec 20, 2005 at 11:01:39AM -0800, Keshavamurthy Anil S wrote:

> On Tue, Dec 20, 2005 at 10:21:14PM +0900, Masami Hiramatsu wrote:
> > Hi, Anil
> >
> > Masami Hiramatsu wrote:
> > >>Issue No 1:
> > >> In the above code patch, once you call reset_current_kprobes() and
> > >>preempt_enable_no_resched() ( I see preeempt_enable_no_resched()
> > >>in your fix to this code), you are not suppose to relay on
> > >>&p->ainsn.insn, the reason is that, on the other cpu technically
> > >>this memory can get freed due to unregister_kprobe() call.
> > >>In other words, by the time this cpu tries to execte the instruction
> > >>at regs->eip, that memory might have gone and system is bound to crash.
> > >
> > > I think old djprobe has the same issue. So, it will be solved by using
> > > safety check routine after removing breakpoint. Current kprobe uses
> > > synchronize_sched() for waiting rcu processing in unregister_kprobe().
> > > In my opinion, the solution is that introducing safety check same as
> > > djprobe instead of synchronize_sched() in unregister_kprobe().
> >
> > I found the rcu_barrier() function that works similar to djprobe's
> > safety check routine.
> >
> > > The safety check routine of djprobe waits until all cpus execute the works,
> > > and the works are executed from the keventd.
> >
> > The rcu_barrier() function waits until all cpus have gone through a
> > quiescent state.
> >
> > - The CPU performs a process switch.
> > - The CPU starts executing in User Mode.
> > - The CPU executes the idle loop
> > In each of these cases, We say that the CPU has gone through
> > a quiescent state.
> >
> > If we call rcu_barrier() after "int3" is removed, we can ensure
> > followings after the rcu_barrier() called.
> > - interrupt handlers are finished on all CPUs.
> > - p->ainsn.insn is not executed on all CPUs.
> > And we can remove a boosted kprobe safely.
> >
> > Current kprobes uses synchronize_sched(). I think this function
> > checks safety of only current CPU. So, it may not ensure above
> > conditions. But rcu_barrier() checks safety of all CPUs. So, it
> > can ensure above conditions.
> > How would you think about this?
>
> You are correct, we need to ensure safety of all CPUs.
> rcu_barrier() will work.

Unless I've understood RCU incorrectly, rcu_barrier() is not needed in
the kprobes case. As I read the code, you'll need to use rcu_barrier()
in cases where you have to run the completion routine on each online cpu.

All we care about in kprobes case is that the other cpus don't hold a
reference to the just removed kprobe, so we can complete the kprobe
unregistration. Return from synchronize_sched() ensures that all the
other cpus have passed through a quiescent state (one of the conditions
Masami has listed above).

Also remember we don't define our own callback function either.

Ananth
12