segfault in ffi_data_to_code_pointer

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

segfault in ffi_data_to_code_pointer

DJ Delorie-2

In src/closures.c, ffi_data_to_code_pointer() calls segment_holding()
to get a pointer to the code segment for a data segment.  It doesn't
check for a NULL return, and I've got a test case where I run Ruby's
test suite (on a non-selinux aarch64 machine, if that matters) and
segment_holding() returns NULL and much hilarity ensues.

The following patch fixes the segfault, but I don't know if
segment_holding() returning NULL is an expected case, or a symptom of
problems elsewhere?

> diff -rup a/src/closures.c b/src/closures.c
> --- a/src/closures.c 2019-06-25 21:21:06.738743440 -0400
> +++ b/src/closures.c 2019-06-25 21:22:00.769716129 -0400
> @@ -621,7 +621,10 @@ void *
>  ffi_data_to_code_pointer (void *data)
>  {
>    msegmentptr seg = segment_holding (gm, data);
> -  return add_segment_exec_offset (data, seg);
> +  if (seg)
> +    return add_segment_exec_offset (data, seg);
> +  else
> +    return data;
>  }
Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

Anthony Green-5

Hi DJ,

DJ Delorie <[hidden email]> writes:
> In src/closures.c, ffi_data_to_code_pointer() calls segment_holding()
> to get a pointer to the code segment for a data segment.  It doesn't
> check for a NULL return, and I've got a test case where I run Ruby's
> test suite (on a non-selinux aarch64 machine, if that matters) and
> segment_holding() returns NULL and much hilarity ensues.

This suggests that the ffi_closure object used to invoke
ffi_data_to_code_pointer wasn't allocated by ffi_closure_alloc().  Is
that something you can check?

Thanks,

AG

>
> The following patch fixes the segfault, but I don't know if
> segment_holding() returning NULL is an expected case, or a symptom of
> problems elsewhere?
>
>> diff -rup a/src/closures.c b/src/closures.c
>> --- a/src/closures.c 2019-06-25 21:21:06.738743440 -0400
>> +++ b/src/closures.c 2019-06-25 21:22:00.769716129 -0400
>> @@ -621,7 +621,10 @@ void *
>>  ffi_data_to_code_pointer (void *data)
>>  {
>>    msegmentptr seg = segment_holding (gm, data);
>> -  return add_segment_exec_offset (data, seg);
>> +  if (seg)
>> +    return add_segment_exec_offset (data, seg);
>> +  else
>> +    return data;
>>  }

--
Anthony Green <[hidden email]>
Senior Principal Solutions Architect, Financial Services
+1 647 477-3809
Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

DJ Delorie-2
> DJ Delorie <[hidden email]> writes:
>> In src/closures.c, ffi_data_to_code_pointer() calls segment_holding()
>> to get a pointer to the code segment for a data segment.  It doesn't
>> check for a NULL return, and I've got a test case where I run Ruby's
>> test suite (on a non-selinux aarch64 machine, if that matters) and
>> segment_holding() returns NULL and much hilarity ensues.

Anthony Green <[hidden email]> writes:
> This suggests that the ffi_closure object used to invoke
> ffi_data_to_code_pointer wasn't allocated by ffi_closure_alloc().  Is
> that something you can check?

It seems to be so:

 [root ruby-2.6.3]# ./miniruby ./tool/runruby.rb -Itest/lib -r 'test/unit' "./test/fiddle/test_import.rb"
 < 0000ffffaa9b0000
 < 0000ffffaa900000
 Run options:

The '<' lines indicate calls to ffi_data_to_code_pointer that had a NULL
segment, and the non-existing '>' lines indicated non-existing calls to
ffi_closure_allocate.

However, a simple NULL check may be the right fix anyway:

* it fixes the ruby build/test problem
* it does not re-introduce the illegal instruction bug

Which means that the non-code-seg'd closure isn't the one that triggers
the illegal instruction... but I don't know why still ;-)
Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

DJ Delorie-2

Further debugging has disclosed that the ruby code calls
ffi_prep_closure() passing a pointer returned from mmap() - not one from
ffi_closure_alloc().

Is that allowed?
Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

Anthony Green-5
DJ Delorie <[hidden email]> writes:

> Further debugging has disclosed that the ruby code calls
> ffi_prep_closure() passing a pointer returned from mmap() - not one from
> ffi_closure_alloc().
>
> Is that allowed?

No, they should be using ffi_closure_alloc().

Can you point me at the ruby code that does this?

Thanks,

AG


Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

DJ Delorie-2
Anthony Green <[hidden email]> writes:
> No, they should be using ffi_closure_alloc().
>
> Can you point me at the ruby code that does this?

ruby 2.6.3
ext/fiddle/closure.c

In the failing case, USE_FFI_CLOSURE_ALLOC is not set

static VALUE
allocate(VALUE klass)
{
    fiddle_closure * closure;

    VALUE i = TypedData_Make_Struct(klass, fiddle_closure,
            &closure_data_type, closure);

fprintf (stderr, "DJ: allocate\n");
#if USE_FFI_CLOSURE_ALLOC
    closure->pcl = ffi_closure_alloc(sizeof(ffi_closure), &closure->code);
#else
    closure->pcl = mmap(NULL, sizeof(ffi_closure), PROT_READ | PROT_WRITE,
        MAP_ANON | MAP_PRIVATE, -1, 0);
#endif

    return i;
}

initialize()
{
 . . .
#if USE_FFI_CLOSURE_ALLOC
    result = ffi_prep_closure_loc(pcl, cif, callback,
                (void *)self, cl->code);
#else
    result = ffi_prep_closure(pcl, cif, callback, (void *)self);
    cl->code = (void *)pcl;
    i = mprotect(pcl, sizeof(*pcl), PROT_READ | PROT_EXEC);
    if (i) {
        rb_sys_fail("mprotect");
    }
#endif
 . . .
}
Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

Anthony Green-5
DJ Delorie <[hidden email]> writes:

> Anthony Green <[hidden email]> writes:
>> No, they should be using ffi_closure_alloc().
>>
>> Can you point me at the ruby code that does this?
>
> ruby 2.6.3
> ext/fiddle/closure.c

Thanks.  This looks right to me.  Perhaps they wrote this before
ffi_closure_alloc() existed?  I don't remember when every bit was
introduced...   Looks like you need to follow up with the ruby people.

Thanks DJ,

AG


> In the failing case, USE_FFI_CLOSURE_ALLOC is not set
>
> static VALUE
> allocate(VALUE klass)
> {
>     fiddle_closure * closure;
>
>     VALUE i = TypedData_Make_Struct(klass, fiddle_closure,
>             &closure_data_type, closure);
>
> fprintf (stderr, "DJ: allocate\n");
> #if USE_FFI_CLOSURE_ALLOC
>     closure->pcl = ffi_closure_alloc(sizeof(ffi_closure), &closure->code);
> #else
>     closure->pcl = mmap(NULL, sizeof(ffi_closure), PROT_READ | PROT_WRITE,
>         MAP_ANON | MAP_PRIVATE, -1, 0);
> #endif
>
>     return i;
> }
>
> initialize()
> {
>  . . .
> #if USE_FFI_CLOSURE_ALLOC
>     result = ffi_prep_closure_loc(pcl, cif, callback,
>                 (void *)self, cl->code);
> #else
>     result = ffi_prep_closure(pcl, cif, callback, (void *)self);
>     cl->code = (void *)pcl;
>     i = mprotect(pcl, sizeof(*pcl), PROT_READ | PROT_EXEC);
>     if (i) {
>         rb_sys_fail("mprotect");
>     }
> #endif
>  . . .
> }

Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

DJ Delorie-2
Anthony Green <[hidden email]> writes:
> Thanks.  This looks right to me.

By "right" do you mean "wrong" ?

> Perhaps they wrote this before ffi_closure_alloc() existed?  I don't
> remember when every bit was introduced...  Looks like you need to
> follow up with the ruby people.

So... no need/desire/reason to change anything in libffi then? ;-)
Reply | Threaded
Open this post in threaded view
|

Re: segfault in ffi_data_to_code_pointer

Florian Weimer-5
In reply to this post by DJ Delorie-2
* DJ Delorie:

> In src/closures.c, ffi_data_to_code_pointer() calls segment_holding()
> to get a pointer to the code segment for a data segment.  It doesn't
> check for a NULL return, and I've got a test case where I run Ruby's
> test suite (on a non-selinux aarch64 machine, if that matters) and
> segment_holding() returns NULL and much hilarity ensues.
>
> The following patch fixes the segfault, but I don't know if
> segment_holding() returning NULL is an expected case, or a symptom of
> problems elsewhere?
>
>> diff -rup a/src/closures.c b/src/closures.c
>> --- a/src/closures.c 2019-06-25 21:21:06.738743440 -0400
>> +++ b/src/closures.c 2019-06-25 21:22:00.769716129 -0400
>> @@ -621,7 +621,10 @@ void *
>>  ffi_data_to_code_pointer (void *data)
>>  {
>>    msegmentptr seg = segment_holding (gm, data);
>> -  return add_segment_exec_offset (data, seg);
>> +  if (seg)
>> +    return add_segment_exec_offset (data, seg);
>> +  else
>> +    return data;
>>  }

I think you also need to fix the aarch64 code to avoid a null pointer
dereference, like below.  Also submitted as
<https://github.com/libffi/libffi/pull/499>.

I have verified that this fixes the Ruby build failure.

Thanks,
Florian

diff --git a/src/aarch64/ffi.c b/src/aarch64/ffi.c
index 6f6aac4..69e04d6 100644
--- a/src/aarch64/ffi.c
+++ b/src/aarch64/ffi.c
@@ -777,7 +777,8 @@ ffi_prep_closure_loc (ffi_closure *closure,
 
   /* Also flush the cache for code mapping.  */
   unsigned char *tramp_code = ffi_data_to_code_pointer (tramp);
-  ffi_clear_cache (tramp_code, tramp_code + FFI_TRAMPOLINE_SIZE);
+  if (tramp_code != NULL)
+    ffi_clear_cache (tramp_code, tramp_code + FFI_TRAMPOLINE_SIZE);
 #endif
 
   closure->cif = cif;
diff --git a/src/closures.c b/src/closures.c
index 4d7f945..18d3913 100644
--- a/src/closures.c
+++ b/src/closures.c
@@ -925,6 +925,8 @@ void *
 ffi_data_to_code_pointer (void *data)
 {
   msegmentptr seg = segment_holding (gm, data);
+  if (seg == NULL)
+    return NULL;
   return add_segment_exec_offset (data, seg);
 }