[PATCH 0/2] Improvements to Python Register Descriptor API

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/2] Improvements to Python Register Descriptor API

Andrew Burgess
I was thinking about my recent Python API additions to provide access
to register and register group information, and I thought that the
implementation of the API could be improved.

Currently for both registers and register groups, you ask the
architecture for an iterator, and the iterator then returns
descriptors.  Each descriptor returned is a new object, so if you have
two iterators, and ask the the first register you'll get two unique
objects that both describe the same register.

I thought it might be a nice improvement if the objects returned were
the same from every iterator, so in the above example you'd end up
with two references to the same object.  In this way you can compare
the objects for equality.

I don't know if this will be useful, but this seems like a better
implementation, so I'd like to change things to work this way
(assuming nobody disagrees).

Thanks,
Andrew

---

Andrew Burgess (2):
  gdb/python: Reuse gdb.RegisterDescriptor objects where possible
  gdb/python: Reuse gdb.RegisterGroup objects where possible

 gdb/ChangeLog                                 |  19 ++++
 gdb/python/py-registers.c                     | 103 ++++++++++++++----
 gdb/testsuite/ChangeLog                       |   8 ++
 .../gdb.python/py-arch-reg-groups.exp         |  19 ++++
 .../gdb.python/py-arch-reg-names.exp          |  19 ++++
 5 files changed, 146 insertions(+), 22 deletions(-)

--
2.25.4

Reply | Threaded
Open this post in threaded view
|

[PATCH 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Andrew Burgess
Instead of having the gdb.RegisterDescriptorIterator creating new
gdb.RegisterDescriptor objects for each regnum, instead cache
gdb.RegisterDescriptor objects on the gdbarch object and reuse these.

This means that for every gdbarch/regnum pair there is a single unique
gdb.RegisterDescriptor, this feels like a neater implementation than
the existing one.

It is possible for a user to see (in Python code) that the descriptors
are now identical, but as the descriptors are read-only this should
make no real difference.

There should be no other user visible changes.

gdb/ChangeLog:

        * python/py-registers.c (gdbpy_register_object_data): New static
        global.
        (gdbpy_register_object_data_init): New function.
        (gdbpy_new_register_descriptor): Renamed to...
        (gdbpy_get_register_descriptor): ...this, and update to reuse
        existing register descriptors where possible.
        (gdbpy_register_descriptor_iter_next): Update.
        (gdbpy_initialize_registers): Register new gdbarch data.

gdb/testsuite/ChangeLog:

        * gdb.python/py-arch-reg-names.exp: Additional tests.
---
 gdb/ChangeLog                                 | 11 ++++
 gdb/python/py-registers.c                     | 63 +++++++++++++++----
 gdb/testsuite/ChangeLog                       |  4 ++
 .../gdb.python/py-arch-reg-names.exp          | 19 ++++++
 4 files changed, 85 insertions(+), 12 deletions(-)

diff --git a/gdb/python/py-registers.c b/gdb/python/py-registers.c
index db0fe37eecb..fec84e5d5d4 100644
--- a/gdb/python/py-registers.c
+++ b/gdb/python/py-registers.c
@@ -24,6 +24,9 @@
 #include "reggroups.h"
 #include "python-internal.h"
 
+/* Token to access per-gdbarch data related to register descriptors.  */
+static struct gdbarch_data *gdbpy_register_object_data = NULL;
+
 /* Structure for iterator over register descriptors.  */
 typedef struct {
   PyObject_HEAD
@@ -81,6 +84,18 @@ typedef struct {
 extern PyTypeObject reggroup_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("reggroup_object");
 
+/* Associates a vector of register_descriptor_object pointers with GDBARCH
+   as gdbarch_data via the gdbarch post init registration mechanism
+   (gdbarch_data_register_post_init).  */
+
+static void *
+gdbpy_register_object_data_init (struct gdbarch *gdbarch)
+{
+  std::vector<const register_descriptor_object *> *vec
+    = new (std::vector<const register_descriptor_object *>);
+  return (void *) vec;
+}
+
 /* Create a new gdb.RegisterGroup object wrapping REGGROUP.  */
 
 static PyObject *
@@ -117,20 +132,41 @@ gdbpy_reggroup_name (PyObject *self, void *closure)
   return gdbpy_reggroup_to_string (self);
 }
 
-/* Create an return a new gdb.RegisterDescriptor object.  */
+/* Return a gdb.RegisterDescriptor object for REGNUM from GDBARCH.  For
+   each REGNUM (in GDBARCH) only one descriptor is ever created, which is
+   then cached on the GDBARCH.  */
+
 static PyObject *
-gdbpy_new_register_descriptor (struct gdbarch *gdbarch,
+gdbpy_get_register_descriptor (struct gdbarch *gdbarch,
        int regnum)
 {
-  /* Create a new object and fill in its details.  */
-  register_descriptor_object *reg
-    = PyObject_New (register_descriptor_object,
-    &register_descriptor_object_type);
-  if (reg == NULL)
-    return NULL;
-  reg->regnum = regnum;
-  reg->gdbarch = gdbarch;
-  return (PyObject *) reg;
+  std::vector<const register_descriptor_object *> *vec
+    = (std::vector<const register_descriptor_object *> *) gdbarch_data
+    (gdbarch, gdbpy_register_object_data);
+
+  /* Ensure that we have enough entries in the vector.  */
+  if (vec->size () <= regnum)
+    vec->resize ((regnum + 1), nullptr);
+
+  /* If we don't already have a descriptor for REGNUM in GDBARCH then
+     create one now.  */
+  if (vec->at (regnum) == nullptr)
+    {
+      register_descriptor_object *reg
+ = PyObject_New (register_descriptor_object,
+ &register_descriptor_object_type);
+      if (reg == NULL)
+ return NULL;
+      reg->regnum = regnum;
+      reg->gdbarch = gdbarch;
+      vec->at (regnum) = reg;
+    }
+
+  /* Grab the register descriptor from the vector and increment its
+     reference count before returning it.  */
+  PyObject *obj = (PyObject *) vec->at (regnum);
+  Py_XINCREF (obj);
+  return obj;
 }
 
 /* Convert the register descriptor to a string.  */
@@ -281,7 +317,7 @@ gdbpy_register_descriptor_iter_next (PyObject *self)
       iter_obj->regnum++;
 
       if (name != nullptr && *name != '\0')
- return gdbpy_new_register_descriptor (gdbarch, regnum);
+ return gdbpy_get_register_descriptor (gdbarch, regnum);
     }
   while (true);
 }
@@ -291,6 +327,9 @@ gdbpy_register_descriptor_iter_next (PyObject *self)
 int
 gdbpy_initialize_registers ()
 {
+  gdbpy_register_object_data
+    = gdbarch_data_register_post_init (gdbpy_register_object_data_init);
+
   register_descriptor_object_type.tp_new = PyType_GenericNew;
   if (PyType_Ready (&register_descriptor_object_type) < 0)
     return -1;
diff --git a/gdb/testsuite/gdb.python/py-arch-reg-names.exp b/gdb/testsuite/gdb.python/py-arch-reg-names.exp
index 14bc0a822a4..8dd34ef5fd2 100644
--- a/gdb/testsuite/gdb.python/py-arch-reg-names.exp
+++ b/gdb/testsuite/gdb.python/py-arch-reg-names.exp
@@ -85,3 +85,22 @@ for { set i 0 } { $i < [llength $regs] } { incr i } {
     }
 }
 gdb_assert { $found_non_match == 0 } "all registers match"
+
+# Check that we get the same register descriptors from two different
+# iterators.
+gdb_py_test_silent_cmd \
+    "python iter1 = arch.registers ()" \
+    "get first all register iterator" 0
+gdb_py_test_silent_cmd \
+    "python iter2 = arch.registers ()" \
+    "get second all register iterator" 0
+gdb_py_test_silent_cmd \
+    [multi_line_input \
+ "python" \
+ "for r1, r2 in zip(iter1, iter2):" \
+ "  if (r1.name != r2.name):"\
+ "    raise gdb.GdbError (\"miss-matched names\")" \
+ "  if (r1 != r2):" \
+ "    raise gdb.GdbError (\"miss-matched objects\")" \
+ "\004" ] \
+    "check names and objects match" 1
--
2.25.4

Reply | Threaded
Open this post in threaded view
|

[PATCH 2/2] gdb/python: Reuse gdb.RegisterGroup objects where possible

Andrew Burgess
In reply to this post by Andrew Burgess
Only create one gdb.RegisterGroup Python object for each of GDB's
reggroup objects.

I could have added a field into the reggroup object to hold the Python
object pointer for each reggroup, however, as reggroups are never
deleted within GDB, and are global (not per-architecture) a simpler
solution seemed to be just to hold a single global map from reggroup
pointer to a Python object representing the reggroup.  Then we can
reuse the objects out of this map.

After this commit it is possible for a user to tell that two
gdb.RegisterGroup objects are now identical when previously they were
unique, however, as both these objects are read-only I don't think
this should be a problem.

There should be no other user visible changes after this commit.

gdb/ChangeLog:

        * python/py-registers.c : Add 'unordered_map' include.
        (gdbpy_new_reggroup): Renamed to...
        (gdbpy_get_reggroup): ...this.  Update to only create register
        group descriptors when needed.
        (gdbpy_reggroup_iter_next): Update.

gdb/testsuite/ChangeLog:

        * gdb.python/py-arch-reg-groups.exp: Additional tests.
---
 gdb/ChangeLog                                 |  8 ++++
 gdb/python/py-registers.c                     | 40 ++++++++++++++-----
 gdb/testsuite/ChangeLog                       |  4 ++
 .../gdb.python/py-arch-reg-groups.exp         | 19 +++++++++
 4 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/gdb/python/py-registers.c b/gdb/python/py-registers.c
index fec84e5d5d4..5d0f5386533 100644
--- a/gdb/python/py-registers.c
+++ b/gdb/python/py-registers.c
@@ -23,6 +23,7 @@
 #include "disasm.h"
 #include "reggroups.h"
 #include "python-internal.h"
+#include <unordered_map>
 
 /* Token to access per-gdbarch data related to register descriptors.  */
 static struct gdbarch_data *gdbpy_register_object_data = NULL;
@@ -96,18 +97,37 @@ gdbpy_register_object_data_init (struct gdbarch *gdbarch)
   return (void *) vec;
 }
 
-/* Create a new gdb.RegisterGroup object wrapping REGGROUP.  */
+/* Return a gdb.RegisterGroup object wrapping REGGROUP.  The register
+   group objects are cached, and the same Python object will always be
+   returned for the same REGGROUP pointer.  */
 
 static PyObject *
-gdbpy_new_reggroup (struct reggroup *reggroup)
+gdbpy_get_reggroup (struct reggroup *reggroup)
 {
-  /* Create a new object and fill in its details.  */
-  reggroup_object *group
-    = PyObject_New (reggroup_object, &reggroup_object_type);
-  if (group == NULL)
-    return NULL;
-  group->reggroup = reggroup;
-  return (PyObject *) group;
+  /* Map from GDB's internal reggroup objects to the Python representation.
+     GDB's reggroups are global, and are never deleted, so using a map like
+     this is safe.  */
+  static std::unordered_map<struct reggroup *,const reggroup_object *>
+    gdbpy_reggroup_object_map;
+
+  /* If there is not already a suitable Python object in the map then
+     create a new one, and add it to the map.  */
+  if (gdbpy_reggroup_object_map[reggroup] == nullptr)
+    {
+      /* Create a new object and fill in its details.  */
+      reggroup_object *group
+ = PyObject_New (reggroup_object, &reggroup_object_type);
+      if (group == NULL)
+ return NULL;
+      group->reggroup = reggroup;
+      gdbpy_reggroup_object_map[reggroup] = group;
+    }
+
+  /* Fetch the Python object wrapping REGGROUP from the map, increment its
+     reference count and return it.  */
+  PyObject *obj = (PyObject *) gdbpy_reggroup_object_map[reggroup];
+  Py_INCREF (obj);
+  return obj;
 }
 
 /* Convert a gdb.RegisterGroup to a string, it just returns the name of
@@ -219,7 +239,7 @@ gdbpy_reggroup_iter_next (PyObject *self)
     }
 
   iter_obj->reggroup = next_group;
-  return gdbpy_new_reggroup (iter_obj->reggroup);
+  return gdbpy_get_reggroup (iter_obj->reggroup);
 }
 
 /* Return a new gdb.RegisterGroupsIterator over all the register groups in
diff --git a/gdb/testsuite/gdb.python/py-arch-reg-groups.exp b/gdb/testsuite/gdb.python/py-arch-reg-groups.exp
index ea9aa77b0fa..093de2e7390 100644
--- a/gdb/testsuite/gdb.python/py-arch-reg-groups.exp
+++ b/gdb/testsuite/gdb.python/py-arch-reg-groups.exp
@@ -85,3 +85,22 @@ for { set i 0 } { $i < [llength $groups] } { incr i } {
     }
 }
 gdb_assert { $found_non_match == 0 } "all register groups match"
+
+# Check that we get the same register group descriptors from two
+# different iterators.
+gdb_py_test_silent_cmd \
+    "python iter1 = arch.register_groups ()" \
+    "get first all register group iterator" 0
+gdb_py_test_silent_cmd \
+    "python iter2 = arch.register_groups ()" \
+    "get second all register group iterator" 0
+gdb_py_test_silent_cmd \
+    [multi_line_input \
+ "python" \
+ "for r1, r2 in zip(iter1, iter2):" \
+ "  if (r1.name != r2.name):"\
+ "    raise gdb.GdbError (\"miss-matched names\")" \
+ "  if (r1 != r2):" \
+ "    raise gdb.GdbError (\"miss-matched objects\")" \
+ "\004" ] \
+    "check names and objects match" 1
--
2.25.4

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Tom Tromey-2
In reply to this post by Andrew Burgess
>>>>> "Andrew" == Andrew Burgess <[hidden email]> writes:

Andrew> * python/py-registers.c (gdbpy_register_object_data): New static
Andrew> global.
Andrew> (gdbpy_register_object_data_init): New function.
Andrew> (gdbpy_new_register_descriptor): Renamed to...
Andrew> (gdbpy_get_register_descriptor): ...this, and update to reuse
Andrew> existing register descriptors where possible.
Andrew> (gdbpy_register_descriptor_iter_next): Update.
Andrew> (gdbpy_initialize_registers): Register new gdbarch data.

This looks reasonable to me.

Andrew> +static void *
Andrew> +gdbpy_register_object_data_init (struct gdbarch *gdbarch)
Andrew> +{
Andrew> +  std::vector<const register_descriptor_object *> *vec
Andrew> +    = new (std::vector<const register_descriptor_object *>);
Andrew> +  return (void *) vec;

It seems a bit strange to use "const" here, since the refcounts do
change, and the code has to cast away const elsewhere.

Also I wonder if a vector of PyObject* or even gdb_pyref<> would be
better.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/2] gdb/python: Reuse gdb.RegisterGroup objects where possible

Tom Tromey-2
In reply to this post by Andrew Burgess
>>>>> "Andrew" == Andrew Burgess <[hidden email]> writes:

Andrew> gdb/ChangeLog:

Andrew> * python/py-registers.c : Add 'unordered_map' include.
Andrew> (gdbpy_new_reggroup): Renamed to...
Andrew> (gdbpy_get_reggroup): ...this.  Update to only create register
Andrew> group descriptors when needed.
Andrew> (gdbpy_reggroup_iter_next): Update.

The idea looks good to me.

Andrew> +  /* Map from GDB's internal reggroup objects to the Python representation.
Andrew> +     GDB's reggroups are global, and are never deleted, so using a map like
Andrew> +     this is safe.  */
Andrew> +  static std::unordered_map<struct reggroup *,const reggroup_object *>
Andrew> +    gdbpy_reggroup_object_map;

The same comment about "const", etc, applies here as in the previous
patch.

Tom
Reply | Threaded
Open this post in threaded view
|

[PATCHv2 0/2] Improvements to Python Register Descriptor API

Andrew Burgess
In reply to this post by Andrew Burgess
Changes from v1:

  - Data structures in both patches now hold gdbpy_ref<> as suggested
    by Tom.

Thanks,
Andrew

---

Andrew Burgess (2):
  gdb/python: Reuse gdb.RegisterDescriptor objects where possible
  gdb/python: Reuse gdb.RegisterGroup objects where possible

 gdb/ChangeLog                                 |  19 ++++
 gdb/python/py-registers.c                     | 102 +++++++++++++-----
 gdb/testsuite/ChangeLog                       |   8 ++
 .../gdb.python/py-arch-reg-groups.exp         |  19 ++++
 .../gdb.python/py-arch-reg-names.exp          |  19 ++++
 5 files changed, 143 insertions(+), 24 deletions(-)

--
2.25.4

Reply | Threaded
Open this post in threaded view
|

[PATCHv2 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Andrew Burgess
Instead of having the gdb.RegisterDescriptorIterator creating new
gdb.RegisterDescriptor objects for each regnum, instead cache
gdb.RegisterDescriptor objects on the gdbarch object and reuse these.

This means that for every gdbarch/regnum pair there is a single unique
gdb.RegisterDescriptor, this feels like a neater implementation than
the existing one.

It is possible for a user to see (in Python code) that the descriptors
are now identical, but as the descriptors are read-only this should
make no real difference.

There should be no other user visible changes.

gdb/ChangeLog:

        * python/py-registers.c (gdbpy_register_object_data): New static
        global.
        (gdbpy_register_object_data_init): New function.
        (gdbpy_new_register_descriptor): Renamed to...
        (gdbpy_get_register_descriptor): ...this, and update to reuse
        existing register descriptors where possible.
        (gdbpy_register_descriptor_iter_next): Update.
        (gdbpy_initialize_registers): Register new gdbarch data.

gdb/testsuite/ChangeLog:

        * gdb.python/py-arch-reg-names.exp: Additional tests.
---
 gdb/ChangeLog                                 | 11 ++++
 gdb/python/py-registers.c                     | 61 +++++++++++++++----
 gdb/testsuite/ChangeLog                       |  4 ++
 .../gdb.python/py-arch-reg-names.exp          | 19 ++++++
 4 files changed, 82 insertions(+), 13 deletions(-)

diff --git a/gdb/python/py-registers.c b/gdb/python/py-registers.c
index db0fe37eecb..8e22a919d87 100644
--- a/gdb/python/py-registers.c
+++ b/gdb/python/py-registers.c
@@ -24,6 +24,9 @@
 #include "reggroups.h"
 #include "python-internal.h"
 
+/* Token to access per-gdbarch data related to register descriptors.  */
+static struct gdbarch_data *gdbpy_register_object_data = NULL;
+
 /* Structure for iterator over register descriptors.  */
 typedef struct {
   PyObject_HEAD
@@ -81,6 +84,17 @@ typedef struct {
 extern PyTypeObject reggroup_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("reggroup_object");
 
+/* Associates a vector of gdb.RegisterDescriptor objects with GDBARCH as
+   gdbarch_data via the gdbarch post init registration mechanism
+   (gdbarch_data_register_post_init).  */
+
+static void *
+gdbpy_register_object_data_init (struct gdbarch *gdbarch)
+{
+  std::vector<gdbpy_ref<>> *vec = new (std::vector<gdbpy_ref<>>);
+  return (void *) vec;
+}
+
 /* Create a new gdb.RegisterGroup object wrapping REGGROUP.  */
 
 static PyObject *
@@ -117,20 +131,38 @@ gdbpy_reggroup_name (PyObject *self, void *closure)
   return gdbpy_reggroup_to_string (self);
 }
 
-/* Create an return a new gdb.RegisterDescriptor object.  */
-static PyObject *
-gdbpy_new_register_descriptor (struct gdbarch *gdbarch,
+/* Return a gdb.RegisterDescriptor object for REGNUM from GDBARCH.  For
+   each REGNUM (in GDBARCH) only one descriptor is ever created, which is
+   then cached on the GDBARCH.  */
+
+static gdbpy_ref<>
+gdbpy_get_register_descriptor (struct gdbarch *gdbarch,
        int regnum)
 {
-  /* Create a new object and fill in its details.  */
-  register_descriptor_object *reg
-    = PyObject_New (register_descriptor_object,
-    &register_descriptor_object_type);
-  if (reg == NULL)
-    return NULL;
-  reg->regnum = regnum;
-  reg->gdbarch = gdbarch;
-  return (PyObject *) reg;
+  auto vec = (std::vector<gdbpy_ref<>> *) gdbarch_data
+    (gdbarch, gdbpy_register_object_data);
+
+  /* Ensure that we have enough entries in the vector.  */
+  if (vec->size () <= regnum)
+    vec->resize ((regnum + 1), nullptr);
+
+  /* If we don't already have a descriptor for REGNUM in GDBARCH then
+     create one now.  */
+  if (vec->at (regnum) == nullptr)
+    {
+      gdbpy_ref <register_descriptor_object> reg
+ (PyObject_New (register_descriptor_object,
+       &register_descriptor_object_type));
+      if (reg == NULL)
+ return NULL;
+      reg->regnum = regnum;
+      reg->gdbarch = gdbarch;
+      vec->at (regnum) = gdbpy_ref<> ((PyObject *) reg.release ());
+    }
+
+  /* Grab the register descriptor from the vector, the reference count is
+     automatically incremented thanks to gdbpy_ref.  */
+  return vec->at (regnum);
 }
 
 /* Convert the register descriptor to a string.  */
@@ -281,7 +313,7 @@ gdbpy_register_descriptor_iter_next (PyObject *self)
       iter_obj->regnum++;
 
       if (name != nullptr && *name != '\0')
- return gdbpy_new_register_descriptor (gdbarch, regnum);
+ return gdbpy_get_register_descriptor (gdbarch, regnum).release ();
     }
   while (true);
 }
@@ -291,6 +323,9 @@ gdbpy_register_descriptor_iter_next (PyObject *self)
 int
 gdbpy_initialize_registers ()
 {
+  gdbpy_register_object_data
+    = gdbarch_data_register_post_init (gdbpy_register_object_data_init);
+
   register_descriptor_object_type.tp_new = PyType_GenericNew;
   if (PyType_Ready (&register_descriptor_object_type) < 0)
     return -1;
diff --git a/gdb/testsuite/gdb.python/py-arch-reg-names.exp b/gdb/testsuite/gdb.python/py-arch-reg-names.exp
index 14bc0a822a4..8dd34ef5fd2 100644
--- a/gdb/testsuite/gdb.python/py-arch-reg-names.exp
+++ b/gdb/testsuite/gdb.python/py-arch-reg-names.exp
@@ -85,3 +85,22 @@ for { set i 0 } { $i < [llength $regs] } { incr i } {
     }
 }
 gdb_assert { $found_non_match == 0 } "all registers match"
+
+# Check that we get the same register descriptors from two different
+# iterators.
+gdb_py_test_silent_cmd \
+    "python iter1 = arch.registers ()" \
+    "get first all register iterator" 0
+gdb_py_test_silent_cmd \
+    "python iter2 = arch.registers ()" \
+    "get second all register iterator" 0
+gdb_py_test_silent_cmd \
+    [multi_line_input \
+ "python" \
+ "for r1, r2 in zip(iter1, iter2):" \
+ "  if (r1.name != r2.name):"\
+ "    raise gdb.GdbError (\"miss-matched names\")" \
+ "  if (r1 != r2):" \
+ "    raise gdb.GdbError (\"miss-matched objects\")" \
+ "\004" ] \
+    "check names and objects match" 1
--
2.25.4

Reply | Threaded
Open this post in threaded view
|

[PATCHv2 2/2] gdb/python: Reuse gdb.RegisterGroup objects where possible

Andrew Burgess
In reply to this post by Andrew Burgess
Only create one gdb.RegisterGroup Python object for each of GDB's
reggroup objects.

I could have added a field into the reggroup object to hold the Python
object pointer for each reggroup, however, as reggroups are never
deleted within GDB, and are global (not per-architecture) a simpler
solution seemed to be just to hold a single global map from reggroup
pointer to a Python object representing the reggroup.  Then we can
reuse the objects out of this map.

After this commit it is possible for a user to tell that two
gdb.RegisterGroup objects are now identical when previously they were
unique, however, as both these objects are read-only I don't think
this should be a problem.

There should be no other user visible changes after this commit.

gdb/ChangeLog:

        * python/py-registers.c : Add 'unordered_map' include.
        (gdbpy_new_reggroup): Renamed to...
        (gdbpy_get_reggroup): ...this.  Update to only create register
        group descriptors when needed.
        (gdbpy_reggroup_iter_next): Update.

gdb/testsuite/ChangeLog:

        * gdb.python/py-arch-reg-groups.exp: Additional tests.
---
 gdb/ChangeLog                                 |  8 ++++
 gdb/python/py-registers.c                     | 41 ++++++++++++++-----
 gdb/testsuite/ChangeLog                       |  4 ++
 .../gdb.python/py-arch-reg-groups.exp         | 19 +++++++++
 4 files changed, 61 insertions(+), 11 deletions(-)

diff --git a/gdb/python/py-registers.c b/gdb/python/py-registers.c
index 8e22a919d87..9396498cc34 100644
--- a/gdb/python/py-registers.c
+++ b/gdb/python/py-registers.c
@@ -23,6 +23,7 @@
 #include "disasm.h"
 #include "reggroups.h"
 #include "python-internal.h"
+#include <unordered_map>
 
 /* Token to access per-gdbarch data related to register descriptors.  */
 static struct gdbarch_data *gdbpy_register_object_data = NULL;
@@ -95,18 +96,36 @@ gdbpy_register_object_data_init (struct gdbarch *gdbarch)
   return (void *) vec;
 }
 
-/* Create a new gdb.RegisterGroup object wrapping REGGROUP.  */
+/* Return a gdb.RegisterGroup object wrapping REGGROUP.  The register
+   group objects are cached, and the same Python object will always be
+   returned for the same REGGROUP pointer.  */
 
-static PyObject *
-gdbpy_new_reggroup (struct reggroup *reggroup)
+static gdbpy_ref<>
+gdbpy_get_reggroup (struct reggroup *reggroup)
 {
-  /* Create a new object and fill in its details.  */
-  reggroup_object *group
-    = PyObject_New (reggroup_object, &reggroup_object_type);
-  if (group == NULL)
-    return NULL;
-  group->reggroup = reggroup;
-  return (PyObject *) group;
+  /* Map from GDB's internal reggroup objects to the Python representation.
+     GDB's reggroups are global, and are never deleted, so using a map like
+     this is safe.  */
+  static std::unordered_map<struct reggroup *,gdbpy_ref<>>
+    gdbpy_reggroup_object_map;
+
+  /* If there is not already a suitable Python object in the map then
+     create a new one, and add it to the map.  */
+  if (gdbpy_reggroup_object_map[reggroup] == nullptr)
+    {
+      /* Create a new object and fill in its details.  */
+      gdbpy_ref<reggroup_object> group
+ (PyObject_New (reggroup_object, &reggroup_object_type));
+      if (group == NULL)
+ return NULL;
+      group->reggroup = reggroup;
+      gdbpy_reggroup_object_map[reggroup]
+ = gdbpy_ref<> ((PyObject *) group.release ());
+    }
+
+  /* Fetch the Python object wrapping REGGROUP from the map, increasing
+     the reference count is handled by the gdbpy_ref class.  */
+  return gdbpy_reggroup_object_map[reggroup];
 }
 
 /* Convert a gdb.RegisterGroup to a string, it just returns the name of
@@ -215,7 +234,7 @@ gdbpy_reggroup_iter_next (PyObject *self)
     }
 
   iter_obj->reggroup = next_group;
-  return gdbpy_new_reggroup (iter_obj->reggroup);
+  return gdbpy_get_reggroup (iter_obj->reggroup).release ();
 }
 
 /* Return a new gdb.RegisterGroupsIterator over all the register groups in
diff --git a/gdb/testsuite/gdb.python/py-arch-reg-groups.exp b/gdb/testsuite/gdb.python/py-arch-reg-groups.exp
index ea9aa77b0fa..093de2e7390 100644
--- a/gdb/testsuite/gdb.python/py-arch-reg-groups.exp
+++ b/gdb/testsuite/gdb.python/py-arch-reg-groups.exp
@@ -85,3 +85,22 @@ for { set i 0 } { $i < [llength $groups] } { incr i } {
     }
 }
 gdb_assert { $found_non_match == 0 } "all register groups match"
+
+# Check that we get the same register group descriptors from two
+# different iterators.
+gdb_py_test_silent_cmd \
+    "python iter1 = arch.register_groups ()" \
+    "get first all register group iterator" 0
+gdb_py_test_silent_cmd \
+    "python iter2 = arch.register_groups ()" \
+    "get second all register group iterator" 0
+gdb_py_test_silent_cmd \
+    [multi_line_input \
+ "python" \
+ "for r1, r2 in zip(iter1, iter2):" \
+ "  if (r1.name != r2.name):"\
+ "    raise gdb.GdbError (\"miss-matched names\")" \
+ "  if (r1 != r2):" \
+ "    raise gdb.GdbError (\"miss-matched objects\")" \
+ "\004" ] \
+    "check names and objects match" 1
--
2.25.4

Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv2 0/2] Improvements to Python Register Descriptor API

Tom Tromey-2
In reply to this post by Andrew Burgess
>>>>> "Andrew" == Andrew Burgess <[hidden email]> writes:

Andrew> Andrew Burgess (2):
Andrew>   gdb/python: Reuse gdb.RegisterDescriptor objects where possible
Andrew>   gdb/python: Reuse gdb.RegisterGroup objects where possible

These look fine to me.  Thank you.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv2 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Pedro Alves-2
In reply to this post by Andrew Burgess
Hi Andrew,

Some nits below.

On 7/14/20 6:14 PM, Andrew Burgess wrote:

>  
> +/* Token to access per-gdbarch data related to register descriptors.  */
> +static struct gdbarch_data *gdbpy_register_object_data = NULL;
> +
>  /* Structure for iterator over register descriptors.  */
>  typedef struct {
>    PyObject_HEAD
> @@ -81,6 +84,17 @@ typedef struct {
>  extern PyTypeObject reggroup_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("reggroup_object");
>  
> +/* Associates a vector of gdb.RegisterDescriptor objects with GDBARCH as
> +   gdbarch_data via the gdbarch post init registration mechanism
> +   (gdbarch_data_register_post_init).  */
> +
> +static void *
> +gdbpy_register_object_data_init (struct gdbarch *gdbarch)
> +{
> +  std::vector<gdbpy_ref<>> *vec = new (std::vector<gdbpy_ref<>>);
> +  return (void *) vec;

This could just be:

  return new std::vector<gdbpy_ref<>>;


> +}
> +
>  /* Create a new gdb.RegisterGroup object wrapping REGGROUP.  */
>  
>  static PyObject *
> @@ -117,20 +131,38 @@ gdbpy_reggroup_name (PyObject *self, void *closure)
>    return gdbpy_reggroup_to_string (self);
>  }
>  
> -/* Create an return a new gdb.RegisterDescriptor object.  */
> -static PyObject *
> -gdbpy_new_register_descriptor (struct gdbarch *gdbarch,
> +/* Return a gdb.RegisterDescriptor object for REGNUM from GDBARCH.  For
> +   each REGNUM (in GDBARCH) only one descriptor is ever created, which is
> +   then cached on the GDBARCH.  */
> +
> +static gdbpy_ref<>
> +gdbpy_get_register_descriptor (struct gdbarch *gdbarch,
>         int regnum)
>  {
> -  /* Create a new object and fill in its details.  */
> -  register_descriptor_object *reg
> -    = PyObject_New (register_descriptor_object,
> -    &register_descriptor_object_type);
> -  if (reg == NULL)
> -    return NULL;
> -  reg->regnum = regnum;
> -  reg->gdbarch = gdbarch;
> -  return (PyObject *) reg;
> +  auto vec = (std::vector<gdbpy_ref<>> *) gdbarch_data
> +    (gdbarch, gdbpy_register_object_data);

Wrap the rhs in parens so the second line is properly aligned,
per the GNU standard's "so that emacs aligns it automatically"
rule:

  auto vec = ((std::vector<gdbpy_ref<>> *) gdbarch_data
              (gdbarch, gdbpy_register_object_data));

Or you could put the = in the next line, so that it aligns
without the parens:

  auto vec
    = (std::vector<gdbpy_ref<>> *) gdbarch_data (gdbarch,
                                                 gdbpy_register_object_data);

Note, I think it's more idiomatic to write the * for pointers:

  auto *vec = ....

But since vec can't be NULL, I'd write a reference instead:

  auto &vec
    = *(std::vector<gdbpy_ref<>> *) gdbarch_data (gdbarch,
                                                  gdbpy_register_object_data);

> +
> +  /* Ensure that we have enough entries in the vector.  */
> +  if (vec->size () <= regnum)
> +    vec->resize ((regnum + 1), nullptr);
> +
> +  /* If we don't already have a descriptor for REGNUM in GDBARCH then
> +     create one now.  */
> +  if (vec->at (regnum) == nullptr)

Is there a reason you're using at?  Was it to avoid

 (*vec)[regnum]

or was it really about out-of-range errors?

If such as exception were thrown, it would be a logic
bug in GDB.  And note that we don't catch it anywhere, so
it would bring down GDB.

There's a school of thought that claims that at should not
really exist, and I agree with it.  :-)

If you go the reference route, then you can write the natural:

  if (vec[regnum] = nullptr)

> +    {
> +      gdbpy_ref <register_descriptor_object> reg
> + (PyObject_New (register_descriptor_object,
> +       &register_descriptor_object_type));
> +      if (reg == NULL)
> + return NULL;
> +      reg->regnum = regnum;
> +      reg->gdbarch = gdbarch;
> +      vec->at (regnum) = gdbpy_ref<> ((PyObject *) reg.release ());
> +    }
> +
> +  /* Grab the register descriptor from the vector, the reference count is
> +     automatically incremented thanks to gdbpy_ref.  */
> +  return vec->at (regnum);

Ditto in these two other spots.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv2 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Andrew Burgess
* Pedro Alves <[hidden email]> [2020-07-22 14:10:17 +0100]:

> Hi Andrew,
>
> Some nits below.
>
> On 7/14/20 6:14 PM, Andrew Burgess wrote:
>
> >  
> > +/* Token to access per-gdbarch data related to register descriptors.  */
> > +static struct gdbarch_data *gdbpy_register_object_data = NULL;
> > +
> >  /* Structure for iterator over register descriptors.  */
> >  typedef struct {
> >    PyObject_HEAD
> > @@ -81,6 +84,17 @@ typedef struct {
> >  extern PyTypeObject reggroup_object_type
> >      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("reggroup_object");
> >  
> > +/* Associates a vector of gdb.RegisterDescriptor objects with GDBARCH as
> > +   gdbarch_data via the gdbarch post init registration mechanism
> > +   (gdbarch_data_register_post_init).  */
> > +
> > +static void *
> > +gdbpy_register_object_data_init (struct gdbarch *gdbarch)
> > +{
> > +  std::vector<gdbpy_ref<>> *vec = new (std::vector<gdbpy_ref<>>);
> > +  return (void *) vec;
>
> This could just be:
>
>   return new std::vector<gdbpy_ref<>>;
>
>
> > +}
> > +
> >  /* Create a new gdb.RegisterGroup object wrapping REGGROUP.  */
> >  
> >  static PyObject *
> > @@ -117,20 +131,38 @@ gdbpy_reggroup_name (PyObject *self, void *closure)
> >    return gdbpy_reggroup_to_string (self);
> >  }
> >  
> > -/* Create an return a new gdb.RegisterDescriptor object.  */
> > -static PyObject *
> > -gdbpy_new_register_descriptor (struct gdbarch *gdbarch,
> > +/* Return a gdb.RegisterDescriptor object for REGNUM from GDBARCH.  For
> > +   each REGNUM (in GDBARCH) only one descriptor is ever created, which is
> > +   then cached on the GDBARCH.  */
> > +
> > +static gdbpy_ref<>
> > +gdbpy_get_register_descriptor (struct gdbarch *gdbarch,
> >         int regnum)
> >  {
> > -  /* Create a new object and fill in its details.  */
> > -  register_descriptor_object *reg
> > -    = PyObject_New (register_descriptor_object,
> > -    &register_descriptor_object_type);
> > -  if (reg == NULL)
> > -    return NULL;
> > -  reg->regnum = regnum;
> > -  reg->gdbarch = gdbarch;
> > -  return (PyObject *) reg;
> > +  auto vec = (std::vector<gdbpy_ref<>> *) gdbarch_data
> > +    (gdbarch, gdbpy_register_object_data);
>
> Wrap the rhs in parens so the second line is properly aligned,
> per the GNU standard's "so that emacs aligns it automatically"
> rule:
>
>   auto vec = ((std::vector<gdbpy_ref<>> *) gdbarch_data
>      (gdbarch, gdbpy_register_object_data));
>
> Or you could put the = in the next line, so that it aligns
> without the parens:
>
>   auto vec
>     = (std::vector<gdbpy_ref<>> *) gdbarch_data (gdbarch,
> gdbpy_register_object_data);
>
> Note, I think it's more idiomatic to write the * for pointers:
>
>   auto *vec = ....
>
> But since vec can't be NULL, I'd write a reference instead:
>
>   auto &vec
>     = *(std::vector<gdbpy_ref<>> *) gdbarch_data (gdbarch,
>  gdbpy_register_object_data);
>
> > +
> > +  /* Ensure that we have enough entries in the vector.  */
> > +  if (vec->size () <= regnum)
> > +    vec->resize ((regnum + 1), nullptr);
> > +
> > +  /* If we don't already have a descriptor for REGNUM in GDBARCH then
> > +     create one now.  */
> > +  if (vec->at (regnum) == nullptr)
>
> Is there a reason you're using at?  Was it to avoid
>
>  (*vec)[regnum]
>
> or was it really about out-of-range errors?
>
> If such as exception were thrown, it would be a logic
> bug in GDB.  And note that we don't catch it anywhere, so
> it would bring down GDB.
>
> There's a school of thought that claims that at should not
> really exist, and I agree with it.  :-)
>
> If you go the reference route, then you can write the natural:
>
>   if (vec[regnum] = nullptr)
>
> > +    {
> > +      gdbpy_ref <register_descriptor_object> reg
> > + (PyObject_New (register_descriptor_object,
> > +       &register_descriptor_object_type));
> > +      if (reg == NULL)
> > + return NULL;
> > +      reg->regnum = regnum;
> > +      reg->gdbarch = gdbarch;
> > +      vec->at (regnum) = gdbpy_ref<> ((PyObject *) reg.release ());
> > +    }
> > +
> > +  /* Grab the register descriptor from the vector, the reference count is
> > +     automatically incremented thanks to gdbpy_ref.  */
> > +  return vec->at (regnum);
>
> Ditto in these two other spots.

Pedro,

Thanks for your feedback.  As this patch was already merged, the patch
below applies on top of current master to address these issues.

If I don't hear anything I'll apply this in a couple of days.

Thanks,
Andrew

---

commit ed3d57aca93bfe3e0eaa5888486b9fc84d06a0c6
Author: Andrew Burgess <[hidden email]>
Date:   Wed Jul 22 14:57:55 2020 +0100

    gdb/python: Use reference not pointer in py-registers.c
   
    Pedro's review comments arrived after I'd already committed this
    change:
   
      commit f7306dac19c502232f766c3881313857915f330d
      Date:   Tue Jul 7 15:00:30 2020 +0100
   
          gdb/python: Reuse gdb.RegisterDescriptor objects where possible
   
    See:
   
      https://sourceware.org/pipermail/gdb-patches/2020-July/170726.html
   
    There should be no user visible changes after this commit.
   
    gdb/ChangeLog:
   
            * python/py-registers.c (gdbpy_register_object_data_init): Remove
            redundant local variable.
            (gdbpy_get_register_descriptor): Extract descriptor vector as a
            reference, not pointer, update code accordingly.

diff --git a/gdb/python/py-registers.c b/gdb/python/py-registers.c
index 9396498cc34..f64ca3c401b 100644
--- a/gdb/python/py-registers.c
+++ b/gdb/python/py-registers.c
@@ -92,8 +92,7 @@ extern PyTypeObject reggroup_object_type
 static void *
 gdbpy_register_object_data_init (struct gdbarch *gdbarch)
 {
-  std::vector<gdbpy_ref<>> *vec = new (std::vector<gdbpy_ref<>>);
-  return (void *) vec;
+  return new std::vector<gdbpy_ref<>>;
 }
 
 /* Return a gdb.RegisterGroup object wrapping REGGROUP.  The register
@@ -158,16 +157,17 @@ static gdbpy_ref<>
 gdbpy_get_register_descriptor (struct gdbarch *gdbarch,
        int regnum)
 {
-  auto vec = (std::vector<gdbpy_ref<>> *) gdbarch_data
-    (gdbarch, gdbpy_register_object_data);
+  auto &vec
+    = *(std::vector<gdbpy_ref<>> *) gdbarch_data (gdbarch,
+  gdbpy_register_object_data);
 
   /* Ensure that we have enough entries in the vector.  */
-  if (vec->size () <= regnum)
-    vec->resize ((regnum + 1), nullptr);
+  if (vec.size () <= regnum)
+    vec.resize ((regnum + 1), nullptr);
 
   /* If we don't already have a descriptor for REGNUM in GDBARCH then
      create one now.  */
-  if (vec->at (regnum) == nullptr)
+  if (vec[regnum] == nullptr)
     {
       gdbpy_ref <register_descriptor_object> reg
  (PyObject_New (register_descriptor_object,
@@ -176,12 +176,12 @@ gdbpy_get_register_descriptor (struct gdbarch *gdbarch,
  return NULL;
       reg->regnum = regnum;
       reg->gdbarch = gdbarch;
-      vec->at (regnum) = gdbpy_ref<> ((PyObject *) reg.release ());
+      vec[regnum] = gdbpy_ref<> ((PyObject *) reg.release ());
     }
 
   /* Grab the register descriptor from the vector, the reference count is
      automatically incremented thanks to gdbpy_ref.  */
-  return vec->at (regnum);
+  return vec[regnum];
 }
 
 /* Convert the register descriptor to a string.  */
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv2 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Pedro Alves-2
On 7/22/20 3:05 PM, Andrew Burgess wrote:
> * Pedro Alves <[hidden email]> [2020-07-22 14:10:17 +0100]:
>
>> Hi Andrew,
>>

Hi Andrew,

> Thanks for your feedback.  As this patch was already merged, the patch
> below applies on top of current master to address these issues.

Sorry, I didn't realize it was already in.

>
> If I don't hear anything I'll apply this in a couple of days.

It looks great.  Please merge it.

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv2 1/2] gdb/python: Reuse gdb.RegisterDescriptor objects where possible

Andrew Burgess
* Pedro Alves <[hidden email]> [2020-07-22 15:32:40 +0100]:

> On 7/22/20 3:05 PM, Andrew Burgess wrote:
> > * Pedro Alves <[hidden email]> [2020-07-22 14:10:17 +0100]:
> >
> >> Hi Andrew,
> >>
>
> Hi Andrew,
>
> > Thanks for your feedback.  As this patch was already merged, the patch
> > below applies on top of current master to address these issues.
>
> Sorry, I didn't realize it was already in.

Not a problem.  Always happy to get feedback on how to improve
patches.

>
> >
> > If I don't hear anything I'll apply this in a couple of days.
>
> It looks great.  Please merge it.

Done.

Thanks,
Andrew