[patch v2] Fix cyrillic symbols show in tui

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[patch v2] Fix cyrillic symbols show in tui

Sourceware - gdb-patches mailing list
From: Vahagn Vardanyan <[hidden email]>

Cyrillic symbols are truncated when assigned to signed char
and are shown as special symbols in tui.
The patch uses unsigned char to allow 256 bit encoding.

gdb/ChangeLog

        * tui/tui-winsource.c (tui_copy_source_line):
        Change type from char to unsigned char.
        * tui/tui-winsource.c (show_source_line):
        Use mvwaddstr instead of tui_puts.
---
 gdb/tui/tui-winsource.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gdb/tui/tui-winsource.c b/gdb/tui/tui-winsource.c
index b5ba59e2f7..598b632e25 100644
--- a/gdb/tui/tui-winsource.c
+++ b/gdb/tui/tui-winsource.c
@@ -87,7 +87,7 @@ tui_copy_source_line (const char **ptr, int line_no, int first_col,
     }
 
   int column = 0;
-  char c;
+  unsigned char c;
   do
     {
       int skip_bytes;
@@ -260,8 +260,8 @@ tui_source_window_base::show_source_line (int lineno)
   if (line->is_exec_point)
     tui_set_reverse_mode (handle.get (), true);
 
-  wmove (handle.get (), lineno, TUI_EXECINFO_SIZE);
-  tui_puts (line->line.c_str (), handle.get ());
+  mvwaddstr (handle.get(), lineno, 4,
+               (char *) line->line.c_str ());
   if (line->is_exec_point)
     tui_set_reverse_mode (handle.get (), false);
 
--
2.21.1 (Apple Git-122.3)

Reply | Threaded
Open this post in threaded view
|

Re: [patch v2] Fix cyrillic symbols show in tui

Simon Marchi-4
On 2020-03-31 5:16 a.m., vaag--- via Gdb-patches wrote:
> From: Vahagn Vardanyan <[hidden email]>
>
> Cyrillic symbols are truncated when assigned to signed char
> and are shown as special symbols in tui.
> The patch uses unsigned char to allow 256 bit encoding.

Hi,

I was able to reproduce using french accented characters.

Unfortunately, your patch fixes the special characters, but unfortunately it breaks
the coloring.  See the before/after screenshots.

Simon

after.png (4K) Download Attachment
before.png (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [patch v2] Fix cyrillic symbols show in tui

Tom Tromey-2
In reply to this post by Sourceware - gdb-patches mailing list
>>>>> ">" == vaag--- via Gdb-patches <[hidden email]> writes:

Thanks for the patch.

>> gdb/ChangeLog

>>         * tui/tui-winsource.c (tui_copy_source_line):
>>         Change type from char to unsigned char.
>>         * tui/tui-winsource.c (show_source_line):
>>         Use mvwaddstr instead of tui_puts.

This is in bugzilla -- see
https://sourceware.org/bugzilla/show_bug.cgi?id=25342

You should put this at the top of your ChangeLog entry (see ChangeLog
for examples):

        PR tui/25342

>> +  unsigned char c;

I tend to think the approach in the bug -- of using ISCNTRL -- is
preferable.

>>    do
>>      {
>>        int skip_bytes;
>> @@ -260,8 +260,8 @@ tui_source_window_base::show_source_line (int lineno)
>>    if (line->is_exec_point)
>>      tui_set_reverse_mode (handle.get (), true);
 
>> -  wmove (handle.get (), lineno, TUI_EXECINFO_SIZE);
>> -  tui_puts (line->line.c_str (), handle.get ());
>> +  mvwaddstr (handle.get(), lineno, 4,
>> +               (char *) line->line.c_str ());

As Simon points out, this will lose colorizing support.

Also there's the issue of how this will handle the case where a
multi-byte character gets cut off in the middle.

Perhaps this code needs convert to wchar_t and back again.  That seems
pretty unfortunate though.  The other alternative would be if there's a
way to change tui_puts so that over-long lines are (optionally)
truncated rather than overflowing.  I don't know if curses offers that
or not.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: [patch v2] Fix cyrillic symbols show in tui

Tom Tromey-2
>>>>> "Tom" == Tom Tromey <[hidden email]> writes:

Tom> Perhaps this code needs convert to wchar_t and back again.  That seems
Tom> pretty unfortunate though.  The other alternative would be if there's a
Tom> way to change tui_puts so that over-long lines are (optionally)
Tom> truncated rather than overflowing.  I don't know if curses offers that
Tom> or not.

I researched this a little.

Curses doesn't directly support control over this.  Bleah curses.

However, it seems it can maybe be accomplished anyway.  A few
approaches.

1. Use waddnstr or mvaddnstr to limit how many characters are emitted.
   To do this, I suppose that do_tui_putc would have to be reimplemented
   (to fix the \t hack), and also tui_puts_internal would have to be
   changed so that it does not emit a single character at a time.
   Instead, tui_puts_internal would be changed to emit strings of
   characters (limited to the remaining horizontal space), then check
   the column to see where the cursor ended up.

2. Use winsstr to insert characters rather than append them.  In theory
   this should cause characters to the right to be pushed off and
   disappear -- not wrap.  (The ncurses docs say this but as with all
   things curses, one must try it.)  For this to work we'd have to clear
   to the end of the line before emitting things.

3. Use pads instead of windows, and make sure the pads are sized so that
   wrapping cannot occur.  This would be a somewhat larger change.  I
   suppose it would have to be limited to the source window, since it
   wouldn't really be possible for the assembly window.

I'm not sure if #1 or #2 is more desirable.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: [patch v2] Fix cyrillic symbols show in tui

Tom Tromey-2
>>>>> "Tom" == Tom Tromey <[hidden email]> writes:

Tom> 1. Use waddnstr or mvaddnstr to limit how many characters are emitted.
Tom>    To do this, I suppose that do_tui_putc would have to be reimplemented
Tom>    (to fix the \t hack), and also tui_puts_internal would have to be
Tom>    changed so that it does not emit a single character at a time.
Tom>    Instead, tui_puts_internal would be changed to emit strings of
Tom>    characters (limited to the remaining horizontal space), then check
Tom>    the column to see where the cursor ended up.

Maybe we don't need the do_tui_putc change.
I'm not totally sure yet.

Could you try this?

Tom

diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index b5ee2a2b6b6..92a10420a5c 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -438,42 +438,90 @@ tui_write (const char *buf, size_t length)
 static void
 tui_puts_internal (WINDOW *w, const char *string, int *height)
 {
-  char c;
   int prev_col = 0;
   bool saw_nl = false;
+  /* Once we run off the end of the window, we keep processing escape
+     sequences, but otherwise suppress output.  */
+  bool suppressing = false;
+  int max_col = getmaxx (w);
 
-  while ((c = *string++) != 0)
+  while (true)
     {
-      if (c == '\n')
- saw_nl = true;
+      const char *next = strpbrk (string, "\n\1\2\033\t");
 
-      if (c == '\1' || c == '\2')
+      /* Print the plain text prefix, stopping at the window
+ boundary.  */
+      size_t n_chars = next == nullptr ? strlen (string) : next - string;
+      if (!suppressing && n_chars > 0)
  {
-  /* Ignore these, they are readline escape-marking
-     sequences.  */
- }
-      else
- {
-  if (c == '\033')
+  int col = getcurx (w);
+  waddnstr (w, string, std::min (n_chars, (size_t) (max_col - col)));
+  if (getcurx (w) == 0)
     {
-      size_t bytes_read = apply_ansi_escape (w, string - 1);
-      if (bytes_read > 0)
- {
-  string = string + bytes_read - 1;
-  continue;
- }
+      /* We wrapped.  */
+      suppressing = true;
     }
-  do_tui_putc (w, c);
+ }
 
-  if (height != nullptr)
+      if (height != nullptr)
+ {
+  int col = getcurx (w);
+  if (col <= prev_col)
+    ++*height;
+  prev_col = col;
+ }
+
+      /* We finished.  */
+      if (next == nullptr)
+ break;
+
+      switch (*next)
+ {
+ case '\1':
+ case '\2':
+  /* Ignore these, they are readline escape-marking
+     sequences.  */
+  ++next;
+  break;
+
+ case '\n':
+  saw_nl = true;
+  if (suppressing)
     {
-      int col = getcurx (w);
-      if (col <= prev_col)
- ++*height;
-      prev_col = col;
+      /* The cursor is already on the next line and there's
+ nothing to do.  */
+      suppressing = false;
     }
+  else
+    do_tui_putc (w, '\n');
+  ++next;
+  break;
+
+ case '\t':
+  do_tui_putc (w, '\t');
+  ++next;
+  break;
+
+ case '\033':
+  {
+    size_t bytes_read = apply_ansi_escape (w, next);
+    if (bytes_read > 0)
+      next += bytes_read;
+    else
+      {
+ /* Just drop the escape.  */
+ ++next;
+      }
+  }
+  break;
+
+ default:
+  gdb_assert_not_reached ("missing case in tui_puts_internal");
  }
+
+      string = next;
     }
+
   if (TUI_CMD_WIN != nullptr && w == TUI_CMD_WIN->handle.get ())
     update_cmdwin_start_line ();
   if (saw_nl)
diff --git a/gdb/tui/tui-winsource.c b/gdb/tui/tui-winsource.c
index b5ba59e2f7a..d2bdd9ed1f2 100644
--- a/gdb/tui/tui-winsource.c
+++ b/gdb/tui/tui-winsource.c
@@ -39,6 +39,7 @@
 #include "tui/tui-source.h"
 #include "tui/tui-disasm.h"
 #include "gdb_curses.h"
+#include "safe-ctype.h"
 
 /* Function to display the "main" routine.  */
 void
@@ -131,7 +132,7 @@ tui_copy_source_line (const char **ptr, int line_no, int first_col,
  {
   /* Nothing.  */
  }
-      else if (c < 040 && c != '\t')
+      else if (ISCNTRL (c) && c != '\t')
  {
   result.push_back ('^');
   result.push_back (c + 0100);