URI: 
       tmbsprint: improve printing output when it has invalid UTF data - sacc - sacc(omys), simple console gopher client (mirror)
  HTML git clone https://git.parazyd.org/sacc
   DIR Log
   DIR Files
   DIR Refs
   DIR LICENSE
       ---
   DIR commit edab539b23594219bbfc83729822da917a18a243
   DIR parent c416c8c73d0a33eb8c428b1a9b9eaaffc098ee5b
  HTML Author: Hiltjo Posthuma <hiltjo@codemadness.org>
       Date:   Tue,  5 Jan 2021 21:21:03 +0100
       
       mbsprint: improve printing output when it has invalid UTF data
       
       Reset the decode state when mbtowc returns -1. The OpenBSD mbtowc(3)
       man page says: "If a call to mbtowc() resulted in an undefined internal
       state, mbtowc() must be called with s set to NULL to reset the internal
       state before it can safely be used again."
       
       Print the UTF replacement character (codepoint 0xfffd) for the invalid
       codepoint or incomplete sequence and continue printing the line
       (instead of stopping).
       
       Remove the 0 return code as it can't happen because we're already
       checking the string length in the loop.
       
       Diffstat:
         M sacc.c                              |      12 +++++++++---
       
       1 file changed, 9 insertions(+), 3 deletions(-)
       ---
   DIR diff --git a/sacc.c b/sacc.c
       t@@ -110,12 +110,18 @@ mbsprint(const char *s, size_t len)
        
                slen = strlen(s);
                for (i = 0; i < slen; i += rl) {
       -                if ((rl = mbtowc(&wc, s + i, slen - i < 4 ? slen - i : 4)) <= 0)
       -                        break;
       +                rl = mbtowc(&wc, s + i, slen - i < 4 ? slen - i : 4);
       +                if (rl == -1) {
       +                        mbtowc(NULL, NULL, 0); /* reset state */
       +                        fputs("\xef\xbf\xbd", stdout); /* replacement character */
       +                        col++;
       +                        rl = 1;
       +                        continue;
       +                }
                        if ((w = wcwidth(wc)) == -1)
                                continue;
                        if (col + w > len || (col + w == len && s[i + rl])) {
       -                        fputs("\xe2\x80\xa6", stdout);
       +                        fputs("\xe2\x80\xa6", stdout); /* ellipsis */
                                col++;
                                break;
                        }