Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs on ctrl-t with utf-8 input #34

Open
weinholt opened this issue Aug 8, 2018 · 4 comments
Open

Hangs on ctrl-t with utf-8 input #34

weinholt opened this issue Aug 8, 2018 · 4 comments

Comments

@weinholt
Copy link

weinholt commented Aug 8, 2018

Hello,

Linenoise hangs when transposing two utf-8 characters with ctrl-t.

To reproduce, run "make example; ./example" and enter "åä" so your screen looks like this:

こんにちは> åä

Move the cursor to ä (positioning it between å and ä). Press ctrl-t.

@rain-1
Copy link
Owner

rain-1 commented Aug 8, 2018

confirmed. looked at lldb backtraces. what they have in common is linenoiseUtf8NextCharLen.

        case CTRL_T:    /* ctrl-t, swaps current character with previous. */
            if (l.pos > 0 && l.pos < l.len) {
                int aux = buf[l.pos-1];
                buf[l.pos-1] = buf[l.pos];
                buf[l.pos] = aux;
                if (l.pos != l.len-1) l.pos++;
                refreshLine(&l);
            }
            break;

control-t code was merged which does not work with the new unicode API. So it creates invalid data.

So there is 2 bugs here:

  • ctrl-t implementation
  • invalid unicode data causes infinite loop

and actually 3

  • pasting in unicode text is very unpredictable and buggy

@rain-1
Copy link
Owner

rain-1 commented Aug 8, 2018

other functions which make use of buf directly:

        case CTRL_U: /* Ctrl+u, delete the whole line. */
            buf[0] = '\0';
            l.pos = l.len = 0;
            refreshLine(&l);
            break;
        case CTRL_K: /* Ctrl+k, delete from current to end of line. */
            buf[l.pos] = '\0';
            l.len = l.pos;
            refreshLine(&l);
            break;

I believe both are unicode safe though, I think 0 and l.pos will always be on the boundary of a codepoint.

@rain-1
Copy link
Owner

rain-1 commented Aug 8, 2018

The loop is caused by utf8BytesToCodePoint returning 0 on utf-8 invalid text. I'm not sure what the best thing to do in a case like that is.

fa0de5c improves the behavior on invalid text, stopping infinite loops.

6d043da fixes the transpose bug for åä and こんにちは. I just realized after pushing that it may not solve the issue when we have more complex clusters(?).

@rain-1
Copy link
Owner

rain-1 commented Aug 8, 2018

https://github.com/yhirose/linenoise/tree/utf8-support repo has this same issue. need to contact him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants