Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🦸 [Need Help] Operator-precedence parser 🦸 #71

Open
BaseMax opened this issue Jun 1, 2021 · 32 comments
Open

🦸 [Need Help] Operator-precedence parser 🦸 #71

BaseMax opened this issue Jun 1, 2021 · 32 comments
Labels
documentation Good First Issue Good for newcomers Help Wanted Extra attention is needed High Priority Question Further information is requested Sample Code Example code for better learning and how to function a process. tests

Comments

@BaseMax
Copy link
Member

BaseMax commented Jun 1, 2021

Hi, I took the initial phase of the parser forward and it works well and detects tokens.

Now I need help arranging operations and operators. Does anyone want to help? (Operator-precedence parser)
for e.g:

expression ::= equality-expression
equality-expression ::= additive-expression ( ( '==' | '!=' ) additive-expression ) *
additive-expression ::= multiplicative-expression ( ( '+' | '-' ) multiplicative-expression ) *
multiplicative-expression ::= primary ( ( '*' | '/' ) primary ) *
primary ::= '(' expression ')' | NUMBER | VARIABLE | '-' primary
@BaseMax BaseMax added the Good First Issue Good for newcomers label Jun 1, 2021
@ghost
Copy link

ghost commented Jun 4, 2021

here is cpp's operator precedence https://en.cppreference.com/w/cpp/language/operator_precedence im pretty sure its common sense using bidmas i can help if need be.

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

LexAndYacc.pdf
OperatorPrecedenceParsing.pdf
unknown

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Thank you for your message.

Okay, honestly I not sure about some special operators. such as >> << and etc.
I aware Operator-precedence is different in every language. So I not sure how we can go up.
I think we can figure out how PHP works by thinking about its Operator-precedence parser.

@jbampton jbampton added Help Wanted Extra attention is needed Question Further information is requested labels Jun 4, 2021
@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

I checked its parser. They did not divide the phrase into several groups.
And he manages these using BISON. https://github.com/php/php-src/blob/master/Zend/zend_language_parser.y#L1041
Since we are writing Parser by hand, we need to group and know more details.

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Dear @snapogod;
#71 (comment)

Can you help us complete the BNF grammar? https://github.com/One-Language/One/blob/master/grammar.BNF
We need to know grammar of all operators.

@BaseMax BaseMax assigned ghost Jun 4, 2021
@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Another nice implementation of the expression parser. But it does not help us much.
https://github.com/maxicombina/c-calculator/blob/master/calculator.c#L170

int get_op_precedence(char op)
{
    int res = 0;
    if (op == '+' || op == '-')
    {
        res = 9;
    }
    else if (op == '*' || op == '/' || op == '%')
    {
        res = 10;
    }
    else if (op == '^')
    {
        res = 11;
    }
    return res;
}

static int is_higher_or_equal_precedence(char op1, char op2)
{
    return get_op_precedence(op1) &gt;= get_op_precedence(op2);
}

static int is_higher_precedence(char op1, char op2)
{
    return get_op_precedence(op1) &gt; get_op_precedence(op2);
}

static int is_left_assoc_operator(char op)
{
    return (op == '+' || op == '-' || op == '*' || op == '/' || op == '%');
}

static int is_right_assoc_operator(char op)
{
    return (op == '^');
}

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Some useful references I find yesterday:
https://depositonce.tu-berlin.de/bitstream/11303/2630/1/Dokument_46.pdf
https://cdn2.hubspot.net/hubfs/3881487/ExprParser_User_Guide.pdf?t=1517563829446

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Another nice implementation of the expression parser:
https://github.com/programoworm/Expression_Recogniser/blob/master/expression.c

This is something I extracted from the above code but it is still incomplete and has a problem:

parseFactor = "+" parseFactor # need Fix
                    = "-" parseFactor # need Fix
                    = "(" INT ")"
                    = INT

parseTerm = parseFactor
                  = parseFactor "*" parseFactor
                  = parseFactor "/" parseFactor

parseExpressions = parseTerm
                              = parseTerm "+" parseTerm
                              = parseTerm "-" parseTerm

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

I found some online tools that can test grammar.
This is very interesting and it is better to test the grammars in the same way.

More about this subject:

But I'm looking for something that can draw a picture and show it with a figure ... Do you know anything?

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Another nice example from PEG.JS:

// Simple Arithmetics Grammar
// ==========================
//
// Accepts expressions like "2 * (3 + 4)" and computes their value.

Expression
  = head:Term tail:(_ ("+" / "-") _ Term)* {
      return tail.reduce(function(result, element) {
        if (element[1] === "+") { return result + element[3]; }
        if (element[1] === "-") { return result - element[3]; }
      }, head);
    }

Term
  = head:Factor tail:(_ ("*" / "/") _ Factor)* {
      return tail.reduce(function(result, element) {
        if (element[1] === "*") { return result * element[3]; }
        if (element[1] === "/") { return result / element[3]; }
      }, head);
    }

Factor
  = "(" _ expr:Expression _ ")" { return expr; }
  / Integer

Integer "integer"
  = _ [0-9]+ { return parseInt(text(), 10); }

_ "whitespace"
  = [ \t\n\r]*

https://pegjs.org/online

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

The above codes are summarized as follows:

Expression   = Term ( ("+" / "-")  Term)*
                         | Term

Term  = Factor ( ("*" / "/") Factor)*

Factor   = "(" Expression ")"
               | [0-9]+

The problem is that this grammar is not complete and does not include dozens of operators!

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

A tiny YACC/Bison example:

exp: exp OPERATOR_PLUS exp
   | exp OPERATOR_MINUS exp
   | exp OPERATOR_MULT exp
   | exp OPERATOR_DIV exp
   | LPAREN exp RPAREN
   | NUMBER

https://www.cs.uic.edu/~spopuri/ibison.html

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

Okay, so again we need "Precedence of operators in BNF grammar" with all operators not only + - * /.

@BaseMax
Copy link
Member Author

BaseMax commented Jun 4, 2021

And finally, I find BNF of c language:
https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm

And also another nice documentation for BNF:
https://www.radford.edu/~nokie/classes/380/syntax.html

@jbampton jbampton added Sample Code Example code for better learning and how to function a process. tests labels Jun 5, 2021
@jbampton jbampton changed the title [Need Help] Operator-precedence parser 🦸 [Need Help] Operator-precedence parser 🦸 Jun 5, 2021
@BaseMax
Copy link
Member Author

BaseMax commented Jun 5, 2021

=============== INPUT ===============
package main
i8 test1 {
    _ 4
    _ 5+1
}
=============== TREE ===============
Package: main

[FUNC] test1
   [STMT] PRINT
      [EXPRESSIONS] 1 expression(s)
         [EXPR] VALUE 4
   [STMT] PRINT
      [EXPRESSIONS] 1 expression(s)
         [EXPR] + 
            Left:
               [EXPR] VALUE 5
            Right:
               [EXPR] VALUE 1
=============== VM ===============
Package: main
[FUNC] test1
[STMT] PRINT
[STMT] PRINT

Currently + - operators supported. But we need to add many more operators.

A question, The input is _ 5+1-2+6
generate this TREE:

   [STMT] PRINT
      [EXPRESSIONS] 1 expression(s)
         [EXPR] + 
            Left:
               [EXPR] VALUE 5
            Right:
               [EXPR] - 
                  Left:
                     [EXPR] VALUE 1
                  Right:
                     [EXPR] + 
                        Left:
                           [EXPR] VALUE 2
                        Right:
                           [EXPR] VALUE 6

It's correct? can someone check this?

@async-costelo async-costelo removed their assignment Jun 5, 2021
@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

A message from Dear snappyagggodypingable:

a = b
a += b
a -= b
a *= b
a /= b
a %= b
a &= b
a |= b
a ^= b
a <<= b
a >>= b

++a
--a
a++
a--

+a
-a
a + b
a - b
a * b
a / b
a % b
~a
a & b
a | b
a ^ b
a << b
a >> b

!a
a && b
a || b

a == b
a != b
a < b
a > b
a <= b
a >= b
a <=> b

a[b]
a
&a
a->b
a.b
a->b
a.*b

all those right?

@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

ruby:

<=> == === != =~ !~
= %= { /= -= += |= &= >>= <<= *= &&= ||= **=

And again message from Dear snappyagggodypingable:

<expression>   ::= <term> "+" <expression>
                 | <term> "-" <expression>
                 | <term>

<term>         ::= <factor> "*" <term>
                 | <factor> "/" <term>
                 | <factor>
<power>        ::= <factor> ^ <power>
                 | <factor>


<factor>       ::= "(" <expression> ")"
                 | <const>

<const>        ::= NUMBER | STRING | IDENTIFIER

@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

Hooray, It's mostly done and now works. 👍
I did used C operators table: https://en.cppreference.com/w/c/language/operator_precedence

@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

A = (2 + 15 * ( 25 - 13 ) / 1 - 4) + 10 / 2

AST TREE of The above mathematic expression:

 [EXPR] + 
    Left:
       [EXPR] + 
          Left:
             [EXPR] VALUE 2
          Right:
             [EXPR] * 
                Left:
                   [EXPR] VALUE 15
                Right:
                   [EXPR] / 
                      Left:
                         [EXPR] - 
                            Left:
                               [EXPR] VALUE 25
                            Right:
                               [EXPR] VALUE 13

                      Right:
                         [EXPR] - 
                            Left:
                               [EXPR] VALUE 1
                            Right:
                               [EXPR] VALUE 4




    Right:
       [EXPR] / 
          Left:
             [EXPR] VALUE 10
          Right:
             [EXPR] VALUE 2

I'm not sure it's correct or no.
It's wrong. but what is correct? so....
A = (2 + 15 * (( 25 - 13 ) / 1) - 4) + 10 / 2
A = (2 + (15 * ( 25 - 13 ) / 1) - 4) + 10 / 2
or ?

@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

Correct answer is:

= (2 + ((15 * (25 - 13)) / 1) - 4) + (10 / 2)

@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

Below is the full implementation of EvaluateExpression function and its helpers in C#.

int EvaluateExpression(char[] exp)
{

    Stack<int> vStack = new Stack<int>();
    Stack<char> opStack = new Stack<char>();

    opStack.Push('('); // Implicit opening parenthesis

    int pos = 0;
    while (pos <= exp.Length)
    {
        if (pos == exp.Length || exp[pos] == ')')
        {
            ProcessClosingParenthesis(vStack, opStack);
            pos++;
        }
        else if (exp[pos] >= '0' && exp[pos] <= '9')
        {
            pos = ProcessInputNumber(exp, pos, vStack);
        }
        else
        {
            ProcessInputOperator(exp[pos], vStack, opStack);
            pos++;
        }

    }

    return vStack.Pop(); // Result remains on values stacks

}

void ProcessClosingParenthesis(Stack<int> vStack,
                                Stack<char> opStack)
{

    while (opStack.Peek() != '(')
        ExecuteOperation(vStack, opStack);

    opStack.Pop(); // Remove the opening parenthesis

}

int ProcessInputNumber(char[] exp, int pos,
                        Stack<int> vStack)
{

    int value = 0;
    while (pos < exp.Length &&
            exp[pos] >= '0' && exp[pos] <= '9')
        value = 10 * value + (int)(exp[pos++] - '0');

    vStack.Push(value);

    return pos;

}

void ProcessInputOperator(char op, Stack<int> vStack,
                            Stack<char> opStack)
{

    while (opStack.Count > 0 &&
            OperatorCausesEvaluation(op, opStack.Peek()))
        ExecuteOperation(vStack, opStack);

    opStack.Push(op);

}

bool OperatorCausesEvaluation(char op, char prevOp)
{

    bool evaluate = false;

    switch (op)
    {
        case '+':
        case '-':
            evaluate = (prevOp != '(');
            break;
        case '*':
        case '':
            evaluate = (prevOp == '*' || prevOp == '');
            break;
        case ')':
            evaluate = true;
            break;
    }

    return evaluate;

}

void ExecuteOperation(Stack<int> vStack,
                        Stack<char> opStack)
{

    int rightOperand = vStack.Pop();
    int leftOperand = vStack.Pop();
    char op = opStack.Pop();

    int result = 0;
    switch (op)
    {
        case '+':
            result = leftOperand + rightOperand;
            break;
        case '-':
            result = leftOperand - rightOperand;
            break;
        case '*':
            result = leftOperand * rightOperand;
            break;
        case '':
            result = leftOperand  rightOperand;
            break;
    }

    vStack.Push(result);

}

@BaseMax
Copy link
Member Author

BaseMax commented Jun 8, 2021

Correct answer/tree is:
photo_2021-06-09_01-35-12

Thanks from https://github.com/carlos-chaguendo/arboles-binarios

@jbampton
Copy link
Member

jbampton commented Jun 8, 2021

Great work @BaseMax !! 🥇

@BaseMax
Copy link
Member Author

BaseMax commented Jun 13, 2021

_ (2 + 15 * ( 25 - 13 ) / 1 - 4) + 10 / 2

Tree Output of the expression:

└──Statement: PRINT
    └──Expressions (1)
        └──Expression +
            └──Expression Left
                └──Expression -
                    └──Expression Left
                        └──Expression +
                            └──Expression Left
                                └──Expression Direct: 2
                            └──Expression Right
                                └──Expression *
                                    └──Expression Left
                                        └──Expression Direct: 15
                                    └──Expression Right
                                        └──Expression /
                                            └──Expression Left
                                                └──Expression -
                                                    └──Expression Left
                                                        └──Expression Direct: 25
                                                    └──Expression Right
                                                        └──Expression Direct: 13
                                            └──Expression Right
                                                └──Expression Direct: 1
                    └──Expression Right
                        └──Expression Direct: 4
            └──Expression Right
                └──Expression /
                    └──Expression Left
                        └──Expression Direct: 10
                    └──Expression Right
                        └──Expression Direct: 2

I not sure it's correct or no.

@BaseMax
Copy link
Member Author

BaseMax commented Jun 13, 2021

That was wrong. I fixed bug 4c07353.

and finally it's AST output:

            └──Statement: PRINT
                └──Expressions (1)
                    └──Expression +
                        └──Expression Left
                            └──Expression -
                                └──Expression Left
                                    └──Expression +
                                        └──Expression Left
                                            └──Expression Direct: 2
                                        └──Expression Right
                                            └──Expression /
                                                └──Expression Left
                                                    └──Expression *
                                                        └──Expression Left
                                                            └──Expression Direct: 15
                                                        └──Expression Right
                                                            └──Expression -
                                                                └──Expression Left
                                                                    └──Expression Direct: 25
                                                                └──Expression Right
                                                                    └──Expression Direct: 13
                                                └──Expression Right
                                                    └──Expression Direct: 1
                                └──Expression Right
                                    └──Expression Direct: 4
                        └──Expression Right
                            └──Expression /
                                └──Expression Left
                                    └──Expression Direct: 10
                                └──Expression Right
                                    └──Expression Direct: 2

@BaseMax
Copy link
Member Author

BaseMax commented Jun 17, 2021

Sorry, I don't have much time to translate this to English:

سلام شب همگی بخیر؛ سوالی داشتم لطفا راهنمایی ام کنید. اینکه یک اپراتور چپ به راست هست یا راست به چپ چه مفهومی داره؟

مثلا جمع و تفریق بنظر چپ به راست هست.
5 + 4
5 - 4
ضرب و تقسیم هم همینطور چپ به راست هست.
5 * 4

اما مساوی راست به چپ هست.
a = 5+4
حتی کوچک تر / بزرگتر/ کوچکتر مساوی / بزرگتر مساوی هم چپ به راست هست.
5 > 4
5 < 4

یکسری مثال خوب و شفاف پیدا کردم میفرستم شاید بدرد شما هم بخوره.
سعی میکنم تحلیل کنم نتیجه برسم

Example: 5 - 4 - 3
(5 - 4) - 3 = -2 // left association is correct
5 - (4 - 3) = 4 // right is incorrect

Example: x = y = z
x = (y = z) // right association is correct, sets x and y
(x = y) = z // left is incorrect, does not set y

این دقیقا بحث operator associativity هست که نمیدونم فارسی اش چیه:

https://en.wikipedia.org/wiki/Operator_associativity

سلام.
درسته. مثلا این عبارت:
100 / 10 * 10
برابر هست با:
(100/10) * 10
= 10 * 10

چون بنظر عملگر ضرب و تقسیم هر دو چپ به راست هستند.
اما وقتی عبارت پیچیده میشه گیج میشم نمیفهمم چی به چی باید بشه از کجا شروع کنم.

:
5 + 100 / 10 * 10 = 10/2 * 5 - 1

اول اینکه جمع و تفریق چپ به راست هست.
ضرب تقسیم هم چپ به راست هست.
ولی تساوی راست به چپ هست. علت اش رو نمیدونم ولی با این فرض سعی میکنم ادامه دهم:

105 / 10 * 10 = 10/2 * 5 -1
در مرحله بعدی سعی میکنم تقسیم سمت چپ رو حل کنم. اصلا چرا باید از سمت چپ شروع کنیم؟ چرا از سمت راست شروع نمیکنیم؟ مگر ما اینجا تساوی هم نداریم تساوی از راست به چپ هست. ولی بنظر ما باید طبق قانونی که معلوم نیست چیه از چپ شروع به تحلیل کنیم.

10.5 * 10 = 10/2 * 5 -1
و حالا اگر دوباره ادامه دهیم و ضرب سمت چپ رو هم حساب کنیم:
105 = 10/2 * 5 -1
حالا به اینجا رسیدیم که سمت چپ تساوی یک جواب نهایی داریم. و حالا که به تساوی برخورد میکنیم. میرویم که طرف دیگه رو تحلیل کنیم:

105 = 5 * 5 -1
و حالا ضرب سمت راست رو تحلیل کنیم:
105 = 25 -1
و حالا تفریق سمت راست:
105 = 24
و این شد جواب اخر. درسته این دو تا برابر نیستند لزومی هم نبوده برابر باشند. صرفا یه عبارت ریاضی هست که ممکنه درست باشه یا نه.

توی این تحلیل من اصلا هیچ موردی توی ذهنم رو برای عملگر مساوی رعایت نکردم چون عملگر مساوی راست به چپ هست اما تغییری برام ایجاد نکرد یا حداقل بلد نیستم چه تغییری ایجاد کرده که نفهمیدم.

These thoughts were in my mind.

@BaseMax
Copy link
Member Author

BaseMax commented Jun 17, 2021

یه چیزی یاد گرفتم گفتم اینجا هم بگم شاید برای برخی هم سوال باشه. چپ به راست بودن یا راست به چپ بودن عملگر تنها در شرایطی مهم میشه که حداقل دو تا عملگر هم جنس یا تکراری داشته باشیم. در شرایطی که یک عملگر باشه هیچ فرقی نمیکنه که از چپ شروع بشه یا از راست.

عبارت ریاضی:

x = y = z

دو تحلیل مختلف که میشود داشت:
یکی از راست:

x = (y = z)
یکی از چپ:
(x = y) = z
اینجا چون حداقل دو تا عملگر = داشتیم این مهم شد که کدوم تحلیل درسته. اما اگه فقط یدونه = میداشتیم اصلا این نیاز حس نمیشد.

اها. الان بیشتر متوجه میشوم. مثلا اگه همچین عبارتی داشته باشیم:

!+-X
این رو چطور تحلیل کنیم؟ اینجا بیشتر از یک اپراتور از یک جنس داریم. طبیعتا دو نوع تحلیل میشه داشت. تحلیل از راست به چپ:

! ( + ( -(X) ) )

تحلیل از چپ به راست: نمیدونم چطوری میشه. گیج شدم باز.

خب مشخص هست که هنوز نفهمیدم چون گیر میکنم.
الان چه دلیلی داره که میگویند عملگر ! که قبل از چیزی میاد راست به چپ هست. و چرا چپ به راست نیست؟
نمیدونم حالتی که چپ به راست میشه رو چطور و چی باید نوشت

@BaseMax
Copy link
Member Author

BaseMax commented Jun 17, 2021

========= نیم ساعت بعد =========

سعی کردم یه عبارت ریاضی سخت تر و پیچیده بسازم:

5 * !+-X / 2
خب روی همین عبارت گیر کردم. سوالی که برای ادم پیش میاد این هست که الان منفی ایکس رو تقسیم بر دو بکنم و بعد ادامه بده یا اینکه نه همه رو حساب بکنه اخر سر تقسیم بر دو بکنه. با اینکه گیج میزنم سعی میکنم دامه بدهم ببینم به چی و چه مرحله ای میرسم:
5 * !(+(-(x))) / 2
این یکی از حالت ها هست. حالت دیگه هم احتمالا این بشه:
5 * !(+(-(x / 2) ) )

@jbampton
Copy link
Member

Hey @BaseMax what is the status of this issue ?

@BaseMax
Copy link
Member Author

BaseMax commented Dec 27, 2021

It's fixed in new version, I have to push new version and focus on the core.

@jbampton
Copy link
Member

jbampton commented Jun 4, 2022

Do you have an update on this ? Is core ready in the new branch ?

@BaseMax
Copy link
Member Author

BaseMax commented Aug 28, 2022

A new update: almost every operator work well in the new version. We are on a roll!
Happy ^_^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Good First Issue Good for newcomers Help Wanted Extra attention is needed High Priority Question Further information is requested Sample Code Example code for better learning and how to function a process. tests
Projects
None yet
Development

No branches or pull requests

8 participants