Thread: Potential of a compiler that creates the executable at once

  1. #16
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by Schol-R-LEA-2 View Post
    I've come up with a partial token grammar for the language as you've described it, in EBNF notation, though some of it will need to be expanded upon or changed as the language design progresses. This should give you a leg up on defining the lexical analyzer when the time comes to implement it.

    Code:
    token ::= <keyword> | <number> | <char> | <string> | <identifier> |
              <assignment-op> | <arithmetic-op> | <bitwise-op> | <logical-op> | 
              <paren> | <colon> | <right-arrow> | <angle-bracket> | <brace> | 
              <comma> | <period> 
    keyword ::= <base-type> | "if" | "elif" | "else" | 
                "loop" | "while" | "once" | "for" | 
                "fn" | "let" | "mut" | "return" | "object" | 
                "import" | "as" | "alias" |
                "and" | "or" | "not"
    base-type ::= "i8"| "i16" | "i32" | "i64" |"f32" | "f64" | 
                  "bool" | "char" | "string" | "ptr" | "Array" | "Vector"
    identifer ::= <alpha><alphanum>*
    alpha ::= "A" | "a" |"B" | "b" | "C" | "c"  ... | Z" | "z"
    alphanum ::= <alpha> | <digit>
    bit ::= "0" | "1"
    octal-digit ::= <bit> | "2" | "3" | "4" | "5" | "6" | "7" 
    digit ::= <octal-digit> | "8" | "9"
    hex-digit ::= <digit> | "A" | "a" |"B" | "b" | "C" | "c" | "D" | "d" | "E" | "e" | "F" | "f"
    number ::= <integer> | <float>
    integer ::= <digit>+ | "0b" <bit> | "0" <octal-digit> | "0x" <hex-digit> 
    float ::= <integer> <period> <integer> {("E"|"e") <integer>}
    char ::= <quote> {<printable-character>} <quote>
    string ::= <double-quote> {(<printable-character> | '\"')}* <double-quote>
    assignment-op ::= "="
    arithmetic-op ::= "+" | "-" | "*" | "/"
    bitwise-op ::= "<<" | ">>" | "&" | "|"
    logical-op ::= "==" | "<" | ">" | "<=" | ">="
    paren ::= <lparen> | <rparen>
    lparen ::= "("
    rparen ::= ")"
    angle-bracket ::= <langle> | <rangle>
    lparen ::= "<"
    rparen ::= ">"
    brace ::= <lbrace> | <rbrace>
    lbrace ::= "{"
    rbrace ::= "}"
    colon ::= ":"
    right-arrow ::= "->"
    comma ::= ","
    period ::= "."
    quote ::= "'"
    double-quote ::= '"'
    Wow! Just wow man! You made all this from nothing? I really don't know what to say. Seriously! I will first read then book, then learn assembly and then learn machine language. This may take years but It would be fun! Do you want me to hit you up when this happens and I'm ready to make a thing? Is there a chance that you see yourself not been interested on helping and made something together? I mean, of course things can change from now but I would like to hear if there is even a little interested on seeing yourself wanting to work in a project like that. How do you feel about that?

  2. #17
    Registered User
    Join Date
    Feb 2022
    Posts
    45
    I make two significant changes to the token grammar: I forgot that it makes sense to break the various keywords and operators into their own token terminals, as it makes the job of writing the parser easier later on.
    Code:
    keyword ::= <base-type> | <if-clause> | <elif-clause> | <else-clause> | 
                <unbounded-iteration> | <indefinite-iteration> | <reverse-indefinite-iteration> | <definite-iteration> | 
                <fn-decl> | <var-decl> | <mutable-modifier> | <object-decl> |
                <return-statement> 
                <import-statement> | <as-op> | <alias-op> |
                <and-op> | <or-op> | <not-op>
    base-type ::= "i8"| "i16" | "i32" | "i64" |"f32" | "f64" | 
                  "bool" | "char" | "string" | "ptr" | "Array" | "Vector"
    if-clause ::= "if" 
    elif-clause ::= "elif" 
    else-clause ::= "else"
    unbounded-iteration ::= "loop"
    indefinite-iteration ::= "while"
    reverse-indefinite-iteration ::= "once"
    definite-iteration ::= "for"
    fn-decl ::= "fn" 
    var-decl ::= "let"
    mutable-modifier ::= "mut"
    object-decl ::= "object"
    return-statement ::= "return"
    import-statement ::= "import"
    as-op ::= "as"
    alias-op ::= "alias"
    and-op ::= "and"
    or-op ::= "or"
    alias-op ::= "not"
    and

    Code:
    arithmetic-op ::= <add-op> | <sub-op> | <mult-op> | <div-op>
    add-op ::= "+" 
    sub-op ::= "-" 
    mul-op ::=  "*" 
    div-op ::=  "/"
    bitwise-op ::= <left-shift-op> | <right-shift-op> |<bitwise-and-op> |<bitwise-or-op>
    left-shift-op ::= "<<" 
    right-shift-op ::= ">>" 
    bitwise-and-op :: = "&" 
    bitwise-or-op ::= "|"
    logical-op ::= <equals-op> | <not-equal-op> | <less-than-op> | <greater-than-op> | <lt-eq-op> | <gt-equal-op>
    equals-op ::= "=="
    not-equal-op ::= "!=" 
    less-than-op ::= "<" 
    greater-than-op ::= ">" 
    lt-eq-op ::=  "<=" 
    gt-equal-op ::=  ">="
    There are still going to be more changes needed, but it is a starting point. I'll see what I can do about a syntax grammar for the parser.
    Last edited by Schol-R-LEA-2; 02-15-2022 at 03:17 PM.

  3. #18
    Registered User
    Join Date
    Feb 2022
    Posts
    45
    I realized that the existing naming would present some problems when defining the syntax grammar, as it would be useful for the token terminals not to conflict with those for the syntax. To this end, I've reworded some of the existing grammar:

    Code:
    token ::= <keyword> | <number> | <char> | <string> | <identifier> |
              <assignment-token> | <arithmetic-token> | <bitwise-token> | <logical-token> | 
              <paren> | <colon> | <right-arrow> | <angle-bracket> | <brace> |  <bracket> |
              <comma> | <period> 
    keyword ::= <base-type> | <if-token> | <elif-token> | <else-token> | 
                <unbounded-iteration-token> | <indefinite-iteration-token> | <reverse-indefinite-iteration-token> | <definite-iteration-token> | 
                <fn-token> | <var-decl-token> | <mutable-modifier-token> | <object-token> |
                <return-token> 
                <import-token> | <as-token> | <alias-token> |
                <and-token> | <or-token> | <not-token>
    base-type ::= "i8"| "i16" | "i32" | "i64" |"f32" | "f64" | "bool" | "char" | 
                  "string" | "ptr" | "Array" | "Vector"
    if-token ::= "if" 
    elif-token ::= "elif" 
    else-token ::= "else"
    unbounded-iteration-token ::= "loop"
    indefinite-iteration-token ::= "while"
    reverse-indefinite-iteration-token ::= "once"
    definite-iteration-token ::= "for"
    fn-decl ::= "fn" 
    var-decl ::= "let"
    mutable-modifier ::= "mut"
    object-decl ::= "object"
    return-token ::= "return"
    import-token ::= "import"
    as-token ::= "as"
    alias-token ::= "alias"
    and-token ::= "and"
    or-token ::= "or"
    alias-token ::= "not"
    identifer ::= <alpha><alphanum>*
    alpha ::= "A" | "a" |"B" | "b" | "C" | "c"  ... | Z" | "z"
    alphanum ::= <alpha> | <digit>
    bit ::= "0" | "1"
    octal-digit ::= <bit> | "2" | "3" | "4" | "5" | "6" | "7" 
    digit ::= <octal-digit> | "8" | "9"
    hex-digit ::= <digit> | "A" | "a" |"B" | "b" | "C" | "c" | "D" | "d" | "E" | "e" | "F" | "f"
    number ::= <integer> | <float>
    integer ::= <digit>+ | "0b" <bit> | "0" <octal-digit> | "0x" <hex-digit> 
    float ::= <integer> <period> <integer> {("E"|"e") <integer>}
    char ::= <quote> {<printable-character>} <quote>
    string ::= <double-quote> {(<printable-character> | '\"')}* <double-quote>
    assignment-token ::= "="
    arithmetic-token ::= <add-token> | <sub-token> | <mult-token> | <div-token>
    add-token ::= "+" 
    sub-token ::= "-" 
    mul-token ::=  "*" 
    div-token ::=  "/"
    bitwise-token ::= <left-shift-token> | <right-shift-token> |<bitwise-and-token> |<bitwise-or-token>
    left-shift-token ::= "<<" 
    right-shift-token ::= ">>" 
    bitwise-and-token :: = "&" 
    bitwise-or-token ::= "|"
    logical-token ::= <equals-token> | <not-equal-token> | <less-than-token> | <greater-than-token> | <lt-eq-token> | <gt-equal-token>
    equals-token ::= "=="
    not-equal-token ::= "!=" 
    less-than-token ::= "<" 
    greater-than-token ::= ">" 
    lt-eq-token ::=  "<=" 
    gt-equal-token ::=  ">="
    paren ::= <lparen> | <rparen>
    lparen ::= "("
    rparen ::= ")"
    angle-bracket ::= <langle> | <rangle>
    langle ::= "<"
    rangle ::= ">"
    brace ::= <lbrace> | <rbrace>
    lbrace ::= "{"
    rbrace ::= "}"
    bracket ::= <lbracket> | <rbracket>
    lbracket ::= "["
    rbracket ::= "]"
    colon ::= ":"
    right-arrow ::= "->"
    comma ::= ","
    period ::= "."
    quote ::= "'"
    double-quote ::= '"'
    As for helping, I would be glad to take part. I will send you my email address in DM.
    Last edited by Schol-R-LEA-2; 02-15-2022 at 04:45 PM.

  4. #19
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by Schol-R-LEA-2 View Post
    I realized that the existing naming would present some problems when defining the syntax grammar, as it would be useful for the token terminals not to conflict with those for the syntax. To this end, I've reworded some of the existing grammar:

    Code:
    token ::= <keyword> | <number> | <char> | <string> | <identifier> |
              <assignment-token> | <arithmetic-token> | <bitwise-token> | <logical-token> | 
              <paren> | <colon> | <right-arrow> | <angle-bracket> | <brace> |  <bracket> |
              <comma> | <period> 
    keyword ::= <base-type> | <if-token> | <elif-token> | <else-token> | 
                <unbounded-iteration-token> | <indefinite-iteration-token> | <reverse-indefinite-iteration-token> | <definite-iteration-token> | 
                <fn-token> | <var-decl-token> | <mutable-modifier-token> | <object-token> |
                <return-token> 
                <import-token> | <as-token> | <alias-token> |
                <and-token> | <or-token> | <not-token>
    base-type ::= "i8"| "i16" | "i32" | "i64" |"f32" | "f64" | "bool" | "char" | 
                  "string" | "ptr" | "Array" | "Vector"
    if-token ::= "if" 
    elif-token ::= "elif" 
    else-token ::= "else"
    unbounded-iteration-token ::= "loop"
    indefinite-iteration-token ::= "while"
    reverse-indefinite-iteration-token ::= "once"
    definite-iteration-token ::= "for"
    fn-decl ::= "fn" 
    var-decl ::= "let"
    mutable-modifier ::= "mut"
    object-decl ::= "object"
    return-token ::= "return"
    import-token ::= "import"
    as-token ::= "as"
    alias-token ::= "alias"
    and-token ::= "and"
    or-token ::= "or"
    alias-token ::= "not"
    identifer ::= <alpha><alphanum>*
    alpha ::= "A" | "a" |"B" | "b" | "C" | "c"  ... | Z" | "z"
    alphanum ::= <alpha> | <digit>
    bit ::= "0" | "1"
    octal-digit ::= <bit> | "2" | "3" | "4" | "5" | "6" | "7" 
    digit ::= <octal-digit> | "8" | "9"
    hex-digit ::= <digit> | "A" | "a" |"B" | "b" | "C" | "c" | "D" | "d" | "E" | "e" | "F" | "f"
    number ::= <integer> | <float>
    integer ::= <digit>+ | "0b" <bit> | "0" <octal-digit> | "0x" <hex-digit> 
    float ::= <integer> <period> <integer> {("E"|"e") <integer>}
    char ::= <quote> {<printable-character>} <quote>
    string ::= <double-quote> {(<printable-character> | '\"')}* <double-quote>
    assignment-token ::= "="
    arithmetic-token ::= <add-token> | <sub-token> | <mult-token> | <div-token>
    add-token ::= "+" 
    sub-token ::= "-" 
    mul-token ::=  "*" 
    div-token ::=  "/"
    bitwise-token ::= <left-shift-token> | <right-shift-token> |<bitwise-and-token> |<bitwise-or-token>
    left-shift-token ::= "<<" 
    right-shift-token ::= ">>" 
    bitwise-and-token :: = "&" 
    bitwise-or-token ::= "|"
    logical-token ::= <equals-token> | <not-equal-token> | <less-than-token> | <greater-than-token> | <lt-eq-token> | <gt-equal-token>
    equals-token ::= "=="
    not-equal-token ::= "!=" 
    less-than-token ::= "<" 
    greater-than-token ::= ">" 
    lt-eq-token ::=  "<=" 
    gt-equal-token ::=  ">="
    paren ::= <lparen> | <rparen>
    lparen ::= "("
    rparen ::= ")"
    angle-bracket ::= <langle> | <rangle>
    langle ::= "<"
    rangle ::= ">"
    brace ::= <lbrace> | <rbrace>
    lbrace ::= "{"
    rbrace ::= "}"
    bracket ::= <lbracket> | <rbracket>
    lbracket ::= "["
    rbracket ::= "]"
    colon ::= ":"
    right-arrow ::= "->"
    comma ::= ","
    period ::= "."
    quote ::= "'"
    double-quote ::= '"'
    As for helping, I would be glad to take part. I will send you my email address in DM.
    Thank you man!!! Yes, it may need a couple of more changes (I will probably add more stuff in the language anyway) but this is a great start and I hope you at least enjoyed making it! This will really come in handy as I learn about how to properly write a frontend. I got your DM and I replied back tho the site doesn't bring my message back for me to read so I don't know if something happened. In any case, I will contact you when I have something working because you should help and not do the heavy and dirty work because it will not be fair and because I need to learn to start working hard.

  5. #20
    Registered User
    Join Date
    Feb 2022
    Posts
    45
    I noticed that there is a checkbox down below the edit window which sets whether your PM is saved in your outbox. This may be why you can't find your earlier message.

    Do you have a public repo for the project yet? I know you haven't settled on a name for the language yet, but you could call it 'New Language Project' or something like that and then rename it later. It would help even this early in the project, as we could collaborate through the repo. Putting the grammars in the repo's doc folder would allow us both to edit them.

    I did start a syntax grammar, but given what you said, I'll simply give you what I've done so far, and let you proceed as you need to until you can work on it yourself. Let me know if you need any help.

    Code:
    program-file ::= <statement>+statement ::= <import-statement> | 
                     <fn-decl> | <var-decl> | <object-decl> | 
                     <assignment> | <cond-statement> | <iteration-statement>
    import-statement ::= <import-token> <identifier> {<as-token> <identifier>}
    fn-decl ::= <fn-token> <identifier> {<parameter-list>} {<right-arrow> <type>} <colon> <block>
    parameter-list ::= <lparen> <parameter> {<comma> <parameter>}* <rparen>
    parameter ::= {<ref-token>} {<type>} <identifier>
    arg-list ::= <lparen> <expression> {<comma> <expression>}* <rparen>
    type ::= <base-type> | <parameterized-type> {<langle> <type> <rangle>} | <identifier>
    assignment ::= <identifier> <assignment-token> <expression>
    value ::= <number> | <char> | <string> | <bool> | <identifier>
    expression ::= <value> | <fn-call> | <arith-expression> | <logical-expression>
    fn-call ::= <identifier> {<arg-list>}
    logical-expression ::= <value> <logical-token> (value | <logical-expression>) 
    <arith-expression ::= <arith-term> | 
                          <value> (<bitwise-and-token> | <bitwise-or-token> | <xor-token>) <arith-expression>
    <arith-term> ::=  <term> |
                      <value> (<add-token> | <sub-token>) <term>
    <term> ::= <value> (<mul-token> | <div-token>) <arith-expression>
    Last edited by Schol-R-LEA-2; 02-16-2022 at 11:25 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. potential problems?
    By deepcode in forum C Programming
    Replies: 4
    Last Post: 08-11-2010, 02:04 PM
  2. C# - What potential does it hold?
    By dannysmith in forum C# Programming
    Replies: 14
    Last Post: 11-18-2006, 02:23 PM
  3. calling an executable from an executable
    By dee in forum C Programming
    Replies: 4
    Last Post: 01-10-2004, 01:32 PM
  4. Command line executable not a KDE executable?
    By FillYourBrain in forum Linux Programming
    Replies: 3
    Last Post: 10-03-2003, 12:01 PM

Tags for this Thread