Thread: Finite State Machine

  1. #1
    Registered User
    Join Date
    Mar 2007
    Posts
    25

    Finite State Machine

    Create a program that will read tokens written to match the following syntax and print the type and value of each token found. The statements will be found in a file that will be passed to the program via a command line argument.
    Valid tokens: EVAL, VALUE, EXP, Variable (in the form of Vnn where nn ranges from 00 to 99), Constant (decimal digits), +, -, *, /, (, ), ;


    Hello everyone. That's my current lab assignment, and I have to write a state machine to do it. I've found simple examples of simple state machines on the web, but it has too many switch statements. I've also seen examples of 2-dimensional state machine and hierarchical state machine.

    I still can't structure my state machine for it to do what I need to do for this lab, mainly because I'm still confused about the whole concept of State and Event. Do I make each Token a State and each character and number an Event?

    State machines are supposed to make things simpler, but every way I've tried to solve the problem it seems that state machines are just as complicated as a nested switch/if statement.

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    You're just writing the parser, right? You don't actually have to know what all the stuff means. So I'm guessing your states are going to be: looking for the start of a token, reading a token, and out of input. I'm hoping you can guess how to switch from one to another.

  3. #3
    Registered User
    Join Date
    Mar 2007
    Posts
    25
    Yes, it's just the parser. Later on he will tell us what he wants us to do with them.

  4. #4
    Registered User
    Join Date
    Mar 2007
    Posts
    25
    If I make each state a class then I end up with many classes. What if I decide to write a parser for c++; that's too many classes to keep track of. I guess that explains why we don't have c++ refactoring tools.

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by ArlexBee-871RBO View Post
    If I make each state a class then I end up with many classes. What if I decide to write a parser for c++; that's too many classes to keep track of. I guess that explains why we don't have c++ refactoring tools.
    Wait -- what? We have three states, and unless I'm on Discworld all of a sudden, "three" does not correspond to "many". And for that matter, you don't really have to write anything for the third state, because "done with input" doesn't require you to do anything (except stop). Reading the token "EVAL" and the token "EXP" aren't different states; the output will be different tokens, obviously, but so what.

    If you really want to make the most work for yourself, this is the largest number of states I can come up with:
    (1) No current token
    (2) Current token starts with "E"
    (3) Current token starts with "V"
    (4) Current token starts with a digit
    (5) Current token starts with punctuation
    (6) Current token is invalid (if you want to get fancy)

    You won't stay in state 5 for very long; since all your punctuation marks are one character only, you'll immediately spit out the token and go to state 1.

  6. #6
    Registered User
    Join Date
    Mar 2007
    Posts
    25
    Well I'm confused for a reason I guess.

    I'll be reading a file that may contain anything. So I have a stream of characters that's coming in and I'll have to extract those tokens. So if I have "BEVAL DBRG" coming in then I have no tokens, but in "Jk EXP uIP * BBgU"I have the EXP token and the asterisk token. I think the technical term for this is lexical analysis .

    so far for EVAL, VALUE, and EXP I have 5 states, E, VAL, XP, V, ALUE. I'll need two more states for unknown chars and white spaces.

    Have I got it right so far?

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Well, I don't think so. Why would those be states? Those are input/output things.

    Let me show you what I'm thinking with your example of "Jk EXP uIP * BBgU" and we'll see:

    start in state (1)
    next character is J. No valid token starts with J, so J takes me from state 1 to state 6
    [You seem to be using space-separated tokens, so:]
    in state (6)
    read until a space, throwing everything away.
    now in state (1)
    next character is E; E takes me from state 1 to state 2
    now in state (2)
    [either we will read "XP" and be done, or "VAL" and be done, or something else and go to 6]
    read XP, so output token "EXP" and move to state 1
    now in state (1)
    next character is u
    no valid token starts with u, so u takes me from state 1 to state 6
    in state (6)
    read until a space, throwing everything away.
    now in state (1)
    next character is *, so * takes me from state 1 to state 5
    now in state (5)
    [punctuation always one character long]
    output token "*" and go to state 1
    now in state (1)
    next character is B, which takes me from state 1 to state 6
    now in state (6)
    read until a space, throwing everything away
    now in state (1)
    next character is EOF, so go to state (7)
    now in state (7)
    stop

    In other words, the letters and numbers aren't states, they're the inputs that cause transitions between states. Normally I wouldn't separate E and V like that, but since they're the only valid starting letters, I thought why not.

    Edit to add: I guess, looking at this again, if your formalism requires that everything be a state (in other words, state 2 has to go to state 2a when you see an X (going for EXP) and to state 2b when you see a V (going for EVAL) and to state 6 otherwise), then yes, you'll have a bunch more states, although not too many more.
    Last edited by tabstop; 07-13-2008 at 09:49 PM.

  8. #8
    Registered User
    Join Date
    Mar 2007
    Posts
    25
    The stream of characters coming in aren't states, I know, they are messages (events if you want to call it that). The kind of message I get determines which state i'm in. And the tokens I'm looking for in the stream determine the kinds of states that I'm going to have.

    If I don't break up the words like that then I end up with too many if and switch statements, and the code becomes unmanageable.

  9. #9
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    I think perhaps it might make the problem easier if you considered only one character at a time, not a word at a time.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. State machine
    By sergio in forum General Discussions
    Replies: 3
    Last Post: 07-03-2009, 09:03 AM
  2. state machine using function pointers
    By meena_selvam in forum C Programming
    Replies: 1
    Last Post: 10-25-2008, 02:09 PM
  3. NAQ: Everything you never wanted to know about CPP
    By evildave in forum C Programming
    Replies: 21
    Last Post: 12-12-2005, 10:56 AM
  4. Finite State Machine Project Help
    By ryanbradley in forum C++ Programming
    Replies: 4
    Last Post: 03-06-2004, 10:23 AM
  5. Designing State Machine
    By axon in forum Tech Board
    Replies: 3
    Last Post: 11-06-2003, 12:13 PM