Thread: Tiny thing in a simple parser - C++

  1. #16
    Registered User
    Join Date
    Dec 2017
    Posts
    1,626
    1. sin >> line won't do anything since sin has nothing in it, so nothing will be put into line. Maybe you meant sin << line, but that is not legal for an istringstream. You should try it out.

    BTW, now that I see the syntax highlighting, sin is perhaps not the best name since <cmath> has a sin function and even without using namespace std it is often exposed as just sin. I changed the name to iss in my code.

    -----------------------------

    2. The return value of sin >> ch is sin itself. That's why you can chain the calls like sin >> a >> b >> c, which is parsed as (((sin >> a) >> b) >> c). Each operator<< call returns sin so that it can be used in the next call.

    When tested in a boolean context it return !sin.fail(), which is why it can be used to control a loop like while (sin >> ch). So in the if statements we're testing if the input failed (e.g., the input was letters or a number out of the range of the variable) before we test the more limited range.

    -----------------------------

    3(?) sin >> ch reads the first non-whitespace character it encounters. So if sin contained "John" it would read 'J".

    sin.putback(ch) is just putting ch back into the stream so that the next sin>>?? call can read it again. It basically replaces your use of peek, except that peek will read the next char even if it's whitespace, whereas sin>>ch reads the next non-whitespace character, which we then put back so it can be read again.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  2. #17
    Registered User
    Join Date
    Nov 2019
    Posts
    135
    @john.c - I have asked a member staff regarding the issue of empty lines - if there are empty lines between non-empty lines -> invalid input.
    Besides that, they say, there can be an empty line at the end of the file, as Unix automatically (?) adds a new-line by itself...

    Weird, but I think you last version captures all these things and we don't have to change anything.

    1. Yea, I meant in the opposite direction. Weird that's not legal...

    2. Oh, in a boolean context it behaves really differently. It fails if we try assigning a non-integer value into an integer value?

    3. So, sin >> ch - "Take the first non-EOF char and transfer it into ch". While sin.putback(ch) - "Take the char inside ch and transfer it into the beginning of sin". Yes?

    - Another thing, I have also to validate that the path (can be relative/absolute) received is of a CSV format. My validation here is sufficient?

    Code:
    std::string filepath = argv[1];
    
    
        if (filepath.substr(filepath.find_last_of(EXTDOT) + 1) != VALIDEXT)
        {
            std::cerr << invalidErr;
            exit(EXIT_FAILURE);
        }
    
        std::ifstream in(filepath);
    Thank you!

  3. #18
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by HelpMeC
    Can you only explain how would you use this regex to define Maxima?
    What do you mean?

    Quote Originally Posted by HelpMeC
    I don't think I'll use this solution because as I said, I'm not so friendly with regexes
    Then it would be a good time to learn

    Quote Originally Posted by HelpMeC
    nor with things like stoi (which demands us to handle exceptions)
    A great deal of modern C++ involves exceptions, if not throwing them, then at least handling them. In this case, an exception thrown would really be unusual since the argument to stoi will definitely be a single digit numeric string, which is why I wasn't bothered with demonstrating how to handle a possible exception.

    Quote Originally Posted by HelpMeC
    but I'll be glad to understand the syntax and semantics of this line (what's going on there?)
    matches.str(2) returns the second subexpression match (i.e., the second set of stuff in parentheses in the regex pattern). stoi converts that numeric string to int, and the result is used to initialise dim.

    Quote Originally Posted by HelpMeC
    Another thing, I think in my case of using a stack and a clone method - I can't replace the push method by something like emplace_back.
    The right member function for std::stack would be emplace, but yes, since you're deferring construction to a simple factory, you cannot use that. The idea of emplace is to create the object of the known type in-place in the container; if the actual type is not known to the container because you're using polymorphism, then you cannot use emplace.

    Note that if you're using std::stack<Fractal *> and are using dynamic memory allocation, then you also should handle deallocation manually. Consequently, it would be better to use a std::stack<std::unique_ptr<Fractal>> or std::stack<std::shared_ptr<Fractal>> so that memory deallocation is managed for you.

    BTW, I have also to validate that the path (can be relative/absolute) received is of a CSV format.
    My validation here is sufficient?
    You aren't validating that the path is "of a CSV format"; you're validating that the path has a CSV file name extension. The file format refers to the content of the file; the file name extension refers to a convention concerning the file name that may or may not accurately indicate the file format.

    As for your validation itself: you should check that filepath.find_last_of(EXTDOT) does not return std::string::npos before using its return value. Currently, what you have mostly works, except that if the path is equal to VALIDEXT, validation will pass when arguably it shouldn't since a path consisting entirely of VALIDEXT technically does not have a file name extension. Furthermore, you may need to consider if you want validation to be case-sensitive or not.

    If you're compiling with respect to C++17, you should be aware that std::filesystem::path and its extension() member function could be an option.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #19
    Registered User
    Join Date
    Dec 2017
    Posts
    1,626
    blank_lines:
    I wouldn't say that *nix (the operating system) adds a newline at the end of a text file, although it is standard practice in *nix text editors to ensure that the last line ends in a newline. This tends to mean that if you supply a newline at the end of the last line then you will get an extra newline (a blank line) at the end. I believe that Windows doesn't do this, but I'm not sure.

    But if you are only allowed one blank line at the end then the current code is not correct since it allows any number of blank lines at the end.

    2. In any context it enters the failed state if you try to read "hello" (or an out-of-range value) into an integer. But in a boolean context it returns !in.fail() instead of the istream itself like it usually does.

    3. No, it takes the first NON-WHITESPACE char and puts it into ch. There is no such thing as an "EOF char". The EOF value is a special value that is returned by some functions when the end-of-file is reached. But it is not a character in the file. Recall that in C you need to read a char from getchar() into an int, not a char. That's because the EOF value is a so-called "out-of-band" value that cannot possibly correspond to a valid char.

    csv file: (Here I'm repeating some of what laserlight has already said.)
    For validating that a filename ends in ".csv" you need to handle the case where the filename doesn't contain a period at all.

    Code:
    #include <iostream>
    #include <string>
    #include <cstdlib>
    using namespace std;
     
    int main(int argc, char** argv)
    {
        if (argc < 2)
        {
            cerr << "Usage: ...\n";
            exit(EXIT_FAILURE);
        }
     
        std::string filepath = argv[1];
     
        auto pos = filepath.find_last_of(".");
        if (pos == filepath.npos || filepath.substr(pos + 1) != "csv")
        {
            cerr << "Usage: ...\n";
            exit(EXIT_FAILURE);
        }
    }
    A little inaccuracy saves tons of explanation. - H.H. Munro

  5. #20
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    Wow, just wow.... how many years have you been programming in C++? That's an insane amount of new knowledge that I came across when reading the last few posts on this thread.
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  6. #21
    Registered User
    Join Date
    Nov 2019
    Posts
    135
    @laserlight -
    Actually the staff said we mustn't use regex.

    As for Maxima - I meant, if you can show me how to change the pattern to also define maximums/minimums.

    As for exceptions - Yea, I'm pretty sure the our next class will cover the topic of exceptions in CPP. And I understood, in this case you verified at first you have the correct pattern which involves the digits - so nothing is gonna be wrong.

    As for emplace - We haven't learnt yet those smart pointers (unique_ptr/shared_ptr).
    But, to my understanding, I have nothing to delete here manually (and through all the project in general, because all the allocated data involved here is created inside a vector/stack). These containers receive the pointer to the right Fractal - and they have to destruct it at the end.
    No?


    File extension - thanks for the remarks!

    @john.c -

    blank_lines: Yea - I won't change anything to make it fit to only one empty line - this parsing is really overrated. Thanks for the remarks!

    2. So you actually add '!' (!(sin >> dim)) to make it be like: sin.fail() and not !sin.fail() as it automatically becomes.

    3. So, I understood correctly the process (?) - except for using the EOF word incorrectly.


    So, this is the final version:

    Code:
    #include <iostream>
    #include "Fractal.h"
    #include <fstream>
    #include <stack>
    #include <sstream>
    
    const char *usgErr = "Usage: FractalDrawer <file path>\n";
    const char *invalidErr = "Invalid input\n";
    const char *VALIDEXT = "csv";
    const char EXTDOT = '.';
    const char COMMA = ',';
    const char MINTYPE = 1;
    const char MAXTYPE = 3;
    const int MINDIM = 1;
    const int MAXDIM = 6;
    const int NUBEROFARGS = 2;
    
    int main(int argc, char *argv[])
    {
    
        if (argc != NUBEROFARGS)
        {
            std::cerr << usgErr;
            std::exit(EXIT_FAILURE);
        }
    
        std::string filepath = argv[1];
    
        auto pos = filepath.find_last_of(EXTDOT);
        if (pos == std::string::npos || filepath.substr(pos + 1) != VALIDEXT)
        {
            std::cerr << invalidErr;
            exit(EXIT_FAILURE);
        }
    
        std::ifstream in(filepath);
    
        std::stack<Fractal *> resToPrint;
    
        bool blank_line_seen = false;
        int lineno = 0;
        for (std::string line; std::getline(in, line);  )
        {
            std::istringstream sin(line);
            ++lineno;
    
            int type = 0, dim = 0;
            char ch = 0;
            if (!(sin >> ch)) // Check if there is some non-whitespace in the current line
            {
                blank_line_seen = true;
                continue;
            }
            else if (blank_line_seen) // If the current line is non-empty and the previous one was an empty one
            {
                std::cerr << invalidErr;
                exit(EXIT_FAILURE);
            }
    
            sin.putback(ch);
    
            if (!(sin >> type) || type < 1 || type > 3) // Take a numeric digit value in legal range
            {
                std::cerr << invalidErr;
                exit(EXIT_FAILURE);
            }
    
            if (!(sin >> ch) || ch != COMMA) // The middle char must be a comma
            {
                std::cerr << invalidErr;
                exit(EXIT_FAILURE);
            }
    
            if (!(sin >> dim) || dim < 1 || dim > 6) // Take a numeric digit value in legal range
            {
                std::cerr << invalidErr;
                exit(EXIT_FAILURE);
            }
    
            if (sin >> ch) // If there is some extra data in the line
            {
                std::cerr << invalidErr;
                exit(EXIT_FAILURE);
            }
    
            resToPrint.push(FractalFactory::factoryMethod(type, dim));
        }
    
        while (!resToPrint.empty())
        {
            std::cout << *(resToPrint.top()) << std::endl;
            resToPrint.pop();
        }
    
        in.close();
    
        return 0;
    }

    BTW, question regarding this line:

    Code:
    for(std::string line; std::getline(in, line);  )
    Here, we only declare on a string, but not allocate some memory for it.
    So, how getline succeeds to input something into line?
    Last edited by HelpMeC; 01-06-2020 at 05:57 AM.

  7. #22
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by HelpMeC
    As for Maxima - I meant, if you can show me how to change the pattern to also define maximums/minimums.
    Generally, it usually isn't worth the effort. The regex pattern validates textual patterns, not number values, so you can only validate specific numeric ranges if you express them as a textual pattern. It usually is easier and more readable to just express a more generic numeric range as a textual pattern (e.g., 1 to N number of digits) so that you can exclude input that is obviously invalid, then validate the actual number as being within the range when parsing it.

    Quote Originally Posted by HelpMeC
    As for emplace - We haven't learnt yet those smart pointers (unique_ptr/shared_ptr).
    emplace and smart pointers are two different concepts.

    Quote Originally Posted by HelpMeC
    But, to my understanding, I have nothing to delete here manually (and through all the project in general, because all the allocated data involved here is created inside a vector/stack). These containers receive the pointer to the right Fractal - and they have to destruct it at the end.
    No?
    Yes, the container will destroy the pointers that are its elements. That's absolutely true. So the pointer is gone... but the memory that the now non-existent pointer pointed to remains allocated. If the process doesn't terminate soon after, or if you're not running the program on a modern OS that will reclaim the memory after the process terminates, then you have a classic memory leak.

    The takeaway here is that these containers do resource management for you, but only if you instantiate them with types that do their own resource management (and memory is perhaps the most common resource). Well designed object types may be designed to do that, and so would a number of smart pointer types, but not your ordinary pointers as they don't manage the memory of what they point to; they merely point to the memory.

    Quote Originally Posted by HelpMeC
    Here, we only declare on a string, but not allocate some memory for it.
    So, how getline succeeds to input something into line?
    std::string manages its own memory, and this version of getline is able to cause std::string to allocate sufficient memory to store the string read (assuming there is sufficient contiguous memory that can be allocated).
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #23
    Registered User
    Join Date
    Nov 2019
    Posts
    135
    @laserlight -
    Regarding the destruction - I'm pretty confused.
    You agree that the container will destroy each pointer pushed into it - that is, if we push to it (new SierpinskyFractal(some_dim)) - at the end of the stack-frame its destructor will invoke SierpinskyFractal's destructor (which is the default one - but it knows how to destroy its instances I guess) on this cell and delete it. So everything is supposed to be fine. Isn't it?
    What you actually describe me is something like: the container only nullifies its pointers and that's it (weird).

    Well designed object types may be designed to do that, and so would a number of smart pointer types, but not your ordinary pointers as they don't manage the memory of what they point to; they merely point to the memory.
    This sentence is also some of opaque for me too, maybe you can explain it more in-depth.



    As for std::string -
    When I only declare some variable this way:
    Code:
    std::string s;
    Some memory has been allocated for that at all? I guess no, it's only a declaration...


    There is a lot to learn from both of you, thanks a lot.

  9. #24
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by HelpMeC
    You agree that the container will destroy each pointer pushed into it - that is, if we push to it (new SierpinskyFractal(some_dim)) - its destructor will invoke SierpinskyFractal's destructor (which is the default one - but it knows how to destroy its instances I guess) on this cell and delete it. So everything is supposed to be fine. Isn't it?
    When it is destroyed, a container (or in this case container adapter) of type T will destroy the elements of type T that it contains. For types that have a destructor, this means invoking the T destructor for each element.

    You have a std::stack<Fractal *>. This means that T is a Fractal*. A pointer to Fractal. So, what the container destroys are pointers to Fractal. The destructor for whatever Fractal class or derived class is never invoked unless you do it yourself.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #25
    Registered User
    Join Date
    Nov 2019
    Posts
    135
    @laserlight -
    It's an ultra-important point.

    What does it mean to only destroy a pointer itself? What is the process of destroying pointers? Just nullifying them?

    So, I tried just a min ago to fix that problem, adding this delete line to the last loop:

    Code:
    while (!resToPrint.empty())
        {
            std::cout << *(resToPrint.top()) << std::endl;
            delete *(resToPrint.top());
            resToPrint.pop();
        }
    But the compiler complains:
    It thinks I'm trying to delete a Fractal instance which is not possible as Fractal is an abstract class.
    How can I overcome this obstacle?...

    error: type ‘class Fractal’ argument given to ‘delete’, expected pointer delete *(resToPrint.top());
    That is, for some reason when invoking delete - Polymorphism is not performed, though the destructor is virtual...
    Last edited by HelpMeC; 01-06-2020 at 07:10 AM.

  11. #26
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by HelpMeC
    What does it mean to only destroy a pointer itself? What is the process of destroying pointers? Just nullifying them?
    Imagine you have a std::vector<int>. Destroying elements that happen to be pointers is pretty much the same as destroying elements that happen to be ints: the memory for them is deallocated. They probably aren't even zeroed.

    Quote Originally Posted by HelpMeC
    It thinks I'm trying to delete a Fractal instance which is not possible as Fractal is an abstract class.
    Actually, it's not possible because Fractal is an object type, not a pointer type. That's what the error message is trying to tell you. You need to invoke delete on the pointer:
    Code:
    delete resToPrint.top();
    Strictly speaking, this is somewhat careless: a more careful version is to copy the pointer to a local pointer variable, call pop(), then use delete on the local pointer variable. The reason is that otherwise you have a stack in a partially invalid state (that you have no way of determining) before the pop, so in theory some maintainer mistakenly adding code between the delete and the pop could ruin things.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  12. #27
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    By the way, why did you choose a stack rather than a vector? There doesn't seem to be any last-in-first-out characteristics involved here.

    Another thing: you should check that a file was successfully opened before reading from it (or writing to it).

    I note that your main function is some 80 lines. You should consider breaking it up into smaller functions that each do one thing and does it well.

    Oh, and while this looks like it declares a constant, it doesn't:
    Code:
    const char *VALIDEXT = "csv";
    You need to make the pointer itself const:
    Code:
    const char * const VALIDEXT = "csv";
    Last edited by laserlight; 01-06-2020 at 07:35 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  13. #28
    Registered User
    Join Date
    Nov 2019
    Posts
    135
    @laserlight -
    This one: resToPrint.top() - gives us a pointer to a Fractal. But I don't want to delete the pointer itself (As the container is in charge of doing that) but what it points to - the Fractal object itself.

    This is how I see the things.

    I understand from what you say - it can't delete it as it's not a pointer. Is delete required to be performed only on pointers? Probably no.

    And still with your suggestion:

    Code:
        while (!resToPrint.empty())
        {
            std::cout << *(resToPrint.top()) << std::endl;
            delete resToPrint.top();
            resToPrint.pop();
        }
    Compiler says:
    error: ‘virtual Fractal::~Fractal()’ is protected within this context delete resToPrint.top();

    I guess I have to declare the destructor in Fractal base class in a public section, yea?


    -------------------------------------------------------------------------------------------------------------
    Your second comment:

    We can assume the file is opened successfully. Also, we have to print the fractals in reverse-order it's presented in the CSV file. Thanks for the const remark - it really can be bypassed in my version the const expected behavior.

    BTW, why the compiler warns about declaring const like that:

    Code:
    const std::string invalidErr = "Invalid input\n";
    Some exception can be thrown because static duration...
    Last edited by HelpMeC; 01-06-2020 at 07:41 AM.

  14. #29
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by HelpMeC
    resToPrint.top() - gives us a pointer to a Fractal. But I don't want to delete the pointer itself (As the container is in charge of doing that) but what it points to - the Fractal object itself.

    This is how I see the things.
    delete p; for some pointer p destroys the lone object that p points to (or does nothing if p is a null pointer). It does not destroy p itself.

    Quote Originally Posted by HelpMeC
    Is delete required to be performed only on pointers?
    Yes.

    Quote Originally Posted by HelpMeC
    I guess I have to declare the destructor in Fractal base class in a public section, yea?
    Yes.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #30
    Registered User
    Join Date
    Nov 2019
    Posts
    135
    @laserlight -
    - delete issue - Very important and good to know.

    And here is a related question to the destruction issue -


    This is a piece of a major method inside an implementation of a class (I won't reveal here the whole code to avoid copying by students from my class - it would lead to disqualification):


    Code:
    std::vector<std::string> Fractal::toLines() const
    {
    // Base case:
    if (m_Dim == 0)
    {
    return std::vector<std::string>(1, FRACTALCELL);
    }
    
    // Here we create a new fractal of a specific type - Polymorphism is performed here as _smallerClone is a pure virtual method of the Fractal base class
    Fractal *smallerOne = _smallerClone(m_Dim - 1);
    
    // Here we save each line of the final sub-fractal
    std::vector<std::string> lines = smallerOne->toLines();
    
    // The final output board
    std::vector<std::string> output;
    At the end of such method - do I have to delete manually the smallerOne variable before the return?
    I thought that at the end of the method - everything is destroyed, including instances of the related class.
    Am I wrong?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Simple parser - K&R ex6-2
    By lukasjob in forum C Programming
    Replies: 3
    Last Post: 11-16-2011, 08:46 AM
  2. Simple parser
    By lruc in forum C Programming
    Replies: 5
    Last Post: 11-19-2009, 12:19 AM
  3. Thread safety for tiny simple functions
    By CodeMonkey in forum C++ Programming
    Replies: 16
    Last Post: 12-31-2008, 12:20 AM
  4. not able to understand this tiny tiny method
    By noobcpp in forum C++ Programming
    Replies: 5
    Last Post: 10-20-2008, 10:42 AM
  5. Open Source Tiny Simple TFTP Server
    By Geolingo in forum C++ Programming
    Replies: 1
    Last Post: 03-28-2004, 03:27 PM

Tags for this Thread