Thread: properlyFormatScript()

  1. #1
    Programming Ninja In-T...
    Join Date
    May 2009
    Posts
    827

    properlyFormatScript()

    I wrote a function for properly formatting a script extracted from an html file (i.e. by inserting new-lines when appropriate), and here is the definition:

    Code:
    void C_script_operations::properlyFormatScript(string& scriptStr) {
    
        cout<< "Just entered properlyFormatScript()" <<endl;
        cout<< "scriptStr.size() is: " << scriptStr.size() <<endl;
        //cin.get();
        string temp_str = scriptStr;
        for (size_t i = 0; i < scriptStr.size(); i++) {
            cout<< "i is: " << i << endl;
            if ((scriptStr.at(i) == '{') && (!charIsInLiteralString(scriptStr, scriptStr.at(i), i))) {
                if (i < scriptStr.size() - 2) {
                    if (scriptStr.at(i+1) != '\n' && scriptStr.at(i+2) != '\n') {
                        temp_str.insert(i+1, "\n\n");
                    }
                }
                else
                    temp_str.append("\n\n");
    
            }
    
            else if ((scriptStr.at(i) == ';') && (!charIsInLiteralString(scriptStr, scriptStr.at(i), i))) {
                if (i < scriptStr.size() - 1) {
                    if (scriptStr.at(i+1) != '\n') {
                        temp_str.insert(i + 1, "\n");
                    }
                }
                else
                    temp_str.append("\n\n");
            }
    
            else if ((scriptStr.at(i) == '}') && !(charIsInLiteralString(scriptStr, scriptStr.at(i), i))) {
                if (scriptStr.at(i-1) != '\n' && scriptStr.at(i-2) != '\n') {
                    scriptStr.insert(i - 1, "\n\n");
                }
            }
        }
        scriptStr = temp_str;
    
    }
    This function appears to work in some cases, but other times, it doesn't. I looked at a script outputted to file by my program, after it was done, and it had inserted a new-line character right after one of the semi-colons (like its supposed to do), but with a few semicolons several characters later of the script string, it ended up inserting the new-line before the the semicolons, and I don't know why.
    Note that I added the output lines only for debugging purposes.

    Please look over the code and tell me what you think. Thanks.
    Last edited by Programmer_P; 01-15-2011 at 04:47 PM.
    I'm an alien from another world. Planet Earth is only my vacation home, and I'm not liking it.

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Why are you checking things in scriptStr? As soon as you insert something (anything) into tempStr, your indices will be off between the two strings.

  3. #3
    Programming Ninja In-T...
    Join Date
    May 2009
    Posts
    827
    Quote Originally Posted by tabstop View Post
    Why are you checking things in scriptStr? As soon as you insert something (anything) into tempStr, your indices will be off between the two strings.
    That's a carry-over from where I was using just scriptStr, and no temp_str, but then for some reason, it never exited the loop (I guess because scriptStr.size() kept changing), so I figured I could use a copy of it instead, which appeared to fix it except for the small minor detail which prompted creating this thread.
    As for the indexes thing, I considered that, but thinking about it, it seems like the indexes aren't actually what's changing at each iteration of the loop. i is, and the underlying data accessed by i. And therefore, even though the data at the index i+1 (for the first two outermost if/else if statements) and index i - 1 (for the last outermost else if statement) will change, by the next iteration, it accesses the same index as the one modified by string::insert(), which will now contain the data of what was inserted there (namely a new line character), therefore, those if statements wont be entered, and therefore it will iterate again to the next index which will be the index of the character that originally held the position which the inserted new-line now holds, and it will check it again...hmm. nevermind. I think I just solved it.

    Thanks. Your post caused me to think harder.
    Last edited by Programmer_P; 01-15-2011 at 06:17 PM.
    I'm an alien from another world. Planet Earth is only my vacation home, and I'm not liking it.

  4. #4
    Programming Ninja In-T...
    Join Date
    May 2009
    Posts
    827
    Hmm...I thought for sure that this code would work, but it still does nearly the same thing with the semicolons and new-lines:

    Code:
    void C_script_operations::properlyFormatScript(string& scriptStr) {
    
        cout<< "Just entered properlyFormatScript()" <<endl;
        cout<< "scriptStr.size() is: " << scriptStr.size() <<endl;
        //cin.get();
        string temp_str = scriptStr;
        bool this_char_was_already_checked = false;
        for (size_t i = 0; i < scriptStr.size(); i++) {
            cout<< "i is: " << i << endl;
            if ((scriptStr.at(i) == '{') && (!charIsInLiteralString(scriptStr, scriptStr.at(i), i))
               && (!this_char_was_already_checked)) {
                if (i < scriptStr.size() - 2) {
                    if (scriptStr.at(i+1) != '\n' && scriptStr.at(i+2) != '\n') {
                        temp_str.insert(i+1, "\n\n");
                        this_char_was_already_checked = true;
                    }
                }
                else
                    temp_str.append("\n\n");
    
            }
    
            else if ((scriptStr.at(i) == ';') && (!charIsInLiteralString(scriptStr, scriptStr.at(i), i))
                    && (!this_char_was_already_checked)) {
                if (i < scriptStr.size() - 1) {
                    if (scriptStr.at(i+1) != '\n') {
                        temp_str.insert(i + 1, "\n");
                        this_char_was_already_checked = true;
                    }
                }
                else
                    temp_str.append("\n\n");
            }
    
            else if ((scriptStr.at(i) == '}') && !(charIsInLiteralString(scriptStr, scriptStr.at(i), i))
                    && (!this_char_was_already_checked)) {
                if (scriptStr.at(i-1) != '\n' && scriptStr.at(i-2) != '\n') {
                    scriptStr.insert(i - 1, "\n\n");
                    this_char_was_already_checked = true;
                }
            }
    
            if (this_char_was_already_checked == true)
                this_char_was_already_checked = false; //reset it to false
        }
        scriptStr = temp_str;
    
    }
    This was based off the logic that the same char shouldn't be checked twice (i.e. if it changed indexes due to a call to scriptStr.insert()), so I felt that if I provided a check against that, it would have the desired effect, even though I'm using a copy of the script_str for my insert() operations, and not the original string. But, I guess I'm missing something.
    Last edited by Programmer_P; 01-15-2011 at 06:43 PM.
    I'm an alien from another world. Planet Earth is only my vacation home, and I'm not liking it.

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Programmer_P View Post
    As for the indexes thing, I considered that, but thinking about it, it seems like the indexes aren't actually what's changing at each iteration of the loop. i is,
    This is the stupidest sentence you've ever posted, and for you that's saying something. What do you think i is but the index?

    Think for five seconds:
    Code:
    123456789
    abc{123;  (Original)
    abc{123;  (Copy)
    
    abc{123;   (Original)
    abc{n123;  (Copy)
    (n stands for the new-line you inserted.) So now when your original tells you to do something at space 8, where does that land in your copy? You must search in temp to figure out where to insert things in temp.

  6. #6
    Programming Ninja In-T...
    Join Date
    May 2009
    Posts
    827
    Quote Originally Posted by tabstop View Post
    This is the stupidest sentence you've ever posted, and for you that's saying something. What do you think i is but the index?
    I didn't think i was anything else. I'm just stating the obvious, though I may not have made it obvious enough...
    Think for five seconds:
    Code:
    123456789
    abc{123;  (Original)
    abc{123;  (Copy)
    
    abc{123;   (Original)
    abc{n123;  (Copy)
    (n stands for the new-line you inserted.) So now when your original tells you to do something at space 8, where does that land in your copy? You must search in temp to figure out where to insert things in temp.
    EDIT: nevermind
    Last edited by Programmer_P; 01-15-2011 at 07:17 PM.
    I'm an alien from another world. Planet Earth is only my vacation home, and I'm not liking it.

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Programmer_P View Post
    Yeah, but the problem with doing that is it changes temp_str.size() by making it a higher number, which means my loop could go on forever. That's what it was doing before when I using just the original string for all operations, and that's why I started using temp_str to begin with.
    Except, of course, it does no such thing. (I took out the InStringLiteral thing rather than write my own. I hope you don't mind.)

    Code:
    #include <iostream>
    #include <string>
    
    using namespace std;
    void properlyFormatScript(string& scriptStr) {
    
        cout<< "Just entered properlyFormatScript()" <<endl;
        cout<< "scriptStr.size() is: " << scriptStr.size() <<endl;
        //cin.get();
        for (size_t i = 0; i < scriptStr.size(); i++) {
            cout<< "i is: " << i << endl;
            if ((scriptStr.at(i) == '{')) {
                if (i < scriptStr.size() - 2) {
                    if (scriptStr.at(i+1) != '\n' && scriptStr.at(i+2) != '\n') {
                        scriptStr.insert(i+1, "\n\n");
                    }
                }
                else
                    scriptStr.append("\n\n");
    
            }
    
            else if ((scriptStr.at(i) == ';')) {
                if (i < scriptStr.size() - 1) {
                    if (scriptStr.at(i+1) != '\n') {
                        scriptStr.insert(i + 1, "\n");
                    }
                }
                else
                    scriptStr.append("\n\n");
            }
    
            else if ((scriptStr.at(i) == '}')) {
                if (scriptStr.at(i-1) != '\n' && scriptStr.at(i-2) != '\n') {
                    scriptStr.insert(i - 1, "\n\n");
                }
            }
        }
    
    }
    
    int main() {
        string foo("Something{withsome;bracesand;semicolons}init;");
        cout << foo << endl;
        properlyFormatScript(foo);
        cout << foo << endl;
        return 0;
    }

  8. #8
    Programming Ninja In-T...
    Join Date
    May 2009
    Posts
    827
    Quote Originally Posted by tabstop View Post
    Except, of course, it does no such thing. (I took out the InStringLiteral thing rather than write my own. I hope you don't mind.)

    Code:
    #include <iostream>
    #include <string>
    
    using namespace std;
    void properlyFormatScript(string& scriptStr) {
    
        cout<< "Just entered properlyFormatScript()" <<endl;
        cout<< "scriptStr.size() is: " << scriptStr.size() <<endl;
        //cin.get();
        for (size_t i = 0; i < scriptStr.size(); i++) {
            cout<< "i is: " << i << endl;
            if ((scriptStr.at(i) == '{')) {
                if (i < scriptStr.size() - 2) {
                    if (scriptStr.at(i+1) != '\n' && scriptStr.at(i+2) != '\n') {
                        scriptStr.insert(i+1, "\n\n");
                    }
                }
                else
                    scriptStr.append("\n\n");
    
            }
    
            else if ((scriptStr.at(i) == ';')) {
                if (i < scriptStr.size() - 1) {
                    if (scriptStr.at(i+1) != '\n') {
                        scriptStr.insert(i + 1, "\n");
                    }
                }
                else
                    scriptStr.append("\n\n");
            }
    
            else if ((scriptStr.at(i) == '}')) {
                if (scriptStr.at(i-1) != '\n' && scriptStr.at(i-2) != '\n') {
                    scriptStr.insert(i - 1, "\n\n");
                }
            }
        }
    
    }
    
    int main() {
        string foo("Something{withsome;bracesand;semicolons}init;");
        cout << foo << endl;
        properlyFormatScript(foo);
        cout << foo << endl;
        return 0;
    }
    I haven't tested your code yet (though I know for a fact that it was definitely taking a long time to go through a loop just like yours on my computer, and I just terminated the program after waiting about 30 seconds, and changed the code to use temp_str instead). Anyway, I just fixed my problem by using this code instead:

    Code:
    void C_script_operations::properlyFormatScript(string& scriptStr) {
    
        cout<< "Just entered properlyFormatScript()" <<endl;
        cout<< "scriptStr.size() is: " << scriptStr.size() <<endl;
        //cin.get();
        string temp_str = scriptStr;
        bool this_char_was_already_checked = false;
        for (size_t i = 0; i < scriptStr.size(); i++) {
            cout<< "i is: " << i << endl;
            if ((temp_str.at(i) == '>') && (!charIsInLiteralString(temp_str, temp_str.at(i), i))
               && (!this_char_was_already_checked)) {
                if (i < temp_str.size() - 1) {
                    if (temp_str.at(i+1) != '\n' && temp_str.at(i+2) != '\n') {
                        temp_str.insert(i+1, "\n");
                        this_char_was_already_checked = true;
                    }
                }
                else
                    temp_str.append("\n\n");
    
            }
    
            else if ((temp_str.at(i) == '{') && (!charIsInLiteralString(temp_str, temp_str.at(i), i))
               && (!this_char_was_already_checked)) {
                if (i < temp_str.size() - 2) {
                    if (temp_str.at(i+1) != '\n' && temp_str.at(i+2) != '\n') {
                        temp_str.insert(i+1, "\n\n");
                        this_char_was_already_checked = true;
                    }
                }
                else
                    temp_str.append("\n\n");
    
            }
    
            else if ((temp_str.at(i) == ';') && (!charIsInLiteralString(temp_str, temp_str.at(i), i))
                    && (!this_char_was_already_checked)) {
                if (i < temp_str.size() - 1) {
                    if (temp_str.at(i+1) != '\n') {
                        temp_str.insert(i + 1, "\n");
                        this_char_was_already_checked = true;
                    }
                }
                else
                    temp_str.append("\n\n");
            }
    
            else if ((temp_str.at(i) == '}') && (!charIsInLiteralString(temp_str, temp_str.at(i), i))
                    && (!this_char_was_already_checked)) {
                if (temp_str.at(i-1) != '\n' && temp_str.at(i-2) != '\n') {
                    temp_str.insert(i - 1, "\n\n");
                    this_char_was_already_checked = true;
                }
            }
    
            else if ((temp_str.at(i) == '<') && !(charIsInLiteralString(temp_str, temp_str.at(i), i))
                    && (!this_char_was_already_checked) && (i > 0)) {
                if (temp_str.at(i-1) != '\n') {
                    temp_str.insert(i - 1, "\n");
                    this_char_was_already_checked = true;
                }
            }
    
            if (this_char_was_already_checked == true)
                this_char_was_already_checked = false; //reset it to false
        }
        scriptStr = temp_str;
    
    }
    I didn't think of controlling the loop with scriptStr.size(), and iterating through and operating on temp_str, at the same time, before your post, so thanks.
    Last edited by Programmer_P; 01-15-2011 at 08:13 PM. Reason: updated code
    I'm an alien from another world. Planet Earth is only my vacation home, and I'm not liking it.

Popular pages Recent additions subscribe to a feed