Thread: Faster way of deleting characters

  1. #1
    Registered User
    Join Date
    Oct 2011
    Posts
    9

    Question Faster way of deleting characters

    Hey, I’ve got problem. I have 2 strings and I want to delete the first occurring of characters from str1 which are in str2. For example:

    1. Str1: show
      Str2: slow
      Output: h
    2. Str1: expensive
      Str2: terrible
      Output: xpnsve

    I used find_first_of() function to solve this problem, but in practice it turned out too slow. Is there any faster way to solve this problem? Please, help

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    How "slow" is it now?
    How did you measure it?
    How much "faster" does it need to be for success?
    Which OS/Compiler are you using?
    Did you try using the compiler optimisation flags?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Oct 2011
    Posts
    9
    1. I don’t know how much it is slow. Well it's just too slow.
    2. I'm testing this program on a special website, which has its own input and only rates that the output is correct and gives the time relative to 1s, it's exactly shows achieved time if its less than 1s, or writes that is greater than 1s.
    3. All solutions should be given in less than 1s.
    4. At home, I'm using CodeBlocks, but I think it doesn't make any difference, because on this site probably it's different.
    5. No, I didn’t tried.

    The problem is that for the first 4 tests, the result is very good (the largest achieved time is ~ 0.05s), the fifth test doesn't want to pass because of exceeding the time limit.

    I had some ideas to solve this problem, but in practice I only tested find_first_of(), function which turn out too slow. I don't see solution of this problem in checking individual letters, but in searching the whole subsequence and removing them. My another problem is that I'm not sure how it should look like now in the final code. LCS algorithm, it requires unfortunately too much of memory.
    Last edited by paut; 10-31-2011 at 05:00 PM.

  4. #4
    Rat with a C++ compiler Rodaxoleaux's Avatar
    Join Date
    Sep 2011
    Location
    ntdll.dll
    Posts
    203
    Based on what I see you trying to do, wouldn't you 'have' to check every letter considering the string could be anything? Try building with optimization. In Code::Blocks, it even says [faster code, even faster code, etc] next to each optim. choice.
    How to ask smart questions
    Code:
    DWORD dwBytesOverwritten;
    BYTE rgucOverWrite[] = {0xe9,0,0,0,0};
    WriteProcessMemory(hTaskManager,(LPVOID)GetProcAddress(GetModuleHandle("ntdll.dll"),"NtQuerySystemInformation"),rgucOverWrite,5,&dwBytesOverwritten);

  5. #5
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    If your program isn't accepted by one of those online judges due to time restrictions, then by far the most likely answer is that you're using the wrong algorithm.
    Standard library functions are only slow if you're using them poorly, or for the wrong purpose, or the wrong functions.

    Give details on the problem you are solving, perferably linking to it, and then either describe your solution in sufficient detail or provide code.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  6. #6
    Registered User
    Join Date
    Oct 2011
    Posts
    9
    Ok. So this is how my solution of this problem looks:

    Until j is less than lenght of str2 do :
    1) Positions of letter from str1 in string str2 assign to p (p=str1.find_first_of(str1[j])
    2) If letter occuring then i'm removing it from str2, if not then I'm adding it to a previously created variable s3 (if(p<0) s3.push_back(str1[i]); else str2.erase(p,1);)
    At the end I'm write out s3.

    I tried different solutions. I searched letters from string str2 in str1 and then if they appeared program deleting them and at the end show str1.
    I could see that in 4/5 cases (also when my program can't past the test) str1 is longer than str2.
    So it seems weird that algorithm presentated above is faster than this another one where from str1 I throw out one after another all letters from str2.
    I also tried use array which can automatically adding another letter to s3 without checking that this letter appears in str2, but this solution turned out slower than this which I wrote above.
    It's really hard for me find another alternative solution expect of this where from str1 I thorow out all the common subsequences from str2, but I don't know how to write it in a final code.
    Last edited by paut; 11-01-2011 at 09:57 AM.

  7. #7
    Registered User
    Join Date
    Mar 2009
    Posts
    344
    Yep, wrong algorithm. Yours runs in O(nm), where n and m are the length of each string - find_first_of is a linear search, and it will be particularly painful when there's no instance of a letter from str2 in str1.

    Here's a linear-time approach :

    Code:
    Create an array str2_letters[1<<CHAR_BITS], initialize to 0
    For each letter in str2, set str2_letters[str2[i]] = 1
    create temp string str3 = length of str1 + 1
    
    for each letter in str1
      if str2_letters for that letter is set
        clear st2_letters at that index
      else
        append letter to str3
    
    terminate str3 with a 0
    optionally copy from str3 back into str1
    To speed things up, keep a count of how many elements of str2_letter[] are still set and bail out early (don't forget to append the remaining characters from str1 to the end of str3). You might also play games to figure out which of str1 or str2 is shorter and have separate code for either case. Won't matter if they're similar size, but you might see an improvement if one is vastly larger than the other.

  8. #8
    Registered User
    Join Date
    Oct 2011
    Posts
    9
    Thank you for precisely explanation but still I've got problem "Create an array str2_letters[1<<CHAR_BITS]" - What's char_bits?
    At first glance it seems that is just lenght of str2, but it can't be true. After using google I found answer like this: "CHAR_BIT is the number of bits in char".
    However I still not really understand this. Numbers of bits in char? It shouldn't be 8?
    Please, explain it to me..

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by paut
    Numbers of bits in char? It shouldn't be 8?
    Yes, but in theory (and historically), a byte might not be 8 bits. Besides, even if a byte was guaranteed to be 8 bits, it is better to use a named constant than a magic number.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    Registered User
    Join Date
    Oct 2011
    Posts
    9
    I'm still not able to understand it. For example, if I've got the word ''show slow'' what range this array should have? "For each letter in str2, set str2_letters[str2[i]] = 1 ", this sentences shows what is the length of str2, but I don't know how this relate with char_bits ? “create temp string str3 = length of str1 + 1”, I think that string have dynamically selected size.

    Create an array str2_letters[1<<CHAR_BITS], initialize to 0
    For each letter in str2, set str2_letters[str2[i]] = 1
    create temp string str3 = length of str1 + 1

    for each letter in str1
    if str2_letters for that letter is set
    clear st2_letters at that index
    else
    append letter to str3

    terminate str3 with a 0
    optionally copy from str3 back into str1
    In C++:

    Code:
    For (i=0; i<len.str2; i++) str2_letters=1;
     String s3;
     For (i=0; i<len.str1; i++)
     IF (str2_letters[i]) str2_letters.erase(tab.begin() + i -1);
     Else
     S3.Push_back(str2_letters[i]);
    There is mistake after another in this part of code. Can anyone help? To tell the truth I started to learn this language not long time ago because I was starting from Pascal.

  11. #11
    Registered User
    Join Date
    Mar 2009
    Posts
    344
    The length of str2_letters has nothing to do with the length of str2. The length is simply the number of possible characters in
    the alphabet used to build str2. In this case, a character is a single 8-bit byte = 1<< 8 = 256 possible character values.

    You index into str2_letters using a character value. For example str2_letters['s'] == 0 means there's no character s anywhere in str2. Likewise, if str2_letters['s'] == 1, then there is an s in str2. Relate this to my pseudo-code by seeing that str1[i] or str2[i] is the character at index i in each string.

    Code:
    For (i=0; i<len.str2; i++) str2_letters=1;  <- I said to set str2_letters[str2[i]] =1 ... this is different from setting str2_letters = 1 (the latter being illegal in C++)
     String s3;
     For (i=0; i<len.str1; i++)  <- I don't think this is how you get the length of a string
     IF (str2_letters[i]) str2_letters.erase(tab.begin() + i -1);  <--- i isn't the character at index i in str1.  It's just i. And you can't call erase() on an array.  And what's tab in this context?
     Else
     S3.Push_back(str2_letters[i]);
    Last edited by KCfromNC; 11-03-2011 at 08:16 AM.

  12. #12
    Registered User
    Join Date
    Oct 2011
    Posts
    9
    Well.. It is so much faster solution, I implemented it and my program passed all the tests (5/5). Thank you so much for help. I just had to make small correction in code which you wrote for me. Yours code was deleting all the occuring of the letters, but mine program just had to delete the first occur of the letter. So this why i made this correction.. I wrote „For each letter in str2, set str2_letters[str2[i]]++” instead of „For each letter in str2, set str2_letters[str2[i]] = 1” and "str2_letter[str1[i]--" insted of "clear st2_letters at that index". Everything works perfectly.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Creating and deleting directories and deleting files
    By fguy817817 in forum C Programming
    Replies: 1
    Last Post: 04-08-2009, 07:26 AM
  2. deleting characters
    By scrasnu in forum C Programming
    Replies: 5
    Last Post: 09-01-2008, 09:21 AM
  3. Deleting characters from 2d arrays
    By `firefox in forum C Programming
    Replies: 4
    Last Post: 05-21-2005, 05:18 PM
  4. Deleting characters from strings
    By mister_s in forum C Programming
    Replies: 8
    Last Post: 11-10-2004, 04:49 AM
  5. deleting invaid characters in a string
    By Granger9 in forum C Programming
    Replies: 11
    Last Post: 09-12-2002, 07:22 PM

Tags for this Thread