Faster way of deleting characters

**paut** · 10-30-2011

Hey, I’ve got problem. I have 2 strings and I want to delete the first occurring of characters from str1 which are in str2. For example:

Str1: show
Str2: slow
Output: h
Str1: expensive
Str2: terrible
Output: xpnsve

I used find_first_of() function to solve this problem, but in practice it turned out too slow. Is there any faster way to solve this problem? Please, help

**Salem** · 10-30-2011

How "slow" is it now?
How did you measure it?
How much "faster" does it need to be for success?
Which OS/Compiler are you using?
Did you try using the compiler optimisation flags?

**paut** · 10-31-2011

1. I don’t know how much it is slow. Well it's just too slow.
2. I'm testing this program on a special website, which has its own input and only rates that the output is correct and gives the time relative to 1s, it's exactly shows achieved time if its less than 1s, or writes that is greater than 1s.
3. All solutions should be given in less than 1s.
4. At home, I'm using CodeBlocks, but I think it doesn't make any difference, because on this site probably it's different.
5. No, I didn’t tried.

The problem is that for the first 4 tests, the result is very good (the largest achieved time is ~ 0.05s), the fifth test doesn't want to pass because of exceeding the time limit.

I had some ideas to solve this problem, but in practice I only tested find_first_of(), function which turn out too slow. I don't see solution of this problem in checking individual letters, but in searching the whole subsequence and removing them. My another problem is that I'm not sure how it should look like now in the final code. LCS algorithm, it requires unfortunately too much of memory.

**Rodaxoleaux** · 10-31-2011

Based on what I see you trying to do, wouldn't you 'have' to check every letter considering the string could be anything? Try building with optimization. In Code::Blocks, it even says [faster code, even faster code, etc] next to each optim. choice.

**iMalc** · 11-01-2011

If your program isn't accepted by one of those online judges due to time restrictions, then by far the most likely answer is that you're using the wrong algorithm.
Standard library functions are only slow if you're using them poorly, or for the wrong purpose, or the wrong functions.

Give details on the problem you are solving, perferably linking to it, and then either describe your solution in sufficient detail or provide code.

**paut** · 11-01-2011

Ok. So this is how my solution of this problem looks:

Until j is less than lenght of str2 do :
1) Positions of letter from str1 in string str2 assign to p (p=str1.find_first_of(str1[j])
2) If letter occuring then i'm removing it from str2, if not then I'm adding it to a previously created variable s3 (if(p<0) s3.push_back(str1[i]); else str2.erase(p,1);)
At the end I'm write out s3.

I tried different solutions. I searched letters from string str2 in str1 and then if they appeared program deleting them and at the end show str1.
I could see that in 4/5 cases (also when my program can't past the test) str1 is longer than str2.
So it seems weird that algorithm presentated above is faster than this another one where from str1 I throw out one after another all letters from str2.
I also tried use array which can automatically adding another letter to s3 without checking that this letter appears in str2, but this solution turned out slower than this which I wrote above.
It's really hard for me find another alternative solution expect of this where from str1 I thorow out all the common subsequences from str2, but I don't know how to write it in a final code.

**KCfromNC** · 11-01-2011

Yep, wrong algorithm. Yours runs in O(nm), where n and m are the length of each string - find_first_of is a linear search, and it will be particularly painful when there's no instance of a letter from str2 in str1.

Here's a linear-time approach :

Code:

Create an array str2_letters[1<<CHAR_BITS], initialize to 0
For each letter in str2, set str2_letters[str2[i]] = 1
create temp string str3 = length of str1 + 1

for each letter in str1
  if str2_letters for that letter is set
    clear st2_letters at that index
  else
    append letter to str3

terminate str3 with a 0
optionally copy from str3 back into str1

To speed things up, keep a count of how many elements of str2_letter[] are still set and bail out early (don't forget to append the remaining characters from str1 to the end of str3). You might also play games to figure out which of str1 or str2 is shorter and have separate code for either case. Won't matter if they're similar size, but you might see an improvement if one is vastly larger than the other.

**paut** · 11-01-2011

Thank you for precisely explanation

but still I've got problem "Create an array str2_letters[1<<CHAR_BITS]" - What's char_bits?
At first glance it seems that is just lenght of str2, but it can't be true. After using google I found answer like this: "CHAR_BIT is the number of bits in char".
However I still not really understand this. Numbers of bits in char? It shouldn't be 8?
Please, explain it to me..

**laserlight** · 11-01-2011

Originally Posted by paut

Numbers of bits in char? It shouldn't be 8?

Yes, but in theory (and historically), a byte might not be 8 bits. Besides, even if a byte was guaranteed to be 8 bits, it is better to use a named constant than a magic number.

**paut** · 11-02-2011

I'm still not able to understand it. For example, if I've got the word ''show slow'' what range this array should have? "For each letter in str2, set str2_letters[str2[i]] = 1 ", this sentences shows what is the length of str2, but I don't know how this relate with char_bits ? “create temp string str3 = length of str1 + 1”, I think that string have dynamically selected size.

Create an array str2_letters[1<<CHAR_BITS], initialize to 0
For each letter in str2, set str2_letters[str2[i]] = 1
create temp string str3 = length of str1 + 1

for each letter in str1
if str2_letters for that letter is set
clear st2_letters at that index
else
append letter to str3

terminate str3 with a 0
optionally copy from str3 back into str1

In C++:

Code:

For (i=0; i<len.str2; i++) str2_letters=1;
 String s3;
 For (i=0; i<len.str1; i++)
 IF (str2_letters[i]) str2_letters.erase(tab.begin() + i -1);
 Else
 S3.Push_back(str2_letters[i]);

There is mistake after another in this part of code. Can anyone help? To tell the truth I started to learn this language not long time ago because I was starting from Pascal.

**KCfromNC** · 11-03-2011

The length of str2_letters has nothing to do with the length of str2. The length is simply the number of possible characters in
the alphabet used to build str2. In this case, a character is a single 8-bit byte = 1<< 8 = 256 possible character values.

You index into str2_letters using a character value. For example str2_letters['s'] == 0 means there's no character s anywhere in str2. Likewise, if str2_letters['s'] == 1, then there is an s in str2. Relate this to my pseudo-code by seeing that str1[i] or str2[i] is the character at index i in each string.

Code:

For (i=0; i<len.str2; i++) str2_letters=1;  <- I said to set str2_letters[str2[i]] =1 ... this is different from setting str2_letters = 1 (the latter being illegal in C++)
 String s3;
 For (i=0; i<len.str1; i++)  <- I don't think this is how you get the length of a string
 IF (str2_letters[i]) str2_letters.erase(tab.begin() + i -1);  <--- i isn't the character at index i in str1.  It's just i. And you can't call erase() on an array.  And what's tab in this context?
 Else
 S3.Push_back(str2_letters[i]);

**paut** · 11-03-2011

Well.. It is so much faster solution, I implemented it and my program passed all the tests (5/5). Thank you so much for help. I just had to make small correction in code which you wrote for me. Yours code was deleting all the occuring of the letters, but mine program just had to delete the first occur of the letter. So this why i made this correction.. I wrote „For each letter in str2, set str2_letters[str2[i]]++” instead of „For each letter in str2, set str2_letters[str2[i]] = 1” and "str2_letter[str1[i]--" insted of "clear st2_letters at that index". Everything works perfectly.

Thread: Faster way of deleting characters

Thread Tools

Search Thread

Display

Faster way of deleting characters

Similar Threads

Creating and deleting directories and deleting files

deleting characters

Deleting characters from 2d arrays

Deleting characters from strings

deleting invaid characters in a string

Tags for this Thread