eliminating \n and \r from string [boost::iostreams]

This is a discussion on eliminating \n and \r from string [boost::iostreams] within the C++ Programming forums, part of the General Programming Boards category; Hello I use boost::iostreams input filter stream to filter the input file. Heres the example code: Code: using namespace std; ...

  1. #1
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630

    eliminating \n and \r from string [boost::iostreams]

    Hello

    I use boost::iostreams input filter stream to filter the input file.

    Heres the example code:
    Code:
    using namespace std;
    using namespace boost::iostreams;
    namespace io = boost::iostreams;
    
    class long_line_counter : public boost::iostreams::line_filter {
    public:
     explicit long_line_counter(int max_length = 80)
      : max_(max_length), count_(0) { }
    
     int count() const { return count_; }
    
     vector<string> m_list;
    
    private:
     std::string do_filter(const std::string& line) {
      if (line.size() < max_) m_list.push_back(line);
      ++count_;
      return line;
     }
    
     int max_;
     int count_;
    };
    
    int main() {
     try {
      filtering_istream fin;
      long_line_counter llc(80);
    
      file_source fss("outest.fil", ios::in | ios::binary);
    
      fin.push(boost::ref(llc));
      fin.push(boost::ref(fss));
    
      char buf[4012];
      while ( fin.read(buf, 4012) ) {
      }
    
      std::cout << llc.count() << "\n";
      std::cout << llc.mail_list.size() << "\n";
     }
     catch (exception &e) {
      cout << "exception: " << e.what() << std::endl;
     }
    
     return 0;
    }
    Now the problem is that the boost::iostreams line_filter will pass the line (string) along with the \r and \n symbols to the do_filter() function.. So if it reads line "customer: Steve Porter\r\n" it will pass the same string to the function instead of string without '\r\n'.

    I've been wondering whats the most efficient way to eliminate/remove '\r' and '\n' symbols from this std::string.

    Should I do char by char check, or maybe use regex? Is there any more efficient way since I have to do this many thousand times in a short period.

    Thanks for help!

  2. #2
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,893
    Seems like you want a trim_right. String_Algo has one.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  3. #3
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,230
    I'm confused. From my brief reading of the Boost documentation, it looks like it should be removing the line terminator character already. So maybe you just have the line termination mode set incorrectly.

  4. #4
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630
    Quote Originally Posted by brewbuck View Post
    I'm confused. From my brief reading of the Boost documentation, it looks like it should be removing the line terminator character already. So maybe you just have the line termination mode set incorrectly.
    You are right.

    Thats how I open the stream:

    Code:
    line_filter m_line_filter; //filter class that derives from line_filter
    boost::iostreams::filtering_istream m_filter_stream;
    m_filter_stream.push(boost::ref(m_line_filter));
    m_filter_stream.push(boost::iostreams::file_source("file.txt", std::ios::in | std::ios::binary));
    Not sure why it doesnt remove line termintator '\r' although I open it in binary mode. Any idea?

    Thanks a lot!

  5. #5
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,230
    In binary mode the line terminator is just '\n' not "\r\n". Multi-character line terminators are a feature of NON-binary access modes.

  6. #6
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630
    Quote Originally Posted by brewbuck View Post
    In binary mode the line terminator is just '\n' not "\r\n". Multi-character line terminators are a feature of NON-binary access modes.
    I meant with the code above it will still pass "someline\r" instead of "someline" to do_filter() function.

  7. #7
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,893
    Yes, because you're in binary mode.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  8. #8
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630
    Quote Originally Posted by CornedBee View Post
    Yes, because you're in binary mode.
    Which way should I open it then to get "someline" instead of "someline\r"?

  9. #9
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,893
    In text mode, which is the default.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  10. #10
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630
    Quote Originally Posted by CornedBee View Post
    In text mode, which is the default.
    Won't that affect preformance as far as I know?

  11. #11
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,893
    No. The scanning for LF already happens because your filter goes line by line. Kicking out a CR ought to be quite unnoticeable.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  12. #12
    Confused Magos's Avatar
    Join Date
    Sep 2001
    Location
    Sweden
    Posts
    3,145
    Opening a text file in binary mode can have undesireable sideeffects if there is a BOM (Byte Order Mark) present.
    MagosX.com

    Give a man a fish and you feed him for a day.
    Teach a man to fish and you feed him for a lifetime.

  13. #13
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630
    I have always thought that the right way on windows is to open a text file in binary mode because of preformance and other issues.

  14. #14
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,893
    I've yet to see a narrow text mode that cares about the BOM in any way.

    l2u: Well, you can always trim the thing yourself.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  15. #15
    l2u
    l2u is offline
    Registered User
    Join Date
    May 2006
    Posts
    630
    Which way do you think its better? Leave the text mode do the job or do it by yourself?

    Whats the flag to open the file (file_stream) in text mode? I couldnt find it on the boost::iostreams documentation website (I tried ios::text).

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21