C Board  

Go Back   C Board > General Programming Boards > C Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 09-13-2003, 07:49 PM   #1
Registered User
 
Join Date: Oct 2002
Posts: 27
better c string functions

I was looking for a different string library to use then the standard c one, or maybe just some simple wrappers for some of the c string functions. I did some searches and came up with a few possible ones.

Firestring http://freshmeat.net/projects/firestring/?topic_id=809
Better String Library http://bstring.sourceforge.net/

I was wonder if anyone uses these or anything similar. What would you recommend? Which one do you think is best/easiest to use?

Thanks,
samps005 is offline   Reply With Quote
Old 09-14-2003, 11:32 AM   #2
Obsessed with C
 
chrismiceli's Avatar
 
Join Date: Jan 2003
Posts: 501
what is wrong with the standard c string library?
chrismiceli is offline   Reply With Quote
Old 09-14-2003, 02:06 PM   #3
Code Goddess
 
Prelude's Avatar
 
Join Date: Sep 2001
Posts: 9,661
>what is wrong with the standard c string library?
A lot, actually. C doesn't support very good string handling functionality.

>What would you recommend?
Perl
__________________
My best code is written with the delete key.
Prelude is offline   Reply With Quote
Old 09-14-2003, 03:52 PM   #4
Registered User
 
Join Date: Oct 2002
Posts: 27
I'm not looking for a different language to use, if I was I would just use Python. The thing is I've gotten so use to using the simple string operations in Python that now that I have to do some C code I'm just looking for an easy way out.
samps005 is offline   Reply With Quote
Old 09-14-2003, 06:19 PM   #5
Been here, done that.
 
Join Date: May 2003
Posts: 1,036
If you don't like something, you really need to explain what you want. "Different string library" does not explain what you are trying to accomplish that the standard library can't handle. Mentioning Python does not explain what you want any better.

Since C is not an interpretive language like Python, *you* have control over what it does when it comes to strings. It's not a high level language like Basic or Fortran, it exists in the no-mans land between those and Assembler. Python would be above even those mentioned.

Let me know what functions you want C to handle and I can build a string manipulation library for you that does exactly what you want. You'll need to decide if it would it be worth it to you?
__________________
There are only 10 types of people in the world -- those that use binary, and those that don't
WaltP is offline   Reply With Quote
Old 11-03-2003, 04:45 AM   #6
qed
Registered User
 
Join Date: Nov 2003
Posts: 7
Hi folks!

I am the author of the better string library (Bstrlib). As to the question of "Which [string library] do you think is best/easiest to use?" I want the answer to be the better string library. If you think its not, remember that it is an open source project and if there is some criticism that you have, don't hesitate to send it my way.

One thing about it is that although I believe Bstrlib to be a very easy library to use (even easier than the C standard library, I would claim), the fact that I designed it probably gives me a ridiculous advantage in understanding how it works. So if you have question about how to do something, I can add explanations for them to the documentation.

I am also reasonably familliar with the Python programming language. Realizing how simple and powerful Python is, is part of what motivated me to write the better string library. Simple functionality like automatic memory management, extracting substrings, splitting/joining, buffer overfow and alias safety are built in.

However, there is also functionality in Bstrlib that goes beyond the core of Python's string functions and more naturally maps to the C language. Improvements for most C standard library string functions are available to make sure that Bstrlib can meet any need that the C library can meet. Read-only attributes and static strings have been added to allow for robust and safe intermingling of stack based and heap based strings. One can make a purely reference-based substring of another string with very low overhead, for example. Abstract stream consumption functions have been added to mate bstrings to file I/O (or other kinds of I/O) in a way that isn't arbitrary and ad hoc like the C standard library. One can also access a '\0' terminated char * buffer version of the string for complete compatibility with ordinary C strings.

And just for icing on the cake -- many of the Bstrlib functions are also asymptotically much faster than their analogues in the standard C library's string functions (often by a massive margin.)

As to the comment of "Let me know what functions you want C to handle and I can build a string manipulation library for you ...", this has been done over and over by no end of other people who are only too willing to reinvent the wheel (I am one of them!). However, delivering the total functionality of Bstrlib, and providing a transparent regression test is not usually high on the list of priorities for other string libraries. By covering a superset of C standard library functionality in the area of strings, it is also going to be a much better starting point for an application specific string manipulation library.

Its easy to decide to make your own string library, and its no problem to find ways of improving over the joke that is the standard C library. But what happens when you need to ask "Does this library have performance anomolies?", "Is this library interoperable with ordinary char * functionality to support backward compatibility with other libraries?", "Does this library help reduce buffer overflow problems?", "Is this library portable to other platforms?", "What is the learning curve for this library?", "Is the resulting code going to be maintainable?", "Is the library thread safe?". I think Bstrlib does very well on these questions.

But of course, as the author, you did not just receive an unbiased opinion.
qed is offline   Reply With Quote
Old 11-04-2003, 10:23 AM   #7
Yes, my avatar is stolen
 
anonytmouse's Avatar
 
Join Date: Dec 2002
Posts: 2,544
The bstring library looks excellent. Its ownly downside is the lack of unicode support (this seems to be shared by most other string libraries). Is the bstring library relatively stable as I was thinking of attempting to port it to unicode at some point instead of developing my own solutions.

Do you think there are any major hurdles in changing it to unicode? As far as I can tell, its internal storage needs to be changed to wchar_t and all the functions that take 'char *'s need a corresponding function that takes 'wchar_t *'s. The 'wchar_t *' versions would be the native functions while the 'char *' versions would convert their argument to wide char and call the wide character functions. Any thoughts?
anonytmouse is offline   Reply With Quote
Old 11-04-2003, 12:59 PM   #8
Registered User
 
Frobozz's Avatar
 
Join Date: Dec 2002
Posts: 546
I would think it would be more than just a few simple changes. You'd have to lookup what the standards are and see if you can make it comply. In other words... good luck.
Frobozz is offline   Reply With Quote
Old 11-04-2003, 01:28 PM   #9
qed
Registered User
 
Join Date: Nov 2003
Posts: 7
As to the question of stability :

Bstrlib is *very* usable. Prior to implementing the regression test, Bstrlib was very reliable with no known problems under "normal" usage. In writing the main test for bstring made a "best effort" attempt to hit each function on every corner, factoring in write-protection vs. static vs. NULL vs. empty bstrings as well as other more typical scenarios. The regression test pointed out a number of errors that could arise from seriously "on the fringe" usage. I fixed them, and recompiled all my projects without encountering a single problem. So I would say that Bstrlib is as bug-free as I can make it.

That said, an examination of the CVS tree would show you that I've been checking in non-trivial changes at a rate of about 1 per month for the past year. However, this has not come from API-bloating of the core functions. Its mostly been bug fixes, efficiency improvements, slight refactorings, documentation updates, etc. I.e., the code has always been *converging* towards a target.

I would say that I am more than happy with the current implementation of the core functions in bstrlib.c. If I were to add anything it might be a split function that acts on a stream (certain applications like reading large CSV database files would benefit from this), but otherwise, I would consider the core API closed.

Now that doesn't speak for the contents of bstraux.c. That module was supplied intentionally for non-core utility "bonus" functions. Besides bug fixes, I have been periodically adding a function or two here or there into bstraux.c. Basically any function I don't think necessarily belongs in the core because its not general enough, or of somewhat marginal utility gets stuck in there.

As for the C++ stuff ... well, you have the advantage there that I am not much of a C++ person, and have barely touched that module.

So depending on how conservative you want to be ... you might like to wait a month to see if I do anything major to it before you decide if its stable or not ... or you could trust me when I tell you "it recently hit a major milestone and is very stable now" .

As to the question of wchar_t / UNICODE :

Of course, Bstrlib does not have any internationalization support. One of the things Bstrlib is really useful for is manipulating blocks of binary non-string data (i.e., stuff that can include non-ASCII, and '\0' in its contents.) If you were to simply re-implement bstring on top of wchar_t then such a property would be lost. The abstracted stream-based functions would also all of a sudden becomes a lot less useful.

That said, the lack of international character set support is a definite weakness of Bstrlib. It was my intention to write a seperate "Better Universal String Library" on the type "bustring" which would be a UCS-4 (what wchar_t is supposed to be) implementation minus the streaming functions, but then adding in conversion functions from the various UTF formats.

But a reading of some of the UNICODE documentation suggests that this would barely scratch the surface of what one would want from UNICODE string manipulation. String collating, and even something as simple as determining a character position (as opposed to a code point position) is non-trivial. The complexities boggle the mind (or at least it boggled mine) and were discouraging to say the least.

So what about relying on the underlying C compiler's library for support? Unfortunately, not all compilers support wchar_t, and functions like wcscoll(). And WATCOM C/C++ (my compiler of choice) has pulled the seriously nasty trick of defining wchar_t as 16bits for C, and 32bits for C++. WATCOM C/C++ also seems to think that strcol and strcmp are the same thing.

For me, portability is not something I want to negotiate on, and I don't want it to be a mess of conditional compilation. So that would leave me with the task of actually implementing a string collation function for UNICODE myself.

But maybe you see the problem more clearly than I do, and maybe you don't care about portability to older compilers as much. If you would like to take on the task yourself, then I would certainly like to see what you come up with. Since its under the BSD license, you are under no obligation to share whatever you do, but it would be nice if you did.
qed is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
C++ ini file reader problems guitarist809 C++ Programming 7 09-04-2008 06:02 AM
[Inheritance Hierarchy] User Input on program with constructors. How ? chandreu C++ Programming 8 04-25-2008 02:45 PM
RicBot John_ C++ Programming 8 06-13-2006 06:52 PM
Badly designed n string functions? anonytmouse C Programming 3 11-01-2003 06:16 AM
Something is wrong with this menu... DarkViper Windows Programming 2 12-14-2002 11:06 PM


All times are GMT -6. The time now is 05:13 AM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 RC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22