Style Points

**jason_m** · 05-27-2008

Hello,

I am beginning work on a mathematical library and am getting stuck designing the interface. Some peoples' code just seems to more elegant than others. Their library interfaces give you everything you need and make it a pleasure to use their library. Others are clearly missing functionality, are inconsistent, or provide unnecessary features. A person can know the language, the syntax, etc just as well as another, but I think style points go a long way in program design. I was hoping to layout a small piece of my project and see what style points I can gain from any feedback.

The starting point for my library is a table of numbers. This "table" is really a "box" in the sense that it is three dimensions (a collection of 2D tables.) The dimensions are unknown ahead of time. The starting point for me is:

Code:

struct foo_table {
  int x;                   /* x dimension */
  int y;                   /* y dimension */
  int z;                   /* z dimension */
  float ***table;
};

Easy enough...I hope I haven't messed it up too bad yet. For the table to be of any use, it should probably hold it's dimensions, otherwise how would the library know where it ended?

I would like to also give the table an identifier, so I'll add a name in there too:

Code:

struct foo_table {
  char *name;
  int x;
  int y;
  int z;
  float ***table;
};

The library will then go on to define the "core" mathematical functions that can be performed on the table. That isn't too hard...that is the whole point of the library. But I'm left with ambiguity around how to handle some of the finer points. Namely how many "convenience" functions do I define in the library? A user of the library could use the structure as is just fine if they really wanted to. However, I could probably make it a bit easier for them with a well designed interface.

I think it would be nice to provide something that returns a pointer to newly allocated space for an instance of the structure, so I'll make a "new" function for that. But there are a lot of functions to choose from. Four such candidates are:

Code:

struct foo_table *foo_table_new(void);                                                     /* candidate A */
struct foo_table *foo_table_new(int x, int y, int z);                                      /* candidate B */
struct foo_table *foo_table_new(int x, int y, int z, float ***table);                   /* candidate C */
struct foo_table *foo_table_new(char *name, int x, int y, int z, float ***table); /* candidate D */

Candidate A barely saves the user any time. It doesn't initialize or allocate space for any of the member variables. All of this will need to be done at some point. Maybe a function is provided like foo_table_table_set() that takes a struct foo_table *, the three dimensions and a float ***. The function allocates enough space as indicated by the dimension parameters and copies over the data in the passed in float*** up to the dimensions indicated by x, y, and z.

Candidate B can initialize the member variables indicating the size of the table held in the structure and maybe allocate the appropriate amount of space for the float ***. Some time after a call to candidate B, the user could call a function like foo_table_table_set() and pass it a struct foo_table * and a float ***. The function would work like the one mentioned above, but rely on the dimension members x, y, z of the structure, rather than parameters passed in via the function.

Candidate C saves the user a call to another helper function that they are almost certainly going to want to call at some point, and can provide the same "consistency" feature by allocating it's own table and copying over values from the passed in table up to it's dimensions. Assuming the user uses the library interface rather than trying to do it all on his/her own, B and C ensure a level of consistency that A by itself cannot.

Candidate D provides all of the features of C, plus takes care of the name right away too. However, I see a distinction between the features of Candidates B and C verses D. The improvements that B and C made over A helped ensure consistency in the data structure. No consistency is gained by using candidate D, and requiring the user to pass in additional information can become burdensome.

I could provide them all, but C doesn't provide function overloading, so naming them well could become challenging.

Further complicating this is if I were to provide a function to free the space used by a struct foo_table. Undefined behavior happens if I try to free something that wasn't allocated. If say the function candidate A above were used to allocate space for my data, then it will be up to the user to know if they also allocated any space for the name or the table of floats and they will have to take care of freeing that manually. It doesn't sound like my free function then will be helpful at all. In fact, unless its name is less than no more than four characters, it will end up taking more effort from the user to free up the memory!

In contrast, candidates B and C give me some level of assurance that space was allocated for the internal table, and I can free that as well as the data structure itself. However, only assuming the user used candidate D allows me to assume that I can free up everything - the space used for the name, the internal table, and the structure itself. But candidate D was the "new" function I felt may be encroaching on being burdensome.

Then what other helpers do I provide? Is it necessary to define a function that given a struct foo_table *, returns the x dimension? The user could just access that directly through the structure rather than ask a function for it. What that does do, however, is indicate to the user what they should/should not be messing around with. Since C doesn't allow for private data members AFAIK, there is nothing to stop the user from (re)setting say the member variable holding the x dimension of the table to something beyond what it should be. However, if say I provide "get" functions for the dimensions, but not "set" functions, that implies to the user how it was intended to be used. Maybe the only "set" function bundles together the dimensions and a table?

What about the name? I feel like everywhere I look, I see people saying use the "n" series of string functions (strncmp, strncpy, etc.) over their "inferior" counterparts that don't provide any measure of "bounds checking". However, I don't know that I feel it is the place of my library to be defining the maximum length that a name can be. I could leave it all up to the user to take care of, or I suppose I could provide a function like

Code:

foo_table_name_set(char *name, int max_length) /* whatever the return type may be */

which may give the user a friendly reminder that they should be doing some sort of bound checking.

So, there you have it. I apologize if this was too general/vague and hope it doesn't spark any kind of flame war since this may boil down to personal preference. I feel that I know *how* to program this up, I'm just not sure *what* I should be programming. If the interface is poorly defined, then the user isn't going to use it, and there was no point in making one in the first place. Just create the necessary data types and let the user be on their way.

However, if the interface is well designed, hopefully the user will use it, and ideally gain some level of confidence that they are using the data structures as intended and their data is consistent.

I know using a different language like C++ for example could offer some help. I could make private the data members I didn't want the user messing around with and then would be forced to provide the "get" and "set" functions I needed. Also, I could overload the constructor to allow for any of the candidates above. Using another language is a real alternative I will consider, but first I was hoping to see what suggestions you all might have.

Any thoughts, about anything I outlined specifically or just in general? Good rules of thumb? Standards of practice?

Thanks in advance,
Jason

**laserlight** · 05-27-2008

One way to do this is like what SQLite does: provide an opaque structure, and then require the user to just use the library interface. The GMP library does likewise, but goes a step further by providing a typedef of the pointer. In both cases, library users are discouraged from directly accessing the internals, and do so at their own risk.

Borrowing from the GMP interface but without a pointer typedef, I might suggest:

Code:

void foo_table_init(foo_table *result, int x, int y, int z, const char *name);
void foo_table_set(foo_table *result, const foo_table *original);
void foo_table_init_set(foo_table *result, const foo_table *original);
void foo_table_free(foo_table *result);

foo_table_init() would be the default constructor, in C++ parlance. It initialises the foo_table to the given dimensions by allocating space for the internal table. If name is not NULL, it allocates space for the name and then copies over the string. Since freeing a null pointer is a no-op, this makes life easier when implementing foo_table_free(). Incidentally, you may wish to use size_t instead of int for specifying (and storing) the dimensions.

Likewise, foo_table_set() would be the copy assignment operator and foo_table_init_set() would be the copy constructor. foo_table_free() would be the destructor, and it would free both the internal table and the name.

You could change foo_table_init() to:

Code:

void foo_table_init(foo_table *result, int x, int y, int z, const float ***table, const char *name);

If the library user wishes to delay populating the table, NULL can be passed as an argument.

Other possible library functions include:

Code:

void foo_table_set_name(foo_table *result, const char *name);
void foo_table_set_value(foo_table *result, int x, int y, int z, float value);
void foo_table_set_values(foo_table *result, const float ***table);
const char* foo_table_get_name(foo_table *table);
float foo_table_get_value(foo_table *table, int x, int y, int z);

Since the user is not expected to directly access the internals of the foo_table, you could do things like change the internal table from a float*** to a float* and then compute the offsets.

**Sander** · 05-28-2008

First of all, my compliments to your approach. You are entirely on the right track by keeping these things in mind, and the fact that you even noticed how some libraries are much more "natural" to use than others probably means that whatever you choices you make will at least be "reasonable".

I would certainly follow laserlight's advice and go for an opaque struct with "handlers" in the library interface. Since your struct will contain pointers, you should try to prevent users from copying it around and leaving dangling pointers.

One thing I would change is to add return values to the foo_table_xyz functions so you can signal the user that they're initializing a foo_table which was already initalized (for instance).

Your library looks like it will be a joy to use :-)

--
Computer Programming: An Introduction for the Scientifically Inclined

**QuantumPete** · 05-28-2008

One thing I would change is to add return values to the foo_table_xyz functions so you can signal the user that they're initializing a foo_table which was already initalized (for instance).

I agree. Most, if not all, functions should return an int, indicating some sort of error code. Even if your function doesn't have any error conditions yet, it may do in the future and that way you will keep the signature of your functions in sync.

QuantumPete

**jason_m** · 05-28-2008

Thank you for the replies.

One way to do this is like what SQLite does: provide an opaque structure, and then require the user to just use the library interface. The GMP library does likewise, but goes a step further by providing a typedef of the pointer. In both cases, library users are discouraged from directly accessing the internals, and do so at their own risk.

I'll spend some time taking a look at the SQLite code. Yet another reason open source projects rule.

Since the user is not expected to directly access the internals of the foo_table, you could do things like change the internal table from a float*** to a float* and then compute the offsets.

Is this a general preference? I guess I am mostly indifferent. I may lean a little more towards using the float *** representation because then my code could still use array notation when accessing table elements rather than doing pointer arithmetic. Might just be a tad easier to read.

One thing I would change is to add return values to the foo_table_xyz functions so you can signal the user that they're initializing a foo_table which was already initalized (for instance).

I agree

I appreciate your suggestions.

Regards,
Jason

Thread: Style Points

Thread Tools

Search Thread

Display

Style Points

Similar Threads

Help it won't compile!!!!!

I'm so close to finishing this...please help with logic error

Yahtzee C++ programme help

CProg Fantasy Football version pi

Tab Controls - API

Tags for this Thread