Hello,
I am beginning work on a mathematical library and am getting stuck designing the interface. Some peoples' code just seems to more elegant than others. Their library interfaces give you everything you need and make it a pleasure to use their library. Others are clearly missing functionality, are inconsistent, or provide unnecessary features. A person can know the language, the syntax, etc just as well as another, but I think style points go a long way in program design. I was hoping to layout a small piece of my project and see what style points I can gain from any feedback.
The starting point for my library is a table of numbers. This "table" is really a "box" in the sense that it is three dimensions (a collection of 2D tables.) The dimensions are unknown ahead of time. The starting point for me is:
Easy enough...I hope I haven't messed it up too bad yet. For the table to be of any use, it should probably hold it's dimensions, otherwise how would the library know where it ended?Code:struct foo_table { int x; /* x dimension */ int y; /* y dimension */ int z; /* z dimension */ float ***table; };
I would like to also give the table an identifier, so I'll add a name in there too:
The library will then go on to define the "core" mathematical functions that can be performed on the table. That isn't too hard...that is the whole point of the library. But I'm left with ambiguity around how to handle some of the finer points. Namely how many "convenience" functions do I define in the library? A user of the library could use the structure as is just fine if they really wanted to. However, I could probably make it a bit easier for them with a well designed interface.Code:struct foo_table { char *name; int x; int y; int z; float ***table; };
I think it would be nice to provide something that returns a pointer to newly allocated space for an instance of the structure, so I'll make a "new" function for that. But there are a lot of functions to choose from. Four such candidates are:
Candidate A barely saves the user any time. It doesn't initialize or allocate space for any of the member variables. All of this will need to be done at some point. Maybe a function is provided like foo_table_table_set() that takes a struct foo_table *, the three dimensions and a float ***. The function allocates enough space as indicated by the dimension parameters and copies over the data in the passed in float*** up to the dimensions indicated by x, y, and z.Code:struct foo_table *foo_table_new(void); /* candidate A */ struct foo_table *foo_table_new(int x, int y, int z); /* candidate B */ struct foo_table *foo_table_new(int x, int y, int z, float ***table); /* candidate C */ struct foo_table *foo_table_new(char *name, int x, int y, int z, float ***table); /* candidate D */
Candidate B can initialize the member variables indicating the size of the table held in the structure and maybe allocate the appropriate amount of space for the float ***. Some time after a call to candidate B, the user could call a function like foo_table_table_set() and pass it a struct foo_table * and a float ***. The function would work like the one mentioned above, but rely on the dimension members x, y, z of the structure, rather than parameters passed in via the function.
Candidate C saves the user a call to another helper function that they are almost certainly going to want to call at some point, and can provide the same "consistency" feature by allocating it's own table and copying over values from the passed in table up to it's dimensions. Assuming the user uses the library interface rather than trying to do it all on his/her own, B and C ensure a level of consistency that A by itself cannot.
Candidate D provides all of the features of C, plus takes care of the name right away too. However, I see a distinction between the features of Candidates B and C verses D. The improvements that B and C made over A helped ensure consistency in the data structure. No consistency is gained by using candidate D, and requiring the user to pass in additional information can become burdensome.
I could provide them all, but C doesn't provide function overloading, so naming them well could become challenging.
Further complicating this is if I were to provide a function to free the space used by a struct foo_table. Undefined behavior happens if I try to free something that wasn't allocated. If say the function candidate A above were used to allocate space for my data, then it will be up to the user to know if they also allocated any space for the name or the table of floats and they will have to take care of freeing that manually. It doesn't sound like my free function then will be helpful at all. In fact, unless its name is less than no more than four characters, it will end up taking more effort from the user to free up the memory!
In contrast, candidates B and C give me some level of assurance that space was allocated for the internal table, and I can free that as well as the data structure itself. However, only assuming the user used candidate D allows me to assume that I can free up everything - the space used for the name, the internal table, and the structure itself. But candidate D was the "new" function I felt may be encroaching on being burdensome.
Then what other helpers do I provide? Is it necessary to define a function that given a struct foo_table *, returns the x dimension? The user could just access that directly through the structure rather than ask a function for it. What that does do, however, is indicate to the user what they should/should not be messing around with. Since C doesn't allow for private data members AFAIK, there is nothing to stop the user from (re)setting say the member variable holding the x dimension of the table to something beyond what it should be. However, if say I provide "get" functions for the dimensions, but not "set" functions, that implies to the user how it was intended to be used. Maybe the only "set" function bundles together the dimensions and a table?
What about the name? I feel like everywhere I look, I see people saying use the "n" series of string functions (strncmp, strncpy, etc.) over their "inferior" counterparts that don't provide any measure of "bounds checking". However, I don't know that I feel it is the place of my library to be defining the maximum length that a name can be. I could leave it all up to the user to take care of, or I suppose I could provide a function like
which may give the user a friendly reminder that they should be doing some sort of bound checking.Code:foo_table_name_set(char *name, int max_length) /* whatever the return type may be */
So, there you have it. I apologize if this was too general/vague and hope it doesn't spark any kind of flame war since this may boil down to personal preference. I feel that I know *how* to program this up, I'm just not sure *what* I should be programming. If the interface is poorly defined, then the user isn't going to use it, and there was no point in making one in the first place. Just create the necessary data types and let the user be on their way.
However, if the interface is well designed, hopefully the user will use it, and ideally gain some level of confidence that they are using the data structures as intended and their data is consistent.
I know using a different language like C++ for example could offer some help. I could make private the data members I didn't want the user messing around with and then would be forced to provide the "get" and "set" functions I needed. Also, I could overload the constructor to allow for any of the candidates above. Using another language is a real alternative I will consider, but first I was hoping to see what suggestions you all might have.
Any thoughts, about anything I outlined specifically or just in general? Good rules of thumb? Standards of practice?
Thanks in advance,
Jason