changing internal representation of a struct library

This is a discussion on changing internal representation of a struct library within the C Programming forums, part of the General Programming Boards category; I have been reading Bruce Eckel's Thinking in C++ Volume 1. I can usually figure out what he is talking ...

  1. #1
    novice fisheromen1031's Avatar
    Join Date
    Jul 2005
    Location
    Lone Star State/ Rocket City
    Posts
    13

    changing internal representation of a struct library

    I have been reading Bruce Eckel's Thinking in C++ Volume 1. I can usually figure out what he is talking about, but in his discussion leading from C struct libraries into C++ objects, he has a part about member variables I don't follow. It has to do with using a function to retrieve a variable instead of using the member selection operator. (i.e. get_variable(&STRUCTURE) instead of STRUCTURE.variable) Eckel claims that "if you wanted to change the internal representation of CStash and thus the way the count was calculated, the function call interface allows the necessary flexibility." [quote and following code from "Chapter 4: Data Abstraction" of his book]

    My question is, does his statement apply to making changes from the library user point of view or from the library developer point of view? If from the user, how is that possible? I can see that this gives the developer the option of totally changing how the number of elements in the CStash is represented with out requiring the user to change code.

    Did I just answer my own question?

    Thanks,

    Fisher



    Here is the structure headerfile:
    Code:
    typedef struct CStashTag {
      int size;      // Size of each space
      int quantity;  // Number of storage spaces
      int next;      // Next empty space
      // Dynamically allocated array of bytes:
      unsigned char* storage;
    } CStash;
    
    void initialize(CStash* s, int size);
    void cleanup(CStash* s);
    int add(CStash* s, const void* element);
    void* fetch(CStash* s, int index);
    int count(CStash* s);
    void inflate(CStash* s, int increase);
    The function definitions are as follows:
    Code:
    #include "CLib.h"
    #include <iostream>
    #include <cassert> 
    using namespace std;
    // Quantity of elements to add
    // when increasing storage:
    const int increment = 100;
    
    void initialize(CStash* s, int sz) {
      s->size = sz;
      s->quantity = 0;
      s->storage = 0;
      s->next = 0;
    }
    
    int add(CStash* s, const void* element) {
      if(s->next >= s->quantity) //Enough space left?
        inflate(s, increment);
      // Copy element into storage,
      // starting at next empty space:
      int startBytes = s->next * s->size;
      unsigned char* e = (unsigned char*)element;
      for(int i = 0; i < s->size; i++)
        s->storage[startBytes + i] = e[i];
      s->next++;
      return(s->next - 1); // Index number
    }
    
    void* fetch(CStash* s, int index) {
      // Check index boundaries:
      assert(0 <= index);
      if(index >= s->next)
        return 0; // To indicate the end
      // Produce pointer to desired element:
      return &(s->storage[index * s->size]);
    }
    I omitted the functions that don't reference next.

  2. #2
    cas
    cas is offline
    Registered User
    Join Date
    Sep 2007
    Posts
    979
    Code:
    Did I just answer my own question?
    Yes.

    The point is that the layout of your struct should be “secret”. Not that you care if the user knows what it looks like, but you do care if he tries to poke around in there. If the user directly accesses a member of a struct, then you cannot change how that member works, or else the user code will fail to build (or perhaps it will build but give strange results).

    As long as you provide a get_variable() function, you can change the struct to your heart's content, and from the user's point of view it's all the same. Think of the standard library's FILE*, for example. You get the offset by using ftell(fp) instead of fp->_offset (or whatever). Your C library maintainer can manage the offset however he wants, and change it when it becomes necessary.

    It also helps for binary compatibility. Accessing a struct member necessarily means building into the binary the offset of that member from the beginning of the struct. Change the order/number of members, and you have to bump your shared library version; otherwise previously built code will be accessing junk. Instead, if you just give the user a pointer to your struct, and you do all the initialization/access in the library, you can change things and previously linked binaries will still work.

  3. #3
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Was curious so searched the book and it is available online. I quote:

    "count( ) may look a bit strange at first to a seasoned C programmer. It seems like a lot of trouble to go through to do something that would probably be a lot easier to do by hand. If you have a struct CStash called intStash, for example, it would seem much more straightforward to find out how many elements it has by saying intStash.next instead of making a function call (which has overhead), such as count(&intStash). However, if you wanted to change the internal representation of CStash and thus the way the count was calculated, the function call interface allows the necessary flexibility. But alas, most programmers won’t bother to find out about your “better” design for the library. They’ll look at the struct and grab the next value directly, and possibly even change next without your permission. If only there were some way for the library designer to have better control over things like this! (Yes, that’s foreshadowing.)"

    If that is what you are refering at, then what is saying is that you have this:
    Code:
    int count(CStash* s) {
      return s->next;  // Elements in CStash
    }
    Why have this? Someone could simply read the s->next variable like this
    Code:
    CStash st;
    int c = st.next;
    rather than
    Code:
    CStash st;
    int c = count(&st);
    and don't need the overhead of a function call.

    The problem is that you might change CStach and for example don't have next as a member variable or next counts every 5 elements, then you would have to change the main code if you use the first attempt. If you used the second attempt (the function call) then you wouldn't need to change the main code, just the function call.
    The idea is that if a developer of the library updates the CStach then the main code of a programmer shouldn't change. Thus, there has to be some form of abstraction which functions provide.
    Even worse, somebody could alter the "next" member even if it just counted elements. Thus, it would be nice if it was "private" hinting public/private variables.

    The whole thing is a comparison of a C struct with a C++ class and how one lead to the other.

  4. #4
    novice fisheromen1031's Avatar
    Join Date
    Jul 2005
    Location
    Lone Star State/ Rocket City
    Posts
    13
    Code:
    Did I just answer my own question?
    Yes.
    I appreciate y'all's additional comments.

    For some reason, every time I read that section of the book before I always thought he was referring to some sort of casting operation done by the user when the function was called.
    Last edited by fisheromen1031; 11-13-2009 at 05:16 PM.

  5. #5
    Registered User
    Join Date
    Jun 2005
    Posts
    6,252
    He's just saying that it's better to use;
    Code:
    value = get_value(the_struct);
    over (in C++)
    Code:
    value = the_struct.value();
    or (in C)
    Code:
    value = the_struct.value;   //  no member function call
    This works both from a user and a library-implementer point of view.

    One reason is that the only way to introduce a new member function for a class is to modify the class definition (i.e. the class declaration that is in a header file). If a library-writer does that, all user and library code that use that class have to be rebuilt from source. For the library-user, that means all applications must be rebuilt from source in order to use an updated version of the library.

    If, instead, the library-writer provides the get_value(structure) form, then any function can be modified without ever touching the class definition. For the library-writer to rebuild the library, only the modified files need to be recompiled, rather than recompiling every source file that depends on the class definition. If the class library is provided to the user as a runtime library (eg a DLL under windows, a lib.so file under unix), then the impact on the user is minimal: all that is needed is to properly install the library, and applications which use it will run as before but get behaviours from the new version of the library. If the user employs the library statically in applications then worst case for the user (once the library itself is rebuilt, of course) is a need to relink applications to use the new library.

    The other benefit is that a user can also extend the library in the same way. The only way to add a member function to a class is to modify the class definition (e.g. the header file that defines the class). A user can only do that by becoming, in effect, a library-writer. However, the user can add a new "get_funky_value(structure_type)" style of function without changing the library at all. The user also does not need to remember which features are provided as members and which as standalone functions: it is not necessary for the user to remember that get_funky_value() is a non-member function while value() is a struct member (in C or C++) or get_value() is a member function (in C++).

    For example, a user can do this;
    Code:
       value = get_value(some_structure);
       funky_value = get_funky_value(some_structure);
    rather than having to remember to do this;
    Code:
       value = some_structure.get_value();   // In C this would be   some_structure.value
       funky_value = get_funky_value(some_structure);   // see, different syntax here
    There is the slight advantage, in C++, that the library implementer can prevent the user employing the member form (e.g. by making data members private, and simply not providing a member function).
    Right 98% of the time, and don't care about the other 3%.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. help with structs and malloc!
    By coni in forum C Programming
    Replies: 20
    Last Post: 09-14-2009, 05:38 PM
  2. Assignment HELP!!
    By cprogrammer22 in forum C Programming
    Replies: 35
    Last Post: 01-24-2009, 01:24 PM
  3. linked list question
    By brb9412 in forum C Programming
    Replies: 16
    Last Post: 01-04-2009, 03:05 PM
  4. Help please im stuck
    By ItsMeHere in forum C Programming
    Replies: 7
    Last Post: 06-15-2006, 04:07 AM
  5. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21