Request for peer review

This is a discussion on Request for peer review within the C++ Programming forums, part of the General Programming Boards category; I've been working on an alternative method to virtual pointers to achieving runtime polymorphism, with increased space efficiency being the ...

  1. #1
    Registered User
    Join Date
    Dec 2005
    Posts
    24

    Request for peer review

    I've been working on an alternative method to virtual pointers to achieving runtime polymorphism, with increased space efficiency being the primary concern. While this was simple enough, I found that when inlining was brought into consideration, my method was unable to benefit fully due to functions containing switch statements being refused inlining. While this doesn't amount to much of a speed difference in this simple test, I sought to improve upon it and came up with a method involving what I believe is called 'macro magic', which incidentally happens to offer a noticable speed advantage over virtual pointers in many cases.

    What I'm asking for is info on any flaws which could limit its use.

    Code:
    #include <iostream>
    #include <time>
    using namespace std;
    // This next bit is the macro that serves as a 'solution' to the non-inlining function(s?) containing switches.
    #define FUNC(ob) {  \
      switch(ob->get_type()) {  \
        case 1:         \
          ob->m1::koo();    \
          break;         \
        case 2:           \
          reinterpret_cast <m2 *> (ob)->m2::koo(); \
          break;                      \
      }            \
    }
    
    // This determines the version of program compiled. Define MACROS for the macro version, or don't define it for the fuction-calling version.
    //#define MACROS
    
    class v1 { // The v-type classes use virtual pointers for runtime polymorphism.
    public:
      int i;
      int j;
      inline virtual void foo()
      {
        i = 4;
      }
    };
    
    class v2 : public v1 {
    public:
      inline virtual void foo()
      {
        j = 6;
      }
    };
    
    class m1 { // The m-types use my alternative method for runtime polymorphism.
    protected:
      int type;
    public:
      int k;
      int l;
      
      inline m1()
      {
        type = 1;
      }
      
      inline void goo();
      
      inline void koo()
      {
        k = 3;
      }
      
      int get_type()
      {
        return type;
      }
    };
    
    class m2 : public m1 {
    public:
      
      inline m2()
      {
        type = 2;
      }
      
      inline void goo();
      
      inline void koo()
      {
        l = 5;
      }
    };
    
    inline void m1::goo()
    {
      switch(type) {
        case 1:
          this->m1::koo();
          break;
        case 2:
          reinterpret_cast <m2 *> (this)->m2::koo();
          break;
      }
    }
    
    inline void m2::goo()
    {
      switch(type) {
        case 1:
          this->m1::koo();
          break;
        case 2:
          reinterpret_cast <m2 *> (this)->m2::koo();
          break;
      }
    }
    
    int main()
    {
      v1 ob1;
      v2 ob2;
      m1 ob3;
      m2 ob4;
      v1 *ptr1 = &ob1;
      v1 *ptr2 = &ob2;
      m1 *ptr3 = &ob3;
      m1 *ptr4 = &ob4;
      clock_t first1;
      clock_t first2;
      clock_t hold1;
      clock_t hold2;
      clock_t hold3;
      clock_t hold4;
      first1 = clock();
      for(int i=0;i<10000000;++i) ptr1->foo();
      first2 = clock();
      hold1 = (first2 - first1);
      // Place a breakpoint here and start stepping through the assembly instructions to compare differences.
      first1 = clock();
      for(int i=0;i<10000000;++i) ptr2->foo();
      first2 = clock();
      hold2 = (first2 - first1);
    #ifdef MACROS
      first1 = clock();
      for(int i=0;i<10000000;++i) FUNC(ptr3);
      first2 = clock();
      hold3 = (first2 - first1);
      first1 = clock();
      for(int i=0;i<10000000;++i) FUNC(ptr4);
      first2 = clock();
      hold4 = (first2 - first1);
    #else
      first1 = clock();
      for(int i=0;i<10000000;++i) ptr3->goo();
      first2 = clock();
      hold3 = (first2 - first1);
      first1 = clock();
      for(int i=0;i<10000000;++i) ptr4->goo();
      first2 = clock();
      hold4 = (first2 - first1);
    #endif
      cout << hold1 << '\n' << hold2 << '\n' << hold3 << '\n' << hold4 << '\n';
      return 0;
    }

    P.S. I know something about the lack of type safety and the possible problem of programmer logic (which could be helped by thinking of the macro as a global function call rather than a member function). However since I may not fully understand the potential effects, posts about them would not be considered...unwelcome/unnecessary/redundant.

  2. #2
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,556
    I want to throw up.
    <puke>

    > increased space efficiency being the primary concern
    So stop inlining everything and introducing function-like macros which only add to code bloat.

    > what I believe is called 'macro magic
    magic - hell - it's a fine line.
    This kind of macro abuse falls into the HELL category in any language. In C++, it's totally unforgiveable.

    > functions containing switch statements being refused inlining.
    That's your compiler issue, not a language issue.

    > which incidentally happens to offer a noticable speed advantage over virtual pointers in many cases.
    You seem to have a severe case of premature optimisation disease.
    Whilst the statement may be true, you've got no idea whether it will actually be important or not in the complete program. Not unless your code is riddled with thousands of uselessly small classes which add almost no functionality and the code spends all it's time performing table lookups as you describe.

    A clean and simple implementation of a clean and simple design is SO much better for
    - writing
    - testing
    - profiling
    - adding carefully considered improvements based on profiling results.

    "It's easier to optimise a working program than to make an optimised program work".
    You're basically trying to do this the hard way - optimise first, then make it work.

  3. #3
    Registered User
    Join Date
    Dec 2005
    Posts
    24
    > increased space efficiency being the primary concern
    So stop inlining everything and introducing function-like macros which only add to code bloat.
    Maybe I should have explained differently. I'm referring to the amount of space required for the information used for runtime polymorphism, not executable size (besides, I probably ought to learn how much this affects executable size before implementing it in a larger program). In a 32-bit environment, each object will contain a 32-bit virtual pointer which will be used to determine the object's type at runtime, while my method uses a (most likely) 8-bit value to do the same thing, with the added bonus of staying the same size regardless of environment. I'm sorry I failed to make m1::type an 8-bit value, which might have made this more obvious.

    > what I believe is called 'macro magic
    magic - hell - it's a fine line.
    This kind of macro abuse falls into the HELL category in any language. In C++, it's totally unforgiveable.
    I used the term 'macro magic' because I saw it on one of the message boards (OK, it was in one of the threads) and assumed that it was a piece of slang appropriate for most uses of macros (I have yet to learn to sufficiently research terms before using them, unfortunately). Is there a reason other than code bloat (which I assume is a reference to executable size), unreadable code (which didn't seem too bad in this implementation) or alternative method (such as a compiler that supports inlining of functions with switches) that this is 'totally unforgivable', bearing in mind that these are only meant to be used for determining an object type through a base class pointer?

    > which incidentally happens to offer a noticable speed advantage over virtual pointers in many cases.
    You seem to have a severe case of premature optimisation disease.
    Whilst the statement may be true, you've got no idea whether it will actually be important or not in the complete program. Not unless your code is riddled with thousands of uselessly small classes which add almost no functionality and the code spends all it's time performing table lookups as you describe.
    This is a test case independent of any program, just to see if I can apply the general idea successfully. The 'uselessly small classes with almost no functionality' are that way for simplicity. I didn't think I said anything about table lookups 'all the time', nor did I mean to imply it. If you assumed this from the inlining, I just did that to get a fair idea of the best-case performance of each method, with the switch-statement-containing functions (which ironically were refused inlining) being the intended targets in any... realistic use (since they are used to determine the type of an object). While the performance boost might not be that noticable in a proper program, if it's an optimisation that I can see and implement then I don't see why I shouldn't (unless it gets TOO complicated). Also, I just noticed I forgot to remove m2::goo(), which is of no use and probably misleading, sorry again.

    A clean and simple implementation of a clean and simple design is SO much better for
    - writing
    - testing
    - profiling
    - adding carefully considered improvements based on profiling results.
    Since this is something which I believe could be applied to multiple programs (plus the fact that it is being done with a use rather than a program in mind) I think that this is of less relevence in this case (though not typically).

    "It's easier to optimise a working program than to make an optimised program work".
    You're basically trying to do this the hard way - optimise first, then make it work.
    I'm familiar with the concept, though the saying is new.

    I want to throw up.
    <puke>
    Shouting for ruthgrin?
    Good luck cleaning under the keyboard keys. Don't forget the monitor, especially the corners. Don't forget to wash your hands after typing in your last meal.
    To avoid this in future, try aiming into the C (wait, doesn't C discourage spaggetti code?).

  4. #4
    Registered User
    Join Date
    Aug 2002
    Location
    Hermosa Beach, CA
    Posts
    446
    Maybe I'm missing something, but isn't an "inline virtual" function impossible by definition? Because "inline" means that no function call exists--the function body is inlined at compile time. And a virtual function means that the correct function is selected at runtime based on the actual object type. So how does the compiler inline something at compile time that it won't even really know until runtime??

    Also, I agree with the Salem. You really are worried about all the wrong stuff. Memory is plentiful, and processors are lightning fast. What you are doing fits well with the saying, "Penny-wise, pound foolish". You are getting no into saving a little bit of space, or running a little bit faster, that you aren't even paying attention to the heap of .......... that you'll end up with in the end.
    The crows maintain that a single crow could destroy the heavens. Doubtless this is so. But it proves nothing against the heavens, for the heavens signify simply: the impossibility of crows.

  5. #5
    aoeuhtns
    Join Date
    Jul 2005
    Posts
    581
    Quote Originally Posted by ruthgrin
    Maybe I should have explained differently. I'm referring to the amount of space required for the information used for runtime polymorphism, not executable size (besides, I probably ought to learn how much this affects executable size before implementing it in a larger program). In a 32-bit environment, each object will contain a 32-bit virtual pointer which will be used to determine the object's type at runtime, while my method uses a (most likely) 8-bit value to do the same thing, with the added bonus of staying the same size regardless of environment.
    Oh? An 8-bit value takes up less space? No it won't, the way you're using it, because the compiler will align values along four-byte boundaries. (And if it didn't, accessing values from memory would be slower.) Do a test with sizeof.

    And is your macro insanity really an optimization? Will it decrease the running time of a larger program? Probably the opposite. Any significant use of macros would indeed cause executable size to increase. This can slow down the program; your program will have more cache misses. Your current example might end up running faster, but this is because your program is simple, and your member functions are one instruction long.

  6. #6
    Registered User
    Join Date
    Dec 2005
    Posts
    24
    Maybe I'm missing something, but isn't an "inline virtual" function impossible by definition? Because "inline" means that no function call exists--the function body is inlined at compile time. And a virtual function means that the correct function is selected at runtime based on the actual object type. So how does the compiler inline something at compile time that it won't even really know until runtime??
    The idea is 'runtime polymorphism' without virtual pointers or (by implication) virtual functions. Instead, each class object is given a type value. When a function is called through a base-class pointer (with the reason for the creation of the base class being to ensure the inclusion of type and so the pointer could point to objects of derived classes) the base class goes through a switch statement using the type value (at runtime). Once it finds a match, it will have determined the object type (be it the base class or a derived class) and can then call the appropriate function (making type a const variable could be an idea, the important thing is that each object has its own copy). I hope that is enough information to answer your question.

    You really are worried about all the wrong stuff. Memory is plentiful, and processors are lightning fast
    First, this isn't likely to have all the computer's resources to itself, second, my computer isn't as modern as some specs I've seen mentioned (600MHZ Intel Celeron, 128MB RAM, 16MB Voodoo 2000 graphics card) so I want to minimize strain from individual aspects of a program where possible.

    that you aren't even paying attention to the heap of .......... that you'll end up with in the end.
    Something like that has happened to me before (nothing I've brought up here, though), so all I can say is if so, then it's not because I haven't been trying to avoid it.

  7. #7
    Registered User
    Join Date
    Dec 2005
    Posts
    24
    In reply to Rashakil Fol, I compile with single-byte alignment so my variables don't have any padding, though I was unaware of the performance implications.

    As for 'macro insanity'/inlining and its effect on speed due to executable size, I currently lack experience as to how much is considered 'significant usage' and the performance hit for large executables (incidentally, could that be helped by, say, importing a dll with the required info)?

  8. #8
    Registered User
    Join Date
    Aug 2002
    Location
    Hermosa Beach, CA
    Posts
    446
    I was just pointing out that the following scenario that you gave (below) is impossible for the compiler to do. I imagine that the compiler just ignores your request for inlining.

    Code:
    class v1 { // The v-type classes use virtual pointers for runtime polymorphism.
    public:
      int i;
      int j;
      inline virtual void foo()
      {
        i = 4;
      }
    };
    
    class v2 : public v1 {
    public:
      inline virtual void foo()
      {
        j = 6;
      }
    };
    The crows maintain that a single crow could destroy the heavens. Doubtless this is so. But it proves nothing against the heavens, for the heavens signify simply: the impossibility of crows.

  9. #9
    Registered User
    Join Date
    Dec 2005
    Posts
    24
    I was just pointing out that the following scenario that you gave (below) is impossible for the compiler to do.
    My fault for not explaining. This was a stand-alone example to determine the performance differences, so it consisted of, in effect, two classes each done two different ways. The v classes are demonstrating the conventional usage of virtual pointers, while the m classes are my implementation. In actual usage, the v classes wouldn't exist.

    Since it may not be obvious, the line

    Code:
    #define MACROS
    is commented out by default. Unless your compiler supports inlining of functions containing switches, this will result in the slowing method. Uncomment the line to see the effect of the macro (it worked on my machine, so any problems you may experience are probably due to your compiler or its settings).

    *EDIT* Sorry, my eyes 'glazed' over my code and I answered a question you didn't ask. I think I remember reading something about virtual functions not being inlined, but I guess it didn't sink in. Thanks for the reminder.
    Last edited by ruthgrin; 01-23-2006 at 01:36 AM. Reason: Just realized I interpreted post incorrectly

  10. #10
    aoeuhtns
    Join Date
    Jul 2005
    Posts
    581
    Quote Originally Posted by ruthgrin
    In reply to Rashakil Fol, I compile with single-byte alignment so my variables don't have any padding, though I was unaware of the performance implications.
    To be specific, memory access on typical systems is aligned to four-byte units. Here's a drawing of a memory capable of storing eight bytes.

    [ byte 0, byte 1, byte 2, byte 3 | byte 4, byte 5, byte 6, byte 7 ]

    Many systems can retrieve the data stored in [ byte 0, byte 1, byte 2, byte 3 ] in one operation, or retrieve the data stored in [ byte 4, byte 5, byte 6, byte 7 ] in one operation. But it can't retrieve a value, say, [ byte 2, byte 3, byte 4, byte 5 ] as quickly. The processor has to retrieve bytes 0-3 and bytes 4-7, and then combine together the relevant parts. This uses two memory accesses, instead of one. The actual slowdown is probably not large enough to be serious.

    As for 'macro insanity'/inlining and its effect on speed due to executable size, I currently lack experience as to how much is considered 'significant usage' and the performance hit for large executables (incidentally, could that be helped by, say, importing a dll with the required info)?
    Required info? Huh? It's not the large executable, it's the burning through large amounts of code (and memory) if you inline everything.

    Compilers are much better at quickly figuring things like this out than humans. If you actually need optimization, you should use a profiler to learn what parts of your program need the optimization.

  11. #11
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,556
    Basically, it boils down to this.
    Why are you using macros inside C++ to reinvent C++?

    I mean, dumb macro tricks are sometimes necessary to get some sense of 'OO' into C programs, but C++???? - that's just too horrid to think about.

  12. #12
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    I believe that trying some coding algorithm or scheme is worthwhile. Even if it doesn't pan out, it adds knowledge to the coder and the programming community. Elements of it may be useful for other problems. For example, ideas in this scheme may be useful for someone wising to implement run-time polymophism in C or something as unexpected as implementing polymorphism in a scripting engine. I'm sure that there are several useful algorithms and patterns out there that didn't prove useful for the problem they were originally developed for. I hope the forum isn't too harsh on people presenting their ideas.

    That being said, I'm a little skeptical of the performance improvement claimed for this scheme. This scheme adds a per-function lookup. As I understand it, a typical C++ function call to a virtual class looks (expressed in C) like this:
    Code:
    ob->pVtbl->koo(ob);
    whereas your function call looks like this:
    Code:
      switch(ob->type) {
        case 1:
          ob->koo(ob); 
          break;
        case 2:
          (m2 *) (ob)->koo(ob);
          break;
      }
    I suspect that the former will be minutely faster than the latter. If you are seeing a performance improvement, I suspect it is because of the compiler inlining. As Rashakil said, this is not done for the former, but for your scheme looks like this:
    Code:
      switch(ob->type) {
        case 1:
          ob->k = 2;
          break;
        case 2:
          ob->l = 5;
          break;
      }
    This may be practical for two one line functions but lets consider a more realistic case of four classes with thirty lines of code each. In this case, each time we call it we inline no less than 120 lines of code. This may be problematic.

    I could also comment on the implementation of your idea but this is something that can be refined and changed (for example, with the C++ template system) so I kept to the core idea.

  13. #13
    Registered User
    Join Date
    Feb 2003
    Posts
    62

    Good for you

    Don't listern to Salem, well done for trying, I give you credit for that. Keep up the good work and don't let people put you down as they do a lot here.

    Chris

  14. #14
    Registered User
    Join Date
    Dec 2005
    Posts
    24
    To be specific, memory access on typical systems is aligned to four-byte units. Here's a drawing of a memory capable of storing eight bytes...
    I'll try to remember those points.

    Basically, it boils down to this.
    Why are you using macros inside C++ to reinvent C++?
    My orginal intention was simply to make a more space efficient method of achieving runtime polymorphism, even if it was not quite as fast (since none of my current projects are time-critical, if I'm using it in the correct context). When my timing tests showed that both methods were about the same speed, I figured that perhaps there was unplanned optimisation going on so I tried inlining everything to even things up a bit. Since my compiler refused to inline the functions containing switches (which I figured was probably common among compilers) and I was unwilling to change compilers just for an experiment which I wouldn't be able to use with my preferred compiler I tried to achieve the same effect with a function-like macro. Since I gathered that it wasn't a common practice, I tried to work out possible problems that might come up as a result, then when I could only come up with a couple I started this thread.

    I suspect that the former will be minutely faster than the latter...
    I just tested that. The virtual pointer method only took about 3/4 of the time. I guess this won't be useful in scenarios where speed is of the essence then.

    Don't listern to Salem, well done for trying, I give you credit for that. Keep up the good work and don't let people put you down as they do a lot here.
    Thanks. I don't think the people here are that inclined to 'put you down', unless I've been reading all the wrong posts. In addition, I gather I posted this at a time where (at least on one board) they had received a number of irritating posts in a short time, so I assume they figured this to be one more.

    Thanks for the help, again.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Can you check what is wrong with this code
    By Ron in forum C++ Programming
    Replies: 4
    Last Post: 08-01-2008, 10:59 PM
  2. my HTTP request handler code correct?
    By George2 in forum C# Programming
    Replies: 0
    Last Post: 04-25-2008, 04:01 AM
  3. code review request: stack implementation
    By cs32 in forum C Programming
    Replies: 6
    Last Post: 02-24-2008, 01:26 PM
  4. sending https request
    By sreenu_daram in forum Networking/Device Communication
    Replies: 1
    Last Post: 08-05-2006, 09:08 AM
  5. denied request
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 09-20-2001, 11:35 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21