Thread: How does C++ itself work?

  1. #1
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137

    How does C++ itself work?

    I'm familiar with C. I'm also familiar with x86-64 assembly. I'm not super familiar with C++ and I view it as a giant language that is hard to wrap my head around all the features, but seing as I know C# pretty well, I get the OOP stuff.

    That said, I've wanted to know, how does C++ itself actually work? For example, say I'm a C programmer and I want to make C++... Which C constructs would I use to implement classes, operator overloads, the STL and etc??? Are these ultimately all just pointers to structs which contain more pointers to more structs? Or is there something else going on under the hood? Is there any resources or recommendations which would explain this stuff? Thanks.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    That said, I've wanted to know, how does C++ itself actually work?
    A lot like a C compiler actually. Only the earliest C++ compiler turned C++ into C first and compiled that. Now, eventually a modern compiler will make assembly files and make those into executables.

    The C++ compilation process

    Which C constructs would I use to implement classes, operator overloads, the STL and etc???
    There are entire books on how compilers work -- and to be frank I haven't read them, but here is an older one; there may be something newer out -- but rather than looking at it the way that you are, where some magic chunk of code will make all of the operator overloads work or whatever, you have to deeply understand C++'s grammar and be able to write good C to parse it.

    To be frank, I wouldn't wish parsing C++ on my worst enemy. It's not even context free, and it shouldn't be where the faint of heart start to learn about parsers.

    Code:
    A B(C);
    Is B a function declaration or an object? The answer is context dependent.

    If you wanted to invent the simplest compiler, it would help to look at it from a bird's eye view. Access labels like public, private and protected can be enforced by static analysis -- essentially looking at all the files and figuring out if the programmer tried to use a thing where he shouldn't. This goes the same for reporting things like unused, uninitialized, or shadow variables.

    Then once you have everything parsed, and assuming that no rules were broken, you would work on writing the assembly. I imagine that there is some magic to this too. Thinking about what the pre-processor does, and what a compiler should figure out, there is a minimum of new, freshly generated assembly compared to prerendered, or skeleton assembly programming.
    Last edited by whiteflags; 09-12-2017 at 02:43 AM.

  3. #3
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Are you trying to use C to recreate C++ functionality, or are you trying to implement a C++ compiler?
    What can this strange device be?
    When I touch it, it gives forth a sound
    It's got wires that vibrate and give music
    What can this thing be that I found?

  4. #4
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    Quote Originally Posted by Elkvis View Post
    Are you trying to use C to recreate C++ functionality, or are you trying to implement a C++ compiler?
    The former. Just trying to recreate some C++ functionality or model it out at least in C code. However, the idea of a compiler is interesting as well, although I'm sure with the compiler, it has no need to "express it in C code" as it just writes opcodes.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    A google search for "oo in c" finds other attempts at the same thing.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Quote Originally Posted by Asymptotic View Post
    The former. Just trying to recreate some C++ functionality or model it out at least in C code.
    In my opinion, this is a waste of time. If you want C++'s functionality, just use C++, if you can. You'll have a much better overall experience. I can understand the desire to model it for educational or entertainment purposes, but in practice, you're better off using the language the way it's intended to be used, rather than trying to make the square peg of C fit the round hole of C++.
    What can this strange device be?
    When I touch it, it gives forth a sound
    It's got wires that vibrate and give music
    What can this thing be that I found?

  7. #7
    Registered User
    Join Date
    Dec 2010
    Location
    Trinidad, CO (log cabin in middle of nowhere)
    Posts
    148
    All good points, but I'll try to add something. In the 1990s Microsoft was endeavoring to create component architectures, and, out of that came their COM/OLE (Component Object Model - Object Linking and Embedding) technologies. In a sense, its an alternate object model from the one C++ commonly uses which emphasizes object reuse through inheritence. The object architecture or memory layout they finally adopted was one all C++ compilers generate when the class implements pure abstract base classes. Which is abstract in itself and hard to understand. COM eventually became a critical part of Microsoft Operating Systems and many folks struggled to learn it. Most hated it. Few understood it. In 2006 Jeff Glatt wrote an article over at CodeProject entitled "COM In Plain C"...

    COM in plain C - CodeProject

    ...that has been tremendously well received. It kind of goes to the yearning Asymtotic expressed. I know in my case learning how objects are constructed in C has improved my understanding of OOP and C++ tremendously. Just sayin...

  8. #8
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    Freddie, thanks, that's awesome and good point about COM I hadn't really thought of that.

    Elkvis, don't read too much into my question, I'm not actually trying to implement this in production code and I have no need for OOP right now, but this is indeed for educational/research purposes. One thing you gotta know about me is that I'm technically a software researcher so please don't assume I'm trying to implement everything I ask about in production code :P Well, you can if you want to think I'm nuts like most ppl probably already do
    Anyway, thanks for info.

  9. #9
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    By the way Freddie,

    Structs actually just arrays? In that article, Glatt states:

    But let's say we don't want to store the function pointers directly inside of IExample. Instead, we'd rather have an array of function pointers. For example, let's define a second struct whose sole purpose is to store our two function pointers. We'll call this a IExampleVtbl struct, and define it as so:
    He then goes on to show a struct containing two function pointers which he calls an "array." So are structs basically glorified arrays with "members" for addressing reasons? For example, I could create an array and only use the first 4 bytes to represent an inter, the next 2 bytes to represent a short, padding, another byte to represent a char, etc... And then call this a "struct." Is that what he's getting at?

  10. #10
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Asymptotic
    Structs actually just arrays?
    No, struct objects are struct objects; arrays are arrays. A struct can contain members of different types; all the elements of an array are of the same type.

    Quote Originally Posted by Asymptotic
    He then goes on to show a struct containing two function pointers which he calls an "array." So are structs basically glorified arrays with "members" for addressing reasons?
    From what you have recounted, that's a wrong conclusion. Rather, arrays are basically glorified structs with "elements" for addressing reasons

    Quote Originally Posted by Asymptotic
    For example, I could create an array and only use the first 4 bytes to represent an inter, the next 2 bytes to represent a short, padding, another byte to represent a char, etc... And then call this a "struct." Is that what he's getting at?
    What you would be doing is allocating a contiguous chunk of memory by creating an array of bytes, and then manually treating parts of the memory allocated as objects of different types. This is not the same thing as saying that structs are actually just arrays. Rather, both struct objects and arrays reside in memory, so if you force them to be as if they were just memory, you could do things that make it look as if they were the same thing, i.e., memory. But from the perspective of C and C++, there's also the issue of type and the operations permitted on various types, hence structs and arrays differ.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #11
    Registered User
    Join Date
    Dec 2010
    Location
    Trinidad, CO (log cabin in middle of nowhere)
    Posts
    148
    As Laserlight said, structs are structs and arrays are arrays. However, in the case of virtual function tables, which I believe are a fairly standard way of constructing a C++ object by a C++ compiler, all the elements of the struct, which actually contain function pointers, resolve to the same size, i.e., 4 bytes for 32 bit systems or 8 bytes for 64 bit systems. When one constructs a C++ object in C, one typically uses a struct to contain the function pointers. It could be done with an array, and the memory block could be thought of in that way - and indeed it is oftentimes beneficial to think of it in that way, but structs are typically used because use of a struct allows for type checking, and related to that intellisense type behaviour if using an IDE becomes possible. It wouldn't be possible if an array was used.

    I would say the main thing C++ provides in terms of the creation of C++ objects over C is that with C the syntax is horrible. One has to allocate blocks of memory to hold instance variables, then allocate memory for virtual function tables (and that relates to our discussion of structs verses arrays above). Then one has to declare the member functions of the object as ordinary functions, then place their addresses within the virtual function tables. And finally, calling the member functions through function pointer syntax becomes rather error prone and miserable. Its doable and I've done it a lot just for experimentation and learning purposes, but you would never want to do that in a production app.

  12. #12
    Registered User
    Join Date
    Dec 2010
    Location
    Trinidad, CO (log cabin in middle of nowhere)
    Posts
    148
    Concrete examples should help. In Jeff Glatt's example he created an object containing only one interface. I consider that an over simplified case in that its not intuitively obvious how it extends to classes supporting multiple interfaces - at least it wasn't to me. It took me a lot of figuring to get it right. And for classes supporting multiple interfaces, the issue comes up of how to cast the object pointer in such a way as to make the various interfaces accessable. But enough of that. Below is an object built according to Microsoft's COM specification - but built in C. The class is named CB (Class B), and it supports an IX and an IY interface. Unlike a COM object contained in a Dll and loaded through Microsoft's COM Apis (CoInitialize(), CoGetClassObject(), and IClassFactory, etc.), the class is simply instantiated in the code below. It shows how to build a COM compliant object. Later I'll show essentially the same thing using C++, but this example should be compiled as *.C. I do command line compiling so here is thwe full code and output...

    Code:
    #include <objbase.h>   /* CB3.C */
    #include <stdio.h>
    const IID IID_IX = {0x20000000,0x0000,0x0000,{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x08}};
    const IID IID_IY = {0x20000000,0x0000,0x0000,{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x09}};
    typedef HRESULT  (__stdcall* PFNQUERYINTERFACE)    (size_t, const IID*, void**); //For Calling QueryInterface()
    typedef ULONG    (__stdcall* PFNADDREF)            (size_t);                     //For Calling AddRef()
    typedef ULONG    (__stdcall* PFNRELEASE)           (size_t);                     //For Calling Release()
    typedef void     (__stdcall* PIFN)                 (size_t,int);                 //For Calling Interface Functions
    
    
    typedef struct                   // IX
    {
     struct IXVTbl* lpIXVTbl;
    }IX;
    
    struct IXVTbl                    // Here is where pointers to interface functions are installed into VTable, which
    {                                // could be described alterenately as an 'array of function pointers'.
     HRESULT (__stdcall* QueryInterface) (IX*, const IID*, void**);
     ULONG   (__stdcall* AddRef)         (IX*                    );
     ULONG   (__stdcall* Release)        (IX*                    );
     HRESULT (__stdcall* Fx1)            (IX*, int               );
     HRESULT (__stdcall* Fx2)            (IX*, int               );
    };
    
    
    typedef struct                   //IY
    {
     struct IYVTbl* lpIYVTbl;
    }IY;
    
    struct IYVTbl
    {
     HRESULT (__stdcall* QueryInterface) (IY*,const IID* , void**);
     ULONG   (__stdcall* AddRef)         (IY*                    );
     ULONG   (__stdcall* Release)        (IY*                    );
     HRESULT (__stdcall* Fy1)            (IY*, int               );
     HRESULT (__stdcall* Fy2)            (IY*, int               );
    };
    
    typedef struct IXVTbl IXVTbl;
    typedef struct IYVTbl IYVTbl;
    
    typedef struct                   //CB   // CB is actually, what we would term in C++, an object.
    {
     IXVTbl* lpIXVTbl;
     IYVTbl* lpIYVTbl;
     long  m_cRef;
    }CB;
    
    
    HRESULT __stdcall IX_QueryInterface(IX* This, const IID* iid, void** ppv)
    {
     *ppv=0;
     if(!memcmp(iid,&IID_IUnknown,16))
     {
        printf("IX_QueryInterface() For IUnknown!\n");
        *ppv=This;
     }
     if(!memcmp(iid,&IID_IX,16))
     {
        printf("IX_QueryInterface() For IX! - This=%u\t*This=%u\n",This,*This);
        //printf("IX_QueryInterface() For IX!\n");
        *ppv=This;
     }
     if(!memcmp(iid,&IID_IY,16))
     {
        size_t* pInt = NULL;
        IY*     pIY  = NULL;
        pInt =(size_t*)This;
        pIY  = (IY*)pInt;
        pIY++;
        pInt=(size_t*)pIY;
        printf("IX_QueryInterface() For IY! - This=%u\tpIY=%u\n",This,pIY);
        *ppv=(void**)pInt[0];
     }
     if(*ppv)
     {
        //This->lpIXVTbl->AddRef(This);
        return S_OK;
     }
    
     return(E_NOINTERFACE);
    }
    
    
    HRESULT __stdcall IY_QueryInterface(IY* This, const IID* iid, void** ppv)
    {
     *ppv=0;
     if(!memcmp(iid,&IID_IUnknown,16))
     {
        printf("IY_QueryInterface() For IUnknown!\n");
        *ppv=This;
     }
     if(!memcmp(iid,&IID_IY,16))
     {
        printf("IY_QueryInterface() For IY!\n");
        *ppv=This;
     }
     if(!memcmp(iid,&IID_IX,16))
     {
        size_t* pInt=NULL;
        IX* pIX=NULL;
        pInt=(size_t*)This;
        pIX=(IX*)pInt;
        pIX--;
        pInt=(size_t*)pIX;
        printf("IY_QueryInterface() For IX! - This=%u\tpIX=%u\n",This,pIX);
        *ppv=(void**)pInt[0];
     }
     if(*ppv)
     {
        //This->lpIYVTbl->AddRef(This);
        return S_OK;
     }
    
     return(E_NOINTERFACE);
    }
    
    
    ULONG __stdcall IX_AddRef(IX* This)
    {
     CB* pCB=NULL;
    
     pCB=(CB*)This;
     pCB->m_cRef++;
     printf("Called IX_AddRef() - pCB->m_cRef = %u\n",pCB->m_cRef);
    
     return pCB->m_cRef;
    }
    
    
    ULONG __stdcall IY_AddRef(IY* This)
    {
     CB* pCB=NULL;
    
     pCB=(CB*)This;
     pCB->m_cRef++;
     printf("Called IY_AddRef() - pCB->m_cRef = %u\n",pCB->m_cRef);
    
     return pCB->m_cRef;
    }
    
    
    
    ULONG __stdcall IX_Release(IX* This)
    {
     CB* pCB=NULL;
    
     pCB=(CB*)This;
     pCB->m_cRef--;
     printf("Called IX_Release() - pCB->m_cRef = %u\n",pCB->m_cRef);
    
     return pCB->m_cRef;
    }
    
    
    ULONG __stdcall IY_Release(IY* This)
    {
     CB* pCB=NULL;
    
     pCB=(CB*)This;
     pCB->m_cRef--;
     printf("Called IY_Release() - pCB->m_cRef = %u\n",pCB->m_cRef);
    
     return pCB->m_cRef;
    }
    
    
    HRESULT __stdcall Fx1(IX* This,int iNum)
    {
     printf("Called Fx1()  :  iNum = %u\n",iNum);
     return S_OK;
    }
    
    
    HRESULT __stdcall Fx2(IX* This,int iNum)
    {
     printf("Called Fx2()  :  iNum = %u\n",iNum);
     return S_OK;
    }
    
    
    HRESULT __stdcall Fy1(IY* This,int iNum)
    {
     printf("Called Fy1()  :  iNum = %u\n",iNum);
     return S_OK;
    }
    
    
    HRESULT __stdcall Fy2(IY* This,int iNum)
    {
     printf("Called Fy2()  :  iNum = %u\n",iNum);
     return S_OK;
    }
    
    
    static IXVTbl ixVTable=
    {
     IX_QueryInterface,
     IX_AddRef,
     IX_Release,
     Fx1,
     Fx2
    };
    
    
    static IYVTbl iyVTable=
    {
     IY_QueryInterface,
     IY_AddRef,
     IY_Release,
     Fy1,
     Fy2
    };
    
    
    int main(void)
    {
     PFNQUERYINTERFACE    ptrQueryInterface = NULL;
     PFNADDREF            ptrAddRef         = NULL;
     PFNRELEASE           ptrRelease        = NULL;
     PIFN                 pIFn              = NULL;
     size_t*              pVTbl             = NULL;
     size_t*              VTbl              = NULL;
     void*                pIUnk             = NULL;
     CB*                  pCB               = NULL;
     IX*                  pIX               = NULL;
     IY*                  pIY               = NULL;
     IX*                  pIX1              = NULL;
     IY*                  pIY1              = NULL;
     HRESULT              hr                = 0;
     size_t               i                 = 0;
    
     printf("sizeof(CB)     = %u\n",sizeof(CB));
     pCB=(CB*)malloc(sizeof(CB));
     if(pCB)
     {
        pCB->lpIXVTbl=&ixVTable;
        pCB->lpIYVTbl=&iyVTable;
        printf("pCB            = %u\n",pCB);
        printf("pCB->lpIXVTbl  = %u\n",pCB->lpIXVTbl);
        printf("pCB->lpIYVTbl  = %u\n",pCB->lpIYVTbl);
        pCB->lpIXVTbl->QueryInterface((IX*)&pCB->lpIXVTbl,&IID_IY,&pIY1);
        pCB->lpIXVTbl->Fx1(pIX,1);
        pCB->lpIXVTbl->Fx2(pIX,1);
        pCB->lpIYVTbl->Fy1(pIY,2);
        pCB->lpIYVTbl->Fy2(pIY,2);
        pVTbl=(size_t*)pCB;
        printf("\n");
        printf("&pVTbl[i]\t&VTbl[j]\t\tVTbl[j]\t\t\tpFn()\n");
        printf("==================================================================================================================================\n");
        for(i=0; i<2; i++)
        {
            VTbl=(size_t*)pVTbl[i];                                           //Call...
            printf("%u\t\t%u\t\t%u\t\t",&pVTbl[i],&VTbl[0],VTbl[0]);
            ptrQueryInterface=(PFNQUERYINTERFACE)VTbl[0];                     //QueryInterface()
            if(i==0)
               ptrQueryInterface((size_t)&pVTbl[i],&IID_IX,&pIUnk);
            else
               ptrQueryInterface((size_t)&pVTbl[i],&IID_IY,&pIUnk);
            printf("%u\t\t%u\t\t%u\t\t",&pVTbl[i],&VTbl[1],VTbl[1]);
            ptrAddRef=(PFNADDREF)VTbl[1];                                     //AddRef()
            ptrAddRef((int)&pVTbl[i]);
            printf("%u\t\t%u\t\t%u\t\t",&pVTbl[i],&VTbl[2],VTbl[2]);
            ptrRelease=(PFNRELEASE)VTbl[2];                                   //Release()
            ptrRelease((int)&pVTbl[i]);
            printf("%u\t\t%u\t\t%u\t\t",&pVTbl[i],&VTbl[3],VTbl[3]);
            pIFn=(PIFN)VTbl[3];                                               //Fx1() / Fy1()
            pIFn((int)&pVTbl[i],i);
            printf("%u\t\t%u\t\t%u\t\t",&pVTbl[i],&VTbl[4],VTbl[4]);
            pIFn=(PIFN)VTbl[4];                                               //Fx2() / Fy2()
            pIFn((int)&pVTbl[i],i);
            printf("\n");
        }
        free(pCB);
     }
    
     return 0;
    }
    After the code is the output from my run, which shows the addresses of everything involved in the object. Note this is Microsoft code. It should work with Mingw, but it uses Microsoft specific headers. I built with VC9 from Visual Studio 2008, which is actually VC15 if one believes the compiler version. Its a 64 bit build. Should build x86 though.
    Last edited by freddie; 09-14-2017 at 11:24 AM. Reason: add a bit more info

  13. #13
    Registered User
    Join Date
    Dec 2010
    Location
    Trinidad, CO (log cabin in middle of nowhere)
    Posts
    148
    The above is a whole lot to digest Asymtotic. I worked on stuff like that for maybe 6 months back when I was first learning COM and object architectures. I've got whole hard drives full of that kind of stuff. Now I'll try to post something essentially the same in C++. It shows how much easier it is to do in C++, because C++ builds the whole thing for you transparently. There are also comments that try to explain it. In fact, it might be better to try this one, i.e., CA (Class A), before the above, as it might be more comprehensible...

    Code:
    /*
      CA.cpp
    
      This program attempts to show and describe the memory foot print of a COM object.  One of the fundamental ideas of COM
      is the seperation of interface from implementation.  So when a COM object is instantiated there will be memory allo-
      cations in multiple places.  Specifically, there will be a memory allocation for the object itself, and in that memory
      block will be found instance data variables that will maintain 'state' for an instantiation of an object.  Also in that
      memory block will be pointers to structures known as virtual function tables or interfaces.  An object can support any
      number of interfaces, and for each interface supported there will be what is known as a virtual function table pointer,
      i.e., Vtbl*.
    
      So, for an object that supports two interfaces (as in the example below) there will be a block of memory allocated for
      the object itself, and then two seperate additional allocations for each of the two VTables or interfaces.  If each
      interface contains five functions/methods, in a 32 bit operating system, each interface will then require a 5 X 4 = 20
      byte allocation to hold function pointers to the implementations of the interface functions.  From this then you can
      see that a virtual function table or interface can also be likened to an array of function pointers, with each member
      of the array holding a pointer to a function.  In actual practice though arrays are seldom used to construct virtual
      function tables, because with present day compilers and IDEs type and parameter checking doesn't work as well as when
      structs are used.
    
      In a ceetain sense one could really say there are three categories of memory needing to be allocated for a COM object,
      i.e., 1) the base allocation for the object itself; 2) the VTables or interfaces; and 3) the interface functions
      themselves the addresses of which are stored in the VTables.  However, the writer/creator of a COM object only needes
      to concern him/herself with the first two; the compiler takes care of the third (that's why most of us program in a
      high/mid level language as opposed to assembler or machine code).
    
      So lets take a look at the code below.  Its a very simplified version of a COM object whose only purpose is to elucidate
      the memory foot print of a real COM object.  To begin with, note the interface #define.  An interface is just a specific
      kind of struct.  This program doesn't include ObjBase.h, so to use the interface keyword I had to include the define...
    
      #define interface struct
    
      which actually is in ObjBase.h.  Secondly, all COM interfaces inherit a COM System defined interface known as IUnknown.
      This is tremendously significant.  Whenever you have a pointer to a COM object - no matter what it is, you have an
      IUnknown*, and can query that object for other interfaces. You'll note that the IUnknown VTable contains three
      functions, which in C++ syntax are described as pure virtual functions the real declarations of which are in Unknown.h.
      And they actually resolve to function pointers themselves.  If you've gotten this far it might have dawned on you that
      COM memory layout isn't introductory material.  You'll need some understanding of pointers, memory allocations, and
      function pointers to grasp it.  But if you are interested in this material and are having a hard time, please read on.
      I am going to do my best to explain at least some part of it, and it is my hope that if you see the actual memory
      'dumped' as I'm going to do in this example, you will begin to grasp what is going on, and see the actual elegance of
      it!
    
      There is another issue I might mention too.  C++ in some ways obscures an understanding of what is really going on in
      memory with a COM object.  To really fully understand it and see its brutally naked structure you almost have to do
      it in C, and I might just do that for you!  But for now let's continue with C++, pure virtual functions, and the
      works!
    
      You see we have the IUnknown interface/struct which in a 32 bit operating system is going to create a memory block of
      12 bytes, i.e., four bytes for each function pointer.  Then we have an IX interface/struct and an IY interface/struct
      each of which inherits IUnknown.  What that means is that when a COM object is instantiated which supports interfaces
      IX and IY, the virtual function tables for each of these interfaces is going to have to supply room for five function
      pointers each - three for the IUnknown and two more for the supported functions Fx1(), Fx2(), or Fy1() and Fy2().  So
      the VTable for each will look like this...
    
      struct IX            //Offset from base allocation of VTable allocation
      {
       QueryInterface();   // 0
       AddRef();           // 4
       Release();          // 8
       Fx1();              // 12
       Fx2();              // 16
      };                   // =======
                           // 20 bytes
      or
    
      struct IY            //Offset from base allocation of VTable allocation
      {
       QueryInterface();   // 0
       AddRef();           // 4
       Release();          // 8
       Fx1();              // 12
       Fx2();              // 16
      };                   // =======
                           // 20 bytes
    
      Finally, we have a class 'Class A', i.e., 'CA', that publically and multiply inherits interfaces IX and IY.  Since we
      hope to create an instance of Class A in this program we're going to have to implement all the pure virtual functions of
      the interfaces, and we do that by simply providing stdio puts or printf statements that they were called.  Take a look
      at class CA.  Having done that we can now instantiate class CA in main and locate the class on the heap so as to obtain
      a pointer to it, e.g., pCA...
    
      CA* pCA=new CA;
    
      Of course, we could call the IX and IY interface functions like this if we wanted to...
    
      pCA->Fx1();
      pCA->Fx2();
      pCA->Fy1();
      pCA->Fy2();
    
      And if we did we'd get this output...
    
      Called Fx1()
      Called Fx2()
      Called Fy1()
      Called Fy2()
    
      But what we want to do is a lot cuter than that!  We want to see the actual memory addresses and blocks of memory where
      these arcane creations really live, and for that we are going to have to get pretty cute ourselves!  And with pointers
      and function pointers involving several levels of indirection to boot!  Are we having fun yet?  If so, lets continue by
      creating our CA object and seeing how big it is.  The second statement in main outputs the sizeof(CA) and we see its 8
      bytes....
    
      sizeof(CA)              = 8       : An IX VTBL Ptr And A IY VTBL Ptr
    
      The above line is the first line of output from the program below.  Please examine that output, for it is that we are
      going to explain.  The reason the sizeof CA is eight bytes is that the C++ compiler, when creating the object, saw that
      it contained no instance data, but that the class multiply inherited from two abstract base classes, i.e., the interface
      classes, and therefore it created two memory slots of four bytes each for virtual function table pointers to the instan-
      tiations of the respective interfaces/vtables.  So that I could iterate through these various structures/classes/vtables
      whatever and output addresses I declared two variables as unsigned int pointers, i.e.,
    
      unsigned int* pVTbl=0;  // pointer to VTable
      unsigned int* VTbl=0;   // VTable
    
      which would allow me to treat everything more or less as an int or int pointer, which is useful in for loop iterations
      or address output statements.  So first off we set our pVtbl pointer equal to the base address of our 8 byte CA instan-
      tiation and output that address...
    
     pVTbl=(unsigned int*)pCA;
     printf("pVTbl\t\t\t= %u : Ptr To IX VTBL\n",(unsigned int)pVTbl);
    
     And the output...
    
     pVTbl = 7744160 : Ptr To IX VTBL
    
     So our object CA starts at 7744160 and ends at 7744167, i.e., 8 bytes later.  To repeat, at the first four bytes of that
     8 byte memory block is a pointer to the 20 byte allocation for the IX VTable, and at the last four bytes of that allocation
     is a pointer to the 20 byte allocation for the IY VTable.  Using array subscript notation we have then this for the memory
     run below....
    
     &pVTbl[0] = 7744160
     &pVtbl[1] = 7744164
    
     If you look at our tabular output below you'll see those numbers listed in the first column of output.  These numbers are
     referred to as VTable or interface pointers.  They are not VTables or interfaces but rather point to VTable\interfaces.
     In the code in main() you'll see two for loops; the j loop nested within the outer i loop.  The outer i loop simply
     iterates between 0 and 1 to catch the two VTable pointers listed above, i.e., our IX VTable pointer at 7774160 (&pVtbl[0])
     and our IY VTable pointer at 7774164 (&pVtbl[1]).  Within the body of the outer i for loop and just above the j loop you'll
     see this statement...
    
     VTbl=(unsigned int*)pVTbl[i];
    
     This VTbl number will actually be the starting address or base allocation for each of the two VTables/interfaces.  When i
     equals zero it will be the base allocation for the IX VTable, and when i is one it will be the base allocation for the IY
     VTable.  What the j for loop does then is treat this address as an int pointer and uses array subscript notation to iterate
     through the five function pointer slots in each VTable, and output 1) the address within the VTable ( column 2 ); the
     function pointer address held at that aforementioned location ( column 3 ); and 3) the stdio output statement generated by
     a function pointer call through pFn at the addresses in column 3.  These later function pointer calls are seen in column 4.
     A function pointer is declared in main like so...
    
     void (__stdcall* pFn)(int);
    
     What that means is that pFn is a symbol standing for a function pointer which can be used to call a function through its
     address rather than through its name, and the function that it can call must use __stdcall stack frame setup, must return
     nothing, and must take one int parameter.
    
    */
    #include <stdio.h>
    #define interface struct
    
    interface IUnknown
    {
     virtual void __stdcall QueryInterface() = 0;
     virtual void __stdcall AddRef()         = 0;
     virtual void __stdcall Release()        = 0;
    };
    
    interface IX : IUnknown
    {
     virtual void __stdcall Fx1()            = 0;
     virtual void __stdcall Fx2()            = 0;
    };
    
    interface IY : IUnknown
    {
     virtual void __stdcall Fy1()            = 0;
     virtual void __stdcall Fy2()            = 0;
    };
    
    class CA : public IX, public IY
    {
     public:
     virtual ~CA(){}
     virtual void __stdcall QueryInterface(){puts("Called QueryInterface()");}
     virtual void __stdcall AddRef(){puts("Called AddRef()");}
     virtual void __stdcall Release(){puts("Called Release()");}
     virtual void __stdcall Fx1(){printf("Called Fx1()\n");}
     virtual void __stdcall Fx2(){printf("Called Fx2()\n");}
     virtual void __stdcall Fy1(){printf("Called Fy1()\n");}
     virtual void __stdcall Fy2(){printf("Called Fy2()\n");}
    };
    
    int main(void)
    {
     void (__stdcall* pFn)(int);
     size_t* pVTbl=0;
     size_t* VTbl=0;
     unsigned int i=0,j=0;
     CA* pCA=0;
    
     pCA=new CA;
     printf("sizeof(CA)\t\t= %u\t  : An IX VTBL Ptr And A IY VTBL Ptr\n",(unsigned int)sizeof(CA));
     pVTbl=(size_t*)pCA;
     printf("pVTbl\t\t\t= %u : Ptr To IX VTBL\n",(unsigned int)(size_t)pVTbl);
     printf("&pVTbl[0]=%u\t< at this address is ptr to IX VTable\n",(unsigned int)(size_t)&pVTbl[0]);
     printf("&pVTbl[1]=%u\t< at this address is ptr to IY VTable\n",(unsigned int)(size_t)&pVTbl[1]);
     printf("\n");
     printf("&pVTbl[i]\t&VTbl[j]\t\tpFn=VTbl[j]\t\tpFn() Function Pointer Call\n");
     printf("===========================================================================================\n");
     for(i=0;i<2;i++)
     {
         VTbl=(size_t*)pVTbl[i];
         for(j=0;j<5;j++)
         {
             printf
             (
              "%u\t\t%u\t\t%u\t\t",
              &pVTbl[i],
              &VTbl[j],
              VTbl[j]
             );
             pFn=( void (__stdcall*)(int))VTbl[j];
             pFn(i);
         }
         printf("\n");
     }
     delete pCA;
     getchar();
    
     return 0;
    }
    
    
    /*
    C:\Code\VStudio\VC++6\Projects\COM\CB\CB3>CL CA.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    CA.cpp
    Microsoft (R) Incremental Linker Version 9.00.21022.08
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    /out:CA.exe
    CA.obj
    
    C:\Code\VStudio\VC++6\Projects\COM\CB\CB3>CA
    sizeof(CA)              = 16      : An IX VTBL Ptr And A IY VTBL Ptr
    pVTbl                   = 5070912 : Ptr To IX VTBL
    &pVTbl[0]=5070912       < at this address is ptr to IX VTable
    &pVTbl[1]=5070920       < at this address is ptr to IY VTable
    
    &pVTbl[i]       &VTbl[j]                pFn=VTbl[j]             pFn() Function Pointer Call
    ===========================================================================================
    5070912         1073788024              1073746496              Called QueryInterface()
    5070912         1073788032              1073746528              Called AddRef()
    5070912         1073788040              1073746560              Called Release()
    5070912         1073788048              1073746592              Called Fx1()
    5070912         1073788056              1073746624              Called Fx2()
    
    5070920         1073787976              1073746976              Called QueryInterface()
    5070920         1073787984              1073746960              Called AddRef()
    5070920         1073787992              1073746992              Called Release()
    5070920         1073788000              1073746656              Called Fy1()
    5070920         1073788008              1073746688              Called Fy2()
    */
    Note when I wrote up the tutorial part above I was using 32 bit builds where 4 bytes was enough for a pointer. The output above is from a run where a 64 bit build was done. And I changed the unsigned int pointers to size_t pointers. It should be able to be built as 32 bit or 64 bit that way.
    Last edited by freddie; 09-14-2017 at 11:41 AM.

  14. #14
    Old Fashioned
    Join Date
    Nov 2016
    Posts
    137
    laserlight and Freddie,

    Thank you both very much for all of that information. I am feeling very greatful right now to say the least. I also happen to be working through the book "Data Structures in C" and this is all starting to make a lot more sense now. Very interesting and fun stuff!

    Freddie thanks also for your overview and then demo of the difference between the code in C++ vs. C. Going to go through all that code tomorrow and will let ya know if I have any more questions. Already got a much clearer idea now though!

  15. #15
    Registered User
    Join Date
    Dec 2010
    Location
    Trinidad, CO (log cabin in middle of nowhere)
    Posts
    148
    In looking at what I posted yesterday, either the forum software messed up (unlikely), or I somehow failed to include the console run output from the CB example above. Here is that, that is, what you'll get if you build and run it...

    Code:
    C:\Code\VStudio\VC++6\Projects\COM\CB\CB3>CL CB3.C
    Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    CB3.C
    Microsoft (R) Incremental Linker Version 9.00.21022.08
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    /out:CB3.exe
    CB3.obj
    
    C:\Code\VStudio\VC++6\Projects\COM\CB\CB3>CB3
    sizeof(CB)     = 24
    pCB            = 3235904
    pCB->lpIXVTbl  = 1073795616
    pCB->lpIYVTbl  = 1073795656
    IX_QueryInterface() For IY! - This=3235904      pIY=3235912
    Called Fx1()  :  iNum = 1
    Called Fx2()  :  iNum = 1
    Called Fy1()  :  iNum = 2
    Called Fy2()  :  iNum = 2
    
    &pVTbl[i]       &VTbl[j]                VTbl[j]                 pFn()
    ==================================================================================================================================
    3235904         1073795616              1073745920              IX_QueryInterface() For IX! - This=3235904      *This=1073795616
    3235904         1073795624              1073746544              Called IX_AddRef() - pCB->m_cRef = 1
    3235904         1073795632              1073746736              Called IX_Release() - pCB->m_cRef = 0
    3235904         1073795640              1073746928              Called Fx1()  :  iNum = 0
    3235904         1073795648              1073746976              Called Fx2()  :  iNum = 0
    
    3235912         1073795656              1073746240              IY_QueryInterface() For IY!
    3235912         1073795664              1073746640              Called IY_AddRef() - pCB->m_cRef = 604404914
    3235912         1073795672              1073746832              Called IY_Release() - pCB->m_cRef = 604404913
    3235912         1073795680              1073747024              Called Fy1()  :  iNum = 1
    3235912         1073795688              1073747072              Called Fy2()  :  iNum = 1
    Years ago over in Jose Roca's Forum I posted this example made into a COM object in a Dll. I included all the Registry code so it could be registered in the Windows Registy. When that's done the object can be loaded through standard COM services (SCM - Service Control Manager), i.e., call CoInitialize(), then CoGetClassObject(), etc. Alternately, one can do something known as 'Registry Free COM', where one bypasses the standard COM\OLE system services of Windows, and does everything oneself to load and create the object. That isn't hard at all. One simply calls LoadLibrary() on the dll, then DllGetClassObject() to get a pointer to the IClassFactory interface, from which the CB object can be instantiated.

    It might interest you to know (since you seem, like me, to enjoy this stuff), that if the object is created with C code as above, and it exists within a dll, the client utilizing the object in the dll needn't be aware it was created with C code. That is, standard C++ headers describing the IX and IY interfaces could be used. As far as a C++ client would be concerned, it would look like a C++ object. It gets even better.

    Since an object constructed in that manner follows the COM standard in terms of the binary layout of the object, it could be accessed using other languages besides C or C++. I frequently use PowerBASIC to access COM\OLE objects.

    In fact, you mentioned C#. That's absolutely doable. It would be a console app, but you would be able to load it and use it in C#. Of course, it doesn't do much, but that's another issue.

    I'm not sure the intellisense would work though in .NET. When I posted that code years ago I don't believe I compiled a type library into the object. That is likely necessary for intellisense to work. Easy enough to do, but I don't believe my example did it.
    Last edited by freddie; 09-15-2017 at 11:32 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Why does this work?
    By MacNilly in forum C++ Programming
    Replies: 2
    Last Post: 02-15-2006, 06:41 AM
  2. Why won't this work
    By gooey kablooey in forum C++ Programming
    Replies: 4
    Last Post: 03-12-2004, 09:18 AM
  3. Why does this work?
    By Panopticon in forum C++ Programming
    Replies: 4
    Last Post: 01-07-2003, 08:59 PM
  4. my function doesn't work! it should work
    By Unregistered in forum C Programming
    Replies: 13
    Last Post: 05-02-2002, 02:53 PM

Tags for this Thread