Thread: how pointer really works

  1. #1
    Registered User
    Join Date
    Jan 2008
    Posts
    32

    how pointer really works

    how is the compiler able to understand that a particular pointer points to only this no. of bytes...

    i believe my question will be clear with the following example:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    int printit(void *b, int elementid) {
    	printf("%d\n\n\n", *((int *)b + elementid));
    	return 1;
    }
    
    int main(void) {
    	int a[100] = {0};
    	if (1 == printit(a, 200))
    		printf("THE ELEMENT WAS PRESENT IN THE ARRAY AND PRINTED\n");
    	else
    		printf("ELEMENT ID OUT  OF LIMIT\n");
    	return 0;
    }
    how can i know that the element id passed to the function printit is out of bounds and get
    ELEMENT ID OUT OF LIMIT printed

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    That code is unlikely to "work as expected". Referencing element 200 of a 100 element array is quite on the "undefined behaviour" level - in this particular case, it would reference some area of memory that is possibly below the bottom of the stack, and thus likely crash the application. But this depends on the compiler, processor and OS implementations, so it may "do anything".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Jan 2008
    Posts
    32
    i know that it is an undefined behaviour ... my only reason to write the snippet is to explain that what can i do when a person enters an subscript which is greater than the size of array or number of bytes the pointer is pointing to..

    i need to know how does a function called _msize(Windows specific) is able to find out the number of bytes the pointer points to...

    Code:
    #include <stdio.h>
    #include <malloc.h>
    #include <stdlib.h>
    
    int main(void) {
    	int *a = malloc(123);
    	printf("&#37;d\n\n", _msize(a));
    	return 0;
    }

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    _msize uses the internal data structures that malloc uses to keep track of the memory allocation. In almost all implementations, malloc() will add a small amount to the actual requested amount, and it will put it's own data inside this extra amount, in front of the data block that you get back. In MS's implementation, part of what is stored there is the size of the memory allocation.

    There is no (simple) way that you can know in any particular function whether an index is within range or not OTHER THAN by knowing the original size of the array/memory block - so usually this means passing a size to the function.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Jan 2008
    Posts
    32
    i would love to know this thing even if its not "simple"

    and another thing..when we make a malloc call ..does it allocate memory more that we have asked for ...i.e to store all this added information about the data block..

  6. #6
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Quote Originally Posted by technosavvy View Post
    and another thing..when we make a malloc call ..does it allocate memory more that we have asked for ...i.e to store all this added information about the data block..
    Quote Originally Posted by matsp View Post
    In almost all implementations, malloc() will add a small amount to the actual requested amount, and it will put it's own data inside this extra amount, in front of the data block that you get back. In MS's implementation, part of what is stored there is the size of the memory allocation.
    He already answered that.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    What I mean by "no simple way" is that there is no way that I can think of that would work for a reasonable situation[1]. There are ways, like you have mentioned, using _msize(), but that ONLY works for malloc-created memory blocks, nothing else, and it requires that you have the original memory allocation [at least the MSDN info doesn't say otherwise], so the following code wouldn't work:
    Code:
    void func(int *p, int index)
    {
        if (index >= _msize(p) / sizeof(int)) { exit with error };
        ... use p[index] 
    }
    
    int main()
    {
        int *block;
        block = malloc(sizeof(int) * 40);
        func(&p[20], 10);
        return 0;
    }
    Yes, malloc commonly allocates "a little bit extra", something like this would implement malloc and _msize, for example:
    Code:
    void *malloc(size_t size)
    {
        struct mdata *p;
        size_t sz = size + sizeof(struct mdata);
        p = low_level_allocation(sz);
        p->size = size;
        // perhaps other data to be filled in... 
        return (void*)&p[1];
    }
    
    size_t _msize(void *pp)
    {
       struct mdata *p = ((struct mdata *)pp)[-1];  // Not portable
       return p->size;
    }
    [1]If you have access to OS-internal data, then you could potentially check if something would cause a memory violation, but just knowing that doesn't mean that it's correct to access the memory - for example, if we are in a call-stack with 15 calls, using an average of 100 bytes, this would be "valid memory access":
    Code:
    void func()
    {
       int a[5];
       a[300] = 7;
    }
    It may well crash the system when some function about 12 levels below is returning, but it would not be detectable by checking if the memory is valid.

    The only other way to solve this problem, which is how for example Pascal compilers often do range checking, is to pass the size of the data implicitly from one function to the next, and checking against the size for every access. But this is slow if it has to be done every function call and every access - and it relies on the compiler to pass sizes of arrays properly.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Registered User
    Join Date
    Jan 2008
    Posts
    32
    thanks fr d help !!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Ban pointers or references on classes?
    By Elysia in forum C++ Programming
    Replies: 89
    Last Post: 10-30-2007, 03:20 AM
  2. towers of hanoi problem
    By aik_21 in forum C Programming
    Replies: 1
    Last Post: 10-02-2004, 01:34 PM
  3. Another Linked List plee
    By Dragoncaster131 in forum C Programming
    Replies: 3
    Last Post: 05-15-2004, 05:40 PM
  4. sending or recieving a[][]to a function
    By arian in forum C++ Programming
    Replies: 12
    Last Post: 05-01-2004, 10:58 AM
  5. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 10:33 AM