Thread: A pointer problem from a high school programming contest

  1. #16
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robatino View Post
    Here's a long and detailed thread on the subject:

    http://groups.google.com/group/comp....18f6cccd56427d
    Here's the relevant passage from that discussion:

    The Rationale says the Committee considered defining
    the effects at both ends (bilateral dispensations?), but
    rejected it for efficiency reasons. Consider an array of
    large elements -- structs of 32KB size, say. A system
    that actually performed hardware checking of pointer values
    could accommodate the one-past-the-end rule by allocating
    just one extra byte after the end of the array, a byte that
    the special pointer value could point at without setting off
    the hardware's alarms. But one-before-the-beginning would
    require an extra 32KB, just to hold data that could never
    be used ...
    I can understand the benefit of an architecture with fine-grained access control to memory. Yes, I absolutely can. But this is still ridiculous. What harm is there in POSSESSING a pointer which points out of bounds? If the architecture really is capable of fine-grained access control, then attempting to use this pointer will result in a fault ANYWAY. This is a completely arbitrary restriction that gives no real benefit whatsoever, while disallowing lots of completely reasonable code.

    And get this: the standard specifically ALLOWS an uninitialized pointer. So somehow, you're going to have to tell the bounds-checking hardware "Hey, ignore this one particular pointer -- it's okay according to the C standard." Which of course forces language-specific concerns into the hardware, which is completely idiotic.

  2. #17
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    And get this: the standard specifically ALLOWS an uninitialized pointer. So somehow, you're going to have to tell the bounds-checking hardware "Hey, ignore this one particular pointer -- it's okay according to the C standard." Which of course forces language-specific concerns into the hardware, which is completely idiotic.
    It would be relatively easy for a compiler to make uninitialized pointers point to NULL or something if the hardware didn't allow uninitialized pointers, but it would be difficult to do the same sort of thing for one-before-the-start pointers. Just a thought.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #18
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Initialisation requires no prior knowledge of the old (undefined) contents.

    So saying
    p = NULL;
    neither loads p into a machine register, nor attempts to dereference it.

    p--;
    on the other hand has the potential for all sorts of problems (of the undefined variety), if p steps off the beginning of an array, or in this case no longer points at a valid object.

    What if said decrement resulted in address underflow, and an immediate trap without even trying to dereference it? The "it hasn't happened to me yet" defence isn't good enough.

    You could argue that in this specific program that there won't be a problem, but it still remains undefined (remember, undefined includes producing the correct answer). But one specific example isn't good enough to extrapolate every single instance, in every single program, compiled with every single compiler for any real or abstract machine.

    Simply having an uninitialised pointer (or anything else for that matter) in scope isn't undefined behaviour. What you do next however is, if that thing isn't initialisation.
    It would be just as dumb to assume that some floating point operation using an uninitialised float would be just as harmless.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #19
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Salem View Post
    What if said decrement resulted in address underflow, and an immediate trap without even trying to dereference it? The "it hasn't happened to me yet" defence isn't good enough.
    I know of no architecture that would do that. So I admit that the code I write is actually a more restrictive version of C/C++ than the standard would indicate -- the code wouldn't work on such an architecture. But I really don't care. It's no different than writing programs that call getch(). Obviously, such programs are restricted to platforms where getch() is available. Whereas my code is restricted to platforms that don't fault when an invalid pointer value is loaded into a variable.

    As I've stated in the past, portability is a spectrum. I understand that some of the code I write is not strictly standard C. But the reality is, it works everywhere. And I'm not talking "I tested it on Windows and Linux." A platform where it didn't work is a platform I would avoid. A system where simply loading an invalid pointer value into a register causes a fault is just plain dumb design.

    It would be just as dumb to assume that some floating point operation using an uninitialised float would be just as harmless.
    If you look at the newsgroup thread that spawned this discussion, there are others who agree with me. This particular facet of the standard is just plain stupid and largely irrelevant. Don't assume that because I choose to ignore this particular thing that I have no clue what portability means.

  5. #20
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    I understand the need to sometimes compromise portability for functionality or efficiency. But can you give an example of a situation where allowing invalid pointer assignments is necessary for this? Loops can normally be trivially modified to avoid them with no loss of efficiency.

    Edit: A typical example is a backwards loop. Most people would write it like
    Code:
    for (p=c.end()-1; p >= c.begin(); --p) {
      // code
    }
    but this generates an invalid iterator at the last iteration. In fact, if the container has zero length, even the initial assignment is invalid. But it can be rewritten as
    Code:
    for (p=c.end(); p != c.begin();) {
      --p;
      // code
    }
    The trick is that the decrement has to be at the beginning of the body, not the end, so --p has to be separate from the rest of the for statement. But it's just as efficient as the other version.

    Edit: Actually, I think only random-access iterators can be compared with >=, and only bidirectional or random-access iterators can be decremented. But the point holds when using arrays and ordinary pointers.
    Last edited by robatino; 04-25-2007 at 07:15 PM.

  6. #21
    Registered User verbity's Avatar
    Join Date
    Nov 2006
    Posts
    101
    It's nice to see that my ignorance can bring all of you into a constructive argument about all this!!!!!


    In the end this is what I did:

    Code:
    void Case::print()
    {
    	char* templwr = new char[m_length + 1];
    	strcpy(templwr, m_buf);
    
    	m_lower = templwr;
    
    	cout << m_name << " \""<< m_buf << "\"  String Length: " << getLength() << endl;
    	csis << m_name << " \""<< m_buf << "\"  String Length: " << getLength() << endl;
    	
    	cout << "Lower Case: " << strlwr(m_lower) <<  endl;
    	csis << "Lower Case: " << strlwr(m_lower) <<  endl;
    
    	char* tempupr = new char[m_length + 1];
    	strcpy(tempupr, m_buf);
    	m_upper = tempupr;
    
    	cout << "Upper Case: "<< strupr(m_upper)<< endl;
    
    	csis << "Upper Case: "<< strupr(m_upper)<< endl;
    
    
    	delete[]templwr;
    	delete[]tempupr;
    
    }
    And the variables were protected because I needed to access them in this class which is a derived class of String.....
    Last edited by Dave_Sinkula; 04-25-2007 at 07:35 PM. Reason: The other magic of putting [code][/code] tags only around code.

  7. #22
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robatino View Post
    I understand the need to sometimes compromise portability for functionality or efficiency. But can you give an example of a situation where allowing invalid pointer assignments is necessary for this? Loops can normally be trivially modified to avoid them with no loss of efficiency.
    Here is the basic idiom:

    Code:
    char *start, *end;
    start = array - 1; /* Oh no, the HORROR! */
    end = array + length;
    while(++start < end)
    {
        /* Do something */
    }
    This is called "backward biasing," at least where I come from. Why do it this way? Because the value of "start" is the same, both inside the condition and inside the loop. Why does it matter? Reduction of confusion. Try writing a few hundred thousand lines of code and it will become clear. When this same pattern occurs hundreds or thousands (literally) of times in a code base, you aren't going to go through and change everything because of some "hypothetical" architecture where it could be problematic.

    There is some funny guy on this board who likes to claim that "walking a pointer is more efficient than indexing an array." This did, in fact, used to be true. And so you see the above pattern in legacy code. All... over... the... place.

    It's not like I'm referencing an undefined pointer. It's a syntactic convenience for crap's sake. I'm not giving it up because the standard says so.

  8. #23
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    for( start = array,  end = array + length; start < end; start++ )
    To each his own I suppose.


    Quzah.
    Hope is the first step on the road to disappointment.

  9. #24
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by quzah View Post
    Code:
    for( start = array,  end = array + length; start < end; start++ )
    To each his own I suppose.


    Quzah.
    I agree. But I didn't write every line of code I'm responsible for. The fact is, this pattern is everywhere. New code I write usually just uses a for-loop over the index. But I sure as heck am not wasting two years "fixing" all the instances of this.

  10. #25
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    There is some funny guy on this board who likes to claim that "walking a pointer is more efficient than indexing an array." This did, in fact, used to be true. And so you see the above pattern in legacy code. All... over... the... place.
    Actually, I think that discussion was deleted when we had the recent rollback of the forums, heheh.

    The fact is, this pattern is everywhere.
    I had the impression that the for loop version was clearer and more obvious, even excluding the begin/end idiom in the standard library. Then again, maybe my youth (i.e., less exposure to weird "legacy" stuff) has given me a bias to see the loop as clearer and more obvious.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #26
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    #define while(x) for( ; x ; )
    That'll get you started. I've done all the hard work, I'll leave the rest up do you!


    Quzah.
    Hope is the first step on the road to disappointment.

  12. #27
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    I've never seen backward biasing - to me it looks far more confusing than quzah's version, which is basically how I've always seen forward loops done, and which doesn't generate invalid pointers. And if the increment is inside a for loop, instead of a while loop, it gets done at the end of the body instead of the beginning, which means that start has to be initialized to array instead of array-1.

    Edit: Then again, I've had the luxury of not having to maintain legacy code. But it's one thing to not want to change thousands of lines of other people's legacy code, and another to write such code yourself.
    Last edited by robatino; 04-25-2007 at 10:06 PM.

  13. #28
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robatino View Post
    I've never seen backward biasing - to me it looks far more confusing than quzah's version, which is basically how I've always seen forward loops done, and which doesn't generate invalid pointers.
    It wasn't invented for clarity's sake I don't claim to have invented it, but I deal with code that uses it on a daily basis. I am not keen on fixing what isn't broke. To me it reads naturally, but as Quzah said, to each his own.

    Also, consider that a lot of legacy C code was "hand translated" from other languages during the 80's and early 90's. So a lot of this stuff is a result of directly translating idioms from one language to another, which ends up looking funny. But this particular pattern isn't rare by any means.

  14. #29
    Registered User
    Join Date
    Apr 2007
    Posts
    10
    Well I am quite new to the entire pointer C++ thing, so this seemed like good practice, can someone tell me if I am following through the program right?
    Code:
    void g (int i) { //Passes a copy of the inputted number?
       int*p; //Creates a pointer
       p=&i; //p Points to the address of i (which is a copy of the original number?);
       *p--; //Points to a non-existing address;
    }
    
    void h (int*p) { //Takes a reference to the inputted varible
       *p--; //*Doesnt* deincrement variable since -- is resolved before *
    }
    
    void i (int*p) { //Takes a reference to the inputted varible
       int i; //Creates a local varible i
       i=*p; //Assigns i the value of the inputted varible (why was it assigned to p in the first place?)
       i--; //decrements the local copy
    }
    
    int main() {
       int index=100;
       for(;index>0;index--) {  //Standard for loop.
          printf("*"); 
          f(index);
          g(index);
          h(&index);
          i(&index); 
       }
       //Prints * 100 times since none of the functions do anything
    //Talk about a dumb program
    }

  15. #30
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    which is basically how I've always seen forward loops done, and which doesn't generate invalid pointers
    What about reverse loops? I think that a naive implementation of reverse iterators for certain standard containers could end up doing something like:
    Code:
    for (rbegin = begin + length - 1, rend = begin - 1; rbegin != rend; --rbegin)
    Prints * 100 times since none of the functions do anything
    That's correct. None of the four functions (main() excluded) has any net effect, assuming that this invalid pointer thing also has no net effect.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. pointer to pointer realloc problem
    By prakash0104 in forum C Programming
    Replies: 14
    Last Post: 04-06-2009, 08:53 PM
  2. Replies: 5
    Last Post: 04-04-2009, 03:45 AM
  3. A problem with pointer initialization
    By zyklon in forum C Programming
    Replies: 5
    Last Post: 01-17-2009, 12:42 PM
  4. Problem with function's pointer!
    By Tirania in forum C Programming
    Replies: 5
    Last Post: 11-28-2008, 04:50 AM
  5. towers of hanoi problem
    By aik_21 in forum C Programming
    Replies: 1
    Last Post: 10-02-2004, 01:34 PM