If you're referring to my post, you can use the pointers to access the rows directly and you can determine the current length of each row by counting the chars between the pointer target and the next \n (not \0) to verify cursor position validity. The memory address to the right of the cursor would be row_ptr[row]+(char *)column.
There is no deeper connection between the cursor and the data. You only need to know where the new letters will end up in memory, and what's behind (to the right of) the cursor. Both easy enought to access. Using a secondary buffer for changes ensures that you don't have to realloc one large buffer all the time and that you don't clutter your heap with 178.898... mini-mallocs if the file is a little larger than usual.
main() { int O[!0<<~-!0]; (!0<<!0)[O]+= ~0 +~(!0|!0<<!0); printf("a function calling "); }
This is WRONG. Particularly the last sentence. The newline HAS NOTHING TO DO with making it easy for the computer to store rows of text. The newline is a formatting convention. The x and y position of the cursor, the appearance of the text on the screen*, etc., HAVE NOTHING TO DO WITH THE NEWLINE.(/N)
As you can see, the text here is organized in rows and columns, but I will tell you right now that the rows are NOT created with newlines!!! The text is wrapped by the display; there is no software (eg, phpBulletin) that inserted newlines or something to make the rows appear. (/N)
With blocks of text, a newline indicates a break that could occur ANYWHERE in a physically displayed row, NOT at the end. (/N)
If you have a 1500 byte string, you could represent the rows as they will appear on the screen by determining the screen width and keeping an array of pointers at that interval. YOU DO NOT INSERT NEWLINES INTO THE TEXT. NO EDITOR** DOES THIS. ONLY THE USER INSERTS A NEWLINE. You do not have to manually wrap text. The wrap will occur automatically, determined by the width of the display area.(/N)
* beyond the user determined intermittent line breaks WHICH HAVE NO REGULAR INTERVAL.
** except for an email client.
Last edited by MK27; 09-04-2009 at 03:32 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
I know that but in an editor the user sees the text in an x-y grid and moves around an edits that way, so storing the text that way in the editor makes editing more efficient, not having to tediously translate between 1 and 2 dimensions.
All you have to do is take y*width + x and that will give you the character position in the string. So if you are on the 3rd line, 10th column of an 80 character display, your y is 2 (starting at 0) and your x=9, so you are at 2*80+9 = string[169]. Wow was that tedious. Of course, this does not take into account the newline, which is where the pointer array would come in. You position the pointers every 80 bytes OR after the next newline, and map the y coordinate to them. So a better algorithm would be to use ptr[2] + x, presuming you have a simple function that can assign such a pointer array to a string.
What would really be tedious is if you spared yourself such an immense complication by using a 2D array, so you could say look, it's simple, now I am at array[2][9]! True, but if you have to keep shifting the text in that array, you will be performing significantly more operations for every insertion and deletion, because using a string all you have to do is split it at the appropriate point, add or remove the appropriate text, and put it back together. Now picture doing that to the first paragraph of a 20 page document that has been split into 800 separate strings. Of course, you only have to shift the data up to the next incomplete row (line with a newline, like an end of paragraph) -- so if that was say 10 lines (meaning shifting the data in all ten following rows), then after that all you have to do is renumber the 785 remaining array elements*. Hmm.
In short, I promise that using a 2D array in that manner will mean:
- Having to write MORE code, not less
- The final product will be wasteful of resources, ie, slow and inefficient
Nyda is talking about the same idea as I am, except that s/he has a slightly different approach to managing the memory. I don't like the secondary buffer idea, allocating 1K at a time is plenty for general use. Make it 4k even -- big deal.
* as Dino pointed out, a linked list will spare you that last part, but not the fun with the ten shifted lines. I still cannot comprehend how you think either of those options could possibly be easier, more optimal, or less tedious than just using a text string.
Last edited by MK27; 09-04-2009 at 09:10 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
Here's what I would do.
1. Each line is in its own buffer.
2. The document is represented as an array of buffer pointers. This array is dynamic and can grow as more lines are added. I would not bother ever shrinking the array when a line is deleted.
3. The buffers are drawn from buffer pools of fixed sizes. There would be a pool for length 16, 32, 64, 128, 256, etc in progression.
Line buffers are NOT resized at every insertion/deletion. Only when the fill of the buffer falls to 1/2 would the buffer be swapped for the next smaller size, or when the buffer fills it would be swapped for the next larger size. When the buffer pool is depleted more buffers are allocated for the pool.
Code://try //{ if (a) do { f( b); } while(1); else do { f(!b); } while(1); //}
Did you read the entire paragraph?!??
The width is a fixed value returned by getmaxyx() (or whatever). Again, Nyda said pretty much the same thing.
Brewbucks idea is not a bad one, since I believe by "line" he means a string ending in '\n', not a physical line on the screen. But I still don't think it will give you any advantage unless you are dealing with ridiculously large files, which text never is, and it will still require more coding (for nothing).
Last edited by MK27; 09-04-2009 at 11:23 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
That sure seems like a huge waste of memory, considering the huge variation in line length in the average source code file.
In the attachment I have a diagram of my idea:
Something else I haven't seen mentioned is SGI's "rope" data type: rope<T, Alloc>
It is specifically geared for string manipulation tasks and has some interesting efficiency properties.
Code://try //{ if (a) do { f( b); } while(1); else do { f(!b); } while(1); //}
Right, now think about it this way
As mentioned, you need a function to go thru the string and assign the pointers (and possibly expand the array)*. So now you can modify the string (easy) and reassign the pointers, also easy.Code:char string[4096] = "hello\nworld"; char **ptrs=malloc(3*sizeof(char*)); ptr[0]=string[0]; ptr[1]=string[5];
BTW: The approach that uses the least memory is, by definition, the single string.
* so maybe go thru the string, record the positions where the pointers should go, and then you will know how many (more) you need before the actual assignment. Of course, "string" should be realloc'able as well.
Last edited by MK27; 09-04-2009 at 12:18 PM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
Always do it the same way: split the string at the insertion point*. Concat the insertion onto the first string. Concat the three strings together and assign to the original pointer.
So with a really big file, this could become a performance hit vs. using an array. However, I think a file that big will be an exception, not the rule. "The rule" will be that all the extra operations necessary with a list or array will be a performance hit vs. the string. In other words, the string is optimal in most cases.
You could easily test this by timing strcat (or just strcpy), on a couple of say 500kb strings (and think: when was the last time you saw a 500kb text file? That's like, the bible...). But I don't think it will be anything noticeable.
Now, unless gcc optimizes to recognize that 99 of those might as well be no-ops (which I doubt) I would say it took 0.0 seconds on my computer to strcpy the bible a hundred times.Code:char one[500000], two[500000]; strcpy(one,two); for (i=0;i<100;i++) strcpy(one,two);
* by which I mean, if the insertion point is [666], copy 666-> end into another string, insert at 666, add '\0', concat the copy onto the end (allocating another kb if necessary). The array version is WAY more work to no advantage.
Last edited by MK27; 09-04-2009 at 12:38 PM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
Unless you are grossly wasting memory I wouldn't worry about it. Even very large text files are usually only a few megabytes. If your memory wastage is a full 100% you're still only talking about a few megabytes.
Code://try //{ if (a) do { f( b); } while(1); else do { f(!b); } while(1); //}