Multi-dimensional arrays Professionally

**MGrancey** · 12-18-2008

I just finished a class for intro to programming with c++, not my first but my first ones weren't accepted by the college.

The teacher, who admits he's not a programmer, had us visualize two dimensional arrays as var array[y][x];, where as x and y are what you would use in a graph.
or on a spreadsheet table go down, then go across.

I could follow when I was in class with him teaching, but on my own, I do it [x][y]. Is one used more professionally. I would think that my way is more popular as that is how math does it and math is universal. The book supported his method, but I still feel that its wrong somehow.

**laserlight** · 12-18-2008

You might want to read this excerpt from a book: memory layout. In particular, take note of the discussion of caching and accessing a large multi-dimensional array.

**brewbuck** · 12-18-2008

Originally Posted by MGrancey

I could follow when I was in class with him teaching, but on my own, I do it [x][y]. Is one used more professionally. I would think that my way is more popular as that is how math does it and math is universal. The book supported his method, but I still feel that its wrong somehow.

Neither method is wrong. It's not justified to ascribe X or Y to either dimension as if it was some intrinsic thing. Typically the last dimension is called the "row" but this is only because the last dimension is the one which is laid out contiguously in memory.

You might say "Okay, then the last dimension is naturally the X dimension," but that assumes that X measures the distance along a row. That's purely conventional. What matters is that iteration through the last dimension is contiguous iteration, meaning it has good cache behavior.

So if your algorithm is primarily focused on column-wise manipulation, then you would make the "columns" be the last dimension. But if you are processing primarily row-wise, you might want to consider the last dimension to be the "rows" for cache reasons.

In practice, clarity will often trump performance. This is domain-specific.

**core_cpp** · 12-18-2008

Originally Posted by brewbuck

Typically the last dimension is called the "row" but this is only because the last dimension is the one which is laid out contiguously in memory.

Huh? Yes, the last dimension is the one which is laid out contiguously in memory. So the last subscript refers to the column within that row.

int arr[ROWS][COLS];

Now I know this is what you meant, I just don't see how the statement "typically the last dimension is called the row" makes any sense. Semantics? Perhaps. Sure has me scratching my head; still, even.

**MGrancey** · 12-18-2008

His examples were mainly for visualization. My question was mostly aimed at the idea of one being standard for industry.

I know that the last arrays added is the one that gets the cache of memory allocated to it. I was just confused over his way of doing it.

For example - "Which of the following is the statement that declares a multiple dimensional character array that contains 10 words with 40 characters in each word?"

The correct answer on the test is: char X[40][10];
I selected char X[10][40];

All of the array questions ended up like that.

**Adak** · 12-18-2008

I know the graphics functions typically use column major ordering: A[col][row], which matches up with things like gotoxy(x,y).

For our own arrays, I've always used A[row][col], and since it's worked, I don't quite see how it could have really been A[col][row].

I'm going to check with my K&R, but I'm sure I got the idea of A[row][col] being correct, right from them. I realize this is a conceptual thing, and that our memory is not physically laid out (and thus changed), when we change our array parameters. But conceptually is what we're talking about, not soldering on new wires in our RAM.

In this short program to parse words out of text, I use Words[row][col], for example. I use a for loop like this:

Code:

notFound = 1;
for(i = 0; i < MaxArrayRows; i++)  {
   if(strcmp(newlyFoundWord, Words[i]) == 0)  {
      notFound =  0;
      break;
   }
}

if(notFound)
   strcpy(Words[countOfWordsFoundSoFar], newlyFoundWord);

Which puts newly found words, into the Words[][] array. If Words[i] and Words[countOfWordsFoundSoFar], did not correspond conceptually, with a pointer to a row in the Words[][] array, then why does this code work without a hitch?

The Words array is sized at Words[1000][25]. So it's not like it's a convenient square array, that would work column major or row major.

Looks like row major ordering to me.

*countOfWordsFoundSoFar is just named long for clarity, here. In the program, it's just "count".

**The book link posted above showed only "The book could not be viewed..."

**R.Stiltskin** · 12-18-2008

For what it's worth, in every math reference I've ever seen, an m x n matrix has m rows and n columns. Also, in image files - .bmp, .pgm, etc. - the pixel data is laid out row by row. If you read a text file into an array, it seems more natural to read the line of text into the first row, the 2nd line into the 2nd row, etc. I would find it confusing to deal with these things column-first.

I've often wondered why screen resolutions and photos are usually described the other way: 640 x 480, 1600 x 1200 and so on. Is that just a relic from days gone by?

Also, in Fortran, memory is allocated to arrays column-wise rather than row-wise as in the C family. Don't know if there's a particular reason for that or if that's just the way it is. But even in Fortran, whatever code I've seen (admittedly not very much) had the arrays written with the row dimension first. But in loops, they iterate through them with the column on the outer loop and the row on the inner loop.

**R.Stiltskin** · 12-18-2008

Originally Posted by laserlight

You might want to read this excerpt from a book: memory layout. In particular, take note of the discussion of caching and accessing a large multi-dimensional array.

That link is no good. Please give a title/author/page reference.

**laserlight** · 12-18-2008

Originally Posted by R.Stiltskin

That link is no good. Please give a title/author/page reference.

Chapter 7.4.3 (pages 358 to 360) of Programming language pragmatics by Michael Lee Scott. I believe most introductory texts on computer organisation/architecture would have a word to say about this, so the material is probably not new to you.

**Daved** · 12-19-2008

>> The correct answer on the test is: char X[40][10];
>> I selected char X[10][40];

This is interesting. A matrix's rows and columns can probably go either way, but a string of characters has a well-defined use in C and C++. For example, if you wanted to use strcpy (or strncpy) to copy one of the ten strings you would have to use your version:

Code:

#include <iostream>
#include <cstring>

int main()
{
    using namespace std;
    char X[10][40] = { "Hello World!" };

    char dest[40];
    strncpy(dest, X[0], 40);

    cout << dest << '\n';
}

Try to do the same thing with his version and it doesn't work. I think your answer is correct there.

**anon** · 12-19-2008

I am always having enormous trouble with it. (When I do a game why is everything drawn sideways

). As a beginner I got confused with this quite often, and it quite possible that my greatest evar program contains functions that are supposed to be called both

Code:

int a(int x, int y);
int b(int y, int x);

It was quite hard to wire it all together (though the behaviour is usually easily observable), particularly since I even didn't discover the inconsistency early enough (started to compile, see what happens, fix a call, repeat).

Just decide on a way and stick with it. (y, x) or (row, column) seems to make more sense to me.

(I guess what confused me most was that sometimes I used x and y, and other times row and column. Now I'd mostly just use row and column.)

**VirtualAce** · 12-19-2008

I don't know that there is a 'standard'. Video memory is normally y * pitch + x, or y * width + x. Direct3D is y * (pitch >> 2) + x which means surfaces are in row,column order.

But when it comes to 4x4 matrices which are essentially similar to 4x4 arrays Direct3D uses left handed by default and OGL uses right handed by default. So the transpose of a Direct3D matrix is the correct matrix for OGL and vice versa. But the docs tell you exactly what order the rows/columns are expected to be in so it's not much of an issue.

Basically as anon said...decide on a way and stick with it. I rarely use 2 dimensional arrays but when I do it makes more sense to me for them to be row,column.

**brewbuck** · 12-19-2008

Originally Posted by core_cpp

Huh? Yes, the last dimension is the one which is laid out contiguously in memory. So the last subscript refers to the column within that row.

You're making some weird assumptions:

1) That contiguous memory addresses are somehow "horizontal." What if you turn your computer sideways?

2) That a "row" of something is also "horizontal."

The only thing that matters about a multidimensional array is that the elements along the final dimension are contiguous and therefore iterating along this dimension is more efficient than along any other dimension. So the meaning of the individual dimensions should be based on the manner in which you process the data.

In raster imaging, for instance, many operations are carried out horizontally but they could just as well be vertical. Conventionally, the contiguous dimension is horizontal. But imagine a monitor screen turned sideways. Now the "rows" are actually columns and processing that raster buffer "row"-wise is going to be slower than going by "column." And I keep quoting everything because the terms are meaningless on their own -- they only distinguish between two orthogonal dimensions.

Imagine a 1-dimensional array. Is that a "row" or values or a "column?"

This reminds me of the endian debates. Some people get confused trying to think about "left" and "right" when the real matter is the address ordering of words within a multi-byte object.

**core_cpp** · 12-19-2008

If the terms are meaningless by themselves (and I would tend to agree), then why did you make the comment that "Typically the last dimension is called the "row".

That one sentence alone is the only thing I take issue with.

Edit: And yes, I am making the weird assumption that a row is horizontal, only in the context where rows AND columns are both mentioned. I do not think this is such a weird assumption.

From Merriam-Webster:

row (noun)
4b: a horizontal arrangement of items

C is row major. I don't think there's any argument there. So to say that the second dimension is the row is exactly contrary. Even if this is all meaningless, which it is.

**shoutatchickens** · 12-19-2008

The argument wasn't about the definition of row, but rather the assumption that it's stored in a nice little grid in your system's memory.

Thread: Multi-dimensional arrays Professionally

Thread Tools

Search Thread

Display

Multi-dimensional arrays Professionally

Similar Threads

Multi dimensional array

fscanf for multi dimensional arrays

Pointers and multi dimensional arrays

! C Question: Passing multi dimensional arrays through routines

Pointers to multi dimensional arrays.