PDA

View Full Version : system calls in linux



kris.c
10-25-2006, 05:37 AM
I am working on a relational database project that involves tweaking around with the file storage.I need to load a file representing a relation into memory. I thought using mmap() would help but I am not able to interpret the output. This is what I did :




int main()
{
int fp;
int sz=100;
unsigned char *op;

fp=open("file.c");

op = mmap (NULL, sz, PROT_WRITE, MAP_FILE | MAP_SHARED, fp,0);

printf ("mmap (out) returned %08lx\n", (long)op);

return 0;
}




Now, the man pages say that the call to mmap() returns the pointer to the location where the mapping was placed. So, my questions are:
1) is this the pointer to the location in main memory? and how do I use this pointer to perform any update operations.


2)I got a seg fault when I tried to read the value stored in the pointer.
3)Is mmap() the appropriate sys call to use here or is there any other call that does the job that I am looking for?

maxorator
10-25-2006, 05:51 AM
Platform specific... Linux

kris.c
10-25-2006, 05:53 AM
yeah, linux..
should I post it in some linux forum ?
Duh! Its still a C program

Perspective
10-25-2006, 08:51 AM
Does your DBMS support "containers". You can map a DBMS relation to physical part of disk (in this case, the file)

kris.c
10-26-2006, 01:21 AM
no, we are trying to create a mini-relational database project from the scratch using C to support a set of operations.

Salem
10-26-2006, 05:27 AM
> 2)I got a seg fault when I tried to read the value stored in the pointer.
Erm, maybe you need
PROT_READ | PROT_WRITE

> 3)Is mmap() the appropriate sys call to use here or is there any other call that does the job that I am looking for?
Well that depends.
I'd make sure everything worked properly just using fopen / fread / fwrite / fseek.

kris.c
10-26-2006, 12:41 PM
>PROT_READ | PROT_WRITE
yeah, I did that too. I still get the seg fault when I try to read the value stored.
now that u have brought this up, fopen() also actually maps the file into memory right?though not in an explicit manner as mmap().
and , a slightly different question, can I implement DMA through a C program? which system call will I be needing for that?

jim mcnamara
10-27-2006, 08:20 AM
No, fopen doesn't map the file. stdio usually has a buffer that contains a given number of bytes, normally for Linux it is 4096. It isn't populated until your code calls one of the stdio read functions: fgets fread etc.

You can malloc a buffer, using the filesize (+ 1) you got from a stat call, then call fread to read the whole file into a buffer yourself. fclose() the file. If you plan on writing, malloc more space than the filesize, and keep track of what you add to the buffer. Then fopen & fwrite back to the file when you're done. The disadvantage with this is that if the program bombs you lose changes up to the point where it is written back to disk, plus this is a single process only approach.

mmap and mysync let multiple process play with the memory mapped file - BUT read the man pages because MAP_SHARED comes with a lot of "unspecified" behaviors you need to check out first.

kris.c
10-28-2006, 10:01 PM
I am afraid ,that still isnt working.
I have specified PROT_WRITE . Also, with respect to difference in behaviour with different flaovurs, the man pages done elaborate much on that.
And I get the seg-fault whether I specify the MAP_SHARED flag or not.

Is there any restriction on the SIZE parameter? Should I make that equal to the block size?
THis brings me to two more questions :

1) How do I what the block size is for my machine?
and more importantly ,
2) How do I access a particluar block?

Salem
10-29-2006, 01:15 AM
> fp=open("file.c");
Maybe because open itself also takes a bunch of other parameters as well.

Meh, checking the return result would be good as well. You probably don't even open the file properly given the lack of correct parameters.

Do you get warnings when you compile, or are you just ignoring them?

kris.c
10-29-2006, 06:50 AM
HURRAY!!!! Its working!!!!!




#include <fcntl.h>

int main()
{
int fp;

unsigned char *op;

fp=open("file.c",O_RDWR);

op = mmap (NULL, sz, PROT_WRITE|PROT_READ, MAP_FILE | MAP_SHARED, fp,0);

printf ("mmap (out) returned %08lx\n", (long)op);

printf("%s \n",op);
return 0;

}




SALEM,Thanks for identifying the flaw. I have become a bit lazy and have stopped error checking in my programs. I needed that..


Its printing out the entire file now..
I still have a few issues regarding detecting the end of the file, but I think I can handle it from here

Thanks a lot!!!

kris.c
11-04-2006, 11:41 AM
I realised that I still havent found the answer to these questions :
what
1) How do I what the block size is for my machine?
2) How do I access a particluar block?

Any idea what I should be doing?

Salem
11-04-2006, 01:16 PM
Well having mapped the file, accessing op[pos] is just like seek(pos),getch()

kris.c
11-04-2006, 01:48 PM
no, I am not talking about memory here.. I am referring to blocks on my disk. I need a pointer to each block containing my data file for indexing purposes.

Salem
11-04-2006, 01:53 PM
op is a pointer to memory

If you want to do
result = memcmp( &op[pos], reference, 100 );
Then just do it.

The disk is irrelevant at this point.

kris.c
11-04-2006, 05:25 PM
well, the thing is, lets say I am using primary indexing method. That means, my index file contains two fields.
1) the primary key field value.
2) The pointer to the block containing the record corresponding to the value.
Both my index file and my data file are sorted with respect to the primary key field value.
My data file is stored over many disc blocks.
So, if I need to retrieve a record whose promary key value is known, I need to do a binary search on the index file to find out which disc block can hold the record that I am interested in. For this,I need the pointer to that block. Now that I have the pointer, I can load that block to the memory and retrieve the tuple.

So, disk is of relevance here as I cant store my entire data file in the memory.I can only load parts of it as and when required.

Salem
11-04-2006, 07:08 PM
What exactly do you think mmap actually does for you?

If your file is really that big that it would not fit in memory, then mmap would fail.

It's really no different to allocating a big array yourself, then reading the whole file into memory. Once there, you can treat it as any other array in memory, and if it is sorted for example, then you can search it with bsearch().

Maybe practice doing this with regular files before thinking about trying to use mmap.

kris.c
11-08-2006, 05:43 AM
Ok, for the time being forget about the mmap() question that I had initially asked. Is there a system call that allows me to detect the block size on my disc?

Salem
11-08-2006, 06:44 AM
Probably (or maybe), but again, what does the physical (or logical) block size have to do with anything?

Your data has a record size of 's'
you get to the 'n'th record by doing fseek( fp, s * n, SEEK_SET );

Then you read the record by doing
fread( &rec, 1, sizeof rec, fp );

Voila.

Just let the OS worry about which disk block(s) your record maps to.

kris.c
11-08-2006, 07:17 AM
well,the requirement is that the node size in my B+ tree is of the size of the block . So, I need to get the block size. also, I am compelled to work on windows, so, I have to find out if there is one such call for windows.

Salem
11-08-2006, 10:54 AM
Unless the size is already pretty close to say 512 or 4096, it sounds like an awful waste of space.

> also, I am compelled to work on windows,
So why is this posted in the Linux forum, and why are you using mmap(), which isn't available on windows AFAIK.

kris.c
11-08-2006, 11:23 AM
>Unless the size is already pretty close to say 512 or 4096,
Thats what even I was speculating. But my prof told me so. He said I have to use a system call to ascertain the actual block size. Do you think MS releases the list of such available system calls that are avaialble for developers? Are you aware of any such links?

>it sounds like an awful waste of space.
Really? I can have many data pointers in one node right? So, instead of having a large number of small nodes, I have few nodes that are as large as my disc block

>So why is this posted in the Linux forum, and why are you using mmap(), which isn't available on windows AFAIK.
well, I had started off on linux, but I had to switch as my teammates arent that familiar with linux.

Salem
11-08-2006, 11:33 AM
Random example which has something to do with disk sizes
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wmisdk/wmi/win32_logicaldisk.asp

kris.c
11-08-2006, 11:44 AM
well, thanks for that ..
IS it true that I need to have the cygwin library installed to execute system calls from within my C program in windows?

kris.c
11-08-2006, 12:01 PM
It does have a data type of "uint64" for the blocksize.
Can you tell me how do I actually use this. I have not scripted in windows before.