plzzzzzz... i need a help in natural merge sort

**Kanshu** · 05-09-2009

Hmm. Let me restate that. The myfile.read() did not trigger the myfile.eof() condition for the last record.

**Kanshu** · 05-09-2009

I've read about the merge sort algorithm at wikipedia.org:

Merge sort - Wikipedia, the free encyclopedia

I wanted to confirm that the algorithm uses recursion. It does. So, now I'm wondering about your implementation using 3 global binary files. Usually, the later recursive calls require private storage for successively smaller bits of data. The contents are discarded once the call terminates. You, on the other hand, are using common or global storage (i.e., binary files) to store data during recursive calls. How is that even possible?

You may want to use global arrays first to test your implementation. This will eliminate a lot of lines related to reading from and writing to the binary files. If the global arrays work, it should be simple enough to duplicate using files.

Over the top of my head, I think you should use temporary binary files to write the result of a call to merge/sort. This will be similar to local arrays during recursive calls to merge/sort.

**iMalc** · 05-10-2009

The method being used is okay. It's called External Sorting, and it's what we use when the data to be sorted doesn't all fit in RAM at once.
The files are supposed to contain decreasing numbers of runs of ever increasing lengths as time goes by.

You shouldn't have to get as high as 90 to cause a problem. You need to try it with much lower numbers and run it several times until it fails, then hard code the exact strings it gave you to make it reproduceable.
If there's a bug then it's highly likely that it'll occur with fewer than 10 items.

**Kanshu** · 05-10-2009

Do you mean to say that one source file and two working files are enough to implement the algorithm? (From the looks of it, he is using the source file as a working file also.)

I'm skeptical, but interested in the technique.

--------------------
By the way, can you comment/confirm/verify my earlier post on the myfile.read(). When I ran it, it was not triggering immediately the myfile.eof() for the last record. Since he is using myfile.read() all over, this may be affecting the solution.

**iMalc** · 05-11-2009

Originally Posted by Kanshu

Do you mean to say that one source file and two working files are enough to implement the algorithm?

Yes absolutely. The merge process has the extra complication that the streams of data being merged occasionally have items out of order, but you simply check for this every time you advance each data stream. The number of comparisons approximately doubles, but it's still on the order of O(n * log(n))

By the way, can you comment/confirm/verify my earlier post on the myfile.read(). When I ran it, it was not triggering immediately the myfile.eof() for the last record. Since he is using myfile.read() all over, this may be affecting the solution.

Yes that sounds right. Eof isn't true until you're tried reading past it, iirc. I actually very very seldom read from files directly, surprisingly.

**sam-j89** · 05-11-2009

Hello … thank you guyz a lot for helping me through the past two weeks
Finally .. my program worked successfully without any problems even with great values for Nmax , and here is an explanation to my problem :
In the fucnction copy_sequence:

Code:

while ((!f1.eof())&&(w1 >= w2))
{
	f2.write(w1,6);
	w2 = w1;
	f1.read(w1,6);
}

In case of two successive identical words , I consider that the preceding word is less than the succeeding one . that is , the c c c , for example, is a sequence according to the above assumption.. but on the contrary , in this segment:

Code:

if (( wrd >= w1)||(f1.eof()))
{
	wrd = w2;
	myfile.write(w2,6);
	copy_sequence(f2,myfile,w2,wrd);
	end_sequence = true;
}

When it comes to an identical-words sequence , I consider that the condition (wrd>=w) means this is not a sequence so I move to the next file to read from.
that is why dropping the (=) in (wrd >= w1) solved the problem!!!!!
so .. when Nmax gets bigger it is far more likely to have a successive identical words , that would cause the prolem!!
so thank u again iMalc for your useful advices you gave to me ... see you later bro

**Kanshu** · 05-11-2009

Originally Posted by iMalc

Yes absolutely. The merge process has the extra complication that the streams of data being merged occasionally have items out of order, but you simply check for this every time you advance each data stream. The number of comparisons approximately doubles, but it's still on the order of O(n * log(n))

Hmm. I should study this algorithm. Good thing it's finally fixed.

Originally Posted by iMalc

Yes that sounds right. Eof isn't true until you're tried reading past it, iirc. I actually very very seldom read from files directly, surprisingly.

And this one also.

**Salem** · 05-11-2009

> while (!myfile.eof())
Why you shouldn't do this (and the observation of what happens if you do) is in the FAQ.

**iMalc** · 05-11-2009

Well done solving it yourself!

See you around

Thread: plzzzzzz... i need a help in natural merge sort

Thread Tools

Search Thread

Display

Similar Threads

Merge Sort Help

threaded merge sort

merge sort tutorial from cprogramming.com

Natural Merge Sort

help with merge sort