# Thread: Peterson's Algorithm on Producer/Single Consumer Question

1. ## Peterson's Algorithm on Producer/Single Consumer Question

I'm trying to make some code I've written run faster. I have a single produer and single consumer at the moment. I am using semafores and mutexes to control the critical sections. I have been reading and this does take quite a bit of overhead.

I stumbled across this from wikipedia, https://en.wikipedia.org/wiki/Peterson%27s_algorithm.

Has anyone used it before? I read some conflicting things like C99 can't implement a full peterson lock. Once I get a better idea I'll post some code.

Do the while loops have any sort of Sleep or Delay? Or do they run as fast as they can for a short duration.

2. > Has anyone used it before?
Yes.

> I read some conflicting things like C99 can't implement a full peterson lock.
Where?

> Once I get a better idea I'll post some code.
The algorithm is simple, why not just try it?

3. Did you read the "Note" on the wiki article?

A bounded SPSC queue isn't that hard once you understand the instructions. A bounded MPMC queue isn't that bad either - if you know where to look: Bounded MPMC queue - 1024cores

gg

4. I gave this a try on this very simple example. From what I can tell, the mutex version might even be slightly faster. Did I implement this right and is this too simple of an example to really test this out on?

Code:
```#include <windows.h>#include <stdio.h>
#include <stdbool.h>
#include <time.h>

#ifdef MUTEX_FLAG
HANDLE ghMutex;
#else
bool flag[2] = { false, false };
int turn;
#endif

{
int i = 0;
while (i<100)
{
#ifdef MUTEX_FLAG
WaitForSingleObject(
ghMutex,    // handle to mutex
INFINITE);  // no time-out interval
#else
flag[0] = true;
turn = 1;
while (flag[1] && turn == 1)
{
#ifdef LOOP_DELAY
Sleep(1);
#endif
}
#endif // MUTEX_FLAG
// critical section
printf("Thread1 i = %d\n", --*(int *)lpArg);
// end of critical section

#ifdef MUTEX_FLAG
ReleaseMutex(ghMutex);
#else
flag[0] = false;
#endif
i++;
}
return NULL;
}

{
int i = 0;
while (i<100)
{
#ifdef MUTEX_FLAG
WaitForSingleObject(
ghMutex,    // handle to mutex
INFINITE);  // no time-out interval
#else
flag[1] = true;
turn = 0;
while (flag[0] && turn == 0)
{
#ifdef LOOP_DELAY
Sleep(1);
#endif
}
#endif
// critical section
printf("Thread2 i = %d\n", ++*(int *)lpArg);
// end of critical section
#ifdef MUTEX_FLAG
ReleaseMutex(ghMutex);
#else
flag[1] = false;
#endif
i++;
}
return NULL;
}

int main()
{
int lpArgPtr;
HANDLE hHandles[2];
clock_t t0;

t0 = clock();
lpArgPtr = 10;

#ifdef MUTEX_FLAG
ghMutex = CreateMutex(
NULL,              // default security attributes
FALSE,             // initially not owned
NULL);             // unnamed mutex

if (ghMutex == NULL)
{
printf("CreateMutex error: %d\n", GetLastError());
return 1;
}
#endif

if (hHandles[0] == NULL) {
exit(0);
}
else

if (hHandles[1] == NULL) {
exit(0);
}
else

for (int i = 0; i < 2; i++)
{
WaitForSingleObject(hHandles[i], INFINITE);
}
#ifdef MUTEX_FLAG
CloseHandle(ghMutex);
#endif

printf("program runtime %.4f\n", 1.0 * (clock() - t0) / CLOCKS_PER_SEC);

system("Pause");

return 0;
}```

5. Forget about Peterson's. You need to use proper synchronization at the hardware level - ie atomic instruction intrinsics provided by your compiler. Otherwise, you'll just get burned either by the compiler or the hardware (or both).

gg

6. Originally Posted by Codeplug
Forget about Peterson's. You need to use proper synchronization at the hardware level - ie atomic instruction intrinsics provided by your compiler. Otherwise, you'll just get burned either by the compiler or the hardware (or both).

gg

First question, with mutex am I guranteed not to have any issues with two threads writing the same memory at the same time?

Two, Are the two variables I need to worry about with atomic instructions turn and lpArgPtr?

This article has proven to be helpful. Am I searching for a C99 solution on a windows 7 machine.

7. Yes, a mutex will provide mutually exclusive access to a resource. On Windows, you should be using a CriticalSection instead. A Windows Mutex is used when multiple process's need to synchronize with each other.

When two or more threads access the same memory location, and at least one of them writes to that location, then you must provide synchronization.

>> I'm trying to make some code I've written run faster.
Let us help with that by explaining or showing some of your code. Rolling your own synchronization is going to the far extreme of trying to achieve that goal. It's also very difficult to get right.

gg

8. I'm going to look at this a different way. If one thread only produces and the other only consumes do I only need to worry that the compiler doesn't do any optimization to reorder instructions?

9. You always have to worry about compiler and hardware re-ordering.

First, why don't you try out the queue_MT class in this post: ReadDirectoryChangesW question

Be sure to re-code your consumer to use the dequeue_all() interface. See if you get satisfactory speed up with that.

gg