Thread: Process vs Thread

  1. #1
    Registered User
    Join Date
    Mar 2009
    Posts
    37

    Process vs Thread

    Hello guys,

    I'm new to Linux programming, but I'm very excited to be a pro in this field.
    I do have programming experience in Windows, so I'm familiar with programming in C/C++ and even in C#.

    I've read in books and on internet that forking childprocesses is more common than creating threads, I don't know if that is still so. But wouldn't be a problem if the parent process is a big software and you fork numerous childprocesses from it? Cause according to man and other articles, fork() creates an exact copy of the executable image from the parents (I also read that functionlibrary and alike are NOT copied too).

    So isn't it more usefull to use threads instead, because you just want a small portion of code running on another thread. Otherwise it's like running Writer of OpenOffice a hundred times, well that won't be the purpose I think, right?

    This quesiton raises up since I was playing with handling a lot of client tcp/ip connections using fork().

    Thank you.

  2. #2
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    There is significantly more overhead associated with creating processes under windows than under (most) flavours of unix. Yes, there are benefits to using threads instead of processes under unix, but those benefits are - relatively speaking - much less than under windows.

    As to why that is so: that's the way unix operating systems were originally designed. Unix, architecturally, is designed to support filtering (small programs that each do a small thing, with output from one process fed as input to the next in a chain). That relies on command shells (or other programs) being able to launch multiple programs with relatively little overhead, and chain the outputs to the inputs. A fork()/exec() is the way that is done (the command shell forks a copy of itself, and the new process overlays itself with the program to be executed). Accordingly, the basic architecture of unix is designed to ensure the overhead of fork() and exec() is relatively small.

    fork() also does not create an exact copy of the executable; it creates an copy of the calling process and its state (with minor exceptions, such as the return value from fork() being different for the child and the parent). In practice, parts of the copying process are usually deferred so the memory pages of the parent process are not actually copied to the child unless the child actually needs them - this reduces the overhead of process creation for a lot of practical cases, such as when the child process just performs an exec() function call to execute another executable.

    There are also concerns with some older versions of unix (or, more accurately, with some versions of threading libraries) in that a multi-threaded program cannot necessarily launch a process because the process of copying a process - with fork() - carries significant overhead, such as the need to synchronise all threads in the parent before creating a copy of the process. That has mostly been addressed in modern versions of unix - a thread calling fork() only duplicates the calling thread. There is a whole set of little gotchas like that, with the end result that multithreaded programs can work differently between different versions or flavours of unix. For that reason, a lot of programmers simply don't bother with threads under unix - launching processes is simpler, and more likely to work consistently between systems.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  3. #3
    Registered User
    Join Date
    Mar 2009
    Posts
    37
    Hi grumpy,

    Thank you very much for your clear explanation.
    Well, I guess I'll start with forking processes than

  4. #4
    Registered User
    Join Date
    Mar 2009
    Posts
    37
    Ok, I still have one question.

    How many childprocesses can a process fork? Suppose I want to make server application which needs to handle a few thousands clients at the same time, can Linux handle that?. For example, passing client's GPS information to each other and maybe other non-cpu intensive processing.

    Thank you.

  5. #5
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Linux can handle up to 2^15-1 processes in theory - but it will probably slow to a crawl long before that. A better way to handle this many clients is to fork just a few processes (or threads - it doesn't make that much of a difference in Linux) - a good number is the number of hardware threads the machine supports, or twice that - and have each process handle many clients, using multiplexing techniques like select() and/or asynchronous I/O.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  6. #6
    Registered User
    Join Date
    Mar 2009
    Posts
    37
    Thnx for your reply Corned Beef,

    I totally understand except "hardware threads supported by the machine". Since when do we have hardware threads? Do you mean multicore processors?


    Thank you.

    Sorry for my stupid questions.

  7. #7
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Andaluz View Post
    Thnx for your reply Corned Beef,

    I totally understand except "hardware threads supported by the machine". Since when do we have hardware threads? Do you mean multicore processors?


    Thank you.

    Sorry for my stupid questions.
    Most CPUs have some amount of hardware support for thread/process level task switching. It may be cheaper, in some cases, to task-switch between threads than between processes. In fact, this is usually the case, since a thread context switch does not require any change to the VM environment. As opposed to a process switch, where the entire set of page tables changes, the TLB is invalidated, etc.

    On Linux/x86, the kernel actually does not use the hardware-accelerated task switching, because doing it in software is actually faster. Go figure. The situation is unique to each platform.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  8. #8
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    A hardware thread is a path of execution that the CPU can execute independently of all others, without a task switch. In less complicated terms, number of hardware threads == total number of cores on all CPUs. (More with HyperThreading, though.)
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  9. #9
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by CornedBee View Post
    A hardware thread is a path of execution that the CPU can execute independently of all others, without a task switch. In less complicated terms, number of hardware threads == total number of cores on all CPUs. (More with HyperThreading, though.)
    Yeah, I didn't read what you wrote carefully enough.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  10. #10
    Registered User
    Join Date
    Mar 2009
    Posts
    37
    Thanks for your professional posts,

    So bottomline is, if I want to write a simple server on Linux, it's easier to use processes. At least for me as a beginner of Linux programming. And if I want to use threads, than it's more for communicating with a device using a serial port or something like that.

    Thank you.
    If you have any tips for me, I'll be thankfull to you if you post me some

  11. #11
    Registered User Mortissus's Avatar
    Join Date
    Dec 2004
    Location
    Brazil, Porto Alegre
    Posts
    152
    Any good reading of the state of the art (kernel 2.6) on this subject?
    I would like to know a little more about the details.

    Thanks ;D

  12. #12
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  13. #13
    Registered User Mortissus's Avatar
    Join Date
    Dec 2004
    Location
    Brazil, Porto Alegre
    Posts
    152
    Thanks! I will start my reading right away!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. create a child process that creates a child process
    By cus in forum Linux Programming
    Replies: 9
    Last Post: 01-13-2009, 02:14 PM
  2. Thread Prog in C language (seg fault)
    By kumars in forum C Programming
    Replies: 22
    Last Post: 10-09-2008, 01:17 PM
  3. API Thread HEADACHE
    By WaterNut in forum Windows Programming
    Replies: 11
    Last Post: 01-16-2007, 10:10 AM
  4. multithreading question
    By ichijoji in forum C++ Programming
    Replies: 7
    Last Post: 04-12-2005, 10:59 PM
  5. Win32 Thread Object Model Revisted
    By Codeplug in forum Windows Programming
    Replies: 5
    Last Post: 12-15-2004, 08:50 AM