C Board  

Go Back   C Board > Platform Specific Boards > Linux Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 02-08-2010, 10:48 AM   #1
Registered User
 
Join Date: Oct 2006
Posts: 338
Processes in uninterruptible sleep

I am having a periodic issue with a server program, where it will go into uninterruptible sleep (D state) for no reason that I can discern. I am running the release version of OpenSuse 10.3, with kernel version 2.6.22.5-31-default x86-64. My server does basic file, socket, and mysql I/O and the mysql server does not go to D state when my program does. my program is a typical multi-process unix-type server that forks to handle client connections. the interesting thing is that the server runs on 3 ports to handle connections from various locations with differing network and firewall configurations, and only one port (port 80 but not HTTP) goes into D state. the other two continue to function normally. even stranger yet, the parent and ALL of its children go into D state at the same time. I've been googling for over an hour looking for known bugs in the opensuse 10.3 default kernel, and haven't found anything useful. do any of you know of anything that might cause a parent process and all of its children to go into D state simultaneously on this system? just looking for more things to rule out before I start looking at hardware.

I know it's customary to show source code that exhibits the problem, but since I can't reproduce the problem on command, I don't know if source code would really be useful in this case.
Elkvis is offline   Reply With Quote
Old 02-08-2010, 11:36 AM   #2
Registered User
 
Codeplug's Avatar
 
Join Date: Mar 2003
Posts: 3,956
Anything interesting in "dmesg"?

gg
Codeplug is offline   Reply With Quote
Old 02-08-2010, 11:59 AM   #3
dat is, vast staat
 
MK27's Avatar
 
Join Date: Jul 2008
Location: SE Queens
Posts: 6,612
I think this is usually considered to be a hardware I/O issue. It will happen when the kernel is reading or writing to some hardware (eg, disk or the network card) and gets no reply.

I guess that implies something could be broken -- you need to resolve what kind of hardware access is causing that to happen.
__________________
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
GDB tutorial #1 -- gnu debugger tutorials -- GDB tutorial #2
cpwiki -- our wiki on sourceforge
MK27 is offline   Reply With Quote
Old 02-08-2010, 12:01 PM   #4
dat is, vast staat
 
MK27's Avatar
 
Join Date: Jul 2008
Location: SE Queens
Posts: 6,612
Quote:
Originally Posted by Codeplug View Post
Anything interesting in "dmesg"?

gg
Yeah, for sure check the logs, like /var/messages and /var/kern.log or whatever they are called. You may (probably) want to set the kernel and system logging level with klogd()/syslogd() if you are not getting any message about it.
__________________
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
GDB tutorial #1 -- gnu debugger tutorials -- GDB tutorial #2
cpwiki -- our wiki on sourceforge

Last edited by MK27; 02-08-2010 at 12:04 PM.
MK27 is offline   Reply With Quote
Old 02-08-2010, 12:15 PM   #5
Registered User
 
Join Date: Oct 2006
Posts: 338
Quote:
Originally Posted by Codeplug View Post
Anything interesting in "dmesg"?
lots of firewall messages but nothing else. is there a way to turn off firewall logging in the dmesg output, at least temporarily?
Elkvis is offline   Reply With Quote
Old 02-08-2010, 01:03 PM   #6
Registered User
 
Join Date: Oct 2006
Posts: 338
Quote:
Originally Posted by MK27 View Post
Yeah, for sure check the logs, like /var/messages and /var/kern.log or whatever they are called.
the /var/log/messages file that included the last day that it happened contained a lot of logged events from the firewall about SYN flooding. I'm pretty sure these are false positives, but I have no way to know for sure, since my server operates on port 80 but does not actually use http, as the firewall may be expecting.

so perhaps the firewall cuts off the connection but doesn't notify the kernel that the socket is being closed. is this even possible? is it possible to configure the firewall to handle port 80 as a raw port instead of expecting http (if it even enforces protocols at all)?
Elkvis is offline   Reply With Quote
Old 02-08-2010, 01:16 PM   #7
Staff software engineer
 
brewbuck's Avatar
 
Join Date: Mar 2007
Location: Portland, OR
Posts: 6,014
Are you using semaphores?
__________________
"Congratulations on your purchase. To begin using your quantum computer, set the power switch to both off and on simultaneously." -- raftpeople@slashdot
brewbuck is offline   Reply With Quote
Old 02-08-2010, 01:23 PM   #8
Registered User
 
Join Date: Oct 2006
Posts: 338
Quote:
Originally Posted by brewbuck View Post
Are you using semaphores?
no. I am not using any form of IPC, except for the TCP sockets and whatever libmysqlclient uses to talk to the server.
Elkvis is offline   Reply With Quote
Old 02-08-2010, 01:28 PM   #9
Registered User
 
Codeplug's Avatar
 
Join Date: Mar 2003
Posts: 3,956
"Alt+SysRQ+t" may give a clue by dumping a stack trace of the D-state processes.

If you only care about a solution (instead of "why"), then your time may be better spent trying to reproduce the issue in the latest stable kernel (2.6.32.7).

gg
Codeplug is offline   Reply With Quote
Old 02-08-2010, 01:57 PM   #10
Registered User
 
Join Date: Oct 2006
Posts: 338
Quote:
Originally Posted by Codeplug View Post
"Alt+SysRQ+t" may give a clue by dumping a stack trace of the D-state processes.

If you only care about a solution (instead of "why"), then your time may be better spent trying to reproduce the issue in the latest stable kernel (2.6.32.7).

gg
if I build a custom kernel, will I need to rebuild my "world" as well? I remember having to do this when I have updated kernels on machines before.

also, opensuse 10.3 is not supported anymore, and I'd like to try using the kernel package from the 11.2 distribution (linux 2.6.31.5). can you see any problems I might face while doing this? is it even something I should consider doing?
Elkvis is offline   Reply With Quote
Old 02-08-2010, 02:10 PM   #11
{Jaxom,Imriel,TBD}'s Dad
 
Kennedy's Avatar
 
Join Date: Aug 2006
Location: Alabama
Posts: 1,035
Quote:
Originally Posted by Elkvis View Post
I am having a periodic issue with a server program, where it will go into uninterruptible sleep (D state) for no reason that I can discern. I am running the release version of OpenSuse 10.3, with kernel version 2.6.22.5-31-default x86-64. My server does basic file, socket, and mysql I/O and the mysql server does not go to D state when my program does. my program is a typical multi-process unix-type server that forks to handle client connections. the interesting thing is that the server runs on 3 ports to handle connections from various locations with differing network and firewall configurations, and only one port (port 80 but not HTTP) goes into D state. the other two continue to function normally. even stranger yet, the parent and ALL of its children go into D state at the same time. I've been googling for over an hour looking for known bugs in the opensuse 10.3 default kernel, and haven't found anything useful. do any of you know of anything that might cause a parent process and all of its children to go into D state simultaneously on this system? just looking for more things to rule out before I start looking at hardware.

I know it's customary to show source code that exhibits the problem, but since I can't reproduce the problem on command, I don't know if source code would really be useful in this case.
I have seen this before in a system that didn't have enough memory (64 MB) and NO SWAP. The issue was that the kernel thread sd was attempting to allocate memory for a file I/O -- there was no memory left, however, and the driver handled it poorly. The only thing that ever worked for me was to kill that process and start it over again. The real fix came later when I upgraded to much more memory (this era 2GB). I wonder if you are having similar issues?
Kennedy is offline   Reply With Quote
Old 02-08-2010, 02:21 PM   #12
Registered User
 
Join Date: Oct 2006
Posts: 338
Quote:
Originally Posted by Kennedy View Post
I have seen this before in a system that didn't have enough memory (64 MB) and NO SWAP.
I have 16GB of memory and 32GB of swap, so this probably doesn't apply.

Quote:
The issue was that the kernel thread sd was attempting to allocate memory for a file I/O -- there was no memory left, however, and the driver handled it poorly. The only thing that ever worked for me was to kill that process and start it over again..
which I can't do because it's in the uninterruptible sleep state.

Quote:
The real fix came later when I upgraded to much more memory (this era 2GB). I wonder if you are having similar issues?
thanks for trying, but I don't think our situations are similar enough to consider this as a possibility.
Elkvis is offline   Reply With Quote
Old 02-08-2010, 03:03 PM   #13
dat is, vast staat
 
MK27's Avatar
 
Join Date: Jul 2008
Location: SE Queens
Posts: 6,612
Quote:
Originally Posted by Elkvis View Post
the /var/log/messages file that included the last day that it happened contained a lot of logged events from the firewall about SYN flooding. I'm pretty sure these are false positives, but I have no way to know for sure, since my server operates on port 80 but does not actually use http, as the firewall may be expecting.
If you are getting a bunch of SYN requests on port 80 and your server doesn't deal with http, don't you think this could be a problem? It seems to me this is a very bad port to pick -- it's already created these complications for you, real or imagined -- and if you have any choice at all, choose another one.

Quote:
Originally Posted by Elkvis View Post
if I build a custom kernel, will I need to rebuild my "world" as well? I remember having to do this when I have updated kernels on machines before.

also, opensuse 10.3 is not supported anymore, and I'd like to try using the kernel package from the 11.2 distribution (linux 2.6.31.5). can you see any problems I might face while doing this? is it even something I should consider doing?
Generally, building a new kernel does not mean having to change anything else, presuming you know what you are doing. It's safe to try anyway, since if it doesn't work out you can just go back to using your old one (this is determined by the bootloader). It can be a very tedious and boring procedure though, the config changes slightly with each version, meaning you may not be able to just swap in your old .config -- last time I downloaded a new one it took me at least two hours just to go through xconfig making sure everything was set appropriately. Unlike the distro kernels, I don't believe any effort is made to provide a useful "default" configuration with the source, and xconfig et. al. do not do anything automatically. Configuring the kernel is a sure cure for insomnia.

If you think a new kernel will help maybe first try the newer distro package. If you already have a 2.6 kernel, though, you may as well stick with what you've got.

One thing I would definitely do is get on the user mailing list for the firewall and present your case there, hopefully someone will have some more pertinent advice.
__________________
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
GDB tutorial #1 -- gnu debugger tutorials -- GDB tutorial #2
cpwiki -- our wiki on sourceforge

Last edited by MK27; 02-08-2010 at 03:09 PM.
MK27 is offline   Reply With Quote
Old 02-08-2010, 03:25 PM   #14
Staff software engineer
 
brewbuck's Avatar
 
Join Date: Mar 2007
Location: Portland, OR
Posts: 6,014
Quote:
Originally Posted by MK27 View Post
It can be a very tedious and boring procedure though, the config changes slightly with each version, meaning you may not be able to just swap in your old .config -- last time I downloaded a new one it took me at least two hours just to go through xconfig making sure everything was set appropriately. Unlike the distro kernels, I don't believe any effort is made to provide a useful "default" co
That's what "make oldconfig" is for. It re-parses your older .config file and (hopefully) sanitizes it enough so that make xconfig can deal with it.
__________________
"Congratulations on your purchase. To begin using your quantum computer, set the power switch to both off and on simultaneously." -- raftpeople@slashdot
brewbuck is offline   Reply With Quote
Old 02-08-2010, 03:30 PM   #15
Registered User
 
jeffcobb's Avatar
 
Join Date: Dec 2009
Location: Henderson, NV
Posts: 887
To turn it another way, if you did not have to recompile driver/app X to install it, you probably won't have to after another kernel upgrade. For example back in the day you had to rebuild nVidia drivers or Alsa drivers after a kernel update. Now when you have a repository as large as Debian/Ubuntu it is no longer necessary. This is not a guarantee or an absolute but it has been a LOOOONG time since I have had to rebuild a driver after a kernel update.

Now before anyone jumps on my case I am not trying to start a distro war or anything but I do want to say this about kernel rebuilding: The Debian way is IMHO the safest and easiest for those who are new to it, unsure about kernel configuration etc. Many moons ago I submitted a Debian How-To to Linux Laptops that used kernel rebuilding to support some of the hardware on a Vaio. You can look here to see how easy it is...JBCobb.net » Post Topic » The difference between “Don’t wanna” and “Can’t”. Yeah I know its dated now but the process of building the kernel and the help that Debian gives you here is what I am referring to.

Oh and I agree 1000% with MK: you really ought to rethink the wisdom of sticking non-HTTP services on a port universally recognized as HTTP. It's about the only thing Windows and UNIX agree on...
__________________
C/C++ Environment: GNU CC/Emacs
Make system: CMake
Debuggers: Valgrind/GDB
jeffcobb is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Zombie and Uninterruptable Sleep Processes pomprocker Linux Programming 7 01-27-2010 10:53 AM
Doing my own shell, how to properly execute processes in background/foreground? Nazgulled C Programming 34 05-27-2009 12:26 PM
Putting other processes to sleep brett C Programming 12 12-12-2007 01:24 AM
binary tree of processes gregulator C Programming 1 02-28-2005 12:59 AM
Sleep is overrated... Polymorphic OOP A Brief History of Cprogramming.com 24 01-24-2003 12:40 PM


All times are GMT -6. The time now is 12:19 AM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22