Thread: Whoa, what happened? (Again)

  1. #1
    Administrator webmaster's Avatar
    Join Date
    Aug 2001
    Posts
    1,012

    Whoa, what happened? (Again)

    It looks like we had a database failure similar to the last one. This time, I immediately decided to recover from backup, but it still takes a while to restore the backups. We've lost a few days of posts, as I had to restore from last week's backup.

    I'm not yet certain what caused the actual corruption of the DB that required recovery.

  2. #2
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318
    I guess it would be a good idea to make more frequent backups for the next few weeks or so...
    "The Internet treats censorship as damage and routes around it." - John Gilmore

  3. #3
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    Just out of curiosity, what database do you use?
    bit∙hub [bit-huhb] n. A source and destination for information.

  4. #4
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    MySQL was what was mentioned previously.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  5. #5
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    So MySQL has some unresolved issues then? Wondering if I should switch DB's now.

    Any ideas about possible culprits?

  6. #6
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    I'd look into a hardware failure before looking into the database.

    I don't see how mySQL could corrupt a database file in this manner. Right the first time it happened I was surprised to learn it was because of lack of space on the tmp partition. And I still am. I don't know how that could possibly corrupt a database file, unless the actual file or the indexes were stored there.

    And now it happens again. Either you "found" a yet unknown bug, or the tmp partition is being misused and you should know better, or the hardware is at fault... memory?

    edit: or vBulletin has some weird bug.
    Last edited by Mario F.; 02-05-2010 at 12:16 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by webmaster View Post
    I'm not yet certain what caused the actual corruption of the DB that required recovery.
    I say it was Yarin. It was the only way he could get that list of IP's taken down, otherwise the folks from Amsterdam will kill his entire family then chop his fingers off.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by MK27 View Post
    I say it was Yarin. It was the only way he could get that list of IP's taken down, otherwise the folks from Amsterdam will kill his entire family then chop his fingers off.
    No - it was vBulletin and they threatened to sack all the adverts and smiley's, dammit...couldn't have that, now could we? =}

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Sebastiani View Post
    No - it was vBulletin and they threatened to sack all the adverts and smiley's, dammit...couldn't have that, now could we? =}
    Thanks for inspiring my next avatar.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by MK27 View Post
    Thanks for inspiring my next avatar.
    Oh, right.

    Code:
      _ ___
     / \   \
     | =\} |
     \___\_/

  11. #11
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    This is killing my board mojo. Normally I remember where certain threads are and what has been said and now with two failures I'm lost as to where each thread is.

    Thanks for fixing it quickly. Why has this suddenly become an issue after all these years? Does it have to do with the board updates or possibly a bug in the board software?

  12. #12
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by Mario F. View Post
    I'd look into a hardware failure before looking into the database.
    While I agree hardware may be a factor, hardware is often only the means by which software design problems become evident.
    Quote Originally Posted by Mario F. View Post
    I don't see how mySQL could corrupt a database file in this manner. Right the first time it happened I was surprised to learn it was because of lack of space on the tmp partition. And I still am. I don't know how that could possibly corrupt a database file, unless the actual file or the indexes were stored there.
    There are a lot of I/O operations involved in managing user connections, data received or sent over user connections, and data that gets stored in the database. If code doing those various I/O operations does not detect failure of an I/O operation, then data corruption can invisibly occur along the way.

    If some of those I/O operations involve writing data (temporarily) to /tmp or reading from there, then one way of causing a I/O failure is an attempt to write more data to /tmp than allowed. If some other process - the SQL server itself is one of several candidates - assumes data in /tmp is valid, and just copies it "verbatim" into the database, then the database itself may be corrupted.

    It's not difficult to envisage scenarios in which the data placed in /tmp at any point in time depends on the amount of data in the database, the number of connected users over the last 24 hours, or other things. All things that are associated with the forum site being more active.

    It is also not difficult to envisage scenarios in which some file written to /tmp is not cleaned up when no longer needed, so /tmp gradually fills up over time.

    It all depends on the quality of implementation of various bits of software (the operating system, device drivers, the php code that talks to the SQL server, the php interpreter, the SQL server ......).
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  13. #13
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by grumpy View Post
    If some of those I/O operations involve writing data (temporarily) to /tmp or reading from there, then one way of causing a I/O failure is an attempt to write more data to /tmp than allowed. If some other process - the SQL server itself is one of several candidates - assumes data in /tmp is valid, and just copies it "verbatim" into the database, then the database itself may be corrupted.
    But, shouldn't an entire transaction fail if a write fails? But then again, transactions aren't really used a lot...
    But still, write() should be checked for failure - as it can fail - and should it fail, shouldn't a database do anything it can to prevent the database from going corrupt? And I can't imagine there's nothing it can do.

  14. #14
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by EVOEx View Post
    But, shouldn't an entire transaction fail if a write fails? But then again, transactions aren't really used a lot...
    But still, write() should be checked for failure - as it can fail - and should it fail, shouldn't a database do anything it can to prevent the database from going corrupt? And I can't imagine there's nothing it can do.
    I'm not saying there's nothing that can be done. I certainly agree write()s should be checked for failure. But all it needs is an oversight so an error condition is not detected, and Voila!

    There's nothing magical about database servers (or any other software that is a component of a forum suite) that make them less prone to design or coding deficiencies than any other software of reasonable size (however you measure "size"). The probability of one or more defects lurking in code, just waiting to be triggered by the right set of input conditions, does tend to increase with code size. The probability also decreases as technical rigour in the development process increases, but it's very difficult to anticipate all possible error conditions or to find the causes of errors that occur quite rarely in practice (eg after a few months of up time on a site like this). And exhaustive testing is prohibitively expensive too.

    Even developers of safety critical systems - where people can die if something goes wrong - can't realistically aim for a zero defect rate. Expecting that in a general purpose database is a tall order .....
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  15. #15
    Deprecated Dae's Avatar
    Join Date
    Oct 2004
    Location
    Canada
    Posts
    1,034
    Yet another reason I've never been a fan of vBulletin; but hey, maybe that's because as a kid I couldn't afford a license and was forced to use alternatives like Ikonboard -> IPB. Database error/bugs with vBulletin aren't as common as say ASP products (database error hell), but with the PHP alternatives it's a wonder people keep using vB. Perhaps try upgrading?
    Warning: Have doubt in anything I post.

    GCC 4.5, Boost 1.40, Code::Blocks 8.02, Ubuntu 9.10 010001000110000101100101

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Whoa, what happened?
    By webmaster in forum General Discussions
    Replies: 13
    Last Post: 01-10-2010, 11:29 AM
  2. Pls repair my basketball program
    By death_messiah12 in forum C++ Programming
    Replies: 10
    Last Post: 12-11-2006, 05:15 AM
  3. What ever happened to gopher?
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 06-01-2004, 07:06 PM
  4. What happened?
    By KrAzY CrAb in forum A Brief History of Cprogramming.com
    Replies: 4
    Last Post: 02-25-2003, 07:10 AM
  5. WHat happened to the master...
    By vasanth in forum A Brief History of Cprogramming.com
    Replies: 4
    Last Post: 02-04-2003, 03:53 AM