Thread: "Blob" server?

  1. #1
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229

    "Blob" server?

    I have this strange idea about making a generic and very simple database server that will store nothing more than key/value pairs. Keys will be strings, and values can be anything serializable (the server won't interpret the values). It's enough for many applications - synchronizing RSS feeds "read" status, share bookmarks, short notes, application settings, etc. In general, applications that require synchronization by sharing small "blobs" of data.

    The key is to be extremely simple (for clients). Connections can be done over HTTP (with the server written in PHP), since HTTP libraries exist for just about all languages under the sun, and should be stateless (efficiency is not a big concern here). Authentication could be done using challenge-and-response, but I don't see how it can be made stateless. Maybe not even that (just hashed password).

    There would just be a store(key, value) function, and a retrieve(key) function.

    And of course, each server can only support 1 user, but a server is no more than a directory on a HTTP server, so it's easy to run 1 "server" for each user.

    This is a very simple idea, and I'm guessing I'm not the first one to think of it, but I can't seem to find anything like that on Google.

    Many applications don't need the complexity of relational databases, and this can make life easier for many programmers.

    Thoughts?
    Last edited by cyberfish; 11-23-2009 at 11:20 PM.

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by cyberfish View Post
    This is a very simple idea, and I'm guessing I'm not the first one to think of it, but I can't seem to find anything like that on Google.
    I kind of agree SQL style relational databases are over rated and over used. I might guess that the reason you haven't heard of anything like this is because it is so simple -- by the time most web programmers realize they don't have to use SQL, they probably also realize most (or a lot) of stuff can be done just as well only easier with a flat file. I just wrote a site search engine for someone, extracted all the text into a single file -- 9 gb, which is a lot of text -- but you can still perform regexs on that in no time at all. Here's your key value:

    Code:
    filepath**filedata
    Which regular expressions are a lot more powerful (but slower) tool than SQL IMO (but I'm a lil' ignorant, too), and if you are going to code it in PHP, you might include that option.

    Also: there are some extemely popular perl modules (YAML, Storable*) around that will serialize hashes, which is what you are talking about. AFAIK you have to load the whole thing into memory tho, which is no good, but doubtless there are other modules that build on this to overcome that limitation and if you looked around on CPAN you'd find them. Those will be in widespread use, but they probably do not have splashy homesites, etc, and would be tricky to detect via google.

    So it's probably a good idea in the sense that there will still be a lot of people around who could make use of it if they were aware of the option. Niche niche niche!

    * I believe these are some of the successors to dbm that laserlight refers to, dbm modules being available in most scripting languages I think
    Last edited by MK27; 11-24-2009 at 12:38 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by cyberfish
    This is a very simple idea, and I'm guessing I'm not the first one to think of it, but I can't seem to find anything like that on Google.
    It sounds like you want to re-invent dbm and its successors.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    One issue with PHP I read about recently is that it does not work well with mpm_worker, which is the (faster) threaded setup of apache (vs. the forking setup)...
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by MK27
    One issue with PHP I read about recently is that it does not work well with mpm_worker, which is the (faster) threaded setup of apache (vs. the forking setup)...
    Yes, the PHP manual's installation instructions clearly warn about this. It is in the installation FAQ.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by MK27 View Post
    One issue with PHP I read about recently is that it does not work well with mpm_worker, which is the (faster) threaded setup of apache (vs. the forking setup)...
    I wouldn't run a web server in threaded mode... I'm not sure I trust the code enough (especially PHP or any other apache modules). If you compromise the server somehow and it's threaded, you get instant access to all the clients running on that process. At least with separate processes the damage will be confined to a single client.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  7. #7
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Quote Originally Posted by MK27
    Which regular expressions are a lot more powerful (but slower) tool than SQL IMO (but I'm a lil' ignorant, too), and if you are going to code it in PHP, you might include that option.
    Regex's are always fun . I have tried to learn it a few times already. I didn't get stuck or anything... I just... forget about it after a while.

    Quote Originally Posted by laserlight
    It sounds like you want to re-invent dbm and its successors.
    Indeed! Thanks for pointing that out.

    So I guess now I just need to write a simple server wrapper thing... or maybe one exists already.

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by brewbuck View Post
    I wouldn't run a web server in threaded mode... I'm not sure I trust the code enough (especially PHP or any other apache modules). If you compromise the server somehow and it's threaded, you get instant access to all the clients running on that process. At least with separate processes the damage will be confined to a single client.
    Interesting.

    I doubt that is a significant issue w/ 90% of web server activity tho. What are you going to do, read a bunch of identical requests for the same pages you have seen already? Spoof some material? Also, by default apache does not maintain persistent connections, so those threads are probably already dead and dying. Of course, you could "answer the phone" maybe

    But that will be almost the same issue with the forking model -- each process handles bunches of clients.

    In a setting where, eg, people are submitting their credit cards or something, I guess there are many things like this you may want to account for.

    In fact I think the thread vs. fork performance issue is only significant 1) on large dedicated multi-core servers, 2) on small shared servers, where the memory footprint per client will be much smaller. I just like to diss PHP occasionally.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    How about CouchDB?

  10. #10
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    There's also memcache, which is what you want, just memory-only. But if you write a server, you might want to be interface-compatible.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  11. #11
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Quote Originally Posted by MK27 View Post
    I kind of agree SQL style relational databases are over rated and over used. I might guess that the reason you haven't heard of anything like this is because it is so simple -- by the time most web programmers realize they don't have to use SQL, they probably also realize most (or a lot) of stuff can be done just as well only easier with a flat file.
    Sure, and how do you deal with deadlocks, atomic operations, rollbacks and various other features most DBMSs provide? Plus not being relational, or having much structure at all puts a lot of onus on the application rather than the DBMS. Don't forget which group of programmers (read: web designers) use PHP.

    Not to mention changing your file structure could spell the end of the world if you have hundreds of gigs of files.

    Quote Originally Posted by MK27
    I just like to diss PHP occasionally.
    We ought to start a club for that
    Last edited by zacs7; 12-03-2009 at 12:07 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. server client application - (i really need your help)
    By sarahnetworking in forum C Programming
    Replies: 3
    Last Post: 03-01-2008, 10:54 PM
  2. Server Architecture
    By coder8137 in forum Networking/Device Communication
    Replies: 2
    Last Post: 01-29-2008, 11:21 PM
  3. Where's the EPIPE signal?
    By marc.andrysco in forum Networking/Device Communication
    Replies: 0
    Last Post: 12-23-2006, 08:04 PM
  4. IE 6 status bar
    By DavidP in forum Tech Board
    Replies: 15
    Last Post: 10-23-2002, 05:31 PM
  5. socket question
    By Unregistered in forum C Programming
    Replies: 3
    Last Post: 07-19-2002, 01:54 PM