Thread: Scene / image recognition

  1. #1
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472

    Scene / image recognition

    Hi all, I would appreciate any thoughts on a new project I wish to undertake. I am rather fascinated by the idea of image recognition software, I would like to write something that can pick categories of items from a scene. I was wondering about approach to this sort of thing. The idea of a template 'lookup' definition came to mind, with a library of basic objects as a starting point, then I could use heuristics etc in an evaluation routine. Id be happy if i just got the thing to output 'two cups' 'two saucers'. But i think even with a two tone sillhouette this would be tough, never mind the problems with a photograph, multitude of angles and image etc. Thanks for any thoughts on different approaches to this.
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

  2. #2
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by rogster001 View Post
    Hi all, I would appreciate any thoughts on a new project I wish to undertake. I am rather fascinated by the idea of image recognition software, I would like to write something that can pick categories of items from a scene. I was wondering about approach to this sort of thing. The idea of a template 'lookup' definition came to mind, with a library of basic objects as a starting point, then I could use heuristics etc in an evaluation routine. Id be happy if i just got the thing to output 'two cups' 'two saucers'. But i think even with a two tone sillhouette this would be tough, never mind the problems with a photograph, multitude of angles and image etc. Thanks for any thoughts on different approaches to this.
    This is really a very hard problem, obviously, but one particularly useful technique I've used in the past that turned out to be both fairly simple to implement and remarkably accurate (yet fairly "cheap" computationally-wise) made use of the humble wavelet to accomplish just that. There's a little more to it, of course, the color data was first converted to a meaningful color space and then normalized to a signed-number representation before convolution. Statistical calculations were then applied to each "level" of the multiresolutional result which was to be compared to that of another image to come up with an overall final "score". Well, that's the basic idea, anyhow.

    [EDIT]
    Re-reading your post now, I realize that you're asking for something a bit more elaborate than what I described (a means to identify each object in the scene). Finding multiple objects is just a matter of iteration over and grouping of the result - nothing too challenging, really...
    [/EDIT]
    Last edited by Sebastiani; 07-21-2014 at 12:41 AM.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    I would try to combine fingerprint approach used by Digikam for fuzzy search of similar images with edge processing - which could be used to split image into separate objects and then process fingerprint for each subarea separetly.
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  4. #4
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    nothing too challenging, really...
    hehe, Its up there.. I was considering something more like Vart has suggested, certainly trying to just obtain data that can be broken down to a 'shape' parameter, definitely thinking edge detection and then my idea was to have a template library to compare that data against, I suppose that would be required anyway?

    [EDIT] I see Sebastiani had already indicated use of a lookup
    Last edited by rogster001; 07-22-2014 at 01:53 PM.
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

  5. #5
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    Statistical calculations were then applied to each "level" of the multiresolutional result which was to be compared to that of another image to come up with an overall final "score"
    Thanks, that would be something i'd like to explore. I get using different resolutions yes, would you start from the lowest resolution, that seems the natural one for speed maybe?
    Last edited by rogster001; 07-22-2014 at 02:14 PM.
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

  6. #6
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by rogster001 View Post
    Thanks, that would be something i'd like to explore. I get using different resolutions yes, would you start from the lowest resolution, that seems the natural one for speed maybe?
    I would say all of them, preferably. I employed an equal-weight contribution at every scale, but I suppose you could get even better results by assigning a higher weight to increasing resolutions of the wavelet data.

    Quote Originally Posted by rogster001 View Post
    I was considering something more like Vart has suggested, certainly trying to just obtain data that can be broken down to a 'shape' parameter, definitely thinking edge detection and then my idea was to have a template library to compare that data against, I suppose that would be required anyway?
    Sure, you can do that, but don't expect spectacular results - maybe coupled with some other method to obtain a hybrid comparison of the data? I used nothing more than a simple Haar band-pass filter and still achieved amazing results (not to mention being fast and efficient).
    Last edited by Sebastiani; 07-22-2014 at 03:35 PM.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. What happens behind the scene when value is returned ?
    By afesheir in forum C Programming
    Replies: 6
    Last Post: 12-01-2011, 11:41 AM
  2. Scene Graph, Quadtree, Octtree, and Scene Manager.
    By sarah22 in forum Game Programming
    Replies: 1
    Last Post: 04-17-2010, 09:20 AM
  3. Scene Description Language in C
    By shameerkm in forum C Programming
    Replies: 1
    Last Post: 01-12-2010, 01:16 AM
  4. Scene graphs
    By VirtualAce in forum Game Programming
    Replies: 1
    Last Post: 08-09-2009, 12:40 PM
  5. automatic image recognition
    By xximranxx in forum Game Programming
    Replies: 15
    Last Post: 05-10-2007, 06:31 AM