video tracking in C
I have to implement a system that analyses the image data captured by
a video camera mounted in the window of a shop.
(The images captured are in a grey-scale format)
I need to perform basic activity, person tracking and recognition.
The activity recognition is aimed to determine the level of interest
in the items displayed in the shop window. There are two types of
activities that need to be recognized: (1) person walking past the
shop and (2) person looking at the shop window. By counting the number
of persons that stop and look for some time at the window, the level
of interest in the shop can then be derived.
The person tracking recognition is aimed at determining how many
different persons pass in front of the shop. This information can be
again used to determine the level of interest in the shop (person may
be returning to get another look at the items on display) as well as
identify possible criminal intent (“scoping the place”).
Therefore, the system needs to be able to track and identify all
persons in the scene and label their overall activity.
The activity needs to be labeled based on the cumulative information
about each person tracked. Specifically,
each new person entering the scene will be given the “person walking”
activity label. If they stop in front of the shop then, their activity
changes to “person looking at the window shop”. The system receives
the information in the form of a sequence of images.
The system should be able to do the following:
a) build a suitable average frame from a given sequence of images
b) clearly specify how many persons are present in each of the test
frames as well as:
i) the position and identity of each of the persons (using a
bounding box) and the label of their activity
ii) clearly specify if any of the persons have been seen before in
SO i need some advice on how I can achieve this?
Where to start (learn anything?) to make awesome thing like that?
The very beginning
I am assuming that the cameras doesn't move.
So, if you have a sequence of frames you have the difference between two
sequenced ones in pixels, frame by frame. This blur are moving things, those are the people, since all the other objects doesn't move.
If you have a proper color/ image treatment, and a simple convex recognition
algorithm you will be able to count the number of the people in each "test frame".
Cutting the test frame in peaces might help you to identify whether "people blur"
are near to windows or not. The rest is up to you.
I beleve this is a good way to start the approach. I have no knowledge of open source
API's of image recogition that would help you, but google saves.
Thank you for the information.I like this forum
Please contact me
ennavakillesal is including email addresses so they can check if their bots are able to pull addresses from posts for the purpose of spamming.
Originally Posted by ennavakillesal