Multithreaded Facial Recognition with OpenCV

It has been quiet a while since i have been maintaining this blog and giving some information and codes to work with. Lately i started noticing that this can become tedious. So from now on i will try to give access to the projects or work i do using git. I had a git account since September of 2013, but, never got around to using it.

This project is a modification of the facial recognition project which is given with Mastering OpenCV with Practical Computer Vision. The book is available with Packtpub and Amazon. The code base is here https://github.com/MasteringOpenCV and is maintained by Shervin Emami.

I was trying to do the same on a Toradex NVidia Tegra3 based Colibri T30 module which has four CPU cores. The code is single threaded and as such doesn’t detect faces if the training process is going on. I made changes to this, so that even while the training process is on going, it can still detect faces. And mind you, the training process can go on for quiet a while if there are more than 3-4 faces. So, this is basically a two threaded version of the main code along with a few more changes as per my personal requirement. You can actually go one step further to utilize three cores, though right now i can’t recall what was suppose to be the job of the third core.

I do apologize for the code not being very clean. At first i was trying to use the threading facility available with C++, but since i am no C++ expert i ran into problems which i wasn’t able to fix quickly. Decided to use pthreads, which i am much more familiar and comfortable with. You will find the C++ threading part which i was trying to do, commented out. Once i get some C++ mastery using Bruce Eckel’s Thinking in C++, i will try to do it cleanly in just C++ and clean it or clean it anyways when i get time.

You can clone the project with:

git clone https://github.com/SanchayanMaity/MultithreadedFaceRecognition.git

You need to modify the Makefile to compile the project and use it on your platform which can be a PC or an embedded board. Please do note that this project will be useful only if you are running this on a platform which has two cores or more.

Hope you guys find it useful. Cheers! And git and Linus are awesome.

Extracting frame from a gstreamer pipeline and displaying it with OpenCV

Not much to write or say in this post. I was trying to extract a frame from the gstreamer pipeline and then display it with OpenCV.

There are two approaches in the code below.

1. Register a callback function whenever a new buffer becomes available with appsink and then use a locking mechanism to synchronize the extraction of the frame and display in the main thread.

2. The second one is to extract the buffer yourself in a while loop in the main thread.

The first one is active in the code below and the second one commented out. To enable the first mechanism, uncomment the mutex locking and signal connect mechanism and comment out the pull buffer call related stuff in the while loop.

Learn more about gstreamer from http://gstreamer.freedesktop.org/data/doc/gstreamer/head/manual/html/index.html and especially refer section 19.

For some reason, i am experiencing a memory leak issue with the below code (more so with the fist approach) and haven’t got around and being able to fix it. Also, for your platform the gstreamer pipeline elements will be different. Another problem was, i get x-raw-yuv data from my gstreamer source element and i am only able to display the black and white image with OpenCV. Nonetheless, i thought this might be useful and may be someone can also point out the error to me. Not a gstreamer expert by any means.


#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv/cv.h>
#include <gstreamer-0.10/gst/gst.h>
#include <gstreamer-0.10/gst/gstelement.h>
#include <gstreamer-0.10/gst/app/gstappsink.h>
#include <iostream>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <X11/Xlib.h>
#include <X11/Xutil.h>

using namespace std;
using namespace cv;

/* Structure to contain all our information, so we can pass it around */
typedef struct _CustomData
{
    GstElement *appsink;
    GstElement *colorSpace;    
    GstElement *pipeline;
    GstElement *vsource_capsfilter, *mixercsp_capsfilter, *cspappsink_capsfilter;
    GstElement *mixer_capsfilter;
    GstElement *bin_capture;
    GstElement *video_source, *deinterlace;     
    GstElement *nv_video_mixer;    
    GstPad *pad;
    GstCaps *srcdeinterlace_caps, *mixercsp_caps, *cspappsink_caps;    
    GstBus *bus;
    GstMessage *msg;        
}gstData;

GstBuffer* buffer;        

pthread_mutex_t threadMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t waitForGstBuffer = PTHREAD_COND_INITIALIZER; 

/* Global variables */
CascadeClassifier face_cascade;
IplImage *frame = NULL;     
string window_name =         "Toradex Face Detection Demo";
String face_cascade_name =    "/home/root/haarcascade_frontalface_alt2.xml";
const int BORDER =             8;          // Border between GUI elements to the edge of the image.

template <typename T> string toString(T t)
{
    ostringstream out;
    out << t;
    return out.str();
}

// Draw text into an image. Defaults to top-left-justified text, but you can give negative x coords for right-justified text,
// and/or negative y coords for bottom-justified text
// Returns the bounding rect around the drawn text
Rect drawString(Mat img, string text, Point coord, Scalar color, float fontScale = 0.6f, int thickness = 1, int fontFace = FONT_HERSHEY_COMPLEX)
{
    // Get the text size & baseline.
    int baseline = 0;
    Size textSize = getTextSize(text, fontFace, fontScale, thickness, &baseline);
    baseline += thickness;

    // Adjust the coords for left/right-justified or top/bottom-justified.
    if (coord.y >= 0) {
        // Coordinates are for the top-left corner of the text from the top-left of the image, so move down by one row.
        coord.y += textSize.height;
    }
    else {
        // Coordinates are for the bottom-left corner of the text from the bottom-left of the image, so come up from the bottom.
        coord.y += img.rows - baseline + 1;
    }
    // Become right-justified if desired.
    if (coord.x < 0) {
        coord.x += img.cols - textSize.width + 1;
    }

    // Get the bounding box around the text.
    Rect boundingRect = Rect(coord.x, coord.y - textSize.height, textSize.width, baseline + textSize.height);

    // Draw anti-aliased text.
    putText(img, text, coord, fontFace, fontScale, color, thickness, CV_AA);

    // Let the user know how big their text is, in case they want to arrange things.
    return boundingRect;
}

void create_pipeline(gstData *data)
{
    data->pipeline = gst_pipeline_new ("pipeline");
    gst_element_set_state (data->pipeline, GST_STATE_NULL);
}

gboolean CaptureGstBuffer(GstAppSink *sink, gstData *data)
{            
    //g_signal_emit_by_name (sink, "pull-buffer", &buffer);
    pthread_mutex_lock(&threadMutex);
    buffer = gst_app_sink_pull_buffer(sink);
    if (buffer)
    {        
        frame = cvCreateImage(cvSize(720, 576), IPL_DEPTH_16U, 3);
        if (frame == NULL)
        {
            g_printerr("IplImageFrame is null.\n");
        }
        else
        {
            //buffer = gst_app_sink_pull_buffer(sink);
            frame->imageData = (char*)GST_BUFFER_DATA(buffer);        
            if (frame->imageData == NULL)
            {
                g_printerr("IplImage data is null.\n");        
            }
        }        
        pthread_cond_signal(&waitForGstBuffer);            
    }            
    pthread_mutex_unlock(&threadMutex);
    return TRUE;
}

gboolean init_video_capture(gstData *data)
{    
    data->video_source = gst_element_factory_make("v4l2src", "video_source_live");
    data->vsource_capsfilter = gst_element_factory_make ("capsfilter", "vsource_cptr_capsfilter");
    data->deinterlace = gst_element_factory_make("deinterlace", "deinterlace_live");
    data->nv_video_mixer = gst_element_factory_make("nv_omx_videomixer", "nv_video_mixer_capture");    
    data->mixercsp_capsfilter = gst_element_factory_make ("capsfilter", "mixercsp_capsfilter");
    data->colorSpace = gst_element_factory_make("ffmpegcolorspace", "csp");        
    data->cspappsink_capsfilter = gst_element_factory_make ("capsfilter", "cspappsink_capsfilter");
    data->appsink = gst_element_factory_make("appsink", "asink");
        
    if (!data->video_source || !data->vsource_capsfilter || !data->deinterlace || !data->nv_video_mixer || !data->mixercsp_capsfilter || !data->appsink \
        || !data->colorSpace || !data->cspappsink_capsfilter)
    {
        g_printerr ("Not all elements for video were created.\n");
        return FALSE;
    }        
    
    g_signal_connect( data->pipeline, "deep-notify", G_CALLBACK( gst_object_default_deep_notify ), NULL );        
    
    gst_app_sink_set_emit_signals((GstAppSink*)data->appsink, true);
    gst_app_sink_set_drop((GstAppSink*)data->appsink, true);
    gst_app_sink_set_max_buffers((GstAppSink*)data->appsink, 1);    
    
    data->srcdeinterlace_caps = gst_caps_from_string("video/x-raw-yuv, width=(int)720, height=(int)576, format=(fourcc)I420, framerate=(fraction)1/1");        
    if (!data->srcdeinterlace_caps)
        g_printerr("1. Could not create media format string.\n");        
    g_object_set (G_OBJECT (data->vsource_capsfilter), "caps", data->srcdeinterlace_caps, NULL);
    gst_caps_unref(data->srcdeinterlace_caps);        
    
    data->mixercsp_caps = gst_caps_from_string("video/x-raw-yuv, width=(int)720, height=(int)576, format=(fourcc)I420, framerate=(fraction)1/1, pixel-aspect-ratio=(fraction)1/1");    
    if (!data->mixercsp_caps)
        g_printerr("2. Could not create media format string.\n");        
    g_object_set (G_OBJECT (data->mixercsp_capsfilter), "caps", data->mixercsp_caps, NULL);
    gst_caps_unref(data->mixercsp_caps);    
    
    data->cspappsink_caps = gst_caps_from_string("video/x-raw-yuv, width=(int)720, height=(int)576, format=(fourcc)I420, framerate=(fraction)1/1");        
    if (!data->cspappsink_caps)
        g_printerr("3. Could not create media format string.\n");        
    g_object_set (G_OBJECT (data->cspappsink_capsfilter), "caps", data->cspappsink_caps, NULL);    
    gst_caps_unref(data->cspappsink_caps);        
            
    data->bin_capture = gst_bin_new ("bin_capture");        
    
    /*if(g_signal_connect(data->appsink, "new-buffer", G_CALLBACK(CaptureGstBuffer), NULL) <= 0)
    {
        g_printerr("Could not connect signal handler.\n");
        exit(1);
    }*/
    
    gst_bin_add_many (GST_BIN (data->bin_capture), data->video_source, data->vsource_capsfilter, data->deinterlace, data->nv_video_mixer, \
                        data->mixercsp_capsfilter, data->colorSpace, data->cspappsink_capsfilter, data->appsink, NULL);
    
    if (gst_element_link_many(data->video_source, data->vsource_capsfilter, data->deinterlace, NULL) != TRUE)
    {
        g_printerr ("video_src to deinterlace not linked.\n");
        return FALSE;
    }        
    
    if (gst_element_link_many (data->deinterlace, data->nv_video_mixer, NULL) != TRUE)
    {
        g_printerr ("deinterlace to video_mixer not linked.\n");
        return FALSE;
    }        
    
    if (gst_element_link_many (data->nv_video_mixer, data->mixercsp_capsfilter, data->colorSpace, NULL) != TRUE)
    {
        g_printerr ("video_mixer to colorspace not linked.\n");
        return FALSE;    
    }
    
    if (gst_element_link_many (data->colorSpace, data->appsink, NULL) != TRUE)
    {
        g_printerr ("colorspace to appsink not linked.\n");
        return FALSE;    
    }
    
    cout << "Returns from init_video_capture." << endl;
    return TRUE;
}

void delete_pipeline(gstData *data)
{
    gst_element_set_state (data->pipeline, GST_STATE_NULL);
    g_print ("Pipeline set to NULL\n");
    gst_object_unref (data->bus);
    gst_object_unref (data->pipeline);
    g_print ("Pipeline deleted\n");
}

gboolean add_bin_capture_to_pipe(gstData *data)
{
    if((gst_bin_add(GST_BIN (data->pipeline), data->bin_capture)) != TRUE)
    {
        g_print("bin_capture not added to pipeline\n");
    }
    
    if(gst_element_set_state (data->pipeline, GST_STATE_NULL) == GST_STATE_CHANGE_SUCCESS)
    {        
        return TRUE;
    }
    else
    {
        cout << "Failed to set pipeline state to NULL." << endl;
        return FALSE;        
    }
}

gboolean remove_bin_capture_from_pipe(gstData *data)
{
    gst_element_set_state (data->pipeline, GST_STATE_NULL);
    gst_element_set_state (data->bin_capture, GST_STATE_NULL);
    if((gst_bin_remove(GST_BIN (data->pipeline), data->bin_capture)) != TRUE)
    {
        g_print("bin_capture not removed from pipeline\n");
    }    
    return TRUE;
}

gboolean start_capture_pipe(gstData *data)
{
    if(gst_element_set_state (data->pipeline, GST_STATE_PLAYING) == GST_STATE_CHANGE_SUCCESS)
        return TRUE;
    else
    {
        cout << "Failed to set pipeline state to PLAYING." << endl;
        return FALSE;
    }
}

gboolean stop_capture_pipe(gstData *data)
{
    gst_element_set_state (data->bin_capture, GST_STATE_NULL);
    gst_element_set_state (data->pipeline, GST_STATE_NULL);
    return TRUE;
}

gboolean deinit_video_live(gstData *data)
{
    gst_element_set_state (data->pipeline, GST_STATE_NULL);
    gst_element_set_state (data->bin_capture, GST_STATE_NULL);
    gst_object_unref (data->bin_capture);
    return TRUE;
}

gboolean check_bus_cb(gstData *data)
{
    GError *err = NULL;                
    gchar *dbg = NULL;   
          
    g_print("Got message: %s\n", GST_MESSAGE_TYPE_NAME(data->msg));
    switch(GST_MESSAGE_TYPE (data->msg))
    {
        case GST_MESSAGE_EOS:       
            g_print ("END OF STREAM... \n");
            break;

        case GST_MESSAGE_ERROR:
            gst_message_parse_error (data->msg, &err, &dbg);
            if (err)
            {
                g_printerr ("ERROR: %s\n", err->message);
                g_error_free (err);
            }
            if (dbg)
            {
                g_printerr ("[Debug details: %s]\n", dbg);
                g_free (dbg);
            }
            break;

        default:
            g_printerr ("Unexpected message of type %d", GST_MESSAGE_TYPE (data->msg));
            break;
    }
    return TRUE;
}

void get_pipeline_bus(gstData *data)
{
    data->bus = gst_element_get_bus (data->pipeline);
    data->msg = gst_bus_poll (data->bus, GST_MESSAGE_EOS | GST_MESSAGE_ERROR, -1);
    if(GST_MESSAGE_TYPE (data->msg))
    {
        check_bus_cb(data);
    }
    gst_message_unref (data->msg);
}

int main(int argc, char *argv[])
{        
    //Mat frame;
    VideoCapture capture;    
    gstData gstreamerData;
    GstBuffer *gstImageBuffer;
    
    //XInitThreads();
    gst_init (&argc, &argv);
    create_pipeline(&gstreamerData);
    if(init_video_capture(&gstreamerData))
    {        
        add_bin_capture_to_pipe(&gstreamerData);    
        start_capture_pipe(&gstreamerData);
        //get_pipeline_bus(&gstreamerData);    
    
        cout << "Starting while loop..." << endl;
        cvNamedWindow("Toradex Face Detection Demo with Gstreamer", 0);    
    
        while(true)
        {    
            //pthread_mutex_lock(&threadMutex);
            //pthread_cond_wait(&waitForGstBuffer, &threadMutex);
            
            gstImageBuffer = gst_app_sink_pull_buffer((GstAppSink*)gstreamerData.appsink);
        
            if (gstImageBuffer != NULL)
            {        
                frame = cvCreateImage(cvSize(720, 576), IPL_DEPTH_8U, 1);
                    
                if (frame == NULL)
                {
                    g_printerr("IplImageFrame is null.\n");
                }
                else
                {        
                    frame->imageData = (char*)GST_BUFFER_DATA(gstImageBuffer);        
                    if (frame->imageData == NULL)
                    {
                        g_printerr("IplImage data is null.\n");            
                    }                    
                    cvShowImage("Toradex Face Detection Demo with Gstreamer", frame);  
                    cvWaitKey(1);                    
                    gst_buffer_unref(gstImageBuffer);
                }
            }
            else
            {
                cout << "Appsink buffer didn't return buffer." << endl;
            }
            /*
            if (frame)
            {
                cvShowImage("Toradex Face Detection Demo with Gstreamer", frame);
            }
            gst_buffer_unref(buffer);
            buffer = NULL;            
            pthread_mutex_unlock(&threadMutex);    
            cvWaitKey(1);*/                                    
        }
    }
    else
    {
        exit(1);
    }
              
    //Destroy the window
    cvDestroyWindow("Toradex Face Detection Demo with Gstreamer");
       remove_bin_capture_from_pipe(&gstreamerData);
       deinit_video_live(&gstreamerData);    
    delete_pipeline(&gstreamerData);
    
       return 0;
}