Showing posts with label OpenCV. Show all posts
Showing posts with label OpenCV. Show all posts

6/23/16

D-Jenga [Games]

This is a game I was working on since April 2016. The game is an augmented reality version of Jenga Tower integrated with Leap Motion. Currently the game works only on monitors. Moving it to HMD is the next step I guess. The computer vision code is made from scratch using OpenCV. The game engine is made from scratch using cpp (Blank Engine).
Soundtrack made by me, check the description if you like it !!

9/23/15

FFmpeg H.264 RTSP Server Using Live555 [c++][Code on Git]

Hello again everyone! Today's post is related to video streaming/serving and transcoding.
I was working on an RTSP server to transcode a camera feed on the fly. I needed the server to serve only one client (unicast). I will explain bits of the code here and you will find the full code on my repository.

The Server is composed of 5 components:

  1. Transcoder (using FFmpeg):
    • Decoder 
    • Encoder
  2. Live Framed Source (Using Live555)
  3. Media Sub-session  (Using Live555)
  4. RTSP Server (Using Live555)
  5. Main file
Transcoder
Our transcoder here is transcoding any video codec to H.264.
This is done over 2 steps.

Transcoder: Decoder

The decoder role here is to provide a frame for the encoder to be sent later on over network.
The output of the decoder is a uint8_t * buffer of the RGB data. This means it doesn't have to be an FFmpeg decoder, we can use for instance OpenCV and get the buffer data from Mat.data which also a uint8_t *.

The decoder has a constructor that takes a path to the media file to be played. An initialize function to set the FFmpeg structs and get the file info (width, height, GOP*, bitrate and frameRate). A playMedia function which starts to play the video file. An onFrame callback function used to report the arrival of a new frame, so that we can send it later on to the encoder then to the server to be served.The onFrame function is set using the setOnframeCallbackFunction at the beginning of the run. The decoder design is oriented for file processing only. Incase of streaming video,  the decoder need to threaded(streaming packets and deciding).



*GOP: is the interval of I frames, How often do we get I frames.

Transcoder: Encoder
As stated before we want to transcode the video to H.264, the encoder role here is to encode the video frames and fill the RTSP server buffer with them. That's why the encoder has 2 queues one for receiving the input RGB frames and the other one is to output encoded packets to the server buffer.
The sendNewFrame function is simply enqueue a new RGB frame. The encoder main loop is responsible of dequeuing the input queue and call the writeFrame function to encode the frame and enqueue the encoded packet in the output queue. The GetFrame function is called to extract a new packet from the output queue asynchronously from the server side. In case of there is no new frame to write to the server buffer the call of GetFrame is ignored waiting for the encoder resignal the server on a new input.

Live Framed Source
The encoder is interacting with the server and this is done through an interface defined by the Live555 (FramedSource). After implementing this interface we got the FFmpegH264Source class. 
We are using a FramedSource interface because we only can get out of FFmpeg encoded single frames in form of packets. The api offered by the FramedSource interface is basically a way to start delivering frames by using doGetNextFrame, a way to stop getting them by using doStopGettingFrames and an extra call back function for the encoder to use when no new frames are in the queue and we started to get some to resignal the calls of doGetNextFrame.

Media Sub-session
The media sub-session is mainly describing the input and the output of the server when client connect to the server, In our case because we are using an unicast approach, we have only one sub-session that takes as source H264VideoStreamDiscreteFramer defined in createNewStreamSource. 

H264VideoStreamDiscreteFramer means that we will provide the RTSP server with discrete single frames in each call of GetFrame invoked by doGetNextFrame. Because FFmpeg by default needs to output the packets to be saved in a file, we creates a dummy video file at the beginning to fool FFmpeg then we ignore the writing part and direct the packets to the server buffer. The only issue here is that FFmpeg is adding extra information to the packet related to the location of the frame in the video file. If we take a packet extracted by FFmpeg we can see that is composed of a NALU. Each NALU has a start code of 4 bytes (in the case of x264 used by FFmpeg) marking the beginning of new frame in a video file. Because we are using H264VideoStreamDiscreteFramer it assumes the packets without the start code. So basically what we do in the encoder, we remove the first 4 bytes from the packet and then we enqueue it in the output queue before reaching the server buffer. The other aspect described by the Sub-session is the RTPSink that it will use to serve the client. We are using a H264VideoRTPSink which holds the all the encoded packets extracted from the encoder before being served to the client. Here you may need to resize the buffer of the sink before starting the server incase of 4k frames.

RTSP Server
This component is mainly initializing and linking any Live555 components together. It has a main loop that is called and get blocked at the end of it waiting for any clients. Here you may want to name your session or give an authentication credentials to it or even set the port you want to stream on if you don't want Live555 to choose them for you. And at the end of the server initialization we define the input source and sub-session we want to use. Here also you can change the Unicast to Multicast to server more clients for the same source. If everything is OK the server will print for you the URL you can stream on and then enter the eventLoop and block.

Main file
Here an instance of the decoder, encoder and server are created and initialized each instance is running on a separate thread. The decoder is running on the main thread. The playMedia function is called at the end of the main function and it will start by getting the frames from a file and send it to the encoder one by one. Then, the encoder will encode them and send them to server and there they will be ready for a client to stream them.


If there is any bugs please notify me I am still working on enhancements especially on performance
Ok that's it, if you have any questions feel free to ask !!
and here it is the git repo of the project:

1/25/13

Android Native Camera With OpenCV and OpenGLES

Today I will show a how to use OpenCV and OpenGL to make your own camera preview.it can be used later for image processing or like in my case in a Augmented Reality app. Some questions,assumptions and info need to be considered.

Why accessing the camera natively?
because Image processing is done with OpenCV which is written in C and C++.so you don't want to get your frames in Java and send them to using JNI which is slow get the result back from it then render the resulting frame on the screen.so what  we gonna do is to use the camera,openCV and OpenGL all written in C++.

I heard that's it is not supported for all phones,is that right?
So here is the problem android developers changes their native camera module very frequently.OpenCV people tries to cover all the implemented modules till now and they come up with 7 native libraries to cover the majority of the phones till now.So I guess the answer is yes and no.

My code was tested on Samsung Galaxy Note & I got >30 FPS(Frame Per Second) but due to the camera hardware device you can't get more than 30 FPS.

I am using OpenGLES 1.1 because it's easier maybe not efficient because there is no FBOs (Frame Buffer Object) but actually I din't care so much because I got my target FPS and even more.

No camera controles is made here like taking pictures and adjusting focus,just taking frames and render them with OpenGL.

This topic does't have lots of resources and that made me create a blog and write this post.

So lets begin :D



I am assuming that you have installed OpenCV & NDK and know how to use them.

First of all,You need to have a JNI folder so create one if not there.Then you need to add inside it another folder called build.inside build you need to put openCV include folder and libs folder.Both folders comes with the openCV sdk.

OpenCV_Directory->sdk->native->jni->include
OpenCV_Directory->sdk->native->libs

Now you are ready to go.

I will explain the structure of my code in term of threads.
So we have the main thread(1) that controls that whole app.Then after creating a GLSurface and setting its renderer we get another thread called GLThread(2) that is responsible for drawing the the frames that we grab from the camera and render it on the screen.And the last thread is the Frame-Grabber thread(3) (really slow thread),all what it's doing is just taking frames and store them into the a buffer so lately they can be drawn to the screen.

Here we have the our main Activity class called CameraPreviewer.java


package com.mesai.nativecamera;

import java.util.List;

import org.opencv.android.CameraBridgeViewBase.ListItemAccessor;
import org.opencv.android.NativeCameraView.OpenCvSizeAccessor;
import org.opencv.core.Size;
import org.opencv.highgui.Highgui;
import org.opencv.highgui.VideoCapture;

import android.app.Activity;
import android.opengl.GLSurfaceView;
import android.os.Bundle;
import android.view.Display;

public class CameraPreviewer extends Activity {

    GLSurfaceView mView;
    
    @Override protected void onCreate(Bundle icicle) {
        super.onCreate(icicle);
        Native.loadlibs();
        VideoCapture mCamera = new VideoCapture(Highgui.CV_CAP_ANDROID);
        java.util.List<Size> sizes = mCamera.getSupportedPreviewSizes();
        mCamera.release();
 
        mView = new GLSurfaceView(getApplication()){
         @Override
         public void onPause() {
          super.onPause();
          Native.releaseCamera();
         }
        };
        Size size = calculateCameraFrameSize(sizes,new OpenCvSizeAccessor());
        mView.setRenderer(new CameraRenderer(this,size));
        setContentView(mView);
    }
    
 protected Size calculateCameraFrameSize(List supportedSizes,
   ListItemAccessor accessor) {
  int calcWidth = Integer.MAX_VALUE;
  int calcHeight = Integer.MAX_VALUE;

  Display display = getWindowManager().getDefaultDisplay();

  int maxAllowedWidth = 1024;
  int maxAllowedHeight = 1024;

  for (Object size : supportedSizes) {
   int width = accessor.getWidth(size);
   int height = accessor.getHeight(size);

   if (width <= maxAllowedWidth && height <= maxAllowedHeight) {
    if ( width <= calcWidth 
      && width>=(maxAllowedWidth/2)
      &&(display.getWidth()%width==0||display.getHeight()%height==0)) {
     calcWidth = (int) width;
     calcHeight = (int) height;
    }
   }
  }

  return new Size(calcWidth, calcHeight);
 }
    @Override protected void onPause() {
        super.onPause();
        mView.onPause();
       
    }

    @Override protected void onResume() {
        super.onResume();
        mView.onResume();
        
    }
}

Here we can see that there is some weird lines @23-25.this is because the method getSupportedPreviewSizes() is not supported in the C++ version.And I needed the supported resolutions of the camera so that I can pick one that suits me.After that I create the GLSurface that can be used for video rendering.

Now our Custom Renderer CameraRenderer.java
package com.mesai.nativecamera;


import javax.microedition.khronos.egl.EGLConfig;
import javax.microedition.khronos.opengles.GL10;

import org.opencv.core.Size;

import android.content.Context;
import android.opengl.GLSurfaceView.Renderer;


public class CameraRenderer implements Renderer {

 private Size size;
 private Context context;
 public CameraRenderer(Context c,Size size) {
  super();
  context = c;
  this.size = size;
 }
 
 public void onSurfaceCreated(GL10 gl, EGLConfig config) {
  Thread.currentThread().setPriority(Thread.MAX_PRIORITY);
  Native.initCamera((int)size.width,(int)size.height);
 }

 public void onDrawFrame(GL10 gl) {
//  long startTime   = System.currentTimeMillis();
  Native.renderBackground();
//  long endTime   = System.currentTimeMillis();
//  if(30-(endTime-startTime)>0){
//   try {
//    Thread.sleep(30-(endTime-startTime));
//   } catch (InterruptedException e) {}
//  }
//  endTime   = System.currentTimeMillis();
  //System.out.println(endTime-startTime+" ms");
 }
 
 public void onSurfaceChanged(GL10 gl, int width, int height) {
  Native.surfaceChanged(width,height,context.getResources().getConfiguration().orientation);
 }
} 
uncommenting the part @lines 29-38 will make your camera have a steady 30 fps and will print the time needed to draw each frame.(just for debugging)
The renderer is referring to a class called Native in all it's Method
so here it's
package com.mesai.nativecamera;

public class Native {
 public static void loadlibs(){
  System.loadLibrary("opencv_java");
  System.loadLibrary("NativeCamera");
 }
 public static native void initCamera(int width,int height);
 public static native void releaseCamera();
 public static native void renderBackground();
 public static native void surfaceChanged(int width,int height,int orientation);
}
Ok now for the native part here is the code for CameraRenderer.cpp
#include <jni.h>
#include <GLES/gl.h>
#include <GLES/glext.h>
#include <android/log.h>
#include <opencv2/highgui/highgui.hpp>
#include <opencv/cv.h>
#include <pthread.h>
#include <time.h>
#include <Math.h>

// Utility for logging:
#define LOG_TAG    "CAMERA_RENDERER"
#define LOG(...)  __android_log_print(ANDROID_LOG_INFO, LOG_TAG, __VA_ARGS__)

GLuint texture;
cv::VideoCapture capture;
cv::Mat buffer[30];
cv::Mat rgbFrame;
cv::Mat inframe;
cv::Mat outframe;
int bufferIndex;
int rgbIndex;
int frameWidth;
int frameHeight;
int screenWidth;
int screenHeight;
int orientation;
pthread_mutex_t FGmutex;
pthread_t frameGrabber;
pthread_attr_t attr;
struct sched_param param;

GLfloat vertices[] = {
      -1.0f, -1.0f, 0.0f, // V1 - bottom left
      -1.0f,  1.0f, 0.0f, // V2 - top left
       1.0f, -1.0f, 0.0f, // V3 - bottom right
       1.0f,  1.0f, 0.0f // V4 - top right
       };

GLfloat textures[8];

extern "C" {

void drawBackground();
void createTexture();
void destroyTexture();
void *frameRetriever(void*);

JNIEXPORT void JNICALL Java_com_mesai_nativecamera_Native_initCamera(JNIEnv*, jobject,jint width,jint height)
{
 LOG("Camera Created");
 capture.open(CV_CAP_ANDROID + 0);
 capture.set(CV_CAP_PROP_FRAME_WIDTH, width);
 capture.set(CV_CAP_PROP_FRAME_HEIGHT, height);
 frameWidth =width;
 frameHeight = height;
 LOG("frameWidth = %d",frameWidth);
 LOG("frameHeight = %d",frameHeight);
 glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
 glShadeModel(GL_SMOOTH);
 glClearDepthf(1.0f);
 glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_NICEST);
 pthread_attr_t attr;
 pthread_attr_init(&attr);
 pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
 pthread_attr_setschedpolicy(&attr, SCHED_FIFO);
 memset(&param, 0, sizeof(param));
 param.sched_priority = 100;
 pthread_attr_setschedparam(&attr, &param);
 pthread_create(&frameGrabber, &attr, frameRetriever, NULL);
 pthread_attr_destroy(&attr);

}

JNIEXPORT void JNICALL Java_com_mesai_nativecamera_Native_surfaceChanged(JNIEnv*, jobject,jint width,jint height,jint orien)
{
 LOG("Surface Changed");
 glViewport(0, 0, width,height);
 if(orien==1) {
   screenWidth = width;
   screenHeight = height;
   orientation = 1;
  } else {
   screenWidth = height;
   screenHeight = width;
   orientation = 2;
  }


 LOG("screenWidth = %d",screenWidth);
 LOG("screenHeight = %d",screenHeight);
 glMatrixMode(GL_PROJECTION);
 glLoadIdentity();
 float aspect = screenWidth / screenHeight;
 float bt = (float) tan(45 / 2);
 float lr = bt * aspect;
 glFrustumf(-lr * 0.1f, lr * 0.1f, -bt * 0.1f, bt * 0.1f, 0.1f,
   100.0f);
 glMatrixMode(GL_MODELVIEW);
 glLoadIdentity();
 glEnable(GL_TEXTURE_2D);
 glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
 glClearDepthf(1.0f);
 glEnable(GL_DEPTH_TEST);
 glDepthFunc(GL_LEQUAL);
 createTexture();
}

JNIEXPORT void JNICALL Java_com_mesai_nativecamera_Native_releaseCamera(JNIEnv*, jobject)
{
 LOG("Camera Released");
 capture.release();
 destroyTexture();

}

void createTexture() {
  textures[0] = ((1024.0f-frameWidth*1.0f)/2.0f)/1024.0f;
  textures[1] = ((1024.0f-frameHeight*1.0f)/2.0f)/1024.0f + (frameHeight*1.0f/1024.0f);
  textures[2] = ((1024.0f-frameWidth*1.0f)/2.0f)/1024.0f + (frameWidth*1.0f/1024.0f);
  textures[3] = ((1024.0f-frameHeight*1.0f)/2.0f)/1024.0f + (frameHeight*1.0f/1024.0f);
  textures[4] = ((1024.0f-frameWidth*1.0f)/2.0f)/1024.0f;
  textures[5] = ((1024.0f-frameHeight*1.0f)/2.0f)/1024.0f;
  textures[6] = ((1024.0f-frameWidth*1.0f)/2.0f)/1024.0f + (frameWidth*1.0f/1024.0f);
  textures[7] = ((1024.0f-frameHeight*1.0f)/2.0f)/1024.0f;
 LOG("Texture Created");
 glGenTextures(1, &texture);
 glBindTexture(GL_TEXTURE_2D, texture);
 glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
 glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
 glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 1024,1024, 0, GL_RGB,
   GL_UNSIGNED_SHORT_5_6_5, NULL);
 glBindTexture(GL_TEXTURE_2D, 0);
}

void destroyTexture() {
 LOG("Texture destroyed");
 glDeleteTextures(1, &texture);
}

JNIEXPORT void JNICALL Java_com_mesai_nativecamera_Native_renderBackground(
  JNIEnv*, jobject) {
 drawBackground();
}

void drawBackground() {
 glClear (GL_COLOR_BUFFER_BIT);
 glBindTexture(GL_TEXTURE_2D, texture);
 if(bufferIndex>0){
 pthread_mutex_lock(&FGmutex);
 cvtColor(buffer[(bufferIndex - 1) % 30], outframe, CV_BGR2BGR565);
 pthread_mutex_unlock(&FGmutex);
 cv::flip(outframe, rgbFrame, 1);
 if (texture != 0)
   glTexSubImage2D(GL_TEXTURE_2D, 0, (1024-frameWidth)/2, (1024-frameHeight)/2, frameWidth, frameHeight,
   GL_RGB, GL_UNSIGNED_SHORT_5_6_5, rgbFrame.ptr());
 }
 glEnableClientState (GL_VERTEX_ARRAY);
 glEnableClientState (GL_TEXTURE_COORD_ARRAY);
 glLoadIdentity();
 if(orientation!=1){
  glRotatef( 90,0,0,1);
 }
 // Set the face rotation
 glFrontFace (GL_CW);
 // Point to our vertex buffer
 glVertexPointer(3, GL_FLOAT, 0, vertices);
 glTexCoordPointer(2, GL_FLOAT, 0, textures);
 // Draw the vertices as triangle strip
 glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
 //Disable the client state before leaving
 glDisableClientState(GL_VERTEX_ARRAY);
 glDisableClientState(GL_TEXTURE_COORD_ARRAY);
}

void *frameRetriever(void*) {
 while (capture.isOpened()) {
  capture.read(inframe);
  if (!inframe.empty()) {
   pthread_mutex_lock(&FGmutex);
   inframe.copyTo(buffer[(bufferIndex++) % 30]);
   pthread_mutex_unlock(&FGmutex);
  }
 }
 LOG("Camera Closed");
 pthread_exit (NULL);
}


}
Here we have a lot of things that need to be explained :
  1. In initCamera :We initialize our camera here by saying which camera we want to access then we set the resolution we want to our frame to be.
  2. initCamera (cont.):We initialize a new thread(Frame-Grabber) for more information on native threads you can search posix thread or pthreads.
  3. surfaceChanged :it 's all Opengl initialization stuff.
  4. destroyTexture and releaseCamera is for exiting the app.
  5. drawBackground: is for rendering the frames.
  6. frameRetriever is the method called by the Frame Grabber thread.
The now you need android.mk file to compile the .cpp

LOCAL_PATH := $(call my-dir)




include $(CLEAR_VARS)
LOCAL_MODULE := opencv-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libopencv_java.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera1-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r4.2.0.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera2-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r4.1.1.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera3-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r4.0.3.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera4-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r4.0.0.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera5-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r3.0.1.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera6-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r2.3.3.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
LOCAL_MODULE := camera7-prebuilt
LOCAL_SRC_FILES = build/libs/$(TARGET_ARCH_ABI)/libnative_camera_r2.2.0.so
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/build/include
include $(PREBUILT_SHARED_LIBRARY)

include $(CLEAR_VARS)
OPENGLES_LIB  := -lGLESv1_CM
OPENGLES_DEF  := -DUSE_OPENGL_ES_1_1
LOCAL_MODULE    := NativeCamera
LOCAL_SHARED_LIBRARIES := opencv-prebuilt 
LOCAL_SRC_FILES := CameraRenderer.cpp
LOCAL_LDLIBS +=  $(OPENGLES_LIB) -llog -ldl -lEGL

include $(BUILD_SHARED_LIBRARY)

I hope that you have enjoyed my first tutorial post and if you have any questions just leave a comment and feel free to ask.

UPDATE 1:

Here you can find the source code:
https://github.com/MESAI/NativeCamera
Add the openCV java shared library.
Add the build folder and its content in the JNI folder.
Then ndk-build.