Cloud Deep Learning Face Recognition TypeScript WebRTC

Face Detection and Match with TypeScript and Cloud – 1

February 25, 2019

Intro

In the previous post we setup our intention for this project series. We will be constructing an app that utilizes facial recognition via a web application.

In this post we will be setting up our project and grabbing an image from the native JavaScript client API’s

To grab the image from the users device camera we will use MediaDevices.getUserMedia() which is a JavaScript API that allows us to access various device media such as audio, video, screen share, etc.. with the provided constraints.

These API’s can allow us to create some really cool applications, especially using WebRTC (Real Time Communications) which is a way to establish peer to peer communications (video, audio, data) directly from browser to browser with a middleman server. (aside: most webRTC apps use Signal, STUN or TURN servers for processing, manipulation or connection establishment but doesn’t matter in our project. We will be create a proper WebRTC post in another series).

Setup

I’m a big fan of Visual Studio Code so I will be using such as my IDE. I also install TypeScript.

npm install -g typescript

Also create a new file, index.html and a couple typescript files, app.ts and camera.ts.

Whenever you want to compile the source, you can by running the TypeScript compiler by typing:

tsc

In the index.html file, populate with this markup:

<html>
    <head>
    </head>
    <body>
        <button id="btnStartVideo">Start Video</button>
        <button id="btnSnap">Snap Image</button>
        <div>
            <video width="400" height="300"  playsinline autoplay </video>
            <canvas width="400" height="300" ></canvas>
        </div>
        <script src="dist/camera.js"></script>
        <script src="dist/app.js"></script>
    </body>
</html>

This sets us up with a video element and a canvas. The idea here is that we will stream a live video feed from the users device camera to the video element. Then we expose a button to take a snap pic from that video feed (basically just grab a frame from the video feed) and put the frame into the canvas.

Camera Class

In the Camera class we’ll expose a function to start the media feed to the video tag using JavaScript API’s. The startVideo accepts a video element parameter and we will bind a media stream to it. As a security precaution, the browser will ask the user for access to the camera.

class Camera {
    
    mediaoptions = { audio: false,video: true};

    constructor(){}
    startVideo(video:HTMLVideoElement){
        video = video;
        navigator.mediaDevices.getUserMedia(this.mediaoptions)
        .then(( stream )=>{
            video.srcObject = stream;
        })
        .catch(( err )=>{console.log(err)});
    }
}

Which will allow you to expose your camera feed.

Webcam Camera Stream in Browser

Now have to setup a driver to use this class. In the app.ts file, I’ve added this code to supply the Camera class with a video element as well as the functionality to extract a frame to feed to the canvas. We can use the drawImage method to apply the frame data to the canvas.

    class App {

    //dom element references
    btnStart:Element;
    btnSnap:Element;
    video:HTMLVideoElement;
    canvas:HTMLCanvasElement;

    //external util references
    camera:Camera;
    constructor(){
        //init utils
        this.camera = new Camera();
        // setup references and event listeners
        this.btnStart = document.querySelector("#btnStartVideo");
        this.btnStart.addEventListener("click", (e:Event) => { this.btnStartClicked(e);});

        this.btnSnap = document.querySelector("#btnSnap");
        this.btnSnap.addEventListener("click", (e:Event) => { this.btnSnapClicked(e);});

        this.video = document.querySelector("video"); 
        this.canvas = document.querySelector("canvas");  
    }

    btnStartClicked(e:Event) {
        this.camera.startVideo(this.video);

    }

    btnSnapClicked(e:Event) {
        this.canvas.getContext('2d').drawImage(this.video, 0, 0, this.canvas.width, this.canvas.height);
    }


}

new App();

Now you should snap a pic from the feed and put that image data into the canvas element

Here is an example of this code running in this post. (camera required, obviously)

https://recaf-io.github.io/PostExamples/camera1.html

At this point we have achieved accessing the camera on the users device as well as grab a frame from that feed and apply to a canvas. We will use this grabbed frame for the facial recognition processing in part 2.

Aside: WebRTC is extremely useful and enables applications to share data, media streams and other channels between peers and clients without the use of intermediate servers. This enables peer 2 peer data transfer directly in the browser. We will be constructing another series on this. If interested, I highly recommend reading this book on WebRTC which most people refer as the bible of WebRTC. I have this book and it is very easy to read.