Who’s who?: real-time face recognition

Facial recognition technology is a powerful tool – computer vision can identify a person simply by scanning their face. You’ve encountered this whenever your phone recognizes friends to help you organize your photos, or if you need a face scan to enter your top secret laboratory. More seriously, facial recognition technology serves important purposes for law enforcement and surveillance experts. We’ll leave that to the professionals. In this blog, I implement facial recognition in real time on the movie Avengers: Endgame, using Deephaven’s powerful real-time analytic tools. We’ll show you how to easily ingest the data into a platform where you can instantly manipulate, analyze, and learn from that data.

Here’s how we did it, in three easy steps:

Set up a Kafka stream. We like Redpanda for this.
Create Kafka topics.
Run the script.

This is a fun example, but the workflow can be applied to many other computer vision cases. I’ll walk you through the process of setting up the facial recognition mechanism as well as importing image stream data into Deephaven through Redpanda. Stay tuned for other articles detailing the steps of analyzing image data in Deephaven IDE.

In this demo, I face the webcam to the movie Avengers Endgame. We can see that as the movie plays, all the characters are identifed by the model and marked onscreen. (Yes, pointing Python at the movie file instead of pointing a webcam at the movie would work, but in this example we’re mimicing real-time events – you could point a webcam at everyone entering an office building, for example, and stream the data as it comes in.)

In my Deephaven session, on the top left, a streaming table records the attendance of the first appearance of all characters. On the bottom left, there are three graphs:

Top leading character compares the total appearance times of all the main characters.
Show-up times of each character shows the total appearance time (in minutes) for each character.
Number of times each pair of characters hanging out tracks which characters appear together more often than the others. Here, I’m basically tracking relationships between characters.

Below, I give the general details about how I accomplished this, with pointers for you to customize the code for your own purposes. You can also find my scripts in the deephaven-examples GitHub repository.

Now I’ll walk you through the steps of setting up a program that detects characters from the movie, then outputs their names as well as the time they appear on the screen.

In order to train the face recognition model, we need to build a face database that contains the faces/images of all the characters we want to identify. There are so many ways to do it – here, we simply grab images of all characters from the internet.

Once the image database is prepared, we can use face_recognition and Open_CV packages to develop the algorithm.

The sample code is shown below:

cap = cv2.VideoCapture(0)
name_re=set()
while True:
    """capture the pics from webcam, doing face encoding, face detection and face comparion, return the most matched face name
    draw rectangle around all the faces, and all the information"""
    degree=0.25
    ret, img = cap.read()
    
    imgS = cv2.resize(img,(0,0),None,degree,degree)
    imgS = cv2.cvtColor(imgS, cv2.COLOR_BGR2RGB)
    
    facesCurFrame = face_recognition.face_locations(imgS)
    
    encodesCurFrame = face_recognition.face_encodings(imgS,facesCurFrame)

facesCurFrame = face_recognition.face_locations(imgS) runs after the images are captured by webcam. It implements a well-trained HOG algorithm that measures gradient orientation in localized portions of an image to detect the shape of a face.

The next step is to encode each face detected from images and translate it into a language that computers can read. We used a deep learning model to generate 128 unique measurements for every face. face_recognition.face_encodings() does all the heavy lifting.

Once the image goes through the processing, we need to find the most similar one in our database. We use Cosine distance to calculate the distances between faces from a webcam (or movie!) with faces in our database, then output the name of the face with the lowest distance. By the end, we should have a data stream of both the time of the first appearance and names of the characters.

for encode_Face,face_Loc in zip(encodesCurFrame,facesCurFrame):
    matches = face_recognition.compare_faces(encodeListKnown,encode_Face)
    faceDis = face_recognition.face_distance(encodeListKnown,encode_Face)
    print(faceDis)
    matchIndex = np.argmin(faceDis)
    if faceDis[matchIndex]<0.5:
        name = person_name[matchIndex].upper()
    else:
        name=re.sub(r'[^a-zA-Z]', '', name)

We need to store the real-time data so that it can be accessed easily, and we need our tables to auto-refresh as new data comes in. Kafka meets all our needs. The Kafka stream works like a bridge for us to produce and consume real-time data. Kafka can store different data into separated topics; we only use two for the movie data, but this becomes much more useful when you require several. With topic names and key values, we can easily publish and consume the data in real time. Redpanda, a server built upon Kafka, combined with Deephaven is a powerful tool. (To learn more, check out our full how-to guide.)

To start the server, run this code:

git clone https://github.com/deephaven-examples/cv_stream.git
cd cv_stream
docker compose up -d

This builds the container for Redpanda and Deephaven. To access the stream and experience all of Deephaven’s analytic tools, navigate to http://localhost:10000/ide.

After the server is up, we need to create topics for data to be produced and consumed. Anything will work, but we’ve chosen “character_attendance” and “character_relation”. Run:

docker exec -it redpanda-1 rpk topic create character_attendance --brokers=localhost:9092
docker exec -it redpanda-1 rpk topic create character_relation --brokers=localhost:9092

To check the existing topics, run:

docker exec -it redpanda-1 rpk cluster info

Now that you know how the facial recognition model works, here is the full script to try on your own.

To run the script, first install Kafka-python, face_recognition, and opencv-pyth on the local machine with a simple pip install:

pip install Kafka-python face_recognition opencv-python

Then:

import re
import cv2
import numpy as np
import face_recognition
import os
from datetime import datetime
from kafka import KafkaProducer
import json

topic_name1 = 'character_relation'
topic_name2="character_attendance"
def json_serializer(data):
    return json.dumps(data).encode("utf-8")
producer=KafkaProducer(bootstrap_servers=["localhost:9092"],value_serializer=json_serializer)
person_name=[]
images=[]

for pic in os.listdir("images"):
    if pic.endswith("png") or pic.endswith("jpg"):
        img=cv2.imread("images/{}".format(pic))
        name=os.path.splitext(pic)[0]
        images.append(img)
        person_name.append(name)
print(person_name)
def encoding(images):
    """encoding the all the images, and find the 128 measurements for the face"""
    images_encoding=[]
    
    for image in images:
        try:
            img=cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
            encode = face_recognition.face_encodings(img)[0]
            images_encoding.append(encode)
        except:
            pass
    return images_encoding
encodeListKnown = encoding(images)
print('Encoding Complete')

cap = cv2.VideoCapture(0)
name_re=set()
while True:
    """capture the pics from webcam, doing face encoding, face detection and face comparion, return the most matched face name
    draw rectangle around all the faces, and all the information"""
    degree=0.25
    ret, img = cap.read()
    
    imgS = cv2.resize(img,(0,0),None,degree,degree)
    imgS = cv2.cvtColor(imgS, cv2.COLOR_BGR2RGB)
    
    facesCurFrame = face_recognition.face_locations(imgS)
    
    encodesCurFrame = face_recognition.face_encodings(imgS,facesCurFrame)
    format= cv2.FONT_HERSHEY_COMPLEX
    for encode_Face,face_Loc in zip(encodesCurFrame,facesCurFrame):
        matches = face_recognition.compare_faces(encodeListKnown,encode_Face)
        faceDis = face_recognition.face_distance(encodeListKnown,encode_Face)
        print(faceDis)
        matchIndex = np.argmin(faceDis)
        if faceDis[matchIndex]<0.5:
            name = person_name[matchIndex].upper()
            name=re.sub(r'[^a-zA-Z]', '', name)
            y1,x2,y2,x1 = face_Loc
            
            
            y1, x2, y2, x1 = y1*4,x2*4,y2*4,x1*4
            
            cv2.rectangle(img,(x1,y1),(x2,y2),(0,255,0),2)
            cv2.putText(img,name,(x1+6,y2-6),format,1,(255,255,255),2)
            json_dic={"name":name}
            print(json_dic)
            producer.send(topic_name1, json_dic)
            print("yes")
            if name not in name_re:
                name_re.add(name)
                producer.send(topic_name2, json_dic)
    cv2.imshow('Webcam',img)
    if cv2.waitKey(20) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

producer=KafkaProducer(bootstrap_servers=["localhost:9092"],value_serializer=json_serializer helps to build a connection with the local server, so that the data can be stored locally and accessed by Redpanda later on.

producer.send(topic_name, json_dic) sends the data to the server after data stream gets generated.

If you use this project and blog post as a baseline for working with Deephaven, we’d love to hear about it. Let us know what you come up with in our Github Discussions, or our Slack community.

Source link
lol

Who’s who?: real-time face recognition | Deephaven

By stp2y

Leave a Reply Cancel reply