MediaPipe Face Detection

The MediaPipe Face Classifier is a deep learning-based approach to face detection that uses a novel architecture to improve upon traditional computer vision models. Proposed by Google researchers, the MediaPipe Face Classifier is based on a multi-task learning framework that jointly optimizes a face detection task and a face classification task. The model uses a combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to extract features from face images and classify them into different categories. One of the key advantages of the MediaPipe Face Classifier is its ability to handle diverse poses and lighting conditions. Evaluated on the “Labeled Faces in the Wild” (LFW) dataset, where it achieved an accuracy of 99.63% with a false positive rate of 0.17. In addition to its high accuracy, the MediaPipe Face Classifier is also computationally efficient. The model can be deployed on resource-constrained devices, such as smartphones or drones, making it a promising solution for real-world face detection applications. It has a major drawback though making it hard to use for drone applications, i.e. the model is only applicable for close-up faces; it almost does not detect faces for distant people.

Example

from dronevis.models import FaceDetectModel

model = FaceDetectModel()    # create model instance
model.load_model()           # load model weights
model.detect_webcam()        # run camera detection

Face Detection Class

class dronevis.models.FaceDetectModel(confidence=0.6)

Face detection class with mediapipe

This class inherits from base class CVModel, and implements its abstract methods for code integrity.

__init__(confidence=0.6)

Construct model instance

Parameters
  • confidence (float, optional) – threshold for detection,

  • is a probability [0, 1]**. (**input) –

  • to 0.5. (Defaults) –

load_model()

Load model from memory

transform_img(image)

Idle transformation of the image

predict(image)

Run model inference on input image and output face detection keypoints.

Parameters

img (np.array) – input image (assumed to be non-transformed)

Returns

output image with keypoints drawn

Return type

np.array

detect_webcam(video_index=0, window_name='Face Detection')

Run webcam (or any video streaming device) with face detection module

Parameters
  • video_index (Union[int, str], optional) – index of video device. can be an IP

  • video_path. Defaults to 0. (or) –

  • window_name (str, optional) – name of opencv window. Defaults to “Face Detection”.