Detection Algorithms
Detection algorithms are the most frequenlty used in computer vision alogrithms generally, and in drones navigation specifically. Henceforth, the library comes in equipped with state-of-the-art (SOTA) algorithms along with different implementations:
Faster Region-based CNN (R-CNN) (PyTorch)
CenterNet (Mxnet)
You Only Look Once (YOLO) (Mxnet)
Signle-Shot Detector (SSD) (PyTorch, Mxnet)
Detection on Your Web Camera
You can get started by feeding a video stream from your web camera (or any camera) with a few lines of code.
from dronevis.detection_torch import FasterRCNN
model = FasterRCNN() # initialize model instance
model.load_model() # load the model weights
model.detect_webcam() # start video detection
A window pops-up with your webcam video stream, and boxes around detected objects.
Note
The model weights need to be downloaded, so make sure you have a working internet connection.
However, once the weights are downloaded once, they will be stored in ~/.cache/torch/hub/checkpoints (on Ubuntu) and you needn’t to download them again.
You can see that the models run with PyTorch, which will automatically check whether you have a GPU device and load the model accordingly.
If you have multiple GPUs and you want to specify one of them for the detection, just set the device property of the model to your desired choice (either "cuda:<device-index>" or "cpu"):
model.device = "cuda:1" # set second GPU (index=1) for inference
Different Model Implementations
The library takes into account the numerous implementations found on the internet, and that users usually prefer a framework over the other. Hence, detection models are currenly built with two frameworks:
PyTorch
Mxnet
You can see the types of implementation. However, for easier user interactivity, major used methods are unified across all models. Each model has 4 main methods:
load_model: load the model weights from cache, or download them.predict: run the model on input imagetransform_img: run the model’s transformation on input imagedetect_webcam: start detection on your webcam.
Abstract Models
To provide a unified interface for all detection models, all implementations must inherit from an abstract base class.
Main Abstract Model
- class dronevis.abstract.abstract_model.CVModel
Base class for creating custom comptervision models.
To use the abstract class just inherit it, and override the abstract method.
Main methods:
1.
load_modelLoad model weights from web or cache. You only need to download the model weights once, and they will be stored and loaded automatically each time you use them later.2.
predictRun model inference on input image You don’t have to transform the image before the inference, input images will be transformed automatically.3.
transform_imgTransform input image according to models transformations4.
detect_webcamStart webcam (or any camera) detection- abstract load_model()
Load model weights from disk
- abstract predict(image)
Get predictions for inference on input image
- abstract transform_img(image)
Transform input image using model transformations
- abstract detect_webcam(video_index, window_name='Cam Detection')
Run model on a video stream from the webcam
Now, each model inherits from this abstract class, and must implement its abstract methods. You can implement your own model as follows:
from drone.abstract import CVModel
class CustomModel(CVModel):
def load_model(self):
"""Load your model weights""""
pass
def predict(self, image):
"""Run model on input image and return inference results""""
pass
def transform_img(self, image):
"""Transform input image""""
pass
def detect_webcam(self, video_index, window_name):
"""Retrieve video stream from device at video index, and start model inference""""
pass
Torch Abstract Models
- class dronevis.abstract.abstract_torch_model.TorchDetectionModel
Base class (inherits from CV abstract model) for creating custom PyTorch models. To use the abstract class just inherit it, and override the abstract method.
For each prediction, the model output 300 labels, and their corresponding 300 scores. Labels are picked if they surpass the threshold accuracy.
- __init__()
Construct torch models, and detect device for inference (cuda or cpu).
Torch detection models are assumed to be trained on COCO dataset. In addition, torch can detect if you have an available GPU. The property
device, contains the device that will be used for inference. You can change the device by changing thedeviceproperty.
- predict(image, detection_threshold=0.7)
Predict all classes in an image using torch model
- Parameters:
image (numpy.ndarray) – video frame or image to predict the classes in it
detection_threshold (float) – thershold to determine if the calss will be taken or not
- Returns:
output image with boxes drawn
- Return type:
numpy.ndarray
- transform_img(image)
Transform image to tensor
- Parameters:
img (numpy.ndarray) – input array
- Returns:
tensor img
- Return type:
torch.Tensor
- draw_boxes(boxes, classes, labels, image)
Draw boxes for the predicted classes in an image using torch model
- Parameters:
boxes (numpy.ndarray) – predicted boxes returned by predict function
classes (List) – predicted classes in an image returned by predict function
labels (torch.Tensor) – class labels in an image returned by predict function
image (numpy.ndarray) – an image to draw boxes on.
- Returns:
cv2 image after drawing boxes of the predicted classes on it with their labels
- Return type:
numpy.ndarray
- detect_webcam(video_index=0, window_name='Cam Detection')
Detecting objects with a webcam using torch model (to quit running this function press ‘q’)
The stream is retrieved and decoded using opencv library.
As pretrained PyTorch models have many methods into common, TorchDetectionModel unifies the common methods in a single class, and each torch model implementation inherits from this class.
However, each inherited model must implement the load_model method.
from dronevis.abstract.abstract_torch_model import TorchDetectionModel
class CustomTorchModel(TorchDetectionModel):
def __init__(self):
super(CustomTorchModel, self).__init__()
def load_model(self):
"""Load model weights"""
pass