Pushing the object detection to the limit
Rigid object detection is not a trivial task, especially if you want to perform it on ARM devices. If you are new in object detection please read carefully this page (free cascades and a lot of tips to start with OpenCV).
In the recent years the rigid object detection task has played an important role in computer vision, now the marketplace is asking for more “powerful” algorithms to detect objects. There are two main strategy:
- Detection/Classification of a large number of objects (i.e. more than a dozen): the best (trendy) approach is Deep Learning but this strategy is extremely expensive in training (billion of examples, thousands of labels) and computationally prohibitive on low cost devices. It’s the right approach for a “remote server” classification, it will not be “really” suitable for mobile phone and devices for (minimum) a couple of years yet.
- Detection/Classification for a limited number of object: the best approach is the composition of boosted binary classifiers, trained with a limited number of examples. It’s the right approach for the “live” detection/classification, and it’s already suitable for ARM and mobile devices in general (i.e. face detection within your smartphone’s camera app).
From a researcher’s point of view, a common feeling is that the rigid object detection task is a “closed” topic, Viola and Jones approach with differential features allows to detect object with a good reliability. This is partially correct, not fully. In the incoming internet of things IOT era the needing of detecting objects on extremely low-power devices will be essential (digital signage, sensors, automotive, etc.), for these reasons we designed and implemented the SCARTMAN IOT cascade detector.
SCARTMAN IOT FEATURES
It follows a list of features compared to the “classic” Viola and Jones approach with differential features:
- Given a feature (HAAR, LBP, HOG or a custom one), SCARTMAN IOT training-step improves the power of each feature up to 10x. For a 10x improvement in learning, the feature will be just a +5% slower on field.
- Given the same quality, SCARTMAN IOT has just the 20% of the number of features of a classic V&J cascade. Then the average speed of the detection is approximately 5x faster than usual cascades.
- Given the same computational load SCARTMAN IOT can offer a bigger capability to describe the object. More poses can be involved in dataset, occlusion or different objects as well in the case of the multi-class cascades.
- SCARTMAN IOT requires less than 15 MB of RAM to detect several class of objects.
- SCARTMAN IOT is written in C++, suitable for Windows, Linux and Android. It is not dependent from any third-party library, LAPACK or BLAS included (linear algebra from the scratch, suitable for devices without OS). Tested on ARM, ready for IOT.