Pushing the object detection to the limit

Rigid object detection is not a trivial task, especially if you want to perform it on ARM devices. If you are new in object detection please read carefully this page (free cascades and a lot of tips to start with OpenCV).

In the recent years the rigid object detection task has played an important role in computer vision, now the marketplace is asking for more “powerful” algorithms to detect objects. There are two main strategy:

  • Detection/Classification of a large number of objects (i.e. more than a dozen): the best (trendy) approach is Deep Learning but this strategy is extremely expensive in training (billion of examples, thousands of labels) and computationally prohibitive on low cost devices. It’s the right approach for a “remote server” classification, it will not be “really” suitable for mobile phone and devices for (minimum) a couple of years yet.
  • Detection/Classification for a limited number of object: the best approach is the composition of boosted binary classifiers, trained with a limited number of examples. It’s the right approach for the “live” detection/classification, and it’s already suitable for ARM and mobile devices in general (i.e. face detection within your smartphone’s camera app).

From a researcher’s point of view, a common feeling is that the rigid object detection task is a “closed” topic, Viola and Jones approach with differential features allows to detect object with a good reliability. This is partially correct, not fully. In the incoming internet of things IOT era the needing of detecting objects on extremely low-power devices will be essential (digital signage, sensors, automotive, etc.), for these reasons we designed and implemented the SCARTMAN IOT cascade detector.


It follows a list of features compared to the “classic” Viola and Jones approach with differential features:

  • Given a feature (HAAR, LBP, HOG or a custom one), SCARTMAN IOT training-step improves the power of each feature up to 10x. For a 10x improvement in learning, the feature will be just a +5% slower on field.
  • Given the same quality, SCARTMAN IOT has just the 20% of the number of features of a classic V&J cascade. Then the average speed of the detection is approximately 5x faster than usual cascades.
  • Given the same computational load SCARTMAN IOT can offer a bigger capability to describe the object. More poses can be involved in dataset, occlusion or different objects as well in the case of the multi-class cascades.
  • SCARTMAN IOT requires less than 15 MB of RAM to detect several class of objects.
  • SCARTMAN IOT is written in C++, suitable for Windows, Linux and Android. It is not dependent from any third-party library, LAPACK or BLAS included (linear algebra from the scratch, suitable for devices without OS). Tested on ARM, ready for IOT.


    • Dear Mr. Nascimento,
      at the moment SCARTMAN IOT is only inside an internal project, we are planning to join the marketplace by Jan. 2016. In the meanwhile we are interacting with a couple of #AR companies to share this tech but nothing is decided at the moment.

      For what concerning the training suite, we have an internal tool (multi-threads, C++), but there’s no chance to share it. Anyway the training suite creates models compatible only with our detection suite.

      For what concerning the reports, we have some internal reports about performance and detection skills. At the moment we shared the data you can read above. Once the final version will be released we will update the “numbers” above. At the moment the stats you can read above are realistic and confirmed by intensive tests.

      If you are interested in evaluating our SCARTMAN IOT detector please contact us by email, highlighting your company name and your desiderata. We can also share with you some optimized cascades, suitable for OpenCV to start with an evaluation.

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *