Boost The World: OpenCV Boosted Cascades HAAR LBP HOG

Introduction

In the last years, boosted cascades became popular thanks to an even more satisfying detection of a wide range of objects. The success is motivated by the efficiency and the capability of this approach to detect rigid objects in real time under different conditions and on low-cost ARM architectures (such as smartphones, RaspberryPi, Arduino, etc.) as well. This repository has the goal to collect (with a scientific approach) the largest number of boosted cascades in OpenCV style. At the moment, no HAAR cascades will be released due to the amount of work, time and RAM needed to build a competitive cascade. To build a real HAAR cascade it is necessary to burn the PC approximately for a week (often) with more than 32 Gb of RAM allocated. Of course you can train a HAAR cascade with 500 samples P/N, 20 stages and a sub-space of the features. This is the right way to get in touch with the problem, not to develop something of real for the marketplace.


Why re-training a boosted cascades?

There are a lot of reason to re-train a cascade, the most relevant are:

  • Your object is not covered by OpenCV repository (i.e. cat faces edit: recently added)
  • You need to handle a wider variability or different situations (ie. face poses, different light conditions, etc.). Our datasets are collected (with a semi-supervised approach) from Tumblr, YouTube, etc. discovering the data variability more than usual. The OpenCV boosted cascades come from the (not so near) past, 5/10 years ago the datasets were too often poor in variability due to the difficulty in accessing to “fresh” data.
  • Shortage in CPU/RAM usually lead to a “sparse” features allocation. The below cascades are trained with a x64 machine with 16 cores and 32Gb RAM (but we are able to use a monster machine with 64 cores and 128 Gb RAM, in this case it’s something near to the “full” features-space allocation, not just a subspace).
  • With a few of modifications of OpenCV, the “traincascade” routine can be exploited to reach better performance. The biggest issue about speed is linked with cascade structure, features selection strategy, dataset’s taxonomy, etc.
  • Your need an ARM version with no float emulation.
  • HAAR features are outperformed by other features both in efficiency and accuracy for the most of the “commercial” needing. Then it’s suggested to re-train an own cascade to exploit at the best the “new” features capabilities.
  • You need a faster and reliable detector, to be better than your competitors (companies usually “wrap” OpenCV detector within their commercial products).

Performance

The boosted cascades are performant both in efficiency and accuracy. With a few of modification (out of this academic licence) the cascades below can run +50/150% faster and the reliability can grow by +5/10% (both in false positives and false negatives). The following cascades have been trained with a modified version of OpenCV 2.4.9 that include a lot of optimization, improving the accuracy and the speed of the detection. General speaking the speed of an object detector is heavy dependent from a lot of parameters (ie. min face detected, scale progression), architecture (ie. ARM, x64), field of application (ie. first person POV, video-surveillance), etc. If your goal is detecting and tracking an object, the detection step is usually slower than the tracking step. In multi-object detection a “re-initialization” step is anyway necessary, so the full-image scanning is periodically mandatory (when just one-face at time is supported, a lot of optimization can be done to hyper-speed the detection and tracking). Regardless of these premises, building a fast and reliable cascade is the first – and fundamental – step. No real-time systems can be deployed on ARM and mobile with a non-efficient cascade (do not consider the “classic” opencv boosted cascades as “efficient”, testing it with a I5/I7 processor is not a reliable test). Our boosted cascades run easily @30 FPS under Raspberry PI (mono-core, 700 MHz) with 2 cameras.

It follows our benchmarks on ~30,000 images with faces in real scenarios (frontal, semi-frontal, occlusions, etc.). Our cascade outperforms drastically the OpenCV ones in every tested configuration.

 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.1000
 neighs: 2

 accuracy: 0.9950
 sensitivity: 0.9950
 fmeasure: 0.9975
 MISSED: 180

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.1000
 neighs: 2

 accuracy: 0.3795
 sensitivity: 0.3795
 fmeasure: 0.5502
 MISSED: 23194

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.1000
 neighs: 2

 accuracy: 0.4929
 sensitivity: 0.4929
 fmeasure: 0.6603
 MISSED: 18956

Full results available at the end of this page.

Usage

The following boosted cascades are compatible with OpenCV >= 2.4.9 and 3.0.0 (beta). They are probably suitable for 2.4.5 < OpenCV < 2.4.9 as well, but no test has been performed about it. Before the 2.4.5 we are pretty sure the cascades are not fully compatible (the reason why is out of scope for this page). Use them at your risk. We suggest to use 2.4.11 version to perform tests.


License and Commercial Use

The available boosted cascades are a “limited” version of the “real” boosted cascades we built and they are ONLY for research purposes. The limitations are in speed, accuracy both in false and true positives, and portability under ARM architecture (even if compatible, the usage is too slow to be “realistic”). The “full” boosted cascades are available for commercial use, we are also available to train brand new boosted cascades for your specific problems.

*NEWS*: since June 2016 vision-ary project joined ARGO Vision, an innovative firm that excels in visual recognition. For inquiry about cascades and more, please contact ARGO Vision.

At the moment 5 boosted cascades are involved in 2 commercial products.

At the moment 7 boosted cascades are involved in 3 commercial products.

At the moment 8 boosted cascades are involved in 4 commercial products.

At the moment 9 boosted cascades are involved in 5 commercial products.

At the moment 10 boosted cascades are involved in 6 commercial products.

At the moment 11 boosted cascades are involved in 6 commercial products.

At the moment 12 boosted cascades are involved in 7 commercial products.

At the moment 13 boosted cascades are involved in 7 commercial products.


Available Cascades

Cats: LBP | HAAR | HOG

Pedestrians: LBP | HAAR | HOG

Frontal Face: LBP | HAAR | HOG

Frontal CarLBP | HAAR | HOG [only commercial, contact us]

Front/Rear Car: LBP | HAAR | HOG [only commercial, contact us]

Eyes: LBP | HAAR | HOG

Coming soon

Rear Car: LBP | HAAR | HOG
Profile Car: LBP | HAAR | HOG

Bike: LBP | HAAR | HOG

Road Signs: LBP | HAAR | HOG

Profile Face: LBP | HAAR | HOG
Generic Face: LBP | HAAR | HOG [only commercial, contact us]

Mouth: LBP | HAAR | HOG
Nose: LBP | HAAR | HOG

Hand: LBP | HAAR | HOG


OpenCV reference: OpenCV official

Full results about Face detection problem

——————-

 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.2000
 neighs: 2

 accuracy: 0.9940
 sensitivity: 0.9940
 fmeasure: 0.9970
 MISSED: 230

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.2000
 neighs: 2

 accuracy: 0.3182
 sensitivity: 0.3182
 fmeasure: 0.4828
 MISSED: 25487

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.2000
 neighs: 2

 accuracy: 0.4336
 sensitivity: 0.4336
 fmeasure: 0.6049
 MISSED: 21174

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.3000
 neighs: 2

 accuracy: 0.9916
 sensitivity: 0.9916
 fmeasure: 0.9958
 MISSED: 320

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.3000
 neighs: 2

 accuracy: 0.2406
 sensitivity: 0.2406
 fmeasure: 0.3878
 MISSED: 28389

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.3000
 neighs: 2

 accuracy: 0.3484
 sensitivity: 0.3484
 fmeasure: 0.5168
 MISSED: 24358

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.1000
 neighs: 3

 accuracy: 0.9947
 sensitivity: 0.9947
 fmeasure: 0.9973
 MISSED: 201

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.1000
 neighs: 3

 accuracy: 0.3552
 sensitivity: 0.3552
 fmeasure: 0.5242
 MISSED: 24104

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.1000
 neighs: 3

 accuracy: 0.4720
 sensitivity: 0.4720
 fmeasure: 0.6413
 MISSED: 19737

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.2000
 neighs: 3

 accuracy: 0.9932
 sensitivity: 0.9932
 fmeasure: 0.9966
 MISSED: 258

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.2000
 neighs: 3

 accuracy: 0.2931
 sensitivity: 0.2931
 fmeasure: 0.4533
 MISSED: 26425

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.2000
 neighs: 3

 accuracy: 0.4104
 sensitivity: 0.4104
 fmeasure: 0.5819
 MISSED: 22041

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.3000
 neighs: 3

 accuracy: 0.9885
 sensitivity: 0.9885
 fmeasure: 0.9942
 MISSED: 433

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.3000
 neighs: 3

 accuracy: 0.2096
 sensitivity: 0.2096
 fmeasure: 0.3465
 MISSED: 29548

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.3000
 neighs: 3

 accuracy: 0.3256
 sensitivity: 0.3256
 fmeasure: 0.4912
 MISSED: 25211

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.1000
 neighs: 4

 accuracy: 0.9941
 sensitivity: 0.9941
 fmeasure: 0.9970
 MISSED: 221

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.1000
 neighs: 4

 accuracy: 0.3368
 sensitivity: 0.3368
 fmeasure: 0.5039
 MISSED: 24792

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.1000
 neighs: 4

 accuracy: 0.4563
 sensitivity: 0.4563
 fmeasure: 0.6267
 MISSED: 20324

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.2000
 neighs: 4

 accuracy: 0.9915
 sensitivity: 0.9915
 fmeasure: 0.9957
 MISSED: 317

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.2000
 neighs: 4

 accuracy: 0.2719
 sensitivity: 0.2719
 fmeasure: 0.4276
 MISSED: 27217

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.2000
 neighs: 4

 accuracy: 0.3925
 sensitivity: 0.3925
 fmeasure: 0.5637
 MISSED: 22709

-------------------
 model: 'ARGO Vision frontal face cascade.xml'
 scale: 1.3000
 neighs: 4

 accuracy: 0.9846
 sensitivity: 0.9846
 fmeasure: 0.9922
 MISSED: 576

-------------------
 model: 'lbpcascade_frontalface.xml'
 scale: 1.3000
 neighs: 4

 accuracy: 0.1861
 sensitivity: 0.1861
 fmeasure: 0.3138
 MISSED: 30425

-------------------
 model: 'haarcascade_frontalface_default.xml'
 scale: 1.3000
 neighs: 4

 accuracy: 0.3053
 sensitivity: 0.3053
 fmeasure: 0.4677
 MISSED: 25971

29 comments

  1. Hi

    Do you have somekind of tutorial how are you training these cascades? I would like to train my own with opencv but my current hit rate is very poor. I have 1600 positive images and about 4000 negative images of license plates and cars. I got about 86% hit rate on training set and lower than 50% on test set.

    Best regards,
    Kälver

    • I have no tutorial at the moment, you are experiencing the overfitting of your positive dataset (in training) or a bad generalization of the dataset because the training set doesn’t cover the “semantics” of the cars. Moreover 1600 samples are not so many, I trained my (commercial) last car cascade with ~100.000 positives randomly sampled.

      I’m available for consultancy, contact me at info {{AT}} “nameofthesite”.net

      Regards,
      vision-ary

  2. Hi, all this looks great. Thanks for posting your work and teaching articles.

    I tried to use the car truck model with opencv 2.4.10 and got no results. Even using the picture on your site: http://www.vision-ary.net/wp-content/uploads/2015/06/cars_and_trucks_vision-ary.net_.png

    I’m using the model like this:

    CascadeClassifier detector;
    detector.load(“../../Models/visionary.net_cars_and_truck_cascade_web_HAAR.xml”);
    detector.detectMultiScale( image, detections,1.1,1, 0|CV_HAAR_SCALE_IMAGE,Size(40, 40));

    Any thoughts on this??

    Thank you very much.

    • Hello Matias,
      thanks for sharing your experience.

      I’m glad to help you, but I need to understand what do you mean with “no results”:
      – no bounding boxes at all?
      – no cars detected (the cascade works poorly)?
      – software issues (check the detection time and report here) ?

      Regards,
      Alessandro.

      • Hi Alessandro, thanks for your reply. I insist, you are doing a great job here 🙂

        Yeah that’s it, no bounding boxes detected, meaning no cars detected as well.
        The model is loaded correctly, and it takes about 1 or 2 ms to analyse each picture (sometimes less, depending on the size of course). But no bounding boxes are detected.

        Regards,
        Alessandro.

        • Ok,
          it looks something I already have experienced. There’s a “logical” bug in OpenCV that makes a cascade not suitable between different versions. I wasted a week some months ago to handle this bug, the “1-2 ms detection time” was exactly what highlighted the problem to me.

          I’ll deep dive again to check it!

          Gimme some days to check it. Regards,
          A.

  3. Hello Matias,
    I double-checked the haar car cascade: http://www.vision-ary.net/2015/06/boost-the-world-car-detection/

    As I highlighted to you the training process has little differences between opencv versions. Even if compatible, cascades won’t work between some different versions of opencv: as I wrote in the car cascade page the cascade is suitable ONLY for opencv < 2.4.3 (please open the cascade, is written inside as well). The reason why is tricky and it lead to AdaBoost C++ implementation within traincascade project. Please use opencv 2.4.2 to test the car cascade, I'm sure it works well (a similar one is within a commercial product). Unfortunately this cascade is the oldest one from the repository I've shared, it was trained with opencv 2.4.2. All the other cascades are trained with > 2.4.9 version.

    In the future I will convert the cascade to latest “format” to join the “new” format.

  4. Hello Vision-Ary,
    we evaluated your cascades, they are superior in performance (both speed and accuracy). My company would like to evaluate the chance to include them in our computer vision commercial projects, how can we proceed?

    Regards,
    Michael.

    • Hello Michael,
      thanks for contacting us!

      We are evaluating offers from several companies, we will contact you soon by email to start a discussion about it.

      Thanks,
      Vision-ary team

  5. Hello,
    we would like to improve the detection approach to a wider class of objects keeping the best efficiency. Do you have any idea or suggestion?

    Bye,
    Sheila!

    • Hello Sheila,
      thanks for contacting us!

      We developed a multi-class version of the cascade approach for object detection and classification. Please contact us by the contact form for further information.

      Regards,
      Vision-ary team.

  6. Hey!

    We evaluated your cascades, amazing! My company would discuss with you the chance to make them suitable for our products on marketplace. How can we proceed?

    Is the training process suitable for Linux env.?

    • Hello,

      thanks for contacting us!

      Please send an email by the contact form, we will share with you the info you need to start with the integration.

      The training process is mixed Matlab/C++, suitable for Win and Linux.

      Regards,
      Vision-ary team

  7. Hi I am interesting in your cascade for my final project. Do you have a new cascade training?, my project need tracking and people detection in opencv but I only program and detect with Detect.Multiscale. Do you have another method for detect people with the webcam.

    Thanks for your great job

    • Hello Carlos,
      we have a lot of cascades, the most of them are in commercial projects now.

      Please specify what do you need, detecting people is a common task in computer vision and there are a lot of methods.

      Regards,
      vision-ary team.

  8. Hello,

    Great work you have here, congratulations. I’m an engineering student from Brazil and I wonder if is it possible to use your face and eye lbp cascades on my final paper. I sent and e-mail for you with all the details and I appreciate if you take a look.

    Thank you and continue this great job.
    Best regards.

  9. Do you mind if I quote a couple of your articles
    as long as I provide credit and sources
    back to your website? My blog site is in the very same niche
    as yours and my visitors would truly benefit from a lot of the information you provide
    here. Please let me know if this okay with you. Appreciate it!

  10. Hello,

    I am interested in using your commercial cascade (HOG or Haar) for car detection. Could you please get back to me with more details?

  11. Hi,

    Need to detect cars in all angles. Is this being worked upon. If yes, can you please let me know accordingly.

    Regards.

    • Hello Amit,
      we are able to do it! Send an email to us and we will reply to you about your project!

      Vision-ary team

  12. I am curios how does the boost cascades compare to using a CNN. With opencv 3.3 you can import an already built cafee, pytorch, etc model on a Raspberry pi and should get better results, I guess

    • I’m pretty sure you are right. CNN are extremely powerful if compared against cascades. The question is: is it meaningful detecting one class rigid object with a CNN (slow training, gpu dedicated, cpu load very relevant) if you can detect a face with a cascade with 2-300 MHz processor in real time? Cascade is still the best tradeoff speed/accuracy in our opinion.

  13. The result comparison that you show it cannot be correct. Are the results that you are reporting tested on the train set? If it is not the case do you have any scientific publication about your detector? If you are getting something 150% faster and 10% more reliable it should be worth for a CV conference.

    • The results we reported are performed on test dataset completely separated from training set (proprietary dataset, not public).

      Our goal is commercial, not research. This is why we have more a dozen commercial projects with clients disappointed about opencv performances.

      Stay tuned for our research on deep learning, probably in 2018 we will publish something.

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *