# Viola Jones *Lisez ceci dans d'autres langues: [English](README.md)* ## Description Implémentation de l'algorithme "Viola Jones" en Python et C++. ## Dépendances - Python - pip - Bash - Make - curl - tar - Cuda toolkit - Cudnn ## Utilisation ### C++ Vous pouvez configurer l'algorithme avec les variables globales définies au début du fichier *ViolaJones.cpp* puis lancer 'make start'. Il y a également la commande 'make clean' qui permet de supprimer tout fichiers compilées. ### Python Vous pouvez configurer l'algorithme dans le fichier *config.py* puis lancer l'algorithme avec 'make start'. **Note : Le script téléchargera automatiquement le set de données.** **Note : Vous pouvez supprimer la sauvegardes de tout résultat avec la commande 'make reset'** ## Entraînement L'algorithme à été entraîné avec un processeur Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz et un GPU NVIDIA GeForce RTX 2080 Ti. ### Tableau de comparaison des temps d'exécution R = Temps CPU / Temps GPU ou NJIT Si R >= 1 alors le temps est R fois plus rapide que CPU. Si R < 1 alors le temps est R^-1 fois plus lent que CPU. Il se trouve que le GPU bat systématiquement le CPU en matière de temps d'exécution, alors les chiffres indiqués sont les ratios R. | Preprocessing | GPU | NJIT | | ------------------------------------------ | ------- | ------ | | Converting training set to integral images | 12.62 | 280.78 | | Converting testing set to integral images | 15.90 | 251.18 | | Applying features to training set | 3252.38 | 191.87 | | Applying features to testing set | 3204.09 | 114.48 | | Training | GPU | NJIT | | ------------------ | ----- | ------ | | ViolaJones T = 1 | 40.29 | 25.08 | | ViolaJones T = 5 | 64.64 | 124.17 | | ViolaJones T = 10 | 64.98 | 121.03 | | ViolaJones T = 25 | 67.65 | 126.69 | | ViolaJones T = 50 | 67.35 | 128.80 | | ViolaJones T = 100 | 66.86 | 128.31 | | ViolaJones T = 200 | 65.92 | 126.71 | | ViolaJones T = 300 | 65.47 | 124.91 | ## Évaluation L'algorithme de ViolaJones étant déterministe, tous les modèles entraînés avec un T donnée, peu importe le moyen (CPU, NJIT ou GPU), seront les mêmes modèles avec les mêmes paramètres. Rappel: ACC (Accuracy i.e. Précision), F1 (Score F1), FN (Faux Négatif) et FP (Faux Positif). | Evaluating | ACC (E) | F1 (E) | FN (E) | FP (E) | ACC (T) | F1 (T) | FN (T) | FP (T) | | ------------------ | ------- | ------ | ------ | ------ | ------- | ------ | ------ | ------ | | ViolaJones T = 1 | 86.37% | 0.82 | 753 | 198 | 75.64% | 0.09 | 5,662 | 196 | | ViolaJones T = 5 | 85.27% | 0.77 | 318 | 710 | 91.86% | 0.09 | 1,582 | 375 | | ViolaJones T = 10 | 86.01% | 0.80 | 545 | 431 | 93.34% | 0.13 | 1,248 | 354 | | ViolaJones T = 25 | 92.06% | 0.89 | 373 | 181 | 93.79% | 0.19 | 1,201 | 292 | | ViolaJones T = 50 | 94.20% | 0.92 | 239 | 166 | 96.23% | 0.25 | 588 | 319 | | ViolaJones T = 100 | 95.41% | 0.93 | 152 | 168 | 96.54% | 0.22 | 479 | 352 | | ViolaJones T = 200 | 96.24% | 0.95 | 133 | 129 | 96.78% | 0.17 | 381 | 394 | | ViolaJones T = 300 | 96.75% | 0.95 | 94 | 133 | 96.93% | 0.17 | 343 | 394 | ## Annexes ### Temps d'exécution des parties communes | Preprocessing | Time spent (ns) | Formatted time spent | | ----------------------------------- | --------------- | -------------------- | | Compiling NJIT and GPU | 6,315,144,200 | 6s 315ms 144µs 200ns | | Loading sets | 155,582,900 | 155ms 582µs 900ns | | Building features | 292,216,000 | 292ms 216µs | | Selecting best features | 3,956,470,800 | 3s 956ms 470µs 800ns | | Precalculating training set argsort | 1,356,386,000 | 1s 356ms 386µs | | Precalculating testing set argsort | 4,766,277,500 | 4s 766ms 277µs 500ns | ### Test unitaires Les tests unitaires consistent en la vérification de l'égalité des fichiers résultant des procédés (CPU == NJIT == GPU). L'algorithme de ViolaJones étant déterministe, les fichiers devraient être égaux (au mieux que le permet la virgule flottante). | Unit testing | Test state | Time spent (ns) | Formatted time spent | | --------------------- | ---------- | --------------- | ------------------------ | | X_train_feat | Passed | 10,527,171,100 | 10s 527ms 171µs 100ns | | X_test_feat | Passed | 86,698,306,700 | 1m 26s 698ms 306µs 700ns | | X_train_ii | Passed | 1,120,532,600 | 1s 120ms 532µs 600ns | | X_test_ii | Passed | 589,468,800 | 589ms 468µs 800ns | | alphas_1 | Passed | 15,958,700 | 15ms 958µs 700ns | | final_classifiers_1 | Passed | 13,961,300 | 13ms 961µs 300ns | | alphas_5 | Passed | 41,888,300 | 41ms 888µs 300ns | | final_classifiers_5 | Passed | 23,936,300 | 23ms 936µs 300ns | | alphas_10 | Passed | 62,881,900 | 62ms 881µs 900ns | | final_classifiers_10 | Passed | 82,882,100 | 82ms 882µs 100ns | | alphas_25 | Passed | 11,495,300 | 11ms 495µs 300ns | | final_classifiers_25 | Passed | 62,827,900 | 62ms 827µs 900ns | | alphas_50 | Passed | 3,987,200 | 3ms 987µs 200ns | | final_classifiers_50 | Passed | 46,897,900 | 46ms 897µs 900ns | | alphas_100 | Passed | 2,991,400 | 2ms 991µs 400ns | | final_classifiers_100 | Passed | 100,732,100 | 100ms 732µs 100ns | | alphas_200 | Passed | 6,979,900 | 6ms 979µs 900ns | | final_classifiers_200 | Passed | 2,991,600 | 2ms 991µs 600ns | | alphas_300 | Passed | 50,862,500 | 50ms 862µs 500ns | | final_classifiers_300 | Passed | 997,400 | 997µs 400ns | ### Temps d'exécution du CPU | Preprocessing | Time spent (ns) | Formatted time spent | | ------------------------------------------------ | ----------------- | ---------------------------- | | Converting training set to integral images (CPU) | 1,120,022,400 | 1s 120ms 22µs 400ns | | Converting testing set to integral images (CPU) | 3,757,517,900 | 3s 757ms 517µs 900ns | | Applying features to training set (CPU) | 2,607,923,836,600 | 43m 27s 923ms 836µs 600ns | | Applying features to testing set (CPU) | 8,910,858,819,100 | 2h 28m 30s 858ms 819µs 100ns | | Training | Time spent (ns) | Formatted time spent | | ------------------------ | ----------------- | ---------------------------- | | ViolaJones T = 1 (CPU) | 32,948,442,200 | 32s 948ms 442µs 200ns | | ViolaJones T = 5 (CPU) | 159,626,648,000 | 2m 39s 626ms 648µs | | ViolaJones T = 10 (CPU) | 315,165,752,800 | 5m 15s 165ms 752µs 800ns | | ViolaJones T = 25 (CPU) | 773,419,206,100 | 12m 53s 419ms 206µs 100ns | | ViolaJones T = 50 (CPU) | 1,531,656,252,200 | 25m 31s 656ms 252µs 200ns | | ViolaJones T = 100 (CPU) | 3,056,693,435,300 | 50m 56s 693ms 435µs 300ns | | ViolaJones T = 200 (CPU) | 6,093,482,072,800 | 1h 41m 33s 482ms 72µs 800ns | | ViolaJones T = 300 (CPU) | 9,139,635,975,200 | 2h 32m 19s 635ms 975µs 200ns | | Testing | Time spent (ns) (E) | Formatted time spent (E) | Time spent (ns) (T) | Formatted time spent (T) | | ------------------------ | ------------------- | ------------------------ | ------------------- | ------------------------ | | ViolaJones T = 1 (CPU) | 0 | <1ns | 997,200 | 997µs 200ns | | ViolaJones T = 5 (CPU) | 997,100 | 997µs 100ns | 997,700 | 997µs 700ns | | ViolaJones T = 10 (CPU) | 997,700 | 997µs 700ns | 2,992,200 | 2ms 992µs 200ns | | ViolaJones T = 25 (CPU) | 1,994,600 | 1ms 994µs 600ns | 4,986,800 | 4ms 986µs 800ns | | ViolaJones T = 50 (CPU) | 3,989,400 | 3ms 989µs 400ns | 11,968,000 | 11ms 968µs | | ViolaJones T = 100 (CPU) | 6,981,500 | 6ms 981µs 500ns | 18,949,800 | 18ms 949µs 800ns | | ViolaJones T = 200 (CPU) | 12,965,800 | 12ms 965µs 800ns | 36,902,700 | 36ms 902µs 700ns | | ViolaJones T = 300 (CPU) | 17,951,800 | 17ms 951µs 800ns | 56,848,300 | 56ms 848µs 300ns | ### Temps d'exécution du GPU | Preprocessing | Time spent (ns) | Formatted time spent | | ------------------------------------------------ | --------------- | -------------------- | | Converting training set to integral images (GPU) | 88,759,800 | 88ms 759µs 800ns | | Converting testing set to integral images (GPU) | 236,366,600 | 236ms 366µs 600ns | | Applying features to training set (GPU) | 801,849,700 | 801ms 849µs 700ns | | Applying features to testing set (GPU) | 2,781,090,300 | 2s 781ms 90µs 300ns | | Training | Time spent (ns) | Formatted time spent | | ------------------------ | --------------- | ------------------------ | | ViolaJones T = 1 (GPU) | 817,811,700 | 817ms 811µs 700ns | | ViolaJones T = 5 (GPU) | 2,469,417,100 | 2s 469ms 417µs 100ns | | ViolaJones T = 10 (GPU) | 4,850,067,700 | 4s 850ms 67µs 700ns | | ViolaJones T = 25 (GPU) | 11,432,447,600 | 11s 432ms 447µs 600ns | | ViolaJones T = 50 (GPU) | 22,742,326,800 | 22s 742ms 326µs 800ns | | ViolaJones T = 100 (GPU) | 45,714,804,900 | 45s 714ms 804µs 900ns | | ViolaJones T = 200 (GPU) | 92,438,265,000 | 1m 32s 438ms 265µs | | ViolaJones T = 300 (GPU) | 139,605,228,600 | 2m 19s 605ms 228µs 600ns | ### Temps d'exécution du CPU compilé avec NJIT | Preprocessing | Time spent (ns) | Formatted time spent | | ------------------------------------------------- | --------------- | ------------------------ | | Converting training set to integral images (NJIT) | 3,989,000 | 3ms 989µs | | Converting testing set to integral images (NJIT) | 14,959,600 | 14ms 959µs 600ns | | Applying features to training set (NJIT) | 13,592,361,400 | 13s 592ms 361µs 400ns | | Applying features to testing set (NJIT) | 77,834,323,700 | 1m 17s 834ms 323µs 700ns | | Training | Time spent (ns) | Formatted time spent | | ------------------------- | --------------- | ------------------------ | | ViolaJones T = 1 (NJIT) | 1,313,497,300 | 1s 313ms 497µs 300ns | | ViolaJones T = 5 (NJIT) | 1,285,571,700 | 1s 285ms 571µs 700ns | | ViolaJones T = 10 (NJIT) | 2,604,081,500 | 2s 604ms 81µs 500ns | | ViolaJones T = 25 (NJIT) | 6,104,721,700 | 6s 104ms 721µs 700ns | | ViolaJones T = 50 (NJIT) | 11,891,281,600 | 11s 891ms 281µs 600ns | | ViolaJones T = 100 (NJIT) | 23,822,338,800 | 23s 822ms 338µs 800ns | | ViolaJones T = 200 (NJIT) | 48,089,174,900 | 48s 89ms 174µs 900ns | | ViolaJones T = 300 (NJIT) | 73,169,668,200 | 1m 13s 169ms 668µs 200ns | | Testing | Time spent (ns) (E) | Formatted time spent (E) | Time spent (ns) (T) | Formatted time spent (T) | | ------------------------- | ------------------- | ------------------------ | ------------------- | ------------------------ | | ViolaJones T = 1 (NJIT) | 0 | <1ns | 997,900 | 997µs 900ns | | ViolaJones T = 5 (NJIT) | 0 | <1ns | 0 | <1ns | | ViolaJones T = 10 (NJIT) | 0 | <1ns | 997,200 | 997µs 200ns | | ViolaJones T = 25 (NJIT) | 997,100 | 997µs 100ns | 997,400 | 997µs 400ns | | ViolaJones T = 50 (NJIT) | 996,800 | 996µs 800ns | 3,989,000 | 3ms 989µs | | ViolaJones T = 100 (NJIT) | 2,991,900 | 2ms 991µs 900ns | 7,978,900 | 7ms 978µs 900ns | | ViolaJones T = 200 (NJIT) | 3,989,600 | 3ms 989µs 600ns | 15,957,700 | 15ms 957µs 700ns | | ViolaJones T = 300 (NJIT) | 5,983,900 | 5ms 983µs 900ns | 23,935,500 | 23ms 935µs 500ns | ## Resources additionnels - [Rapid Object Detection using a Boosted Cascade of Simple Features](https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/viola-cvpr-01.pdf) - [Chapter 39. Parallel Prefix Sum (Scan) with CUDA](https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-39-parallel-prefix-sum-scan-cuda) - [Understanding and Implementing the Viola-Jones Image Classification Algorithm](https://medium.datadriveninvestor.com/understanding-and-implementing-the-viola-jones-image-classification-algorithm-85621f7fe20b) **2022 Pierre Saunders @saundersp**