For Large Scale Image Classification and Object Detection: AutoML

Nov. 19, 2017, 9:51 a.m. By: Kirti Bakshi


It has been a few months since the introduction of the AutoML project, that is an approach made in order to automate the design of machine learning models. "While it was found that AutoML can design small neural networks that perform on par with neural networks designed by human experts, on the contemporary, these results were obstructed to small datasets that were academic like Treebank and CIFAR-10. We became curious how this method would perform on larger more challenging datasets, such as ImageNet image classification and COCO object detection. Many state-of-the-art machine learning architectures have been invented by humans to tackle these datasets in academic competitions." According to the information provided by the team.

The use of AutoML has been made to the image classification - ImageNet and object detection dataset- COCO that are two of the most respected large-scale academic datasets in the vision of computer and also In the Learning Transferable Architectures for Scalable Image Recognition. These very two datasets as they are orders of magnitude larger than CIFAR-10 and Penn Treebank datasets prove to be a great challenge. Taking into view, for instance, the naive application of AutoML directly to ImageNet would require many months of training the method.

And in order to be able to apply our method to ImageNet the AutoML approach has been altered to be more tractable to large-scale datasets:

  • There has been a redesign of the search space so that AutoML could find the best layer which can then be stacked many times in a flexible manner to create a final network.

  • There has also been a performance of the architecture search on CIFAR-10 and the transferring of the best learned architecture to ImageNet image classification and COCO object detection.

AutoML is now also able to find the best layers that work well on CIFAR-10 but alongside also work well on COCO object detection and ImageNet classification as well with the help of this very method. These two layers were then combined to form we called “NASNet" that is a novel architecture.

AutoML Cell

"NASNet is said to achieve a prediction accuracy of 82.7% on the validation set, On ImageNet image classification, surpassing all previous Inception models that have been built [2, 3, 4]. Furthermore, NASNet may also be resized in order to produce a family of models that are quite able to achieve good accuracies while having very low computational costs adding to it.

The picture given below defines the same:


There has also been a transfer of the learned features from ImageNet to object detection. In all the experiments that were performed, after the combination of the features learned from the Faster-RCNN framework: ImageNet classification with the surpassing of the previous published, the predictive performance on the COCO object detection task in both the mobile-optimized models as well as those that are largest is good. The largest model by them has achieved 4% better than the previous, published state of the art which is 43.1% mAP.

It is also suspected that the image features may be reused for many computer vision application that is learned by NASNet on ImageNet and COCO. Thus, NASNet has been open-sourced for presumption on image classification as well as for object detection in the TensorFlow repositories of both Slim and Object Detection. It is therefore also hoped that the larger machine learning community will be able to build on these models to address multitudes of computer vision problems that have not yet been imagined.

Image Source: geekboy