Compressing images using Artificial Intelligence

Aug. 22, 2017, 8:31 p.m. By: Pranjal Kumar


The technologies like artificial intelligence and neural networks have taken the world by storm. The recent surge in this field is due to various factors which including the availability of large data required for neural networks, improved hardware, low cost in these technologies. The neural network is one of the prime technology being used when it comes to tasks like image recognition, natural language understanding. But, their use is not only limited to that field only. Neural networks can also be used for other tasks like compressing the images at a considerably faster speed.

Image compression in the process of converting a large image so that it occupies less space. Modern days technology or content requires high-quality images which often takes lots of space. Since more space would mean more data transfer requirement and more storage capacity which will increase the cost. There are codecs like PNG which aims to reduce the size of the original image. Basically, there are two types of image compression- lossy and lossless image compression. In lossy compression, some of the data is lost during the conversion while in lossless compression, it is possible to get back all the data of the original image. For example, PNG is lossless whereas JPEG is lossy compression. Lossless is good, but it ends up taking a lot of space on disk.

Yes, there are better ways to compress images without losing much information, but they are quite slow. And one of the limitations with them is that they use iterative approaches, which means they could not be run in parallel over multiple CPU cores. This renders them quite impractical in everyday usage.

One can use a standard Convolutional Neural Network to improve image compression. This method performs at par with the traditional ways while leveraging the power of parallel computing to increase the speed. Convolution neural networks are very good at extracting spatial information from images which are further represented in a more compact form. So, one can use the capability of CNN to be able to better represent images. You will have to use the dual network for this purpose.

The first network will take an image generating a compact representation. The output for this one will be processed by a standard codec. Then, the image will be passed to a 2nd network, which will ‘fix’ the image from the codec and try to get the original image back. This process is known as Reconstructive CNN and is similar to GAN. At last, you can also perform residual which is a step to ‘improve’ the image that the codec decodes.

The training method is similar to that of GAN. In this method, you will have to fix the weight of the first model while the other model’s weights are updated. Then, the other model’s weights are fixed, and the first model is trained. This method performs better than the others while maintaining high speeds when used on capable hardware.