According to the Recent studies, there has been quite a remarkable success of image-to-image translation for two domains. However, on the contemporary, the existing approaches have also resulted in limited scalability and robustness when it comes to the handling of more than two domains since different models are supposed to be built independently for every single pair of image domains.
And in order to address this limitation, we are presented with an approach that is both novel and scalable and that can perform image-to-image translations for multiple domains with the help of only a single model. An approach called StarGAN.
This PyTorch implementation of StarGAN provides us with a novel feature that can very flexibly translate any given input image to any desired target domain with the help of only a single generator and a discriminator.
StarGAN, that turns out to be a generative adversarial network and is capable of learning mappings among multiple domains comes with a unified modeling architecture that in short allows simultaneous training of multiple datasets even with different domains right within a single network. And This feature leads to StarGAN's superior quality of translated images when it is compared to all other existing models.
In addition to the same, there has also been an empirical demonstration that shows the effectiveness of the presented approach on the basis of a facial attribute transfer and a facial expression synthesis tasks.
Here the terms attribute denote a meaningful feature that is inherent in an image such as hair color, gender or age, and in the same context attribute value is considered as a particular value of an attribute, for example, male/female for a gender.
The task of image-to-image translation that we are talking about here basically is to change a particular aspect of a given image to another, for example changing the facial expression of a person from say frowning to smiling or even more in the same context. In consideration of the known, the image below presents the same:
Moving on to the contributions of StarGAN and all that it emphasizes on:
It comes forward with the proposal of StarGAN, that is a novel generative adversarial network aimed at the learning of the mappings among multiple domains with the help of only a single generator and a discriminator, thus, training effectively from images of related to all domains.
It also successfully demonstrates how multi-domain image translation between multiple datasets can be learned and achieved by the utilization of a mask vector method that in return enables StarGAN to control all available domain labels.
They also provide both qualitative and quantitative results on facial attribute transfer and facial expression synthesis tasks with the use of StarGAN, and as a result, showing its superiority over every baseline models that are put into consideration.
StarGAN is thus basically based upon a simple idea that is: Instead of learning a fixed translation, for example, the transfer of black to blonde hair, this new model takes in as inputs for both images as well as domain information, and further learns to flexibly translate the input image into the corresponding required domain.
They basically make the use of a vector as mentioned before to represent the domain information. During the training phase, there is a random generation of a target domain label and then they further train the model so that it can flexibly translate an input image into the target domain. By doing so, they gain control over the domain label and can then translate the image into any desired domain at the phase of testing itself.
In principle, this proposed model can be further applied to translation between any other types of domains, to take for an example: style transfer, which said will turn out to be one of their future work.
For More Information: GitHub
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation:
Video Source: Yunjey Choi