THE PICTURE WORLD OF THE FUTURE: AI TEXT-TO-IMAGE AS A NEW ERA OF VISUAL CONTENT CREATION

Authors

  • Dejan Dodić EDUKOM Ltd Vranje, Republic of Serbia
  • Slavčo Čungurski Faculty of Informatics, UTMS, Skopje, Republic of North Macedonia

Keywords:

AI text-to-image, deep learning, generative adversarial networks, GANs, computer vision

Abstract

In this paper, we investigated the technology of AI text-to-image and its various applications in different industries. We reviewed literature on various methods of AI Text in Image, compared their advantages and limitations, and discussed potential use cases in different fields such as design, medicine, architecture, and art. One important factor that affects the performance of AI text-to-image is the size of the training dataset. To achieve high accuracy and quality in generating images, it is necessary to use large datasets that are diverse and of high quality. It is also important that the datasets consist of descriptive texts that describe the different characteristics of the generated images. To improve the accuracy and quality of generating images, new datasets and techniques are being developed to create diverse and high-quality texts.
We also described the methodology we used in the research, presented results, analyzed challenges, and discussed ethical considerations arising from the use of this technology. Finally, we highlighted that AI text-to-image represents an important and innovative technology with great potential for transforming various fields, while considering ethical guidelines for the use of this technology.

References

Huang, X., Liu, M.-Y., Belongie, S., & Kautz, J. (2018). "Multimodal Unsupervised Image-to-Image Translation," Proceedings of the European Conference on Computer Vision (ECCV),

Nguyen, A., Yosinski, J., & Clune, J. (2015). "Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),

Park, T., Liu, M.-Y., Wang, T.-C., & Zhu, J.-Y. (2019). "Semantic Image Synthesis with Spatially-Adaptive Normalization," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),

Radford, A., Metz, L., & Chintala, S. (2016). "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks," Proceedings of the International Conference on Learning Representations (ICLR),

Reed, S., Akata, Z., Lee, H., & Schiele, B. (2016). "Generative Adversarial Text to Image Synthesis," Proceedings of the International Conference on Computer Vision (ICCV)

Tero, K., Samuli, L., & Timo, A. (2019). "A Style-Based Generator Architecture for Generative Adversarial Networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),

Zhang, H., Goodfellow, I., Metaxas, D., & Odena, A. (2020). Self-Attention Generative Adversarial Networks. International Conference on Learning Representations (ICLR), 2020.

Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, D.N. (2017). "StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks," Proceedings of the IEEE International Conference on Computer Vision (ICCV),

Zhu, J.-Y., Park, T., Isola, P., & Efros, A.A. (2017). "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),

https://lindseygamble.com/blog/the-rise-of-ai-powered-text-to-image/video-generators-what-it-means-for-the-creator-economy, 2023

Downloads

Published

2023-03-31

How to Cite

Dodić, D., & Čungurski, S. (2023). THE PICTURE WORLD OF THE FUTURE: AI TEXT-TO-IMAGE AS A NEW ERA OF VISUAL CONTENT CREATION. KNOWLEDGE - International Journal , 57(3), 417–421. Retrieved from https://ikm.mk/ojs/index.php/kij/article/view/6010