DOI QR코드

DOI QR Code

Automatic Poster Generation System Using Protagonist Face Analysis

  • Yeonhwi You (Department of Computer Science and Engineering, Korea University of Technology and Education) ;
  • Sungjung Yong (Department of Computer Science and Engineering, Korea University of Technology and Education) ;
  • Hyogyeong Park (Department of Computer Science and Engineering, Korea University of Technology and Education) ;
  • Seoyoung Lee (Department of Computer Science and Engineering, Korea University of Technology and Education) ;
  • Il-Young Moon (Department of Computer Science and Engineering, Korea University of Technology and Education)
  • Received : 2023.06.29
  • Accepted : 2023.10.23
  • Published : 2023.12.31

Abstract

With the rapid development of domestic and international over-the-top markets, a large amount of video content is being created. As the volume of video content increases, consumers tend to increasingly check data concerning the videos before watching them. To address this demand, video summaries in the form of plot descriptions, thumbnails, posters, and other formats are provided to consumers. This study proposes an approach that automatically generates posters to effectively convey video content while reducing the cost of video summarization. In the automatic generation of posters, face recognition and clustering are used to gather and classify character data, and keyframes from the video are extracted to learn the overall atmosphere of the video. This study used the facial data of the characters and keyframes as training data and employed technologies such as DreamBooth, a text-to-image generation model, to automatically generate video posters. This process significantly reduces the time and cost of video-poster production.

Keywords

Acknowledgement

This research was supported by the Basic Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (No. 2021R1I1A3057800) and this results was supported by "Regional Innovation Strategy (RIS)" through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (2021RIS-004).

References

  1. J. Sang and C. Xu, "Character-based movie summarization," in Proceedings of the 18th ACM International Conference on Multimedia, pp. 855-858, 2010. DOI: 10.1145/1873951.1874096.
  2. J. Monaco, "How to Read a Film: The Art, Technology, Language, History and Theory of Film and Media," Oxford University Press, New York, 1982.
  3. M. Wang and W. Deng, "Deep face recognition: A survey," Neurocomputing, vol. 429, pp. 215-244, Mar. 2021. DOI: 10.1016/j.neucom.2020.10.081.
  4. W. W. Bledsoe, "The Model Method in Facial Recognition," Panoramic Research, Inc., Palo Alto: CA, Technical Report, PRI 15, 1964.
  5. K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks," IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, Oct. 2016. DOI: 10.1109/LSP.2016.2603342.
  6. L. Chaochao and X. Tang, "Surpassing human-level face verification performance on LFW with GaussianFace," in Proceedings of the AAAI Conference on Artificial Intelligence, Austin: TX, pp. 3811-3819, 2015. DOI: 10.1609/aaai.v29i1.9797.
  7. Y. Sun, X. Wang, and X. Tang, "Deeply learned face representations are sparse, selective, and robust," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston: NY, pp. 2892-2900, 2015. DOI: 10.1109/CVPR.2015.7298907.
  8. K. Yoon and J. Choi, "Compressed Ensembleof Deep Convolutional Neural Networks withGlobal and Local Facial Features for ImprovedFace Recognition," Journal of Korea Mul-timedia Society, vol. 23, no. 8, pp. 1019-1029, Aug. 2020. DOI: 10.9717/kmms.2020.23.8.1019.
  9. Y. Ha, J. Park, and J. Shim, "Comparison of Face Recognition Performance Using CNN Models and Siamese Networks," Journal of Korea Multimedia Society, vol. 26, no. 2, pp. 413-419, 2023. DOI: 10.9717/kmms.2023.26.2.413.
  10. S. Florian, D. Kalenichenko, and J. Philbin, "FaceNet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston: NY, pp. 815-823, 2015. DOI: 10.1109/CVPR.2015.7298682.
  11. S. Tabata, H. Yoshihara, H. Maeda, and K. Yokoyama, "Automatic layout generation for graphical design magazines," in SIGGRAPH '19: Special Interest Group on Computer Graphics and Interactive Techniques Conference, ACM SIGGRAPH 2019 Posters, Los Angeles: CA, pp. 1-2, 2019. DOI: 10.1145/3306214.3338574.
  12. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever, "Zero-Shot Text-to-Image Generation," in Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp.8821-8831, 2021.
  13. N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, "Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation," arXiv preprint arXiv: 2208.12242, Aug. 2023. DOI: 10.48550/arXiv.2208.12242.
  14. S. Guo, Z. Jin, F. Sun, J. Li, Z. Li, Y. Shi, and N. Cao, "Vinci: An Intelligent Graphic Design System for Generating Advertising Posters," in Proceedings of the 2021 CHI Conference on Human Factors inComputing Systems (CHI '21), Yokohama, Japan, pp. 1-17, 2021. DOI: 10.1145/3411764.3445117.