DOI QR코드

DOI QR Code

Computer Vision as a Platform in Metaverse

  • Iqbal Muhamad Ali (Dept. of Computer Engineering, Jeju National University) ;
  • Ho-Young Kwak (Dept. of Computer Engineering, Jeju National University) ;
  • Soo Kyun Kim (Dept. of Computer Engineering, Jeju National University)
  • 투고 : 2023.07.21
  • 심사 : 2023.08.29
  • 발행 : 2023.09.30

초록

메타버스는 빠르게 발전하고 있는 현대적인 기술이다. 본 연구의 목적은 일반적인 관점뿐만 아니라 컴퓨터 비전의 관점에서 메타버스 기술을 조사하는 것이다. 제안 방법에서는 메타버스 주제와 연관된 컴퓨터 비전에 대한 철저한 분석이 수행되었다. 메타버스의 역사, 방식, 아키텍처, 이점과 결점 모두 포함되어 있다. 또한 메타버스의 미래와 이 기술의 적응하기 위해 해야 하는 단계들을 설명하고 있으며, 혼합 현실(MR), 증강 현실(AR), 확장 현실(XR) 및 가상 현실(VR)의 개념들을 간략하게 소개한다. 특히 본 연구에서는 컴퓨터 비전의 역할과 적용, 장단점, 그리고 미래 연구 분야에 대해 논의한다.

Metaverse is a modern new technology that is advancing quickly. The goal of this study is to investigate this technique from the perspective of computer vision as well as general perspective. A thorough analysis of computer vision related Metaverse topics has been done in this study. Its history, method, architecture, benefits, and drawbacks are all covered. The Metaverse's future and the steps that must be taken to adapt to this technology are described. The concepts of Mixed Reality (MR), Augmented Reality (AR), Extended Reality (XR) and Virtual Reality (VR) are briefly discussed. The role of computer vision and its application, advantages and disadvantages and the future research areas are discussed.

키워드

I. Introduction

It has been observed that since metaverse is a newly emerging field, there is very minimal research being conducted in computer vision to support metaverse. Hence, this research is an overview to address this key topic. The phrases meta (which means go beyond a limit) and verse (which is extracted from the universe), these two words are combined to form the term metaverse. It is a hypothetical virtual environment that will be based on the real-world. Neal Stephenson wrote a novel called Snow Crash, he was the one who started off with the premise of the metaverse [1]. The author introduced the theoretical idea of the metaverse in this book, which described a three-dimensional virtual environment, consisting of avatars that represent people and interact with software agents. Second life was developed by Philip Rosedale and his team, it also depicted the existence of human beings as avatars in a virtual environment. Second life has developed an enormous user base over the years starting from 2003 in which it was developed, till date it has millions of active users.

CPTSCQ_2023_v28n9_63_f0001.png 이미지

Fig. 1. Metaverse Environment Overview

The Metaverse is a dynamic fusion of real-world and digital elements, brought to life through avatars and digital twins that enable remarkably lifelike simulations. This immersive space harnesses advanced technologies like Virtual Reality (VR), Augmented Reality (AR), and Brain-Computer Interface (BCI) to offer users a seamless and interactive experience. Artificial Intelligence (AI) techniques such as Machine Learning and Computer Vision ensure personalized and adaptive encounters within this virtual realm. Blockchain technology introduces Non-Fungible Tokens (NFTs) and Decentralized Finance (DeFi), allowing for virtual economies and ownership structures.

II. Preliminaries

2.1. Related works

To establish a foundational understanding, we will first delve into the concepts of 'Virtual Reality (VR), Augmented Reality (AR), Extended Reality (XR), and Mixed Reality (MR)' before delving into the concept of computer vision. Extended Reality, known as Cross Reality, is made possible through wearable devices, allowing the fusion of real and virtual spaces. This creates a platform for seamless human-machine interactions [2]. Mixed Reality, on the other hand, integrates both physical and virtual realms, forming an immersive environment where real and virtual elements coexist and interact in real-time, effectively bridging the gap between virtual and real environments. Mixed reality blends virtual and augmented reality [3]. Augmented Reality (AR) primarily aims to provide users with information by expanding their visual fields with pertinent data [4]. In contrast, Virtual Reality (VR) immerses users entirely in a virtual realm, disconnecting them from the real world. VR offers a computer-generated three-dimensional space where users can interact and explore using specialized headsets and dedicated input devices. It simulates transitions in the virtual environment, mirroring real-world experiences and delivering a sense of immersion and presence in entirely fictional settings [5]. In 1960, Larry Roberts, an MIT PhD student, introduced the foundational concept of extracting 3D information from 2D perspective views, marking a pivotal moment in computer vision [6]. David Marr's influential method followed, using image processing to create a 2.5D sketch from 2D photos, later refined into 3D models [7]. However, practical challenges and the limited need for 3D models in some applications led to the concept of "Purposeful Vision" algorithms, focused on task-specific solutions. Yiannis Aloimonos from the University of Maryland championed this approach [8]. This shift emphasizes aligning computational complexity with specific application requirements, enhancing efficiency and efficacy in computer vision.

CPTSCQ_2023_v28n9_63_f0002.png 이미지

Fig. 2. Merging Computer Vision and Metaverse

III. Metaverse in Computer Vision

3.1. Advantages of Metaverse in Computer Vision

Computer vision has played a pivotal role in enhancing human interaction and immersion in the Metaverse. It achieves this by utilizing digital avatars within virtual reality frameworks, creating a simulation of real-world interactions [9]. Many Extended Reality (XR) applications rely heavily on computer vision, making it a cornerstone of XR technology. Computer vision is crucial for processing, analyzing, and making decisions based on visual data, which is essential for creating precise and reliable virtual and augmented environments in XR devices. Human pose tracking, a computer vision challenge, is fundamental for XR applications, capturing spatial data related to the human body and enabling the creation of a 3D model of the user's surroundings. This model establishes the user's position and orientation, facilitating the tracking of user bodies and poses by the XR interactive system [9]. Avatars are used to represent human users and are tracked through computer vision within the Metaverse. The Metaverse also encourages environmental awareness, making computer vision and image processing highly valuable for constructing an effective Metaverse environment. Establishing a seamless connection between avatars simulating individuals in real time and the physical world is a critical task. This involves displaying the 3D world with minimal blur, reduced noise, and high resolution to maintain a realistic and immersive experience [10].

Computer vision greatly enhances human potential in the Metaverse, as demonstrated by credible evidence and sources. Digital avatars, combined with computer vision, enable realistic interactions in virtual reality settings [9], with computer vision serving as a cornerstone of Extended Reality (XR) applications [9]. Its role in processing visual information enhances the precision and reliability of XR environments [10], supported by research showing its effectiveness. Computer vision is vital for human pose tracking, essential in interactive Metaverse environments, allowing for accurate 3D modeling of users' surroundings [9], as proven by various studies.

Table 1. Advantages of Computer Vision in Metaverse Interactions

CPTSCQ_2023_v28n9_63_t0001.png 이미지

3.2. Demerits of Metaverse in Computer Vision

The application of Computer in metaverse faces two major drawbacks, one is the lack of availability of secure and high-speed internet and secondly to ensure the accessibility and the availability of relevant extended reality technology [11]. Therefore, it is probable that in the future it will not be feasible to accommodate large scale changes in it, while continuing to utilize computer vision to deliver the best experience to users owes to first establish a large scale highly advanced virtual universe. Additionally, it will be of greater importance than before to acquire these technologies to function effectively at their optimum performance, along with the most up-to-date features as the user base expands. The urge for the creation of these technologies is never-ending.

3.3. Applications of Metaverse in Computer Vision

The Metaverse has the potential to revolutionize various domains, including healthcare, real estate, military, education, retail, and manufacturing. In healthcare, it can enable disease diagnosis, surgical simulations, real-time data analysis from 3D scans, and patient education. In the military, it offers realistic training scenarios to assess soldiers' physical and psychological abilities. Real estate professionals and clients can benefit from virtual property tours, saving time and offering thorough inspections. In manufacturing, virtual reality simulations can reduce accidents, while augmented reality aids in employee training, saving time and costs. The adoption of Metaverse technology in education can enhance learning through visual and graphical concepts, increasing understanding and retention of educational material. Avatars are crucial in the Metaverse and controlling them in 3D virtual environments is a key consideration. Human pose tracking, a computer vision technique, captures spatial data on the human body, representing joints and body parts. This body representation is essential for determining user poses in the Metaverse. Eye tracking further enhances user interactions, predicting gaze and intent for immersive experiences. However, consistent eye tracking performance across users and environments is crucial for a seamless Metaverse experience. Achieving real-time eye tracking within device constraints is a challenge that requires ongoing development. As industries adapt to the Metaverse, novel technologies in extended reality (XR) and augmented reality (AR) will continue to evolve, shaping the future of computer vision.

CPTSCQ_2023_v28n9_63_f0003.png 이미지

Fig. 3. Metaverse Keys

IV. The Role of Artificial Intelligence in Metaverse

Applications for artificial intelligence that incorporate deep learning perform better than traditional techniques enabling the developers and designers in building and creating more immersive real-time metaverse applications. To ease users' tasks and offer a high-performance experience, artificial intelligence implementation is insufficient. Existing AI models are frequently complex and need a lot of processing. Therefore, it is essential to develop artificial intelligence models that are compact and effective. [12].

4.1. Augmentation

Augmented reality (AR) technologies play a pivotal role in enhancing the Metaverse experience. They encompass a range of tools and techniques that facilitate the seamless integration of digital content into the user's physical environment. These tools include spatial mapping, marker-based tracking, marker less tracking, real-time rendering, scene understanding, user interface design, content creation tools, and authoring tools. AR achieves the fusion of digital and physical by superimposing digital content onto the user's real-world surroundings. This is made possible through sensor-based systems and computer vision techniques. Tracking techniques like marker less and marker-based tracking ensure precise alignment and positioning of digital content in real-time. Spatial mapping is crucial in creating a three-dimensional representation of the physical environment. It captures structural, geometric, and layout details of the surroundings. Context-aware augmentation involves analyzing objects and elements in the environment, identifying them, and augmenting them contextually. Real-time rendering rapidly generates and displays realistic visual graphics that align with the user's current perspective, enhancing the user's perception of the environment. User interface design focuses on creating intuitive interactions between the user and virtual objects. It aims to map user mental models and expectations, making interactions feel natural and easy to understand. Content creation and authoring tools are instrumental for developers in enriching the Metaverse with unique and customizable digital content. These tools enable the creation of characters, digital assets, environments, and virtual objects, empowering developers to unleash their creative skills and craft distinctive Metaverse experiences.

4.2. Simulation

Simulation helps in creating and replicating real-world experiences in the virtual environment. The replication utilizes computer algorithms in developing such models that simulate interactive, social, physical, and interactive phenomena, enabling the users to experience interactive and immersive virtual space. Through simulation, users are able to engage in a broad range of activities that could potentially be impractical in their real-life environment. This is achieved by integrating and utilizing techniques like artificial intelligence, computer vison, modelling based on the principles of physics. The combined knowledge gained from these techniques results in providing simulations that depict a high degree of realism. Simulation techniques are utilized in training professionals in various fields such as healthcare, military, aviation, architecture, virtual tourism, or even social interaction that are carried out in virtual environments. By deploying simulation based the users can interact with virtual objects that mimics real world objects, instantly see the results of their action in a safer way and based on their actions the user can reach to taking crucial decisions that would have required critical steps and associated high risk factor in the real world.

V. Technologies for Enhanced Engagement in the Metaverse

Metaverse provides a platform based on the digital space within the computer-generated environment. Metaverse utilizes a span of multiple domains that includes technology, user experience and interface design. The goal of the meta verse is to engage the user with the virtual world by establishing a connection that feels immersive and authentic. To implement the concept of metaverse and practically deploy in the real world, technological advancements like brain computer interface, emotional recognition, creation of virtual social media platforms, emotional recognition systems, personalized AI assistance are utilized. The technologies play a key role in developing environments that are more interactive realistic and engaging rather than traditional two-dimensional interfaces. Deploying technologies such as haptic feedback user can get more realistic feel when engaging with virtual objects, hence by tailoring the virtual world's responses to the user's present emotional state, emotional recognition is able to provide user specific customization. Personalized AI Assistance provides virtual experience making it preference based which leads to the overall interaction between user and the virtual world relevant and enjoyable. The ability to actively regulate interactions through brain activity results in a more natural manner to interact with a virtual environment with the use of brain-computer interfaces.

5.1. External Technologies

By using external technologies by integrating external devices into the virtual environment plays a critical role in enhancing and complementing the overall metaverse experience. These technologies provide a wide range of functionalities. The input devices such as motion tracking sensors, haptic feedback devices, and the controllers enable the user to gain immersive and engaging experiences. The use of head mounted devices (HMDs) and high-resolution screens provide the user a platform to explore the digital world in a more realistic and vivid manner Network technologies provides seamless interaction between the user and virtual environment enabling the data exchange in real-time. Sensors such as depth sensors and optical sensors provide precision in tracking the movements of the user resulting in enhancing the realism in the overall experience. Furthermore, extending the external platform by integrating the social media platform into the metaverse aids in enriching and diversifying the user experience resulting in fostering an immersive interconnected virtual world.

5.2. Automation of Digital Twin

The concept of digitalization in the Metaverse can be described in three categories: digital models, digital shadows, and digital twins. Digital models are digital duplicates of physical objects, while digital shadows represent real objects in a digital form, varying with their real-world movements. Digital twins are precise replicas of physical objects or systems with a high level of awareness and coherence, bridging the gap between the Metaverse and reality. They provide services like categorization, recognition, prediction, and determination. Automation, particularly through deep learning, plays a significant role in leveraging digital twins' potential. Deep learning automates data processing, analysis, and training, reducing the need for manual feature engineering. Historical data from virtual and physical systems are combined during model training and evaluation, followed by real-time data integration during implementation. Digital twins find applications in remote surgery validation prototypes, where they replicate patients and procedures using robotic arms, enhanced by deep learning for diagnosis and health prediction. Learning algorithms also continuously assess the well-being of the elderly through their digital twins. In the context of smart cities, digital twins are created by merging IoT data and Building Information Models (BIM), simplifying urban management and planning. They facilitate analysis of factors such as air pollution's impact on quality of life and optimal traffic light intervals. Digital twins also assist in tracking and forecasting building energy usage and making decisions like optimal solar panel placement within urban areas. Overall, digital twins are powerful tools in the Metaverse, enhancing interaction between the virtual and physical worlds and finding applications in various domains.

VI. Architecture of Metaverse

The fixed architecture is yet to be stated since metaverse happens to be in the development phase. An architecture built on seven distinct components, spanning bottom to top, has been presented by Jon Radoff [13]. Infrastructure, user interface, decentralization, geographic computing, the creator economy, knowledge, and experience constitute this architecture. The three-layer figure depicts the architecture from a close-up perspective. Interaction, infrastructure, and ecosystem. The translation of the physical world's architecture to the virtual world is the fundamental prerequisite of the metaverse.

CPTSCQ_2023_v28n9_63_f0004.png 이미지

Fig. 4. Metaverse Three-layer Architecture

The comparative analysis table outlines three distinct metaverse frameworks: the Decentralized Paradigm, Cloud-Native Approach, and Hybrid Integration. Each framework has unique characteristics catering to different aspects of metaverse development. The Decentralized Paradigm employs a peer-to-peer network infrastructure, prioritizing security and privacy, immersive VR/AR experiences, and user-generated content in virtual social spaces. In contrast, the Cloud-Native Approach relies on centralized cloud servers for scalability, emphasizing mixed reality interfaces, multi-user interactions, and AI-generated content within shared experiences. The Hybrid Integration framework strikes a balance by combining elements from both decentralized and cloud-based paradigms, offering resilience and scalability. Its versatile user interface includes immersive VR/AR and mixed reality interfaces, accommodating both user-generated and AI-generated content. Use cases include virtual classrooms and healthcare simulations, facilitating real-time collaboration and professional training in the metaverse. This analysis provides valuable insights for informed decision-making in metaverse development.

Table 2. Metaverse Frameworks: A Comparative Analysis

CPTSCQ_2023_v28n9_63_t0002.png 이미지

VII. Conclusions

In summary, Metaverse is a crucial and exciting today's topic. It is anticipated that the Metaverse strategy will concentrate on integrating this technology into life of the people, rather than the development of this technology in the future. Therefore, this technology must currently be more investigated and developing the areas that are required. These include extended reality technologies, computer vision, and augmented reality, each of them will make a faster and more significant contribution to the metaverse. Technologies are developing more quickly and at a higher level, and businesses need to invest more in this area. Leading companies in the sector should participate in the Metaverse and set an example in this regard, particularly in the business world. Rapid acquisition and execution of this idea are crucial for our future. These technologies also need to be more widely available in society. All members of the organization, regardless of their level of income, should be able to use this technology, which will soon play a significant role in our daily lives. The development of less expensive products that are suitable for usage by everyone should be prioritized above expensive products.

ACKNOWLEDGEMENT

“This research was supported by the 2023 scientific promotion program funded by Jeju National University.”

참고문헌

  1. Joshua, J. (2017). Information Bodies: Computational Anxiety in Neal Stephenson's Snow Crash. Interdisciplinary Literary Studies, 19(1), 17-47. https://doi.org/10.5325/intelitestud.19.1.0017
  2. North of 41 (2020). What really is the difference between AR / MR / VR / XR? Access Date: 15/12/2021 https://medium.com/@northof41/whatreally-is-the-difference-between-ar-mr-vr-xr-35bed1da1a4e
  3. Milgram, Paul & Kishino, Fumio. (1994). A Taxonomy of Mixed Reality Visual Displays. IEICE Trans. Information Systems. vol. E77-D, no. 12. 1321-1329.
  4. Thomas, P. C., & David, W. M. (1992, January). Augmented reality: An application of heads-up display technology to manual manufacturing processes. In Hawaii international conference on system sciences (pp. 659- 669).
  5. Fast-Berglund, A., Gong, L., & Li, D. (2018). Testing and validating Extended Reality (xR) technologies in manufacturing. Procedia Manufacturing, 25, 31-38.
  6. Huang, T. S. (1996). Computer vision: Evolution and promise. CERN European Organization for Nuclear Research-Reports-CERN, 21-26.
  7. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information, henry holt and co. Inc., New York, NY, 2(4.2).
  8. Huang, T. S. (1996). Computer vision: Evolution and promise. CERN European Organization for Nuclear Research-Reports-CERN, 21-26.
  9. Barioni, R. R., Figueiredo, L., Cunha, K., & Teichrieb, V. (2018, October). Human Pose Tracking from RGB Inputs. In 2018 20th Symposium on Virtual and Augmented Reality (SVR) (pp. 176-182). IEEE.
  10. Fast-Berglund, A., Gong, L., & Li, D. (2018). Testing and validating Extended Reality (xR) technologies in manufacturing. Procedia Manufacturing, 25, 31-38.
  11. Marion Davies (2021). Pros and Cons of the Metaverse. Access Date: 15/12/2021. https://www.konsyse.com/articles/pros-and-cons-of-themetaverse/
  12. Lee, L. H., Braud, T., Zhou, P., Wang, L., Xu, D., Lin, Z., ... & Hui, P. (2021). All one needs to know about metaverse: A complete survey on technological singularity, virtual ecosystem, and research agenda. arXiv preprint arXiv:2110.05352.
  13. Jon Radoff (2021). The Metaverse Value-Chain. Access Date: 15/12/2021. https://medium.com/building-the-metaverse/the-metaversevalue-chain-afcf9e09e3a7
  14. Lee, L. H., Braud, T., Zhou, P., Wang, L., Xu, D., Lin, Z., ... & Hui, P. (2021). All one needs to know about metaverse: A complete survey on technological singularity, virtual ecosystem, and research agenda. arXiv preprint arXiv:2110.05352.
  15. Smart, J., Cascio, J., & Paffendorf, J. (2007). Metaverse Roadmap 2007: pathways to the 3D Web. A Cross-industry Public Foresight Project. Retrieved December, 31, 2008.
  16. Jon Radoff (2021). The Metaverse Value-Chain. Access Date: 15/12/2021. https://medium.com/building-the-metaverse/the-metaversevalue-chain-afcf9e09e3a7
  17. Duan, H., Li, J., Fan, S., Lin, Z., Wu, X., & Cai, W. (2021, October). Metaverse for social good: A university campus prototype. In Proceedings of the 29th ACM International Conference on Multimedia (pp. 153-161).
  18. Thais Mayumi Oshiro, Pedro Santoro Perez, and Jose Augusto Baranauskas. How many trees in a random forest? In International workshop on machine learning and data mining in pattern recognition, pages 154-168. Springer, 2012.
  19. Aidan Fuller, Zhong Fan, Charles Day, and Chris Barlow. Digital twin: Enabling technologies, challenges and open research. IEEE access, 8:108952-108971, 2020.
  20. Mohamed Habib Farhat, Xavier Chiementin, Fakher Chaari, Fabrice Bolaers, and Mohamed Haddar. Digital twin-driven machine learning: ball bearings fault severity classification. Measurement Science and Technology, 32(4):044006, 2021.
  21. Giulio Paolo Agnusdei, Valerio Elia, and Maria Grazia Gnoni. A classification proposal of digital twin applications in the safety domain. Computers & Industrial Engineering, page 107137, 2021.
  22. Farzin Piltan and Jong-Myon Kim. Bearing anomaly recognition using an intelligent digital twin integrated with machine learning. Applied Sciences, 11(10):4602, 2021.
  23. Heikki Laaki, Yoan Miche, and Kari Tammi. Prototyping a digital twin for real time remote control over mobile networks: Application of remote surgery. IEEE Access, 7:20325-20336, 2019. https://doi.org/10.1109/ACCESS.2019.2897018
  24. Ying Liu, Lin Zhang, Yuan Yang, Longfei Zhou, Lei Ren, Fei Wang, Rong Liu, Zhibo Pang, and M Jamal Deen. A novel cloud-based framework for the elderly healthcare services using digital twin. IEEE Access, 7:49088-49101, 2019. https://doi.org/10.1109/ACCESS.2019.2909828
  25. Ziran Wang, Xishun Liao, Xuanpeng Zhao, Kyungtae Han, Prashant Tiwari, Matthew J Barth, and Guoyuan Wu. A digital twin paradigm: Vehicle-to-cloud based advanced driver assistance systems. In 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), pages 1-6. IEEE, 2020