• Title/Summary/Keyword: Visual Intelligence

Search Result 244, Processing Time 0.07 seconds

Bioimage Analyses Using Artificial Intelligence and Future Ecological Research and Education Prospects: A Case Study of the Cichlid Fishes from Lake Malawi Using Deep Learning

  • Joo, Deokjin;You, Jungmin;Won, Yong-Jin
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.3 no.2
    • /
    • pp.67-72
    • /
    • 2022
  • Ecological research relies on the interpretation of large amounts of visual data obtained from extensive wildlife surveys, but such large-scale image interpretation is costly and time-consuming. Using an artificial intelligence (AI) machine learning model, especially convolution neural networks (CNN), it is possible to streamline these manual tasks on image information and to protect wildlife and record and predict behavior. Ecological research using deep-learning-based object recognition technology includes various research purposes such as identifying, detecting, and identifying species of wild animals, and identification of the location of poachers in real-time. These advances in the application of AI technology can enable efficient management of endangered wildlife, animal detection in various environments, and real-time analysis of image information collected by unmanned aerial vehicles. Furthermore, the need for school education and social use on biodiversity and environmental issues using AI is raised. School education and citizen science related to ecological activities using AI technology can enhance environmental awareness, and strengthen more knowledge and problem-solving skills in science and research processes. Under these prospects, in this paper, we compare the results of our early 2013 study, which automatically identified African cichlid fish species using photographic data of them, with the results of reanalysis by CNN deep learning method. By using PyTorch and PyTorch Lightning frameworks, we achieve an accuracy of 82.54% and an F1-score of 0.77 with minimal programming and data preprocessing effort. This is a significant improvement over the previous our machine learning methods, which required heavy feature engineering costs and had 78% accuracy.

Assessment of Visual Landscape Image Analysis Method Using CNN Deep Learning - Focused on Healing Place - (CNN 딥러닝을 활용한 경관 이미지 분석 방법 평가 - 힐링장소를 대상으로 -)

  • Sung, Jung-Han;Lee, Kyung-Jin
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.3
    • /
    • pp.166-178
    • /
    • 2023
  • This study aims to introduce and assess CNN Deep Learning methods to analyze visual landscape images on social media with embedded user perceptions and experiences. This study analyzed visual landscape images by focusing on a healing place. For the study, seven adjectives related to healing were selected through text mining and consideration of previous studies. Subsequently, 50 evaluators were recruited to build a Deep Learning image. Evaluators were asked to collect three images most suitable for 'healing', 'healing landscape', and 'healing place' on portal sites. The collected images were refined and a data augmentation process was applied to build a CNN model. After that, 15,097 images of 'healing' and 'healing landscape' on portal sites were collected and classified to analyze the visual landscape of a healing place. As a result of the study, 'quiet' was the highest in the category except 'other' and 'indoor' with 2,093 (22%), followed by 'open', 'joyful', 'comfortable', 'clean', 'natural', and 'beautiful'. It was found through research that CNN Deep Learning is an analysis method that can derive results from visual landscape image analysis. It also suggested that it is one way to supplement the existing visual landscape analysis method, and suggests in-depth and diverse visual landscape analysis in the future by establishing a landscape image learning dataset.

Recent update on reading disability (dyslexia) focused on neurobiology

  • Kim, Sung Koo
    • Clinical and Experimental Pediatrics
    • /
    • v.64 no.10
    • /
    • pp.497-503
    • /
    • 2021
  • Reading disability (dyslexia) refers to an unexpected difficulty with reading for an individual who has the intelligence to be a much better reader. Dyslexia is most commonly caused by a difficulty in phonological processing (the appreciation of the individual sounds of spoken language), which affects the ability of an individual to speak, read, and spell. In this paper, I describe reading disabilities by focusing on their underlying neurobiological mechanisms. Neurobiological studies using functional brain imaging have uncovered the reading pathways, brain regions involved in reading, and neurobiological abnormalities of dyslexia. The reading pathway is in the order of visual analysis, letter recognition, word recognition, meaning (semantics), phonological processing, and speech production. According to functional neuroimaging studies, the important areas of the brain related to reading include the inferior frontal cortex (Broca's area), the midtemporal lobe region, the inferior parieto-temporal area, and the left occipitotemporal region (visual word form area). Interventions for dyslexia can affect reading ability by causing changes in brain function and structure. An accurate diagnosis and timely specialized intervention are important in children with dyslexia. In cases in which national infant development screening tests have been conducted, as in Korea, if language developmental delay and early predictors of dyslexia are detected, careful observation of the progression to dyslexia and early intervention should be made.

A Study on Reality Enhancement Method of VR Baseball Game (VR 야구 게임의 현실감 강화 방법 연구)

  • Yoo, Wang-Yun
    • Journal of Korea Game Society
    • /
    • v.19 no.2
    • /
    • pp.23-32
    • /
    • 2019
  • The popularization of VR content is slow. It's because they have not created a new visual experience, that is, 'utility' beyond 'interest'. The utility of VR content starts from functional reality. And to enhanced it, realistic interaction is required. Specifically, this study presents three methods of network play, character artificial intelligence, and Haptic implementation. In order to confirm the hypothesis, we conducted all phases of VR content production from baseball to contents production, play test, and technical verification. Through the test of the user and the evaluation institution about the final product, it was evaluated that it contributed to the realization of the content realism through the realistic visual effect, the play presentation, and the impact evaluation by the vibration.

Research on Technology Production in Chinese Virtual Character Industry

  • Pan, Yang;Kim, KiHong;Yan, JiHui
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.64-79
    • /
    • 2022
  • The concept of Virtual Character has been developed for a long time with people's demand for cultural and entertainment products such as games, animations, and movies. In recent years, with the rapid development of concepts and industries such as social media, self-media, web3.0, artificial intelligence, virtual reality, and Metaverse, Virtual Character has also expanded new derivative concepts such as Virtual Idol, Virtual YouTuber, and Virtual Digital Human. With the development of technology, people's life is gradually moving towards digitalization and virtualization. At the same time, under the global environment of the new crown epidemic, human social activities are rapidly developing in the direction of network society and online society. From the perspective of digital media content, this paper studies the production technology of Virtual Character related products in the Chinese market, and analyzes the future development direction and possibility of the Virtual Character industry in combination with new media development directions and technical production methods. Consider and provide reference for the development of combined applications of digital media content industry, Virtual Character and Metaverse industry.

Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS (시각적 특징과 머신 러닝으로 악성 URL 구분: HTTPS의 역할)

  • Sung-Won HONG;Min-Soo KANG
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.2
    • /
    • pp.1-6
    • /
    • 2023
  • In this paper, we present a new method for classifying malicious URLs to reduce cases of learning difficulties due to unfamiliar and difficult terms related to information protection. This study plans to extract only visually distinguishable features within the URL structure and compare them through map learning algorithms, and to compare the contribution values of the best map learning algorithm methods to extract features that have the most impact on classifying malicious URLs. As research data, Kaggle used data that classified 7,046 malicious URLs and 7.046 normal URLs. As a result of the study, among the three supervised learning algorithms used (Decision Tree, Support Vector Machine, and Logistic Regression), the Decision Tree algorithm showed the best performance with 83% accuracy, 83.1% F1-score and 83.6% Recall values. It was confirmed that the contribution value of https is the highest among whether to use https, sub domain, and prefix and suffix, which can be visually distinguished through the feature contribution of Decision Tree. Although it has been difficult to learn unfamiliar and difficult terms so far, this study will be able to provide an intuitive judgment method without explanation of the terms and prove its usefulness in the field of malicious URL detection.

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

Development of an Edutainment Contents using Wiimote Controller for Children with Visual Perception Disabilities (위모트를 활용한 시지각 장애아동 교육 콘텐츠개발)

  • Yoo, Sang-Jo;Han, Kyeong-Im;Kim, Bong-Seok;Park, Dong-Gyu
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.10
    • /
    • pp.1547-1556
    • /
    • 2010
  • Until now, many Computer Aided Education(CAE) contents are developed for kids and children with disabilities The contents cover various types of training, including visual perception training, intelligence development training, and literature education areas. Major problems on those contents are those contents requires long training time on desktop machine, which deteriorates human activities. These problems also cause inaction syndrome for young kids and children with disabilities. Solving this problem, we require a human motion sensing contents on touch screen or touch board, which interacts with a trainees and enhancing activity, collaboration and immersiveness. We implement and develope an education contents interacts with a trainee using beam projector or screen and IR(Infra-Red) pens using wiimote controller sensing technology.

The Development of Remodeling Process for Visual Content's Story by Big Data (빅데이터를 활용한 영상콘텐츠 스토리 리모델링 프로세스 개발)

  • Lee, Hye-Won;Park, Sung-Won;Kim, Lee-Kyung
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.3
    • /
    • pp.121-134
    • /
    • 2019
  • The Fourth Industrial Revolution has differentiated technologies such as artificial intelligence, IoT(Internet of things), big data, and mobile. As the civilization develops more and more, humanity enjoy the cultural activities more than economic activity for the food and shelter. The platform structure based on the advanced information technology of the present will expand the cultural contents area in a variety of ways. Cultural contents respond sensitively to changes in consumer and will be useful experiences of human activities. Therefore, it should be noted again that the contents industry should not be limited to the discussion of the application of the fourth technology, but should be produced with emphasis on useful experiences of human being. In other words, the discussion of human activities around cultural contents should be focused on how to apply beyond the use of fourth industrial technology. Therefore, it is necessary to analyze the basis of the successful storytelling of the planning stage to connect the fourth industrial technology and human useful experience as a method for developing cultural contents, and to build and propose a model as a strategic method. This study analyzes domestic and foreign cases made by using big data among the visual contents which show continuous increase of consumption among culture industry field, and draws success factors and limit points. Next, we extract what is the successful matching factor that influenced consumer 's consciousness, and find out that the structure of culture prototype has been applied in the long history of mankind, and presents it as a storytelling model. Through the above research, this study aims to present a new interpretation and creative activity of cultural contents by presenting a storytelling model as a methodology for connecting creative knowledge, away from the general interpretation of social phenomenon applied with big data.

HW/SW Co-design of a Visual Driver Drowsiness Detection System

  • Yu, Tian;Zhai, Yujia
    • Journal of Convergence Society for SMB
    • /
    • v.4 no.1
    • /
    • pp.31-39
    • /
    • 2014
  • PID auto-tuning controller was designed via fuzzy logic. Typical values such as error and error derivative feedback were changed as heuristic expressions, and they determine PID gain through fuzzy logic and defuzzification process. Fuzzy procedure and PID controller design were considered separately, and they are combined and analyzed. Obtained auto-tuning PID controller by Fuzzy Logic showed the ability for less than 3rd order plant control. We also applied to reference tracking problem with the designed auto-tuning scheme.

  • PDF