• Title/Summary/Keyword: Visual Models

Search Result 603, Processing Time 0.037 seconds

Improved Cycle GAN Performance By Considering Semantic Loss (의미적 손실 함수를 통한 Cycle GAN 성능 개선)

  • Tae-Young Jeong;Hyun-Sik Lee;Ye-Rim Eom;Kyung-Su Park;Yu-Rim Shin;Jae-Hyun Moon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.908-909
    • /
    • 2023
  • Recently, several generative models have emerged and are being used in various industries. Among them, Cycle GAN is still used in various fields such as style transfer, medical care and autonomous driving. In this paper, we propose two methods to improve the performance of these Cycle GAN model. The ReLU activation function previously used in the generator was changed to Leaky ReLU. And a new loss function is proposed that considers the semantic level rather than focusing only on the pixel level through the VGG feature extractor. The proposed model showed quality improvement on the test set in the art domain, and it can be expected to be applied to other domains in the future to improve performance.

Leveraging Deep Learning and Farmland Fertility Algorithm for Automated Rice Pest Detection and Classification Model

  • Hussain. A;Balaji Srikaanth. P
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.959-979
    • /
    • 2024
  • Rice pest identification is essential in modern agriculture for the health of rice crops. As global rice consumption rises, yields and quality must be maintained. Various methodologies were employed to identify pests, encompassing sensor-based technologies, deep learning, and remote sensing models. Visual inspection by professionals and farmers remains essential, but integrating technology such as satellites, IoT-based sensors, and drones enhances efficiency and accuracy. A computer vision system processes images to detect pests automatically. It gives real-time data for proactive and targeted pest management. With this motive in mind, this research provides a novel farmland fertility algorithm with a deep learning-based automated rice pest detection and classification (FFADL-ARPDC) technique. The FFADL-ARPDC approach classifies rice pests from rice plant images. Before processing, FFADL-ARPDC removes noise and enhances contrast using bilateral filtering (BF). Additionally, rice crop images are processed using the NASNetLarge deep learning architecture to extract image features. The FFA is used for hyperparameter tweaking to optimise the model performance of the NASNetLarge, which aids in enhancing classification performance. Using an Elman recurrent neural network (ERNN), the model accurately categorises 14 types of pests. The FFADL-ARPDC approach is thoroughly evaluated using a benchmark dataset available in the public repository. With an accuracy of 97.58, the FFADL-ARPDC model exceeds existing pest detection methods.

A Research of User Experience on Multi-Modal Interactive Digital Art

  • Qianqian Jiang;Jeanhun Chung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.80-85
    • /
    • 2024
  • The concept of single-modal digital art originated in the 20th century and has evolved through three key stages. Over time, digital art has transformed into multi-modal interaction, representing a new era in art forms. Based on multi-modal theory, this paper aims to explore the characteristics of interactive digital art in innovative art forms and its impact on user experience. Through an analysis of practical application of multi-modal interactive digital art, this study summarises the impact of creative models of digital art on the physical and mental aspects of user experience. In creating audio-visual-based art, multi-modal digital art should seamlessly incorporate sensory elements and leverage computer image processing technology. Focusing on user perception, emotional expression, and cultural communication, it strives to establish an immersive environment with user experience at its core. Future research, particularly with emerging technologies like Artificial Intelligence(AR) and Virtual Reality(VR), should not merely prioritize technology but aim for meaningful interaction. Through multi-modal interaction, digital art is poised to continually innovate, offering new possibilities and expanding the realm of interactive digital art.

Artificial Intelligence Plant Doctor: Plant Disease Diagnosis Using GPT4-vision

  • Yoeguang Hue;Jea Hyeoung Kim;Gang Lee;Byungheon Choi;Hyun Sim;Jongbum Jeon;Mun-Il Ahn;Yong Kyu Han;Ki-Tae Kim
    • Research in Plant Disease
    • /
    • v.30 no.1
    • /
    • pp.99-102
    • /
    • 2024
  • Integrated pest management is essential for controlling plant diseases that reduce crop yields. Rapid diagnosis is crucial for effective management in the event of an outbreak to identify the cause and minimize damage. Diagnosis methods range from indirect visual observation, which can be subjective and inaccurate, to machine learning and deep learning predictions that may suffer from biased data. Direct molecular-based methods, while accurate, are complex and time-consuming. However, the development of large multimodal models, like GPT-4, combines image recognition with natural language processing for more accurate diagnostic information. This study introduces GPT-4-based system for diagnosing plant diseases utilizing a detailed knowledge base with 1,420 host plants, 2,462 pathogens, and 37,467 pesticide instances from the official plant disease and pesticide registries of Korea. The AI plant doctor offers interactive advice on diagnosis, control methods, and pesticide use for diseases in Korea and is accessible at https://pdoc.scnu.ac.kr/.

Design and Verification of Spacecraft Pose Estimation Algorithm using Deep Learning

  • Shinhye Moon;Sang-Young Park;Seunggwon Jeon;Dae-Eun Kang
    • Journal of Astronomy and Space Sciences
    • /
    • v.41 no.2
    • /
    • pp.61-78
    • /
    • 2024
  • This study developed a real-time spacecraft pose estimation algorithm that combined a deep learning model and the least-squares method. Pose estimation in space is crucial for automatic rendezvous docking and inter-spacecraft communication. Owing to the difficulty in training deep learning models in space, we showed that actual experimental results could be predicted through software simulations on the ground. We integrated deep learning with nonlinear least squares (NLS) to predict the pose from a single spacecraft image in real time. We constructed a virtual environment capable of mass-producing synthetic images to train a deep learning model. This study proposed a method for training a deep learning model using pure synthetic images. Further, a visual-based real-time estimation system suitable for use in a flight testbed was constructed. Consequently, it was verified that the hardware experimental results could be predicted from software simulations with the same environment and relative distance. This study showed that a deep learning model trained using only synthetic images can be sufficiently applied to real images. Thus, this study proposed a real-time pose estimation software for automatic docking and demonstrated that the method constructed with only synthetic data was applicable in space.

A Study on the Visual Preference of Users according to the Location of Benches at Urban Community Parks (도시공원에서 벤치의 배치장소에 따른 이용자의 시각적 선호도에 관한 연구)

  • 유상완;문석기;권상준
    • Archives of design research
    • /
    • v.13 no.2
    • /
    • pp.95-102
    • /
    • 2000
  • The purpose of this study is to find out what is the preference of users according to the location of benches at urban community parks. This location of benches is seperated into 4 patterns according to arranging pattern of water space, a walk, pergola and shelter, greenspace. To investigate the visual preference is examined by analyzing visual volume of 4 patterns. Results are as follows; 1. Factor analysis by the total data showed that 5 factors explain 60.40 percent of total variance of the location of bench visual character. They were classified by the sensitive factor, visual factor, physical-individual factor, distinct factor, density factor. Among 5 factors, the sensitive factor which represented psychological reaction was appreciated to be highest. 2. Most of 20 items showed the following scores of mean values in sementic differential experiment : Spot 1->Spot 4-> 2-> 3. The mean values between arrangement place locational differences showed significantly, that could explain to be a violent contrast between the natural factors(weater space, green space, etc) and the artificial factors (around of pergola, shelter, etc)

  • PDF

Augmented Plasticity: Giving Morphological Editability to Physical Objects (증강가소성: 물리적 오브젝트에 형태적 편집가능성 부여하기)

  • Lee, Woo-Hun;Kang, Hye-Kyoung
    • Archives of design research
    • /
    • v.19 no.1 s.63
    • /
    • pp.225-234
    • /
    • 2006
  • Product designers sketch various ideas of foreground figures(detail design) onto background figures(basic form) and evaluate numerous combinations of them in the late stages of design process. Designers have to test their ideas elaborately with a high-fidelity physical model that looks like a real product. However, due to the requirements of time and expense in making high-fidelity design models, it is impossible to evaluate such a number of combinatorial solutions of background and foreground figures. Contrary to digital models, physical design models are not easily modifiable and so designers cannot easily develope ideas through iterative design-evaluation process. To address these problems, we proposed a new concept 'Augmented Plasticity' that gives morphological editability to a rigid physical object using Augmented Reality technology and implemented the idea as Digital Skin system. Digital Skin system figures out the position and orientation of object surface with ARToolKit visual marker and superimposes a deformed surface image seamlessly using differential rendering method. We tried to apply Digital Skin system to detail design, redesign of product, and material exploration task. In consequence, it was found that Digital Skin system has potential to allow designers to implement and test their ideas very efficiently in the late stages of design process.

  • PDF

THREE-DIMENSIONAL PHOTOELATIC STRESS ANALYSIS OF CLASP RETAINERS INFLUENCED BY VARIOUS DESIGNS ON UNILATERAL FREE-END REMOVABLE PARTIAL DENTURES (하악 편측 유리단 국소의치의 직접유지장치 형태에 따른 3차원적 광탄성 응력분석 연구)

  • Kim Byeong-Moo;Yoo Kwang-Hee
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.32 no.4
    • /
    • pp.526-552
    • /
    • 1994
  • The extent and direction of movement of removable partial dentures during function are influenced by the nature of the supporting structures and and the design of the prosthesis. Since forces are transmitted to the abutment teeth through occlusal rests, guide planes and direct retainers during functional movements, proper design based on the avaialble research data will maintain the health of abutment teeth and their supporting structures. The purpose of this in vitro study is evaluating stress distribution clinically around the abutment teeth prepared following 4-type clasping systems for unilateral free-end removable partial dentures. Three-Dimensional Photoelastic Stress Analysis method was used because it shows a visual display of stresses of the simulated abutment teeth and residual ridges and reveals stress concentration that can be read at any given points in terms of direction and magnitude. For this study, the author fabricated 4 mandibular photoelastic epoxy models missing left 1st and End molar. Epoxy models were duplicated and 4 unilateral removable partial dentures were construe- ted in accordance with 4-type direct retainers. Unilateral free-end removable partial dentures were positioned on their own models. 6kg force was loaded on the every removable partial dentures of the epoxy model on the central fossa of mandibular left 1st molar vertically by the loading device. After the stress was frozen in a stress freezing furnace, 6 specimens of 6-mm thickness were made from every epoxy model and examined with the circular polariscope. The results were as follows : 1. Generally I-bar clasp revealed the most favorable stress distribution around the abutment teeth. 2. At the end portion of the free-end ridge, Back action clasp showed the highest stress concentration at the bucco-lingual and top portions of the residual alveolar ridge. 3. At the distal area of the abutment teeth, Akers clasp and Roach clasp showed higher stress concentration bucco-lingually and apically than the others. 4. To the abutment tooth, I-bar clasp showed the least stress distribution bucco-lingually but the others showed irregular stress distribution. 5. At the mesial area of the abutment teeth, the order of effective stress distribution was I-bar clasp, Back-action clasp, Akers clasp and Roach clasp. There was big difference of stress distribution between them. 6. At the right 2nd premolar and 1st molar, the stress concentration of Akers clasp was a little high but that of I-bar clasp was low.

  • PDF

A Study on Model for Drivable Area Segmentation based on Deep Learning (딥러닝 기반의 주행가능 영역 추출 모델에 관한 연구)

  • Jeon, Hyo-jin;Cho, Soo-sun
    • Journal of Internet Computing and Services
    • /
    • v.20 no.5
    • /
    • pp.105-111
    • /
    • 2019
  • Core technologies that lead the Fourth Industrial Revolution era, such as artificial intelligence, big data, and autonomous driving, are implemented and serviced through the rapid development of computing power and hyper-connected networks based on the Internet of Things. In this paper, we implement two different models for drivable area segmentation in various environment, and propose a better model by comparing the results. The models for drivable area segmentation are using DeepLab V3+ and Mask R-CNN, which have great performances in the field of image segmentation and are used in many studies in autonomous driving technology. For driving information in various environment, we use BDD dataset which provides driving videos and images in various weather conditions and day&night time. The result of two different models shows that Mask R-CNN has higher performance with 68.33% IoU than DeepLab V3+ with 48.97% IoU. In addition, the result of visual inspection of drivable area segmentation on driving image, the accuracy of Mask R-CNN is 83% and DeepLab V3+ is 69%. It indicates Mask R-CNN is more efficient than DeepLab V3+ in drivable area segmentation.

Development of Remote Measurement Method for Reinforcement Information in Construction Field Using 360 Degrees Camera (360도 카메라 기반 건설현장 철근 배근 정보 원격 계측 기법 개발)

  • Lee, Myung-Hun;Woo, Ukyong;Choi, Hajin;Kang, Su-min;Choi, Kyoung-Kyu
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.26 no.6
    • /
    • pp.157-166
    • /
    • 2022
  • Structural supervision on the construction site has been performed based on visual inspection, which is highly labor-intensive and subjective. In this study, the remote technique was developed to improve the efficiency of the measurements on rebar spacing using a 360° camera and reconstructed 3D models. The proposed method was verified by measuring the spacings in reinforced concrete structure, where the twelve locations in the construction site (265 m2) were scanned within 20 seconds per location and a total of 15 minutes was taken. SLAM, consisting of SIFT, RANSAC, and General framework graph optimization algorithms, produces RGB-based 3D and 3D point cloud models, respectively. The minimum resolution of the 3D point cloud was 0.1mm while that of the RGB-based 3D model was 10 mm. Based on the results from both 3D models, the measurement error was from 10.8% to 0.3% in the 3D point cloud and from 28.4% to 3.1% in the RGB-based 3D model. The results demonstrate that the proposed method has great potential for remote structural supervision with respect to its accuracy and objectivity.