• Title/Summary/Keyword: multi-modal

Search Result 631, Processing Time 0.024 seconds

Range Estimating Performance Evaluation of the Underwater Broadband Source by Array Invariant (Array Invariant를 이용한 수중 광대역 음원의 거리 추정성능 분석)

  • Kim Se-Young;Chun Seung-Yong;Kim Boo-Il;Kim Ki-Man
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.6
    • /
    • pp.305-311
    • /
    • 2006
  • In this paper the performance of a array invariant method is evaluated for source-range estimation in horizontally stratified shallow water ocean waveguide. The method has advantage of little computationally effort over existing source-localization methods. such as matched field processing or the waveguide invariant and array gain is fully exploited. And. no knowledge of the environment is required except that the received field should not be dominated by purely interference This simple and instantaneous method is applied to simulated acoustic propagation filed for testing range estimation performance. The result of range estimation according to the SNR for the underwater impulsive source with broadband spectrum is demonstrated. The spatial smoothing method is applied to suppress the effect of mutipath propagation by high frequency signal. The result of performance test for range estimation shows that the error rate is within 20% at the SNR above 10dB.

Performance Analysis for Accuracy of Personality Recognition Models based on Setting of Margin Values at Face Region Extraction (얼굴 영역 추출 시 여유값의 설정에 따른 개성 인식 모델 정확도 성능 분석)

  • Qiu Xu;Gyuwon Han;Bongjae Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.141-147
    • /
    • 2024
  • Recently, there has been growing interest in personalized services tailored to an individual's preferences. This has led to ongoing research aimed at recognizing and leveraging an individual's personality traits. Among various methods for personality assessment, the OCEAN model stands out as a prominent approach. In utilizing OCEAN for personality recognition, a multi modal artificial intelligence model that incorporates linguistic, paralinguistic, and non-linguistic information is often employed. This paper examines the impact of the margin value set for extracting facial areas from video data on the accuracy of a personality recognition model that uses facial expressions to determine OCEAN traits. The study employed personality recognition models based on 2D Patch Partition, R2plus1D, 3D Patch Partition, and Video Swin Transformer technologies. It was observed that setting the facial area extraction margin to 60 resulted in the highest 1-MAE performance, scoring at 0.9118. These findings indicate the importance of selecting an optimal margin value to maximize the efficiency of personality recognition models.

LH-FAS v2: Head Pose Estimation-Based Lightweight Face Anti-Spoofing (LH-FAS v2: 머리 자세 추정 기반 경량 얼굴 위조 방지 기술)

  • Hyeon-Beom Heo;Hye-Ri Yang;Sung-Uk Jung;Kyung-Jae Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.309-316
    • /
    • 2024
  • Facial recognition technology is widely used in various fields but faces challenges due to its vulnerability to fraudulent activities such as photo spoofing. Extensive research has been conducted to overcome this challenge. Most of them, however, require the use of specialized equipment like multi-modal cameras or operation in high-performance environments. In this paper, we introduce LH-FAS v2 (: Lightweight Head-pose-based Face Anti-Spoofing v2), a system designed to operate on a commercial webcam without any specialized equipment, to address the issue of facial recognition spoofing. LH-FAS v2 utilizes FSA-Net for head pose estimation and ArcFace for facial recognition, effectively assessing changes in head pose and verifying facial identity. We developed the VD4PS dataset, incorporating photo spoofing scenarios to evaluate the model's performance. The experimental results show the model's balanced accuracy and speed, indicating that head pose estimation-based facial anti-spoofing technology can be effectively used to counteract photo spoofing.

Audio Generative AI Usage Pattern Analysis by the Exploratory Study on the Participatory Assessment Process

  • Hanjin Lee;Yeeun Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.4
    • /
    • pp.47-54
    • /
    • 2024
  • The importance of cultural arts education utilizing digital tools is increasing in terms of enhancing tech literacy, self-expression, and developing convergent capabilities. The creation process and evaluation of innovative multi-modal AI, provides expanded creative audio-visual experiences in users. In particular, the process of creating music with AI provides innovative experiences in all areas, from musical ideas to improving lyrics, editing and variations. In this study, we attempted to empirically analyze the process of performing tasks using an Audio and Music Generative AI platform and discussing with fellow learners. As a result, 12 services and 10 types of evaluation criteria were collected through voluntary participation, and divided into usage patterns and purposes. The academic, technological, and policy implications were presented for AI-powered liberal arts education with learners' perspectives.

Optimization of Correction Factor for Linearization with Tc-99m HM PAO and Tc-99m ECD Brain SPECT (Tc-99m HMPAO와 Tc-99m ECD 뇌SPECT의 뇌혈류량 정량화에 사용되는 Linearization Algorithm의 Correction Factor 조사)

  • Cho, Ihn-Ho;Hayashida, Kohei;Won, Kyu-Chang;Lee, Hyoung-Woo;Watabe, Hiroshi;Kume, Norihiko;Uyama, Chikao
    • Journal of Yeungnam Medical Science
    • /
    • v.16 no.2
    • /
    • pp.237-243
    • /
    • 1999
  • We conducted this study to find the optimal correction factor(${\alpha}$) of Lassen's linearization algorithm which has been applied for correction of flow-limited uptake at a high flow range in $^{99m}Tc$ d,l-hexamethylpropy leneamine oxime(HMPAO) and $^{99m}Tc$ ethyl cysteinate dimer(ECD). Ten patients with chronic cerebral infarction were involved in this study. We obtained the corrected $^{99m}Tc$ HMPAO and $^{99m}Tc$-ECD brain SPECT(single photon emission computed tomography) using the algorithm with ${\alpha}$ values that varied from 0.1 to 10 and compared the results with regional cerebral blood flow determined by positron emission tomography (PET-rCBF). The multi-modal volume registration by maximization of mutual information was used for matching between PET-rCBF and SPECT images. The highest correlation coefficient between $^{99m}Tc$-HMPAO and $^{99m}Tc$-ECD brain uptake and PET-rCBF was revealed at ${\alpha}$ 1.4 and 2.1, respectively. We concluded that the ${\alpha}$ values of Lassen's linearization algorithm for $^{99m}Tc$-HMPAO and $^{99m}Tc$-ECD brain SPECT images were 1.4 and 2.1, respectively to indicate cerebral blood flow with comparison of PET-rCBF.

  • PDF

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

Modeling of Elastodynamic Problems in Finite Solid Media (유한 고체내 탄성동역학 문제의 모델링)

  • Cho, Youn-Ho
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.20 no.2
    • /
    • pp.138-149
    • /
    • 2000
  • Various modeling techniques for ultrasonic wave propagation and scattering problems in finite solid media are presented. Elastodynamic boundary value problems in inhomogeneous multi-layered plate-like structures are set up for modal analysis of guided wave propagation and numerically solved to obtain dispersion curves which show propagation characteristics of guided waves. As a powerful modeling tool to overcome such numerical difficulties in wave scattering problems as the geometrical complexity and mode conversion, the Boundary Element Method(BEM) is introduced and is combined with the normal mode expansion technique to develop the hybrid BEM, an efficient technique for modeling multi mode conversion of guided wave scattering problems. Time dependent wave forms are obtained through the inverse Fourier transformation of the numerical solutions in the frequency domain. 3D BEM program development is underway to model more practical ultrasonic wave signals. Some encouraging numerical results have recently been obtained in comparison with the analytical solutions for wave propagation in a bar subjected to time harmonic longitudinal excitation. It is expected that the presented modeling techniques for elastic wave propagation and scattering can be applied to establish quantitative nondestructive evaluation techniques in various ways.

  • PDF

On the Free Vibration Analysis of Thin-Walled Box Beams having Variable Cross-Sections (단면형상이 변하는 박판보의 진동해석에 관한 연구)

  • Lee, Gi-Jun;Sa, Jin-Yong;Kim, Jun-Sik
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.30 no.2
    • /
    • pp.111-117
    • /
    • 2017
  • In this paper, a local deformation effect in thin-walled box beams is investigated via a finite element modal analysis. The analysis is carried out for single-cell and multi-cell box beam configurations. The single-cell box beam with and without a neck, which mimics a simple wind-turbine blade, is analyzed first. The results obtained by shell elements are compared to those of one-dimensional(1D) beam elements. It is observed that the wall thickness plays a crucial role in the natural frequencies of the beam. The 1D beam analysis deviates from the shell analysis when the wall thickness is either thin or thick. The shell modes(local deformations) are dominant as it becomes thin, whereas the shear deformation effects are significant as it does thick. The analysis is extended to the single-cell box beam with a neck, in which the shell modes are confined to near the neck. Finally the multi-cell box beam with a taper, which is quite similar to real wind-turbine blade configuration, is considered to investigate the local deformation effect. The results reveal that the 1D beam analysis cannot match with the shell analysis due to the local deformation, especially for the lagwise frequencies. There are approximately 5~7% errors even if the number of segments is increased.

Facile Fabrication of Animal-Specific Positioning Molds For Multi-modality Molecular Imaging (다중 분자 영상을 위한 간편한 동물 특이적 자세 고정틀의 제작)

  • Park, Jeong-Chan;Oh, Ji-Eun;Woo, Seung-Tae;Kwak, Won-Jung;Lee, Jeong-Eun;Kim, Kyeong-Min;An, Gwang-Il;Choi, Tae-Hyun;Cheon, Gi-Jeong;Chang, Young-Min;Lee, Sang-Woo;Ahn, Byeong-Cheol;Lee, Jae-Tae;Yoo, Jeong-Soo
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.42 no.5
    • /
    • pp.401-409
    • /
    • 2008
  • Purpose: Recently multi-modal imaging system has become widely adopted in molecular imaging. We tried to fabricate animal-specific positioning molds for PET/MR fusion imaging using easily available molding clay and rapid foam. The animal-specific positioning molds provide immobilization and reproducible positioning of small animal. Herein, we have compared fiber-based molding clay with rapid foam in fabricating the molds of experimental animal. Materials and Methods: The round bottomed-acrylic frame, which fitted into microPET gantry, was prepared at first. The experimental mice was anesthetized and placed on the mold for positioning. Rapid foam and fiber-based clay were used to fabricate the mold. In case of both rapid foam and the clay, the experimental animal needs to be pushed down smoothly into the mold for positioning. However, after the mouse was removed, the fabricated clay needed to be dried completely at $60^{\circ}C$ in oven overnight for hardening. Four sealed pipet tips containing $[^{18}F]FDG$ solution were used as fiduciary markers. After injection of $[^{18}F]FDG$ via tail vein, microPET scanning was performed. Successively, MRI scanning was followed in the same animal. Results: Animal-specific positioning molds were fabricated using rapid foam and fiber-based molding clay for multimodality imaging. Functional and anatomical images were obtained with microPET and MRI, respectively. The fused PET/MR images were obtained using freely available AMIDE program. Conclusion: Animal-specific molds were successfully prepared using easily available rapid foam, molding clay and disposable pipet tips. Thanks to animal-specific molds, fusion images of PET and MR were co-registered with negligible misalignment.

The Effects of Multi-Modal Cue for Haptic Imagery on Perceived Ownership (촉각적 심상화를 위한 다중감각 단서가 지각된 소유감에 미치는 영향)

  • Kim, Minsun;Han, Kwanghee
    • Science of Emotion and Sensibility
    • /
    • v.20 no.3
    • /
    • pp.49-60
    • /
    • 2017
  • Previous research found that merely touching an object can create psychological ownership and the endowment effect. It was also found that just imagining touching an object without actually touching the object can make the same effect on psychological ownership. Prior research on haptic imagery examined the effect of haptic imagery induced by direct instruction of imaging on psychological ownership. We investigate a new method which can induce the haptic imagery in a more natural way than direct instruction of imaging. We manipulated imagery conditions such as visual-haptic congruence multimodal cue, visual-haptic incongruent multimodal cue, direct instruction condition and control condition, and examined the effects on imagery vividness, feeling of physical control, perceived ownership, and purchase intention. We conducted the experiment on 140 undergraduate students and our results showed that visual-haptic congruence multimodal cue condition is more effective than direct instruction of haptic imagery while visual-haptic incongruence multimodal cue condition is not effective. Our study extends prior haptic imagery research by making important marketing implications for online retailing.