• Title/Summary/Keyword: Multi-temporal Approach

Search Result 66, Processing Time 0.032 seconds

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

A New Mean Frequency Extension Method in Doppler System (초음파 도플러 시스템에서 새로운 평균 주파수 확장 방법)

  • 백광렬
    • Journal of Biomedical Engineering Research
    • /
    • v.16 no.2
    • /
    • pp.183-190
    • /
    • 1995
  • The use of ultrasound pulsed Doppler systems has become increasingly popular due to the advantages of easy measurements of blood velocity, volume blood blow, and irregularities of the circulatory system. However, the 2-D Doppler systems have several problems, such as range ambiguity, low signal to noise ratio, and slow frame rate. The mean frequency aliasing problem originating from the pulse repetition frequency is one of major limitations in pulsed Doppler systems. A conventional approach to resolve this problem is tracking the mean frequency close to and beyond the Nyquist frequency along the temporal axis. In this paper, a new concept of tracking the mean frequency along the spatial axis is proposed. The proposed technique is fault tolerant by nature and more suitable for multi gate and 2-D Doppler system than conventional methods.

  • PDF

Multi-temporal Remote-Sensing Imag e ClassificationUsing Artificial Neural Networks (인공신경망 이론을 이용한 위성영상의 카테고리분류)

  • Kang, Moon-Seong;Park, Seung-Woo;Lim, Jae-Chon
    • Proceedings of the Korean Society of Agricultural Engineers Conference
    • /
    • 2001.10a
    • /
    • pp.59-64
    • /
    • 2001
  • The objectives of the thesis are to propose a pattern classification method for remote sensing data using artificial neural network. First, we apply the error back propagation algorithm to classify the remote sensing data. In this case, the classification performance depends on a training data set. Using the training data set and the error back propagation algorithm, a layered neural network is trained such that the training pattern are classified with a specified accuracy. After training the neural network, some pixels are deleted from the original training data set if they are incorrectly classified and a new training data set is built up. Once training is complete, a testing data set is classified by using the trained neural network. The classification results of Landsat TM data show that this approach produces excellent results which are more realistic and noiseless compared with a conventional Bayesian method.

  • PDF

Multi-Cattle Tracking Algorithm with Enhanced Trajectory Estimation in Precision Livestock Farms

  • Shujie Han;Alvaro Fuentes;Sook Yoon;Jongbin Park;Dong Sun Park
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.23-31
    • /
    • 2024
  • In precision cattle farm, reliably tracking the identity of each cattle is necessary. Effective tracking of cattle within farm environments presents a unique challenge, particularly with the need to minimize the occurrence of excessive tracking trajectories. To address this, we introduce a trajectory playback decision tree algorithm that reevaluates and cleans tracking results based on spatio-temporal relationships among trajectories. This approach considers trajectory as metadata, resulting in more realistic and accurate tracking outcomes. This algorithm showcases its robustness and capability through extensive comparisons with popular tracking models, consistently demonstrating the promotion of performance across various evaluation metrics that is HOTA, AssA, and IDF1 achieve 68.81%, 79.31%, and 84.81%.

Extended Forecasts of a Stock Index using Learning Techniques : A Study of Predictive Granularity and Input Diversity

  • Kim, Steven H.;Lee, Dong-Yun
    • Asia pacific journal of information systems
    • /
    • v.7 no.1
    • /
    • pp.67-83
    • /
    • 1997
  • The utility of learning techniques in investment analysis has been demonstrated in many areas, ranging from forecasting individual stocks to entire market indexes. To date, however, the application of artificial intelligence to financial forecasting has focused largely on short predictive horizons. Usually the forecast window is a single period ahead; if the input data involve daily observations, the forecast is for one day ahead; if monthly observations, then a month ahead; and so on. Thus far little work has been conducted on the efficacy of long-term prediction involving multiperiod forecasting. This paper examines the impact of alternative procedures for extended prediction using knowledge discovery techniques. One dimension in the study involves temporal granularity: a single jump from the present period to the end of the forecast window versus a web of short-term forecasts involving a sequence of single-period predictions. Another parameter relates to the numerosity of input variables: a technical approach involving only lagged observations of the target variable versus a fundamental approach involving multiple variables. The dual possibilities along each of the granularity and numerosity dimensions entail a total of 4 models. These models are first evaluated using neural networks, then compared against a multi-input jump model using case based reasoning. The computational models are examined in the context of forecasting the S&P 500 index.

  • PDF

Mapping Topography Change via Multi-Temporal Sentinel-1 Pixel-Frequency Approach on Incheon River Estuary Wetland, Gochang, Korea (다중시기 Sentinel-1 픽셀-빈도 기법을 통한 고창 인천강 하구 습지의 지형 변화 매핑)

  • Won-Kyung Baek;Moung-Jin Lee;Ha-Eun Yu;Jeong-Cheol Kim;Joo-Hyung Ryu
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_3
    • /
    • pp.1747-1761
    • /
    • 2023
  • Wetlands, defined as lands periodically inundated or exposed during the year, are crucial for sustaining biodiversity and filtering environmental pollutants. The importance of mapping and monitoring their topographical changes is therefore paramount. This study focuses on the topographical variations at the Incheon River estuary wetland post-restoration, noting a lack of adequate prior measurements. Using a multi-temporal Sentinel-1 dataset from October 2014 to March 2023, we mapped long-term variations in water bodies and detected topographical change anomalies using a pixel-frequency approach. Our analysis, based on 196 Sentinel-1 acquisitions from an ascending orbit, revealed significant topography changes. Since 2020, employing the pixel-frequency technique, we observed area increases of +0.0195, 0.0016, 0.0075, and 0.0163 km2 in water level sections at depths of 2-3 m, 1-2 m, 0-1 m, and less than 0 m, respectively. These findings underscore the effectiveness of the wetland restoration efforts in the area.

Overview of Inter-Component Coding in 3D-HEVC (3D-HEVC를 위한 인터-컴포넌트 부호화 방법)

  • Park, Min Woo;Lee, Jin Young;Kim, Chanyul
    • Journal of Broadcast Engineering
    • /
    • v.20 no.4
    • /
    • pp.545-556
    • /
    • 2015
  • A HEVC-compatible 3D video coding method (3D-HEVC) has been recently developed as an extension of the high efficiency video coding (HEVC) standard. In order to efficiently deal with the multi-view video plus depth (MVD) format, 3D-HEVC exploits an inter-component prediction which allows the prediction between texture and depth map images in addition to a temporal prediction used in the conventional single layer video coding such as H.264/AVC and HEVC. The performance of the inter-component prediction is normally affected by the accuracy of the disparity vector, and thus it is important to have an accurate disparity vector used for the inter-component prediction. This paper, therefore, introduces a disparity derivation method and inter-component algorithms using the disparity vector for the efficient 3D video coding. Simulation results show that the 3D-HEVC provides higher coding performance compared with the simulcast approach using HEVC and the simple multi-view extension (MH-HEVC).

A Study on Training Dataset Configuration for Deep Learning Based Image Matching of Multi-sensor VHR Satellite Images (다중센서 고해상도 위성영상의 딥러닝 기반 영상매칭을 위한 학습자료 구성에 관한 연구)

  • Kang, Wonbin;Jung, Minyoung;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1505-1514
    • /
    • 2022
  • Image matching is a crucial preprocessing step for effective utilization of multi-temporal and multi-sensor very high resolution (VHR) satellite images. Deep learning (DL) method which is attracting widespread interest has proven to be an efficient approach to measure the similarity between image pairs in quick and accurate manner by extracting complex and detailed features from satellite images. However, Image matching of VHR satellite images remains challenging due to limitations of DL models in which the results are depending on the quantity and quality of training dataset, as well as the difficulty of creating training dataset with VHR satellite images. Therefore, this study examines the feasibility of DL-based method in matching pair extraction which is the most time-consuming process during image registration. This paper also aims to analyze factors that affect the accuracy based on the configuration of training dataset, when developing training dataset from existing multi-sensor VHR image database with bias for DL-based image matching. For this purpose, the generated training dataset were composed of correct matching pairs and incorrect matching pairs by assigning true and false labels to image pairs extracted using a grid-based Scale Invariant Feature Transform (SIFT) algorithm for a total of 12 multi-temporal and multi-sensor VHR images. The Siamese convolutional neural network (SCNN), proposed for matching pair extraction on constructed training dataset, proceeds with model learning and measures similarities by passing two images in parallel to the two identical convolutional neural network structures. The results from this study confirm that data acquired from VHR satellite image database can be used as DL training dataset and indicate the potential to improve efficiency of the matching process by appropriate configuration of multi-sensor images. DL-based image matching techniques using multi-sensor VHR satellite images are expected to replace existing manual-based feature extraction methods based on its stable performance, thus further develop into an integrated DL-based image registration framework.

Land cover change and forest fragmentation analysis for Naypyidaw, Myanmar (미얀마 네피도 지역의 도시개발로 인한 토지피복변화 탐지 및 산림파편화 분석)

  • Kong, In-Hye;Baek, Gyoung-Hye;Lee, Dong-Kun
    • Journal of Environmental Impact Assessment
    • /
    • v.22 no.2
    • /
    • pp.147-156
    • /
    • 2013
  • Myanmar(Burma) has been preserved valuable environmental resources because of its political isolation. But recently, Myanmar has moved a capital city(Naypyidaw) at central forest area and it has been urbanized radically since 2005. In this paper, we built multi-temporal land cover map from Landsat images of 1970s to 2012 with ENVI 4.5 software. For a broad approach, administrative district Yamethin which includes Naypyidaw is classified into 3 classes and with only Naypyidaw region is classified with 4-5 classes to analyse specific changes. And with forest cover extracted by Object Oriented Classification, we evaluated forest fragmentation before and after the development using Patch Analyst(FRAGSTATs 3.3) at Yamethin area. For Yamethin area, there were significant forest cover change, 51% in 1999 to 48% in 2012, and for Naypyidaw area, 67% in 1999 to 57% in 2012 respectively. Also landscape indices resulted from Patch Analyst concluded that the total edge, edge density and mean shaped index of forest patches increased and total core area is decreased. It is attributed from land cover change with urbanization and agricultural land expansion.

Interaction Intent Analysis of Multiple Persons using Nonverbal Behavior Features (인간의 비언어적 행동 특징을 이용한 다중 사용자의 상호작용 의도 분석)

  • Yun, Sang-Seok;Kim, Munsang;Choi, Mun-Taek;Song, Jae-Bok
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.8
    • /
    • pp.738-744
    • /
    • 2013
  • According to the cognitive science research, the interaction intent of humans can be estimated through an analysis of the representing behaviors. This paper proposes a novel methodology for reliable intention analysis of humans by applying this approach. To identify the intention, 8 behavioral features are extracted from the 4 characteristics in human-human interaction and we outline a set of core components for nonverbal behavior of humans. These nonverbal behaviors are associated with various recognition modules including multimodal sensors which have each modality with localizing sound source of the speaker in the audition part, recognizing frontal face and facial expression in the vision part, and estimating human trajectories, body pose and leaning, and hand gesture in the spatial part. As a post-processing step, temporal confidential reasoning is utilized to improve the recognition performance and integrated human model is utilized to quantitatively classify the intention from multi-dimensional cues by applying the weight factor. Thus, interactive robots can make informed engagement decision to effectively interact with multiple persons. Experimental results show that the proposed scheme works successfully between human users and a robot in human-robot interaction.