Search | Korea Science

Dual-stream Co-enhanced Network for Unsupervised Video Object Segmentation

Hongliang Zhu;Hui Yin;Yanting Liu;Ning Chen
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.18 no.4
- /
- pp.938-958
- /
- 2024
Unsupervised Video Object Segmentation (UVOS) is a highly challenging problem in computer vision as the annotation of the target object in the testing video is unknown at all. The main difficulty is to effectively handle the complicated and changeable motion state of the target object and the confusion of similar background objects in video sequence. In this paper, we propose a novel deep Dual-stream Co-enhanced Network (DC-Net) for UVOS via bidirectional motion cues refinement and multi-level feature aggregation, which can fully take advantage of motion cues and effectively integrate different level features to produce high-quality segmentation mask. DC-Net is a dual-stream architecture where the two streams are co-enhanced by each other. One is a motion stream with a Motion-cues Refine Module (MRM), which learns from bidirectional optical flow images and produces fine-grained and complete distinctive motion saliency map, and the other is an appearance stream with a Multi-level Feature Aggregation Module (MFAM) and a Context Attention Module (CAM) which are designed to integrate the different level features effectively. Specifically, the motion saliency map obtained by the motion stream is fused with each stage of the decoder in the appearance stream to improve the segmentation, and in turn the segmentation loss in the appearance stream feeds back into the motion stream to enhance the motion refinement. Experimental results on three datasets (Davis2016, VideoSD, SegTrack-v2) demonstrate that DC-Net has achieved comparable results with some state-of-the-art methods.
https://doi.org/10.3837/tiis.2024.04.007 인용 PDF HTML

A Novel Way of Context-Oriented Data Stream Segmentation using Exon-Intron Theory (Exon-Intron이론을 활용한 상황중심 데이터 스트림 분할 방안)

Lee, Seung-Hun;Suh, Dong-Hyok
- The Journal of the Korea institute of electronic communication sciences
- /
- v.16 no.5
- /
- pp.799-806
- /
- 2021
In the IoT environment, event data from sensors is continuously reported over time. Event data obtained in this trend is accumulated indefinitely, so a method for efficient analysis and management of data is required. In this study, a data stream segmentation method was proposed to support the effective selection and utilization of event data from sensors that are continuously reported and received. An identifier for identifying the point at which to start the analysis process was selected. By introducing the role of these identifiers, it is possible to clarify what is being analyzed and to reduce data throughput. The identifier for stream segmentation proposed in this study is a semantic-oriented data stream segmentation method based on the event occurrence of each stream. The existence of identifiers in stream processing can be said to be useful in terms of providing efficiency and reducing its costs in a large-volume continuous data inflow environment.
https://doi.org/10.13067/JKIECS.2021.16.5.799 인용 PDF KSCI

Implementation of Image Semantic Segmentation on Android Device using Deep Learning (딥-러닝을 활용한 안드로이드 플랫폼에서의 이미지 시맨틱 분할 구현)

Lee, Yong-Hwan;Kim, Youngseop
- Journal of the Semiconductor & Display Technology
- /
- v.19 no.2
- /
- pp.88-91
- /
- 2020
Image segmentation is the task of partitioning an image into multiple sets of pixels based on some characteristics. The objective is to simplify the image into a representation that is more meaningful and easier to analyze. In this paper, we apply deep-learning to pre-train the learning model, and implement an algorithm that performs image segmentation in real time by extracting frames for the stream input from the Android device. Based on the open source of DeepLab-v3+ implemented in Tensorflow, some convolution filters are modified to improve real-time operation on the Android platform.
PDF KSCI

Speaker Change Detection Based on a Graph-Partitioning Criterion

Seo, Jin-Soo
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.2
- /
- pp.80-85
- /
- 2011
Speaker change detection involves the identification of time indices of an audio stream, where the identity of the speaker changes. In this paper, we propose novel measures for the speaker change detection based on a graph-partitioning criterion over the pairwise distance matrix of feature-vector stream. Experiments on both synthetic and real-world data were performed and showed that the proposed approach yield promising results compared with the conventional statistical measures.
https://doi.org/10.7776/ASK.2011.30.2.080 인용 PDF KSCI

An Automatic Road Sign Recognizer for an Intelligent Transport System

Miah, Md. Sipon;Koo, Insoo
- Journal of information and communication convergence engineering
- /
- v.10 no.4
- /
- pp.378-383
- /
- 2012
This paper presents the implementation of an automatic road sign recognizer for an intelligent transport system. In this system, lists of road signs are processed with actions such as line segmentation, single sign segmentation, and storing an artificial sign in the database. The process of taking the video stream and extracting the road sign and storing in the database is called the road sign recognition. This paper presents a study on recognizing traffic sign patterns using a segmentation technique for the efficiency and the speed of the system. The image is converted from one scale to another scale such as RGB to grayscale or grayscale to binary. The images are pre-processed with several image processing techniques, such as threshold techniques, Gaussian filters, Canny edge detection, and the contour technique.
https://doi.org/10.6109/jicce.2012.10.4.378 인용 PDF KSCI

The Voice Dialing System Using Dynamic Hidden Markov Models and Lexical Analysis (DHMM과 어휘해석을 이용한 Voice dialing 시스템)

최성호;이강성;김순협
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.7
- /
- pp.548-556
- /
- 1991
In this paper, Korean spoken continuous digits are ercognized using DHMM(Dynamic Hidden Markov Model) and lexical analysis to provide the base of developing voice dialing system. After segmentation by phoneme unit, it is recognized. This system can be divided into the segmentation section, the design of standard speech section, the recognition section, and the lexical analysis section. In the segmentation section, it is segmented using the ZCR, O order LPC cepstrum, and Ai, parameter of voice speech dectaction, which is changed according to time. In the standard speech design section, 19 phonemes or syllables are trained by DHMM and designed as a standard speech. In the recognition section, phomeme stream are recognized by the Viterbi algorithm.In the lexical decoder section, finally recognized continuous digits are outputed. This experiment shiwed the recognition rate of 85.1% using data spoken 7 times of 21 classes of 7 continuous digits which are combinated all of the occurence, spoken by 10 man.
PDF

Operational Hydrological Forecast for the Nakdong River Basin Using HSPF Watershed Model (HSPF 유역모델을 이용한 낙동강유역 실시간 수문 유출 예측)

Shin, Changmin;Na, Eunye;Lee, Eunjeong;Kim, Dukgil;Min, Joong-Hyuk
- Journal of Korean Society on Water Environment
- /
- v.29 no.2
- /
- pp.212-222
- /
- 2013
A watershed model was constructed using Hydrological Simulation Program Fortran to quantitatively predict the stream flows at major tributaries of Nakdong River basin, Korea. The entire basin was divided into 32 segments to effectively account for spatial variations in meteorological data and land segment parameter values of each tributary. The model was calibrated at ten tributaries including main stream of the river for a three-year period (2008 to 2010). The deviation values (Dv) of runoff volumes for operational stream flow forecasting for a six month period (2012.1.2 to 2012.6.29) at the ten tributaries ranged from -38.1 to 23.6%, which is on average 7.8% higher than those of runoff volumes for model calibration (-12.5 to 8.2%). The increased prediction errors were mainly from the uncertainties of numerical weather prediction modeling; nevertheless the stream flow forecasting results presented in this study were in a good agreement with the measured data.
KSCI

The Role of Post-lexical Intonational Patterns in Korean Word Segmentation

Kim, Sa-Hyang
- Speech Sciences
- /
- v.14 no.1
- /
- pp.37-62
- /
- 2007
The current study examines the role of post-lexical tonal patterns of a prosodic phrase in word segmentation. In a word spotting experiment, native Korean listeners were asked to spot a disyllabic or trisyllabic word from twelve syllable speech stream that was composed of three Accentual Phrases (AP). Words occurred with various post-lexical intonation patterns. The results showed that listeners spotted more words in phrase-initial than in phrase-medial position, suggesting that the AP-final H tone from the preceding AP helped listeners to segment the phrase-initial word in the target AP. Results also showed that listeners' error rates were significantly lower when words occurred with initial rising tonal pattern, which is the most frequent intonational pattern imposed upon multisyllabic words in Korean, than with non-rising patterns. This result was observed both in AP-initial and in AP-medial positions, regardless of the frequency and legality of overall AP tonal patterns. Tonal cues other than initial rising tone did not positively influence the error rate. These results not only indicate that rising tone in AP-initial and AP_final position is a reliable cue for word boundary detection for Korean listeners, but further suggest that phrasal intonation contours serve as a possible word boundary cue in languages without lexical prominence.
PDF

Performance Analysis of Synchronization Communication Protocols for Real-Time Multimedia Services (실시간 멀티미디어 서비스용 동기 통신 프로토콜의 성능 분석)

김태규;조동호
- Journal of the Korean Institute of Telematics and Electronics A
- /
- v.31A no.4
- /
- pp.1-10
- /
- 1994
In the real-time delivery of multimedia data streams over networks, the interruption of continuity in a single media stream and the mismatching of the data within the same time interval in multimedia data streams transfered in paralled on different channels are considered as the most serious synchronization problems. There are several mechanisms proposed to handle these problems. In this paper, these mechanisms are analyzed and compared in various point of view by the computer simulation. According to the simulation results, it has been shown that the method which uses the segmentation and the method which uses the seperate synchronization channel are superior to the method which uses the synchronization marks in view of the real-time transmission and quality of sevice. On the other hand, it can be seen that the method which uses the segmentation is superior to the method which uses the seperate synchronization channel from a channel utilization's point of view.
PDF

The Effect of Strong Syllables on Lexical Segmentation in English Continuous Speech by Korean Speakers (강음절이 한국어 화자의 영어 연속 음성의 어휘 분절에 미치는 영향)

Kim, Sunmi;Nam, Kichun
- Phonetics and Speech Sciences
- /
- v.5 no.2
- /
- pp.43-51
- /
- 2013
English native listeners have a tendency to treat strong syllables in a speech stream as the potential initial syllables of new words, since the majority of lexical words in English have a word-initial stress. The current study investigates whether Korean (L1) - English (L2) late bilinguals perceive strong syllables in English continuous speech as word onsets, as English native listeners do. In Experiment 1, word-spotting was slower when the word-initial syllable was strong, indicating that Korean listeners do not perceive strong syllables as word onsets. Experiment 2 was conducted in order to avoid any possibilities that the results of Experiment 1 may be due to the strong-initial targets themselves used in Experiment 1 being slower to recognize than the weak-initial targets. We employed the gating paradigm in Experiment 2, and measured the Isolation Point (IP, the point at which participants correctly identify a word without subsequently changing their minds) and the Recognition Point (RP, the point at which participants correctly identify the target with 85% or greater confidence) for the targets excised from the non-words in the two conditions of Experiment 1. Both the mean IPs and the mean RPs were significantly earlier for the strong-initial targets, which means that the results of Experiment 1 reflect the difficulty of segmentation when the initial syllable of words was strong. These results are consistent with Kim & Nam (2011), indicating that strong syllables are not perceived as word onsets for Korean listeners and interfere with lexical segmentation in English running speech.
https://doi.org/10.13064/KSSS.2013.5.2.043 인용 PDF

Search Result 33, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)