• 제목/요약/키워드: Salient Segment Detection

검색결과 5건 처리시간 0.021초

Salient Object Detection via Adaptive Region Merging

  • Zhou, Jingbo;Zhai, Jiyou;Ren, Yongfeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권9호
    • /
    • pp.4386-4404
    • /
    • 2016
  • Most existing salient object detection algorithms commonly employed segmentation techniques to eliminate background noise and reduce computation by treating each segment as a processing unit. However, individual small segments provide little information about global contents. Such schemes have limited capability on modeling global perceptual phenomena. In this paper, a novel salient object detection algorithm is proposed based on region merging. An adaptive-based merging scheme is developed to reassemble regions based on their color dissimilarities. The merging strategy can be described as that a region R is merged with its adjacent region Q if Q has the lowest dissimilarity with Q among all Q's adjacent regions. To guide the merging process, superpixels that located at the boundary of the image are treated as the seeds. However, it is possible for a boundary in the input image to be occupied by the foreground object. To avoid this case, we optimize the boundary influences by locating and eliminating erroneous boundaries before the region merging. We show that even though three simple region saliency measurements are adopted for each region, encouraging performance can be obtained. Experiments on four benchmark datasets including MSRA-B, SOD, SED and iCoSeg show the proposed method results in uniform object enhancement and achieve state-of-the-art performance by comparing with nine existing methods.

돌출영역 분할을 위한 대립과정이론 기반의 인공시각집중모델 (An Artificial Visual Attention Model based on Opponent Process Theory for Salient Region Segmentation)

  • 정기선;홍창표;박동선
    • 전자공학회논문지
    • /
    • 제51권7호
    • /
    • pp.157-168
    • /
    • 2014
  • 본 논문에서는 자연영상에 대한 돌출영역을 자동으로 검출하고 이를 분할하기 위한 새로운 인공시각집중모델을 제안한다. 제안된 모델은 인간의 생물학적 시각인지 기반이며 주된 특징은 다음과 같다. 먼저 영상의 강도특징과 색상특징을 사용하는 대립과정이론 기반의 새로운 인공시각집중모델의 구조를 제안하고, 돌출영역을 인지하기 위해 영상의 강도 및 색상 특징채널의 정보량을 고려하는 엔트로피 필터를 설계하였다. 엔트로피 필터는 높은 정확도와 정밀도로 돌출영역에 대해 검출 및 분할이 가능하다. 마지막으로 최종 돌출지도를 효율적으로 구성하기 위한 적응 조합 방법 또한 제안되었다. 이 방법은 각 인지 모델로부터 검출된 강도 및 색상 가시성지도에 대하여 평가하며 평가된 점수로부터 얻어진 가중치를 이용해 가시성 지도들을 조합한다. 돌출지도에 대해 ROC분석을 이용한 AUC를 측정한 결과 기존 최신의 모델들은 평균 0.7824의 성능을 나타낸 반면 제안된 모델의 AUC는 0.9256으로서 약 15%의 성능 개선을 보였다. 또한 돌출영역 분할에 대해 F-beta를 측정한 결과 기존 최신의 모델은 0.5178이고 제안된 모델은 0.7325로서 분할 성능 또한 약 22%의 성능 개선을 보였다.

조음자질을 이용한 한국인 학습자의 영어 발화 자동 발음 평가 (Automatic pronunciation assessment of English produced by Korean learners using articulatory features)

  • 류혁수;정민화
    • 말소리와 음성과학
    • /
    • 제8권4호
    • /
    • pp.103-113
    • /
    • 2016
  • This paper aims to propose articulatory features as novel predictors for automatic pronunciation assessment of English produced by Korean learners. Based on the distinctive feature theory, where phonemes are represented as a set of articulatory/phonetic properties, we propose articulatory Goodness-Of-Pronunciation(aGOP) features in terms of the corresponding articulatory attributes, such as nasal, sonorant, anterior, etc. An English speech corpus spoken by Korean learners is used in the assessment modeling. In our system, learners' speech is forced aligned and recognized by using the acoustic and pronunciation models derived from the WSJ corpus (native North American speech) and the CMU pronouncing dictionary, respectively. In order to compute aGOP features, articulatory models are trained for the corresponding articulatory attributes. In addition to the proposed features, various features which are divided into four categories such as RATE, SEGMENT, SILENCE, and GOP are applied as a baseline. In order to enhance the assessment modeling performance and investigate the weights of the salient features, relevant features are extracted by using Best Subset Selection(BSS). The results show that the proposed model using aGOP features outperform the baseline. In addition, analysis of relevant features extracted by BSS reveals that the selected aGOP features represent the salient variations of Korean learners of English. The results are expected to be effective for automatic pronunciation error detection, as well.

상황인지 음악추천을 위한 음악 분위기 검출 (Detection of Music Mood for Context-aware Music Recommendation)

  • 이종인;여동규;김병만
    • 정보처리학회논문지B
    • /
    • 제17B권4호
    • /
    • pp.263-274
    • /
    • 2010
  • 상황인지 음악추천 서비스를 제공하기 위해서는 무엇보다 상황 또는 문맥에 따라 사용자가 선호하는 음악의 분위기를 파악할 필요가 있다. 음악 분위기 검출에 대한 기존 연구의 대부분은 수작업으로 대표구간을 선정하고, 그 구간의 특징을 이용하여 분위기를 판별한다. 이러한 접근 방법은 분류 성능이 좋은 반면 전문가의 간섭을 요구하기 때문에 새로운 음악에 대해서는 적용하기 어렵다. 더욱이, 곡의 진행에 따라 음악 분위기가 달라지기 때문에 음악의 대표 분위기를 검출하는 것이 더욱 어려워진다. 본 논문에서는 이러한 문제점들을 보완하기 위해 음악 분위기를 자동으로 판별하는 새로운 방법을 제안하였다. 먼저 곡 전체를 구조적 분석 방법을 통하여 비슷한 특성을 갖는 세그먼트들로 분리한 후 각각에 대해 분위기를 판별한다. 그리고 세그먼트별 분위기 파악 시 Thayer 의 2차원 분위기 모델에 기초한 회귀분석 방법으로 개인별 주관적 분위기 성향을 모델링하였다. 실험결과, 제안된 방법이 80% 이상의 정확도를 보였다.

Automatic Person Identification using Multiple Cues

  • Swangpol, Danuwat;Chalidabhongse, Thanarat
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.1202-1205
    • /
    • 2005
  • This paper describes a method for vision-based person identification that can detect, track, and recognize person from video using multiple cues: height and dressing colors. The method does not require constrained target's pose or fully frontal face image to identify the person. First, the system, which is connected to a pan-tilt-zoom camera, detects target using motion detection and human cardboard model. The system keeps tracking the moving target while it is trying to identify whether it is a human and identify who it is among the registered persons in the database. To segment the moving target from the background scene, we employ a version of background subtraction technique and some spatial filtering. Once the target is segmented, we then align the target with the generic human cardboard model to verify whether the detected target is a human. If the target is identified as a human, the card board model is also used to segment the body parts to obtain some salient features such as head, torso, and legs. The whole body silhouette is also analyzed to obtain the target's shape information such as height and slimness. We then use these multiple cues (at present, we uses shirt color, trousers color, and body height) to recognize the target using a supervised self-organization process. We preliminary tested the system on a set of 5 subjects with multiple clothes. The recognition rate is 100% if the person is wearing the clothes that were learned before. In case a person wears new dresses the system fail to identify. This means height is not enough to classify persons. We plan to extend the work by adding more cues such as skin color, and face recognition by utilizing the zoom capability of the camera to obtain high resolution view of face; then, evaluate the system with more subjects.

  • PDF