• Title/Summary/Keyword: Multi-features similarity

Search Result 57, Processing Time 0.027 seconds

A study on searching image by cluster indexing and sequential I/O (연속적 I/O와 클러스터 인덱싱 구조를 이용한 이미지 데이타 검색 연구)

  • Kim, Jin-Ok;Hwang, Dae-Joon
    • The KIPS Transactions:PartD
    • /
    • v.9D no.5
    • /
    • pp.779-788
    • /
    • 2002
  • There are many technically difficult issues in searching multimedia data such as image, video and audio because they are massive and more complex than simple text-based data. As a method of searching multimedia data, a similarity retrieval has been studied to retrieve automatically basic features of multimedia data and to make a search among data with retrieved features because exact match is not adaptable to a matrix of features of multimedia. In this paper, data clustering and its indexing are proposed as a speedy similarity-retrieval method of multimedia data. This approach clusters similar images on adjacent disk cylinders and then builds Indexes to access the clusters. To minimize the search cost, the hashing is adapted to index cluster. In addition, to reduce I/O time, the proposed searching takes just one I/O to look up the location of the cluster containing similar object and one sequential file I/O to read in this cluster. The proposed schema solves the problem of multi-dimension by using clustering and its indexing and has higher search efficiency than the content-based image retrieval that uses only clustering or indexing structure.

A Novel Video Image Text Detection Method

  • Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.3
    • /
    • pp.941-953
    • /
    • 2012
  • A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.

A Novel Video Image Text Detection Method

  • Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.4
    • /
    • pp.1140-1152
    • /
    • 2012
  • A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.

Multi-Scaling Models of TCP/IP and Sub-Frame VBR Video Traffic

  • Erramilli, Ashok;Narayan, Onuttom;Neidhardt, Arnold;Saniee, Iraj
    • Journal of Communications and Networks
    • /
    • v.3 no.4
    • /
    • pp.383-395
    • /
    • 2001
  • Recent measurement and simulation studies have revealed that wide area network traffic displays complex statistical characteristics-possibly multifractal scaling-on fine timescales, in addition to the well-known properly of self-similar scaling on coarser timescales. In this paper we investigate the performance and network engineering significance of these fine timescale features using measured TCP anti MPEG2 video traces, queueing simulations and analytical arguments. We demonstrate that the fine timescale features can affect performance substantially at low and intermediate utilizations, while the longer timescale self-similarity is important at intermediate and high utilizations. We relate the fine timescale structure in the measured TCP traces to flow controls, and show that UDP traffic-which is not flow controlled-lacks such fine timescale structure. Likewise we relate the fine timescale structure in video MPEG2 traces to sub-frame encoding. We show that it is possibly to construct a relatively parsimonious multi-fractal cascade model of fine timescale features that matches the queueing performance of both the TCP and video traces. We outline an analytical method ta estimate performance for traffic that is self-similar on coarse timescales and multi-fractal on fine timescales, and show that the engineering problem of setting safe operating points for planning or admission controls can be significantly influenced by fine timescale fluctuations in network traffic. The work reported here can be used to model the relevant characteristics of wide area traffic across a full range of engineering timescales, and can be the basis of more accurate network performance analysis and engineering.

  • PDF

Application of Combustion in Porous Inert Medium to Thermophotovoltaic Generation of Electricity and Excess Enthalpy Combustion Similarity to both Single and Multi-channels (다공체 내 연소의 열광전 발전에의 적용과 단일, 다중채널 및 다공체 내 초과 엔탈피 연소의 상사성)

  • Lee, Dae Keun
    • 한국연소학회:학술대회논문집
    • /
    • 2012.11a
    • /
    • pp.171-174
    • /
    • 2012
  • Thermophotovoltaics is the direct energy conversion technology from thermal to electric (voltaic) energy via photon radiation, without any thermodynamic cycle. It is, in general, accomplished by immersing solid body in high temperature heat source (e.g. combustion field), in order to achieve high intensity irradiation, and by receiving the radiation thereof on photovoltaic cells. In this paper, advantages of combustion in porous inert medium in applying to the thermophotovoltaics are discussed in a view of its excess enthalpy features. In addition, the similarities of flame behaviors in porous inert medium to both in single and multi-channels are described.

  • PDF

Gated Recurrent Unit Architecture for Context-Aware Recommendations with improved Similarity Measures

  • Kala, K.U.;Nandhini, M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.538-561
    • /
    • 2020
  • Recommender Systems (RecSys) have a major role in e-commerce for recommending products, which they may like for every user and thus improve their business aspects. Although many types of RecSyss are there in the research field, the state of the art RecSys has focused on finding the user similarity based on sequence (e.g. purchase history, movie-watching history) analyzing and prediction techniques like Recurrent Neural Network in Deep learning. That is RecSys has considered as a sequence prediction problem. However, evaluation of similarities among the customers is challenging while considering temporal aspects, context and multi-component ratings of the item-records in the customer sequences. For addressing this issue, we are proposing a Deep Learning based model which learns customer similarity directly from the sequence to sequence similarity as well as item to item similarity by considering all features of the item, contexts, and rating components using Dynamic Temporal Warping(DTW) distance measure for dynamic temporal matching and 2D-GRU (Two Dimensional-Gated Recurrent Unit) architecture. This will overcome the limitation of non-linearity in the time dimension while measuring the similarity, and the find patterns more accurately and speedily from temporal and spatial contexts. Experiment on the real world movie data set LDOS-CoMoDa demonstrates the efficacy and promising utility of the proposed personalized RecSys architecture.

Evaluation of shape similarity for 3D models (3차원 모델을 위한 형상 유사성 평가)

  • Kim, Jeong-Sik;Choi, Soo-Mi
    • The KIPS Transactions:PartA
    • /
    • v.10A no.4
    • /
    • pp.357-368
    • /
    • 2003
  • Evaluation of shape similarity for 3D models is essential in many areas - medicine, mechanical engineering, molecular biology, etc. Moreover, as 3D models are commonly used on the Web, many researches have been made on the classification and retrieval of 3D models. In this paper, we describe methods for 3D shape representation and major concepts of similarity evaluation, and analyze the key features of recent researches for shape comparison after classifying them into four categories including multi-resolution, topology, 2D image, and statistics based methods. In addition, we evaluated the performance of the reviewed methods by the selected criteria such as uniqueness, robustness, invariance, multi-resolution, efficiency, and comparison scope. Multi-resolution based methods have resulted in decreased computation time for comparison and increased preprocessing time. The methods using geometric and topological information were able to compare more various types of models and were robust to partial shape comparison. 2D image based methods incurred overheads in time and space complexity. Statistics based methods allowed for shape comparison without pose-normalization and showed robustness against affine transformations and noise.

Acoustic Signal based Optimal Route Selection Problem: Performance Comparison of Multi-Attribute Decision Making methods

  • Borkar, Prashant;Sarode, M.V.;Malik, L. G.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.647-669
    • /
    • 2016
  • Multiple attribute for decision making including user preference will increase the complexity of route selection process. Various approaches have been proposed to solve the optimal route selection problem. In this paper, multi attribute decision making (MADM) algorithms such as Simple Additive Weighting (SAW), Weighted Product Method (WPM), Analytic Hierarchy Process (AHP) method and Total Order Preference by Similarity to the Ideal Solution (TOPSIS) methods have been proposed for acoustic signature based optimal route selection to facilitate user with better quality of service. The traffic density state conditions (very low, low, below medium, medium, above medium, high and very high) on the road segment is the occurrence and mixture weightings of traffic noise signals (Tyre, Engine, Air Turbulence, Exhaust, and Honks etc) is considered as one of the attribute in decision making process. The short-term spectral envelope features of the cumulative acoustic signals are extracted using Mel-Frequency Cepstral Coefficients (MFCC) and Adaptive Neuro-Fuzzy Classifier (ANFC) is used to model seven traffic density states. Simple point method and AHP has been used for calculation of weights of decision parameters. Numerical results show that WPM, AHP and TOPSIS provide similar performance.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Ground Target Classification Algorithm based on Multi-Sensor Images (다중센서 영상 기반의 지상 표적 분류 알고리즘)

  • Lee, Eun-Young;Gu, Eun-Hye;Lee, Hee-Yul;Cho, Woong-Ho;Park, Kil-Houm
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.2
    • /
    • pp.195-203
    • /
    • 2012
  • This paper proposes ground target classification algorithm based on decision fusion and feature extraction method using multi-sensor images. The decisions obtained from the individual classifiers are fused by applying a weighted voting method to improve target recognition rate. For classifying the targets belong to the individual sensors images, features robust to scale and rotation are extracted using the difference of brightness of CM images obtained from CCD image and the boundary similarity and the width ratio between the vehicle body and turret of target in FLIR image. Finally, we verity the performance of proposed ground target classification algorithm and feature extraction method by the experimentation.