• Title/Summary/Keyword: Feature representation

Search Result 418, Processing Time 0.031 seconds

Automatic Tagging Scheme for Plural Faces (다중 얼굴 태깅 자동화)

  • Lee, Chung-Yeon;Lee, Jae-Dong;Chin, Seong-Ah
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.3
    • /
    • pp.11-21
    • /
    • 2010
  • To aim at improving performance and reflecting user's needs of retrieval, the number of researches has been actively conducted in recent year as the quantity of information and generation of the web pages exceedingly increase. One of alternative approaches can be a tagging system. It makes users be able to provide a representation of metadata including writings, pictures, and movies etc. called tag and be convenient in use of retrieval of internet resources. Tags similar to keywords play a critical role in maintaining target pages. However, they still needs time consuming labors to annotate tags, which sometimes are found to be a hinderance caused by overuse of tagging. In this paper, we present an automatic tagging scheme for a solution of current tagging system conveying drawbacks and inconveniences. To realize the approach, face recognition-based tagging system on SNS is proposed by building a face area detection procedure, linear-based classification and boosting algorithm. The proposed novel approach of tagging service can increase possibilities that utilized SNS more efficiently. Experimental results and performance analysis are shown as well.

Linguistic Theory in India and Panini (인도의 언어이론과 파니니)

  • 김형엽
    • Lingua Humanitatis
    • /
    • v.1 no.2
    • /
    • pp.123-139
    • /
    • 2001
  • In the history of linguistics in the world the scholars in India could be regarded as the representative linguists, who had provided the cornerstone of the academic development at linguistics. Without looking into the contents of Indian linguistic theories devised and developed in the past it would be almost impossible to account for the origin of descriptive linguistics and historical linguistics. These linguistics trends became full-fledged in 19 and 20 century and are still accepted by a lot of researchers in order to analyze newly revealed languages and train students only coming up the toddling level of linguistic studies. In this paper I will show how far the influence of Indian linguistics has colored the flow of linguistic growth historically. Especially through the analysis of Panini grammar I will prove the intimate relationship between the Indian linguistic theory and the generative grammar - it is the most active theory at present. The methods that Panini applied to constitute the rules like sutra include lots of information, that also could be discovered at the rules postulated in the generative grammar. One of the common features found at both linguistic theories is the simplicity of rule representation. At the generative grammar a rule has to be established without any redundancy. When certain number of sounds like p, b, m show the same phonological. change relevant to lips (labial in linguistic term) different rules need not to be given for each sound separately. It is better to find a way of putting the sounds together in a rule with grouping the 3 sounds with the shared phonetic feature 'labial'. In Panini grammar the form of a rule was decided based on the simplicity, too. For example, sutra 6.1.77 shows the phonological connection between the vowels i, u r 1 and the semi-vowels y, v, r, 1. However, it does not require to postulate 4 individual rules respectively. Instead a rule in which the vowels and the semi-vowels are involved is suggested, and linguistically the rule make it clear that the more simpler the rules will be the better they can reflect the efficiency of human language acquisition. Although the systems introduced at Panini grammar have some sense of distance from the language education itself we cannot deny the fact that the grammar formulates the a turning point of linguistic development. It is essential for us to think over the grammar from the view point of the modem linguistic theories to understand their root and trunk more thoroughly. It will also help us to predict in which way linguistic tendency will proceed to in future.

  • PDF

Visualization of Korean Speech Based on the Distance of Acoustic Features (음성특징의 거리에 기반한 한국어 발음의 시각화)

  • Pok, Gou-Chol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.3
    • /
    • pp.197-205
    • /
    • 2020
  • Korean language has the characteristics that the pronunciation of phoneme units such as vowels and consonants are fixed and the pronunciation associated with a notation does not change, so that foreign learners can approach rather easily Korean language. However, when one pronounces words, phrases, or sentences, the pronunciation changes in a manner of a wide variation and complexity at the boundaries of syllables, and the association of notation and pronunciation does not hold any more. Consequently, it is very difficult for foreign learners to study Korean standard pronunciations. Despite these difficulties, it is believed that systematic analysis of pronunciation errors for Korean words is possible according to the advantageous observations that the relationship between Korean notations and pronunciations can be described as a set of firm rules without exceptions unlike other languages including English. In this paper, we propose a visualization framework which shows the differences between standard pronunciations and erratic ones as quantitative measures on the computer screen. Previous researches only show color representation and 3D graphics of speech properties, or an animated view of changing shapes of lips and mouth cavity. Moreover, the features used in the analysis are only point data such as the average of a speech range. In this study, we propose a method which can directly use the time-series data instead of using summary or distorted data. This was realized by using the deep learning-based technique which combines Self-organizing map, variational autoencoder model, and Markov model, and we achieved a superior performance enhancement compared to the method using the point-based data.

Application of GIS-based Probabilistic Empirical and Parametric Models for Landslide Susceptibility Analysis (산사태 취약성 분석을 위한 GIS 기반 확률론적 추정 모델과 모수적 모델의 적용)

  • Park, No-Wook;Chi, Kwang-Hoon;Chung, Chang-Jo F.;Kwon, Byung-Doo
    • Economic and Environmental Geology
    • /
    • v.38 no.1
    • /
    • pp.45-55
    • /
    • 2005
  • Traditional GIS-based probabilistic spatial data integration models for landslide susceptibility analysis have failed to provide the theoretical backgrounds and effective methods for integration of different types of spatial data such as categorical and continuous data. This paper applies two spatial data integration models including non-parametric empirical estimation and parametric predictive discriminant analysis models that can directly use the original continuous data within a likelihood ratio framework. Similarity rates and a prediction rate curve are computed to quantitatively compare those two models. To illustrate the proposed models, two case studies from the Jangheung and Boeun areas were carried out and analyzed. As a result of the Jangheung case study, two models showed similar prediction capabilities. On the other hand, in the Boeun area, the parametric predictive discriminant analysis model showed the better prediction capability than that from the non-parametric empirical estimation model. In conclusion, the proposed models could effectively integrate the continuous data for landslide susceptibility analysis and more case studies should be carried out to support the results from the case studies, since each model has a distinctive feature in continuous data representation.

Evaluation and Intercomparisons of the Estimated TOVS Precipitable Waters for the Tropical Plume (Tropical Plume 에 대한 TOVS 추정 가강수량의 평가와 상호비교)

  • 정효상;신동인
    • Korean Journal of Remote Sensing
    • /
    • v.9 no.2
    • /
    • pp.51-69
    • /
    • 1993
  • Precipitable Water(PW) are retrieved over the tropical and subtropical Pacific Ocean from TOVS infrared and microwave channel brightness temperature and OLR observations by means of stepwise linear regression. The retrieved TOVS PW fields generated by PW$_{sfc}$(71.1 % of the variance and 0.62 g cm$^{-2}$ standard error over the surface) and PW$_{700500}$(71.7 % and 0.17 g cm$^{-2}$ over the 700 - 500 hPa layer) revealed more evolving synoptic signals over the tropical and subtropical Pacific Ocean. The PW$_{sfc}$ dose not show significantly the TP feature because of the representation of the lower PW for high-level clouds not associated with deep convection. There exists some elusion to trace the TP on the PW$_{sfc}$ field if any supplementary information does not provide. But ECMWF analysis has a general tendency of drying the subtropics and moistening the ITCZ (InterTropical Convergence Zone) and SPCZ(South Pacific Convergence Zone). However, although ECMWF analysis is fairly successful in capturing mean patterms, it is unsuccessful in following active synoptic signal like a tropical plume. Similarly, SMMR-PW does not represent the TP well which consists of the highand middle-level clouds, but PW$_{sfc}$ shows underestimated moistness of TP and does not depict significant signal of TP. In the PW field derived from microwave observations, the TP can not be recognized well. Furthermore, the signature of PW$_{sfc}$ was different from OLR for the TP, which implies the presence of high- and middle-layer thin clouds, but in a closer agreement for deep and active convection areas which contain thick middle- and lower-layer clouds; though OLR represented the cloudiness in the tropics well. In synoptically active regions, it differed from OLR analysis, primarily bacause of actual differences in water vapor and cloud features. The signature of PW$_{sfc}$ was different from OLR for the TP.

Local Prominent Directional Pattern for Gender Recognition of Facial Photographs and Sketches (Local Prominent Directional Pattern을 이용한 얼굴 사진과 스케치 영상 성별인식 방법)

  • Makhmudkhujaev, Farkhod;Chae, Oksam
    • Convergence Security Journal
    • /
    • v.19 no.2
    • /
    • pp.91-104
    • /
    • 2019
  • In this paper, we present a novel local descriptor, Local Prominent Directional Pattern (LPDP), to represent the description of facial images for gender recognition purpose. To achieve a clearly discriminative representation of local shape, presented method encodes a target pixel with the prominent directional variations in local structure from an analysis of statistics encompassed in the histogram of such directional variations. Use of the statistical information comes from the observation that a local neighboring region, having an edge going through it, demonstrate similar gradient directions, and hence, the prominent accumulations, accumulated from such gradient directions provide a solid base to represent the shape of that local structure. Unlike the sole use of gradient direction of a target pixel in existing methods, our coding scheme selects prominent edge directions accumulated from more samples (e.g., surrounding neighboring pixels), which, in turn, minimizes the effect of noise by suppressing the noisy accumulations of single or fewer samples. In this way, the presented encoding strategy provides the more discriminative shape of local structures while ensuring robustness to subtle changes such as local noise. We conduct extensive experiments on gender recognition datasets containing a wide range of challenges such as illumination, expression, age, and pose variations as well as sketch images, and observe the better performance of LPDP descriptor against existing local descriptors.

Research on the Spatial Expression Characteristics of Illustration in Picture Books (그림책 속 일러스트레이션의 공간 표현 특징 연구)

  • Han, YongGang;Kim, KieSu
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.3
    • /
    • pp.131-142
    • /
    • 2021
  • This research is based on the design of pictures in picture books, and the spatial representation of illustrations in the picture books contains the significantly important objective. Various texts, pictures, spaces, etc. in a picture must have the operator's various editing skills so that spatial arrangement is made smoothly.In this paper, the characteristics of spatial expression of design in picture books are derived by analyzing several examples of paintings and studies on classic picture books. First, the fusion of picture and text, that is, that both picture and text convey spatial information together as elements of the screen. Second, as a characteristic of the coherence of the space design in picture books, the story and content must be smoothly connected when reading the book. Third, when expressing a space, the creator should utilize the strengths and weaknesses of each other between abstract and conceived spatial expressions as needed. Fourth, as a symbolic feature of picture book spatial expression, it can be seen that many symbolic expression techniques are applied to the spatial expression of picture books according to the semiotic principle, which greatly improves the cognitive efficiency of reading picture books. The fifth characteristic is that the spatial expression of an excellent picture book has excellent interesting element, rich design means, and interestingly conveys screen contents and screen format to readers. In this research, it is thought that designers and artists should guide the creation within a spatial framework as designing picture books, thus greatly improving the efficiency of the creation process, while also provide a reader-centered visual Interesting experience.

A Thoracic Spine Segmentation Technique for Automatic Extraction of VHS and Cobb Angle from X-ray Images (X-ray 영상에서 VHS와 콥 각도 자동 추출을 위한 흉추 분할 기법)

  • Ye-Eun, Lee;Seung-Hwa, Han;Dong-Gyu, Lee;Ho-Joon, Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.1
    • /
    • pp.51-58
    • /
    • 2023
  • In this paper, we propose an organ segmentation technique for the automatic extraction of medical diagnostic indicators from X-ray images. In order to calculate diagnostic indicators of heart disease and spinal disease such as VHS(vertebral heart scale) and Cobb angle, it is necessary to accurately segment the thoracic spine, carina, and heart in a chest X-ray image. A deep neural network model in which the high-resolution representation of the image for each layer and the structure converted into a low-resolution feature map are connected in parallel was adopted. This structure enables the relative position information in the image to be effectively reflected in the segmentation process. It is shown that learning performance can be improved by combining the OCR module, in which pixel information and object information are mutually interacted in a multi-step process, and the channel attention module, which allows each channel of the network to be reflected as different weight values. In addition, a method of augmenting learning data is presented in order to provide robust performance against changes in the position, shape, and size of the subject in the X-ray image. The effectiveness of the proposed theory was evaluated through an experiment using 145 human chest X-ray images and 118 animal X-ray images.

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

  • Bogyung Park;Somin Park;Hyunki Hong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.303-314
    • /
    • 2023
  • Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.

Improvement of Face Recognition Algorithm for Residential Area Surveillance System Based on Graph Convolution Network (그래프 컨벌루션 네트워크 기반 주거지역 감시시스템의 얼굴인식 알고리즘 개선)

  • Tan Heyi;Byung-Won Min
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.2
    • /
    • pp.1-15
    • /
    • 2024
  • The construction of smart communities is a new method and important measure to ensure the security of residential areas. In order to solve the problem of low accuracy in face recognition caused by distorting facial features due to monitoring camera angles and other external factors, this paper proposes the following optimization strategies in designing a face recognition network: firstly, a global graph convolution module is designed to encode facial features as graph nodes, and a multi-scale feature enhancement residual module is designed to extract facial keypoint features in conjunction with the global graph convolution module. Secondly, after obtaining facial keypoints, they are constructed as a directed graph structure, and graph attention mechanisms are used to enhance the representation power of graph features. Finally, tensor computations are performed on the graph features of two faces, and the aggregated features are extracted and discriminated by a fully connected layer to determine whether the individuals' identities are the same. Through various experimental tests, the network designed in this paper achieves an AUC index of 85.65% for facial keypoint localization on the 300W public dataset and 88.92% on a self-built dataset. In terms of face recognition accuracy, the proposed network achieves an accuracy of 83.41% on the IBUG public dataset and 96.74% on a self-built dataset. Experimental results demonstrate that the network designed in this paper exhibits high detection and recognition accuracy for faces in surveillance videos.