• Title/Summary/Keyword: perceptual similarity

Search Result 43, Processing Time 0.023 seconds

A Study on the Visual Elevation Image of Apartment Buildings (아파트 입면계획에서 시각적 디자인 이미지에 관한 연구)

  • 손세욱;구시온
    • Journal of the Korean housing association
    • /
    • v.10 no.2
    • /
    • pp.247-257
    • /
    • 1999
  • This study is aimed to propose the evaluation model of forecasting visual quality of apartment buildings, which would be a useful tool to make the architectural concepts corresponding to user needs. This study was carried out through the four-step experiments as follows. The first step is to take the user's visual evaluation construct. To do this, the 22 adjective phrases were extracted, which were applicable to all apartment buildings. The second step is to analyze the user's visual preference, which is measured by the user's psychological quantity on the apartment buildings by S. D.(Semantic Differential) Method. The third step is to analyzethe five psychological-factors obtained from the Factor Analysis. The perceptual images on the 41 experimental subjects were checked up through evaluating and analyzing the factor scores of each subjects for each psychological-factor. The fourth step is to analyze, the similarity of various characters in a building, which is mirrored on the user's psychological quantity and how buildings are grouped by it.

  • PDF

Improvement of Perceptual Quality of HEVC by Rate Distortion Optimization Using Frequency Domain Structural Similarity (주파수 도메인의 구조적 유사도를 통한 HEVC 주관적 화질 향상 율-왜곡 최적화)

  • Jung, Sanghyun;Jeon, Byuengwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.06a
    • /
    • pp.81-82
    • /
    • 2017
  • 본 논문에서는 PSNR 을 높이도록 최적화된 HEVC 의 율-왜곡 최적화(RDO)를 MS-SSIM 를 높이도록 하여 RDO 를 수행 하도록 한다. 구현 방법으로는 MS-SSIM 도출 방법과 비슷하도록 원본과 4 단계의 저역 통과 필터(LPF)를 통과한 결과에 대한 DCT(Discrete Cosine Transform) 를 수행하고 그 AC 계수의 비율로 lagrange multiplier(${\lambda}$)를 수정하는 방식이다. AC 계수 비율과 MS-SSIM 에서 도출 된 가중치, LPF 특성 등에 따라 새롭게 각 스케일의 가중치를 결정하여 최종적으로 ${\lambda}$ 가중치를 결정하여 그 결과를 바탕으로 RDO 를 수행한다. 시뮬레이션을 통해 제안의 방법과 HEVC reference software 의 BD-rate 계산 결과 7%의 PNSR, -13.2%의 MS-SSIM 를 얻을 수 있었고 이에 따라 주관적 화질을 개선했다고 할 수 있다.

  • PDF

Directional texture information for connecting road segments in high spatial resolution satellite images

  • Lee, Jong-Yeol
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.245-245
    • /
    • 2005
  • This paper addresses the use of directional textural information for connecting road segments. In urban scene, some roads are occluded by buildings, casting shadow of buildings, trees, and cars on streets. Automatic extraction of road network from remotely sensed high resolution imagery is generally hindered by them. The results of automatic road network extraction will be incomplete. To overcome this problem, several perceptual grouping algorithms are often used based on similarity, proximity, continuation, and symmetry. Roads have directions and are connected to adjacent roads with certain angles. The directional information is used to guide road fragments connection based on roads directional inertia or characteristics of road junctions. In the primitive stage, roads are extracted with textural and direction information automatically with certain length of linearity. The primitive road fragments are connected based on the directional information to improve the road network. Experimental results show some contribution of this approach for completing road network, specifically in urban area.

  • PDF

Production of English final stops by Korean speakers

  • Kim, Jungyeon
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.11-17
    • /
    • 2018
  • This study reports on a production experiment designed to investigate how Korean speaking learners of English produce English forms ending in stops. In a repetition experiment, Korean participants listened to English nonce words ending in a stop and repeated what they heard. English speakers were recruited for the same task as a control group. The experimental result indicated that the transcriptions of the Korean productions by English native speakers showed vowel insertion in only 3% of productions although the pronunciation of English final stops showed that noise intervals after the closure of final stops were significantly longer for Korean speakers than for English speakers. This finding is inconsistent with the loanword data where 49% of words showed vowel insertion. It is also not compatible with the perceptual similarity approach, which predicts that because Korean speakers accurately perceive an English final stop as a final consonant, they will insert a vowel to make the English sound more similar to the Korean sound.

The Effect of Construal Level on Variety Seeking across Subcategories

  • Suh, Jiyeon;Won, Eugene J.S.
    • Asia Marketing Journal
    • /
    • v.21 no.3
    • /
    • pp.1-20
    • /
    • 2019
  • The present study investigates how consumers' construal level affects their variety seeking behavior when choosing multiple items simultaneously. Especially the authors focus on the perceptual level at which variety seeking takes place and propose that variety seeking can take place not only at brand level but also at category or subcategory level. Categorical variety seeking refers to diversification of one's choices over multiple brands not within the same category but across multiple categories. Building on construal level theory, the authors expected that people engaging in higher-level construals tend to subcategorize the choice set and distribute their choices across more subcategories and designed four experiments to test the related hypotheses. The experimental results showed that consumers' construal level can affect the level at which variety seeking takes place and those with higher construal level tend to choose options seemingly more dissimilar to each other.

Quality Assessment of Images Projected Using Multiple Projectors

  • Kakli, Muhammad Umer;Qureshi, Hassaan Saadat;Khan, Muhammad Murtaza;Hafiz, Rehan;Cho, Yongju;Park, Unsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.6
    • /
    • pp.2230-2250
    • /
    • 2015
  • Multiple projectors with partially overlapping regions can be used to project a seamless image on a large projection surface. With the advent of high-resolution photography, such systems are gaining popularity. Experts set up such projection systems by subjectively identifying the types of errors induced by the system in the projected images and rectifying them by optimizing (correcting) the parameters associated with the system. This requires substantial time and effort, thus making it difficult to set up such systems. Moreover, comparing the performance of different multi-projector display (MPD) systems becomes difficult because of the subjective nature of evaluation. In this work, we present a framework to quantitatively determine the quality of an MPD system and any image projected using such a system. We have divided the quality assessment into geometric and photometric qualities. For geometric quality assessment, we use Feature Similarity Index (FSIM) and distance-based Scale Invariant Feature Transform (SIFT). For photometric quality assessment, we propose to use a measure incorporating Spectral Angle Mapper (SAM), Intensity Magnitude Ratio (IMR) and Perceptual Color Difference (ΔE). We have tested the proposed framework and demonstrated that it provides an acceptable method for both quantitative evaluation of MPD systems and estimation of the perceptual quality of any image projected by them.

A study on Brand Image of Korea Women's Apparel Market with Multidimensional Scaling (다차원 척도기법을 이용한 여성 기성복의 상품 이미지에 관한 연구)

  • Hwang, Seon-Jin
    • Journal of the Korean Society of Costume
    • /
    • v.15
    • /
    • pp.253-265
    • /
    • 1990
  • This article was written with two purposes in mind. The first purpose was to introduce clothing and textile community who may not be familiar with Multidimensional Scaling(MDS) with usefulness of the new technique in the area of fashion merchandising. The second purpose was to present the results of an empirical study on brand image utilizing MDS and its related technique as the main analysis tools. The main objective of the empirical study was to gain a better understanding of consumer's brand image by relating differences in perception and attributes of clothing in women's ready-to wear market. For this empirical study, the ten brands and the fifteen attributes of clothing were chosen. The questionnaire consisting of questions asking about the similarity and attributes of clothing between selected brands was administrated to 185 career women during summer in 1989. Data were analyzed cluster analysis, and KYST and PROFIT in MDS program. The results were as follows: 1. The similarities data for the ten selected brand by using KYST program of MDS drawed the perceptual map. The results of this perceptual map showed that the selected brand were grouped into three clusters. 2. In order to get a somewhat objective view of which attributes consumers are attributing to each brand, PROFIT program was used. As a result, it was revealed that assortment depth / width, price, youth-oriented style, possibility of various social activity were significant attributes in consumer's brand choice rather than physical attributes of clothing such as quality or durability. This may imply that consumer orientation in rapidly changing environments of women's apparel market was its basic idea, and the focus of all fashion merchandising activities was put on need's and the response of consumer group who are the object of the target. Implicating for future research as well as for strategy of brand positioning were also suggested.

  • PDF

Articulatory Attributes in Korean Nonassimilating Contexts

  • Son, Minjung
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.109-121
    • /
    • 2013
  • This study examined several kinematic properties of the primary articulator (the tongue dorsum) and the supplementary articulator (the jaw) in the articulation of the voiceless velar stop (/k/) within nonassimilating contexts. We examined in particular the spatiotemporal properties (constriction duration and constriction maxima) from the constriction onset to the constriction offset by analyzing a velar (/k/) followed by the coronal fricative (/s/), the coronal stop (/t/), and the labial (/p/) in across-word boundary conditions (/k#s/, /k#t/, and /k#p/). Along with these measurements, we investigated intergestural temporal coordination between C1 and C2 and the jaw articulator in relation to its coordination with the articulation of consonant sequences. The articulatory movement data was collected by means of electromagnetic midsagittal articulometry (EMMA). Four native speakers of Seoul Korean participated in the laboratory experiment. The results showed several characteristics. First, a velar (/k/) in C1 was not categorically reduced. Constriction duration and constriction degree of the velar (/k/) were similar within nonassimilating contexts (/k#s/=/k#t/=/k#p/). This might mean that spatiotemporal attributes during constriction duration were stable and consistent across different contexts, which might be subsequently associated with the nontarget status of the velar in place assimilation. Second, the gestural overlap could be represented as the order of /k#s/ (less) < /k#p/ (intermediate) < /k#t/ (more) as we measured the onset-to-onset lag (a longer lag indicated shorter gestural overlap.). This indicates a gestural overlap within nonassimilating contexts may not be constrained by any of the several constraints including the perceptual recoverability constraint (e.g., more overlap in Front-to-Back sequences compared to the reverse order (Back-to-Front) since perceptual cues in C1 can be recovered anytime during C2 articulation), the low-level speech motor constraint (e.g., more overlap in lingual-nonlingual sequences as compared to the lingual-lingual sequences), or phonological contexts effects (e.g., similarity in gestural overlap within nonassimilating contexts). As one possible account for more overlap in /k#t/ sequences as compared to /k#p/, we suspect speakers' knowledge may be receptive to extreme encroachment on C1 by the gestural overlap of the coronal in C2 since it does not obscure the perceptual cue of C1 as much as the labial in C2. Third, actual jaw position during C2 was higher in coronals (/s/, /t/) than in the labial (/p/). However, within the coronals, there was no manner-dependent jaw height difference in C2 (/s/=/t/). Vertical jaw position of C1 and C2 was seen as inter-dependent as higher jaw position in C1 was closely associated with C2. Lastly, a greater gap in jaw height was associated with longer intergestural timing (e.g., less overlap), but was confined to the cluster type (/kp/) with the lingual-nonlingual sequence. This study showed that Korean jaw articulation was independent from coordinating primary articulators in gestural overlap in some cluster types (/k#s/, /k#t/) while not in others (e.g., /k#p/). Overall, the results coherently indicate the velar stop (/k/) in C1 was robust in articulation, which may have subsequently contributed to the nontarget status of the velar (/k/) in place assimilation processes.

ISFRNet: A Deep Three-stage Identity and Structure Feature Refinement Network for Facial Image Inpainting

  • Yan Wang;Jitae Shin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.881-895
    • /
    • 2023
  • Modern image inpainting techniques based on deep learning have achieved remarkable performance, and more and more people are working on repairing more complex and larger missing areas, although this is still challenging, especially for facial image inpainting. For a face image with a huge missing area, there are very few valid pixels available; however, people have an ability to imagine the complete picture in their mind according to their subjective will. It is important to simulate this capability while maintaining the identity features of the face as much as possible. To achieve this goal, we propose a three-stage network model, which we refer to as the identity and structure feature refinement network (ISFRNet). ISFRNet is based on 1) a pre-trained pSp-styleGAN model that generates an extremely realistic face image with rich structural features; 2) a shallow structured network with a small receptive field; and 3) a modified U-net with two encoders and a decoder, which has a large receptive field. We choose structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), L1 Loss and learned perceptual image patch similarity (LPIPS) to evaluate our model. When the missing region is 20%-40%, the above four metric scores of our model are 28.12, 0.942, 0.015 and 0.090, respectively. When the lost area is between 40% and 60%, the metric scores are 23.31, 0.840, 0.053 and 0.177, respectively. Our inpainting network not only guarantees excellent face identity feature recovery but also exhibits state-of-the-art performance compared to other multi-stage refinement models.

Enhanced Spectral Hole Substitution for Improving Speech Quality in Low Bit-Rate Audio Coding

  • Lee, Chang-Heon;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3E
    • /
    • pp.131-139
    • /
    • 2010
  • This paper proposes a novel spectral hole substitution technique for low bit-rate audio coding. The spectral holes frequently occurring in relatively weak energy bands due to zero bit quantization result in severe quality degradation, especially for harmonic signals such as speech vowels. The enhanced aacPlus (EAAC) audio codec artificially adjusts the minimum signal-to-mask ratio (SMR) to reduce the number of spectral holes, but it still produces noisy sound. The proposed method selectively predicts the spectral shapes of hole bands using either intra-band correlation, i.e. harmonically related coefficients nearby or inter-band correlation, i.e. previous frames. For the bands that have low prediction gain, only the energy term is quantized and spectral shapes are replaced by pseudo random values in the decoding stage. To minimize perceptual distortion caused by spectral mismatching, the criterion of the just noticeable level difference (JNLD) and spectral similarity between original and predicted shapes are adopted for quantizing the energy term. Simulation results show that the proposed method implemented into the EAAC baseline coder significantly improves speech quality at low bit-rates while keeping equivalent quality for mixed and music contents.