• Title/Summary/Keyword: Perceptual Systems

Search Result 127, Processing Time 0.023 seconds

Real-time Overlay Video Multicast System (실시간 동영상 오버레이 멀티캐스트 시스템)

  • Kang, Ho-Jong;Song, Hwang-Jun;Min, Kyung-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.2C
    • /
    • pp.139-147
    • /
    • 2006
  • This paper presents an overlay video multicast system over the Internet. The proposed system consists of two parts, i.e. overlay multicast tree suitable for the real-time video delivery and H.263+ rate control adaptive to overlay multicast tree. Overlay multicast tree is constructed to minimize the average time delay of members, and H.263+ rate control pursues a tradeoff between spatial and temporal qualities to enhance the human visual perceptual quality. Two systems are integrated and tested over the real Internet. And experimental results are provided to show the performance of the proposed system.

An Empirical Study of Factors Affecting the Value Gap in IS Investment (정보시스템 투자의 성과격차 유발요인에 관한 실증연구)

  • Park Kiho;Cho Namjae
    • Korean Management Science Review
    • /
    • v.21 no.2
    • /
    • pp.145-165
    • /
    • 2004
  • Frequently. lots of organizations have experienced the value discrepancy between the expected value and the realized value from IS (information systems) investments. Being positive or negative the difference is. however, the existence of discrepancy itself is an evidence of less-than-sound management and measurement of IS projects. Analyzing the factors that cause such discrepancy has become an issue of scrutiny both in academia and in practice. We model which factors. as predictors, will affect the value discrepancy, as dependent variables. in IS investment. This research will establish and examine the research model. the validity of category classification of value discrepancy factors and the perceptual level of IS value discrepancy by survey research. As a result of the survey research. the strategic alignment. the proper system design for staffs. the project planning capability. and interdepartmental task cooperation are perceived as the factors that significantly affect the value discrepancy. And known as IS success factors such as the managerial support, the change management, the standardized process. and the competitive investment are not significant factors. The research findings will provide and emphasize useful implications which factors should be deliberately investigated in IS investment both for practices considering IS deployment and for academia.

An Adaptive Wind Noise Reduction Method Based on a priori SNR Estimation for Speech Eenhancement (음성 강화를 위한 a priori SNR 추정기반 적응 바람소리 저감 방법)

  • Seo, Ji-Hun;Lee, Seok-Pil
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.12
    • /
    • pp.1756-1760
    • /
    • 2015
  • This paper focuses on a priori signal to noise ratio (SNR) estimation method for the speech enhancement. There are many researches for speech enhancement with several ambient noise cancellation methods. The method based on spectral subtraction (SS) which is widely used in noise reduction has a trade-off between the performance and the distortion of the signals. So the need of adaptive method like an estimated a priori SNR being able to making a high performance and low distortion is increasing. The decision directed (DD) approach is used to determine a priori SNR in noisy speech signals. A priori SNR is estimated by using only the magnitude components and consequently follows a posteriori SNR with one frame delay. We propose a modified a priori SNR estimator and the weighted rational transfer function for speech enhancement with wind noises. The experimental result shows the performance of our proposed estimator is better Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862) compare to the conventional DD approach-based systems and different noise reduction methods.

Joint Spatial-Temporal Quality Improvement Scheme for H.264 Low Bit Rate Video Coding via Adaptive Frameskip

  • Cui, Ziguan;Gan, Zongliang;Zhu, Xiuchang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.1
    • /
    • pp.426-445
    • /
    • 2012
  • Conventional rate control (RC) schemes for H.264 video coding usually regulate output bit rate to match channel bandwidth by adjusting quantization parameter (QP) at fixed full frame rate, and the passive frame skipping to avoid buffer overflow usually occurs when scene changes or high motions exist in video sequences especially at low bit rate, which degrades spatial-temporal quality and causes jerky effect. In this paper, an active content adaptive frame skipping scheme is proposed instead of passive methods, which skips subjectively trivial frames by structural similarity (SSIM) measurement between the original frame and the interpolated frame via motion vector (MV) copy scheme. The saved bits from skipped frames are allocated to coded key ones to enhance their spatial quality, and the skipped frames are well recovered based on MV copy scheme from adjacent key ones at the decoder side to maintain constant frame rate. Experimental results show that the proposed active SSIM-based frameskip scheme acquires better and more consistent spatial-temporal quality both in objective (PSNR) and subjective (SSIM) sense with low complexity compared to classic fixed frame rate control method JVT-G012 and prior objective metric based frameskip method.

FD-StackGAN: Face De-occlusion Using Stacked Generative Adversarial Networks

  • Jabbar, Abdul;Li, Xi;Iqbal, M. Munawwar;Malik, Arif Jamal
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.7
    • /
    • pp.2547-2567
    • /
    • 2021
  • It has been widely acknowledged that occlusion impairments adversely distress many face recognition algorithms' performance. Therefore, it is crucial to solving the problem of face image occlusion in face recognition. To solve the image occlusion problem in face recognition, this paper aims to automatically de-occlude the human face majority or discriminative regions to improve face recognition performance. To achieve this, we decompose the generative process into two key stages and employ a separate generative adversarial network (GAN)-based network in both stages. The first stage generates an initial coarse face image without an occlusion mask. The second stage refines the result from the first stage by forcing it closer to real face images or ground truth. To increase the performance and minimize the artifacts in the generated result, a new refine loss (e.g., reconstruction loss, perceptual loss, and adversarial loss) is used to determine all differences between the generated de-occluded face image and ground truth. Furthermore, we build occluded face images and corresponding occlusion-free face images dataset. We trained our model on this new dataset and later tested it on real-world face images. The experiment results (qualitative and quantitative) and the comparative study confirm the robustness and effectiveness of the proposed work in removing challenging occlusion masks with various structures, sizes, shapes, types, and positions.

Human Laughter Generation using Hybrid Generative Models

  • Mansouri, Nadia;Lachiri, Zied
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1590-1609
    • /
    • 2021
  • Laughter is one of the most important nonverbal sound that human generates. It is a means for expressing his emotions. The acoustic and contextual features of this specific sound are different from those of speech and many difficulties arise during their modeling process. During this work, we propose an audio laughter generation system based on unsupervised generative models: the autoencoder (AE) and its variants. This procedure is the association of three main sub-process, (1) the analysis which consist of extracting the log magnitude spectrogram from the laughter database, (2) the generative models training, (3) the synthesis stage which incorporate the involvement of an intermediate mechanism: the vocoder. To improve the synthesis quality, we suggest two hybrid models (LSTM-VAE, GRU-VAE and CNN-VAE) that combine the representation learning capacity of variational autoencoder (VAE) with the temporal modelling ability of a long short-term memory RNN (LSTM) and the CNN ability to learn invariant features. To figure out the performance of our proposed audio laughter generation process, objective evaluation (RMSE) and a perceptual audio quality test (listening test) were conducted. According to these evaluation metrics, we can show that the GRU-VAE outperforms the other VAE models.

The Game Selection Model for the Payoff Strategy Optimization of Mobile CrowdSensing Task

  • Zhao, Guosheng;Liu, Dongmei;Wang, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.4
    • /
    • pp.1426-1447
    • /
    • 2021
  • The payoff game between task publishers and users in the mobile crowdsensing environment is a hot topic of research. A optimal payoff selection model based on stochastic evolutionary game is proposed. Firstly, the process of payoff optimization selection is modeled as a task publisher-user stochastic evolutionary game model. Secondly, the low-quality data is identified by the data quality evaluation algorithm, which improves the fitness of perceptual task matching target users, so that task publishers and users can obtain the optimal payoff at the current moment. Finally, by solving the stability strategy and analyzing the stability of the model, the optimal payoff strategy is obtained under different intensity of random interference and different initial state. The simulation results show that, in the aspect of data quality evaluation, compared with BP detection method and SVM detection method, the accuracy of anomaly data detection of the proposed model is improved by 8.1% and 0.5% respectively, and the accuracy of data classification is improved by 59.2% and 32.2% respectively. In the aspect of the optimal payoff strategy selection, it is verified that the proposed model can reasonably select the payoff strategy.

Implementation of Image Transmission Based on Vehicle-to-Vehicle Communication

  • Piao, Changhao;Ding, Xiaoyue;He, Jia;Jang, Soohyun;Liu, Mingjie
    • Journal of Information Processing Systems
    • /
    • v.18 no.2
    • /
    • pp.258-267
    • /
    • 2022
  • Weak over-the-horizon perception and blind spot are the main problems in intelligent connected vehicles (ICVs). In this paper, a V2V image transmission-based road condition warning method is proposed to solve them. The encoded road emergency images which are collected by the ICV are transmitted to the on-board unit (OBU) through Ethernet. The OBU broadcasts the fragmented image information including location and clock of the vehicle to other OBUs. To satisfy the channel quality of the V2X communication in different times, the optimal fragment length is selected by the OBU to process the image information. Then, according to the position and clock information of the remote vehicles, OBU of the receiver selects valid messages to decode the image information which will help the receiver to extend the perceptual field. The experimental results show that our method has an average packet loss rate of 0.5%. The transmission delay is about 51.59 ms in low-speed driving scenarios, which can provide drivers with timely and reliable warnings of the road conditions.

MLSE-Net: Multi-level Semantic Enriched Network for Medical Image Segmentation

  • Di Gai;Heng Luo;Jing He;Pengxiang Su;Zheng Huang;Song Zhang;Zhijun Tu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2458-2482
    • /
    • 2023
  • Medical image segmentation techniques based on convolution neural networks indulge in feature extraction triggering redundancy of parameters and unsatisfactory target localization, which outcomes in less accurate segmentation results to assist doctors in diagnosis. In this paper, we propose a multi-level semantic-rich encoding-decoding network, which consists of a Pooling-Conv-Former (PCFormer) module and a Cbam-Dilated-Transformer (CDT) module. In the PCFormer module, it is used to tackle the issue of parameter explosion in the conservative transformer and to compensate for the feature loss in the down-sampling process. In the CDT module, the Cbam attention module is adopted to highlight the feature regions by blending the intersection of attention mechanisms implicitly, and the Dilated convolution-Concat (DCC) module is designed as a parallel concatenation of multiple atrous convolution blocks to display the expanded perceptual field explicitly. In addition, MultiHead Attention-DwConv-Transformer (MDTransformer) module is utilized to evidently distinguish the target region from the background region. Extensive experiments on medical image segmentation from Glas, SIIM-ACR, ISIC and LGG demonstrated that our proposed network outperforms existing advanced methods in terms of both objective evaluation and subjective visual performance.

Real-Time Video Quality Assessment of Video Communication Systems (비디오 통신 시스템의 실시간 비디오 품질 측정 방법)

  • Kim, Byoung-Yong;Lee, Seon-Oh;Jung, Kwang-Su;Sim, Dong-Gyu;Lee, Soo-Youn
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.3
    • /
    • pp.75-88
    • /
    • 2009
  • This paper presents a video quality assessment method based on quality degradation factors of real-time multimedia streaming services. The video quality degradation is caused by video source compression and network states. In this paper, we propose a blocky metric on an image domain to measure quality degradation by video compression. In this paper, the proposed boundary strength index for the blocky metric is defined by ratio of the variation of two pixel values adjacent to $8{\times}8$ block boundary and the average variation at several pixels adjacent to the two boundary pixels. On the other hand, unnatural image movement caused by network performance deterioration such as jitter and delay factors can be observed. In this paper, a temporal-Jerkiness measurement method is proposed by computing statistics of luminance differences between consecutive frames and play-time intervals between frames. The proposed final Perceptual Video Quality Metric (PVQM) is proposed by consolidating both blocking strength and temporal-jerkiness. To evaluate performance of the proposed algorithm, the accuracy of the proposed algorithm is compared with Difference of Mean Opinion Score (DMOS) based on human visual system.