• Title/Summary/Keyword: 희소성

Search Result 270, Processing Time 0.025 seconds

Addressing Low-Resource Problems in Statistical Machine Translation of Manual Signals in Sign Language (말뭉치 자원 희소성에 따른 통계적 수지 신호 번역 문제의 해결)

  • Park, Hancheol;Kim, Jung-Ho;Park, Jong C.
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.163-170
    • /
    • 2017
  • Despite the rise of studies in spoken to sign language translation, low-resource problems of sign language corpus have been rarely addressed. As a first step towards translating from spoken to sign language, we addressed the problems arising from resource scarcity when translating spoken language to manual signals translation using statistical machine translation techniques. More specifically, we proposed three preprocessing methods: 1) paraphrase generation, which increases the size of the corpora, 2) lemmatization, which increases the frequency of each word in the corpora and the translatability of new input words in spoken language, and 3) elimination of function words that are not glossed into manual signals, which match the corresponding constituents of the bilingual sentence pairs. In our experiments, we used different types of English-American sign language parallel corpora. The experimental results showed that the system with each method and the combination of the methods improved the quality of manual signals translation, regardless of the type of the corpora.

Visually Weighted Group-Sparsity Recovery for Compressed Sensing of Color Images with Edge-Preserving Filter (컬러 영상의 압축 센싱을 위한 경계보존 필터 및 시각적 가중치 적용 기반 그룹-희소성 복원)

  • Nguyen, Viet Anh;Trinh, Chien Van;Park, Younghyeon;Jeon, Byeungwoo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.9
    • /
    • pp.106-113
    • /
    • 2015
  • This paper integrates human visual system (HVS) characteristics into compressed sensing recovery of color images. The proposed visual weighting of each color channel in group-sparsity minimization not only pursues sparsity level of image but also reflects HVS characteristics well. Additionally, an edge-preserving filter is embedded in the scheme to remove noise while preserving edges of image so that quality of reconstructed image is further enhanced. Experimental results show that the average PSNR of the proposed method is 0.56 ~ 4dB higher than that of the state-of-the art group-sparsity minimization method. These results prove the excellence of the proposed method in both terms of objective and subjective qualities.

Vehicle Recognition using NMF in Urban Scene (도심 영상에서의 비음수행렬분해를 이용한 차량 인식)

  • Ban, Jae-Min;Lee, Byeong-Rae;Kang, Hyun-Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.7C
    • /
    • pp.554-564
    • /
    • 2012
  • The vehicle recognition consists of two steps; the vehicle region detection step and the vehicle identification step based on the feature extracted from the detected region. Features using linear transformations have the effect of dimension reduction as well as represent statistical characteristics, and show the robustness in translation and rotation of objects. Among the linear transformations, the NMF(Non-negative Matrix Factorization) is one of part-based representation. Therefore, we can extract NMF features with sparsity and improve the vehicle recognition rate by the representation of local features of a car as a basis vector. In this paper, we propose a feature extraction using NMF suitable for the vehicle recognition, and verify the recognition rate with it. Also, we compared the vehicle recognition rate for the occluded area using the SNMF(sparse NMF) which has basis vectors with constraint and LVQ2 neural network. We showed that the feature through the proposed NMF is robust in the urban scene where occlusions are frequently occur.

Data BILuring Method for Solving Sparseness Problem in Collaborative Filtering (협동적 여과에서의 희소성 문제 해결을 위한 데이타 블러링 기법)

  • Kim, Hyung-Il;Kim, Jun-Tae
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.6
    • /
    • pp.542-553
    • /
    • 2005
  • Recommendation systems analyze user preferences and recommend items to a user by predicting the user's preference for those items. Among various kinds of recommendation methods, collaborative filtering(CF) has been widely used and successfully applied to practical applications. However, collaborative filtering has two inherent problems: data sparseness and the cold-start problems. If there are few known preferences for a user, it is difficult to find many similar users, and therefore the performance of recommendation is degraded. This problem is more serious when a new user is first using the system. In this paper we propose a method of integrating additional feature information of users and items into CF to overcome the difficulties caused by sparseness and improve the accuracy of recommendation. In our method, we first fill in unknown preference values by using the probability distribution of feature values, then generate the top-N recommendations by applying collaborative filtering on the modified data. We call this method of filling unknown preference values as data blurring. Several experimental results that show the effectiveness of the proposed method are also presented.

Tweet Entity Linking Method based on User Similarity for Entity Disambiguation (개체 중의성 해소를 위한 사용자 유사도 기반의 트윗 개체 링킹 기법)

  • Kim, SeoHyun;Seo, YoungDuk;Baik, Doo-Kwon
    • Journal of KIISE
    • /
    • v.43 no.9
    • /
    • pp.1043-1051
    • /
    • 2016
  • Web based entity linking cannot be applied in tweet entity linking because twitter documents are shorter in comparison to web documents. Therefore, tweet entity linking uses the information of users or groups. However, data sparseness problem is occurred due to the users with the inadequate number of twitter experience data; in addition, a negative impact on the accuracy of the linking result for users is possible when using the information of unrelated groups. To solve the data sparseness problem, we consider three features including the meanings from single tweets, the users' own tweet set and the sets of other users' tweets. Furthermore, we improve the performance and the accuracy of the tweet entity linking by assigning a weight to the information of users with a high similarity. Through a comparative experiment using actual twitter data, we verify that the proposed tweet entity linking has higher performance and accuracy than existing methods, and has a correlation with solving the data sparseness problem and improved linking accuracy for use of information of high similarity users.

Time delay estimation between two receivers using basis pursuit denoising (Basis pursuit denoising을 사용한 두 수신기 간 시간 지연 추정 알고리즘)

  • Lim, Jun-Seok;Cheong, MyoungJun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.4
    • /
    • pp.285-291
    • /
    • 2017
  • Many methods have been studied to estimate the time delay between incoming signals to two receivers. In the case of the method based on the channel estimation technique, the relative delay between the input signals of the two receivers is estimated as an impulse response of the channel between the two signals. In this case, the characteristic of the channel has sparsity. Most of the existing methods do not take advantage of the channel sparseness. In this paper, we propose a time delay estimation method using BPD (Basis Pursuit Denoising) optimization technique, which is one of the sparse signal optimization methods, in order to utilize the channel sparseness. Compared with the existing GCC (Generalized Cross Correlation) method, adaptive eigen decomposition method and RZA-LMS (Reweighted Zero-Attracting Least Mean Square), the proposed method shows that it can mitigate the threshold phenomenon even under a white Gaussian source, a colored signal source and oceanic mammal sound source.

A Movie Recommendation System processing High-Dimensional Data with Fuzzy-AHP and Fuzzy Association Rules (퍼지 AHP와 퍼지 연관규칙을 이용하여 고차원 데이터를 처리하는 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.347-353
    • /
    • 2019
  • Recent recommendation systems are developing toward the utilization of high-dimensional data. However, high-dimensional data can increase algorithm complexity by expanding dimensions and be lower the accuracy of recommended items. In addition, it can cause the problem of data sparsity and make it difficult to provide users with proper recommended items. This study proposed an algorithm that classify users' subjective data with objective criteria with fuzzy-AHP and make use of rules with repetitive patterns through fuzzy association rules. Trying to check how problems with high-dimensional data would be mitigated by the algorithm, we performed 5-fold cross validation according to the changing number of users. The results show that the algorithm-applied system recorded accuracy that was 12.5% higher than that of the fuzzy-AHP-applied system and mitigated the problem of data sparsity.

Feature-selection algorithm based on genetic algorithms using unstructured data for attack mail identification (공격 메일 식별을 위한 비정형 데이터를 사용한 유전자 알고리즘 기반의 특징선택 알고리즘)

  • Hong, Sung-Sam;Kim, Dong-Wook;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.1-10
    • /
    • 2019
  • Since big-data text mining extracts many features and data, clustering and classification can result in high computational complexity and low reliability of the analysis results. In particular, a term document matrix obtained through text mining represents term-document features, but produces a sparse matrix. We designed an advanced genetic algorithm (GA) to extract features in text mining for detection model. Term frequency inverse document frequency (TF-IDF) is used to reflect the document-term relationships in feature extraction. Through a repetitive process, a predetermined number of features are selected. And, we used the sparsity score to improve the performance of detection model. If a spam mail data set has the high sparsity, detection model have low performance and is difficult to search the optimization detection model. In addition, we find a low sparsity model that have also high TF-IDF score by using s(F) where the numerator in fitness function. We also verified its performance by applying the proposed algorithm to text classification. As a result, we have found that our algorithm shows higher performance (speed and accuracy) in attack mail classification.

Applying Different Similarity Measures based on Jaccard Index in Collaborative Filtering

  • Lee, Soojung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.5
    • /
    • pp.47-53
    • /
    • 2021
  • Sparse ratings data hinder reliable similarity computation between users, which degrades the performance of memory-based collaborative filtering techniques for recommender systems. Many works in the literature have been developed for solving this data sparsity problem, where the most simple and representative ones are the methods of utilizing Jaccard index. This index reflects the number of commonly rated items between two users and is mostly integrated into traditional similarity measures to compute similarity more accurately between the users. However, such integration is very straightforward with no consideration of the degree of data sparsity. This study suggests a novel idea of applying different similarity measures depending on the numeric value of Jaccard index between two users. Performance experiments are conducted to obtain optimal values of the parameters used by the proposed method and evaluate it in comparison with other relevant methods. As a result, the proposed demonstrates the best and comparable performance in prediction and recommendation accuracies.

Stacked Sparse Autoencoder-DeepCNN Model Trained on CICIDS2017 Dataset for Network Intrusion Detection (네트워크 침입 탐지를 위해 CICIDS2017 데이터셋으로 학습한 Stacked Sparse Autoencoder-DeepCNN 모델)

  • Lee, Jong-Hwa;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.24 no.2
    • /
    • pp.24-34
    • /
    • 2021
  • Service providers using edge computing provide a high level of service. As a result, devices store important information in inner storage and have become a target of the latest cyberattacks, which are more difficult to detect. Although experts use a security system such as intrusion detection systems, the existing intrusion systems have low detection accuracy. Therefore, in this paper, we proposed a machine learning model for more accurate intrusion detections of devices in edge computing. The proposed model is a hybrid model that combines a stacked sparse autoencoder (SSAE) and a convolutional neural network (CNN) to extract important feature vectors from the input data using sparsity constraints. To find the optimal model, we compared and analyzed the performance as adjusting the sparsity coefficient of SSAE. As a result, the model showed the highest accuracy as a 96.9% using the sparsity constraints. Therefore, the model showed the highest performance when model trains only important features.