• Title/Summary/Keyword: Neighbor embedding

Search Result 25, Processing Time 0.021 seconds

Consecutive Difference Expansion Based Reversible DNA Watermarking (연속적 차분 확장 기반 가역 DNA 워터마킹)

  • Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.7
    • /
    • pp.51-62
    • /
    • 2015
  • Of recent interests on high capacity DNA storage, DNA watermarking for DNA copyright protection, and DNA steganography for DNA secret communication are augmented, the reversible DNA watermarking is much needed both to embed the watermark without changing the functionality of organism and to perfectly recover the host DNA sequence. In this paper, we address two ways of DE based reversible DNA watermarking using noncoding DNA sequence. The reversible DNA watermarking should consider the string structure of a DNA sequence, the organism functionality, the perfect recovery, and the high embedding capacity. We convert the string sequence of four characters in noncoding region to the decimal coded values and embed the watermark bit into coded values by two ways; DE based multiple bits embedding (DE-MBE) using pairs of neighbor coded values and consecutive DE-MBE (C-DE-MBE). Two ways process the comparison searching to prevent the false start codon that produces false coding region. Experimental results verified that our ways have more high embedding capacity than conventional methods and produce no false start codon and recover perfectly the host sequence without the reference sequence. Especially C-DE-MBE can embed more high two times than DE-MBE.

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

Fault Diagnosis of Ball Bearing using Correlation Dimension (상관차원에 의한 볼베어링 고장진단)

  • 김진수;최연선
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2004.05a
    • /
    • pp.979-984
    • /
    • 2004
  • The ball bearing having faults generally shows, nonlinear vibration characteristics. For the effective method of fault diagnosis on bail bearing, non-linear diagnostic methods can be used. In this paper, the correlation dimension analysis based on nonlinear timeseries was applied to diagnose the faults of ball bearing. The correlation dimension analysis shows some Intrinsic information of underlying dynamical systems, and clear the classification of the fault of ball bearing.

  • PDF

Memory-Efficient NBNN Image Classification

  • Lee, YoonSeok;Yoon, Sung-Eui
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.1
    • /
    • pp.1-8
    • /
    • 2017
  • Naive Bayes nearest neighbor (NBNN) is a simple image classifier based on identifying nearest neighbors. NBNN uses original image descriptors (e.g., SIFTs) without vector quantization for preserving the discriminative power of descriptors and has a powerful generalization characteristic. However, it has a distinct disadvantage. Its memory requirement can be prohibitively high while processing a large amount of data. To deal with this problem, we apply a spherical hashing binary code embedding technique, to compactly encode data without significantly losing classification accuracy. We also propose using an inverted index to identify nearest neighbors among binarized image descriptors. To demonstrate the benefits of our method, we apply our method to two existing NBNN techniques with an image dataset. By using 64 bit length, we are able to reduce memory 16 times with higher runtime performance and no significant loss of classification accuracy. This result is achieved by our compact encoding scheme for image descriptors without losing much information from original image descriptors.

A Study in Relationship between Facial Expression and Action Unit (Manifold Learning을 통한 표정과 Action Unit 간의 상관성에 관한 연구)

  • Kim, Sunbin;Kim, Hyeoncheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.763-766
    • /
    • 2018
  • 표정은 사람들 사이에서 감정을 표현하는 강력한 비언어적 수단이다. 표정 인식은 기계학습에서 아주 중요한 분야 중에 하나이다. 표정 인식에 사용되는 기계학습 모델들은 사람 수준의 성능을 보여준다. 하지만 좋은 성능에도 불구하고, 기계학습 모델들은 표정 인식 결과에 대한 근거나 설명을 제공해주지 못한다. 이 연구는 표정 인식의 근거로서 Facial Action Coding Unit(AUs)을 사용하기 위해서 CK+ Dataset을 사용하여 표정 인식을 학습한 Convolutional Neural Network(CNN) 모델이 추출한 특징들을 t-distributed stochastic neighbor embedding(t-SNE)을 사용하여 시각화한 뒤, 인식된 표정과 AUs 사이의 분포의 연관성을 확인하는 연구이다.

A semi-automatic cell type annotation method for single-cell RNA sequencing dataset

  • Kim, Wan;Yoon, Sung Min;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.18 no.3
    • /
    • pp.26.1-26.6
    • /
    • 2020
  • Single-cell RNA sequencing (scRNA-seq) has been widely applied to provide insights into the cell-by-cell expression difference in a given bulk sample. Accordingly, numerous analysis methods have been developed. As it involves simultaneous analyses of many cell and genes, efficiency of the methods is crucial. The conventional cell type annotation method is laborious and subjective. Here we propose a semi-automatic method that calculates a normalized score for each cell type based on user-supplied cell type-specific marker gene list. The method was applied to a publicly available scRNA-seq data of mouse cardiac non-myocyte cell pool. Annotating the 35 t-stochastic neighbor embedding clusters into 12 cell types was straightforward, and its accuracy was evaluated by constructing co-expression network for each cell type. Gene Ontology analysis was congruent with the annotated cell type and the corollary regulatory network analysis showed upstream transcription factors that have well supported literature evidences. The source code is available as an R script upon request.

A Block-Based Adaptive Data Hiding Approach Using Pixel Value Difference and LSB Substitution to Secure E-Governance Documents

  • Halder, Tanmoy;Karforma, Sunil;Mandal, Rupali
    • Journal of Information Processing Systems
    • /
    • v.15 no.2
    • /
    • pp.261-270
    • /
    • 2019
  • In order to protect secret digital documents against vulnerabilities while communicating, steganography algorithms are applied. It protects a digital file from unauthorized access by hiding the entire content. Pixel-value-difference being a method from spatial domain steganography utilizes the difference gap between neighbor pixels to fulfill the same. The proposed approach is a block-wise embedding process where blocks of variable size are chosen from the cover image, therefore, a stream of secret digital contents is hidden. Least significant bit (LSB) substitution method is applied as an adaptive mechanism and optimal pixel adjustment process (OPAP) is used to minimize the error rate. The proposed application succeeds to maintain good hiding capacity and better signal-to-noise ratio when compared against other existing methods. Any means of digital communication specially e-Governance applications could be highly benefited from this approach.

Physiological Signal-Based Emotion Recognition in Conversations Using T-SNE (생체신호 기반의 T-SNE 를 활용한 대화 내 감정 인식 )

  • Subeen Leem;Byeongcheon Lee;Jihoon Moon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.703-705
    • /
    • 2023
  • 본 연구는 대화 중 생체신호 데이터를 활용하여 감정 인식 분야에서 더욱 정확하고 범용성이 높은 인식 기술을 제안한다. 이를 위해, 먼저 대화별 길이에 따른 측정값의 개수를 동일하게 조정하고 효과적인 생체신호 데이터의 조합을 비교 및 분석하기 위해 차원 축소 기법인 T-SNE (T-distributed Stochastic Neighbor Embedding)을 활용하여 감정 라벨의 분포를 확인한다. 또한, AutoML (Automated Machine Learning)을 이용하여 축소된 데이터로 감정을 분류 및 각성도와 긍정도를 예측하여 감정을 가장 잘 인식하는 생체신호 데이터의 조합을 발견한다.

Super Resolution Algorithm using TV-G Decomposition (TV-G 분해를 이용한 초해상도 알고리즘)

  • Eum, Kyoung-Bae;Beom, Dong-Kyu
    • Journal of Digital Contents Society
    • /
    • v.18 no.8
    • /
    • pp.1517-1522
    • /
    • 2017
  • Among single image SR techniques, the TV based SR approach seems most successful in terms of edge preservation and no artifacts. But, this approach achieves insufficient SR for texture component. In this paper, we proposed a new TV-G decomposition based SR method to solve this problem. We proposed the SVR based up-sampling to get better edge preservation in the structure component. The NNE used the relaxed constraint to improve the NE. We used the NNE based learning method to improve the resolution of the texture component. Through experimental results, we quantitatively and qualitatively confirm the improved results of the proposed SR method when comparing with conventional interpolation method, ScSR, TV and NNE.

Detection and Classification of Demagnetization and Short-Circuited Turns in Permanent Magnet Synchronous Motors

  • Youn, Young-Woo;Hwang, Don-Ha;Song, Sung-ju;Kim, Yong-Hwa
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.4
    • /
    • pp.1614-1622
    • /
    • 2018
  • The research related to fault diagnosis in permanent magnet synchronous motors (PMSMs) has attracted considerable attention in recent years because various faults such as permanent magnet demagnetization and short-circuited turns can occur and result in unexpected failure of motor related system. Several conventional current and back electromotive force (BEMF) analysis techniques were proposed to detect certain faults in PMSMs; however, they generally deal with a single fault only. On the contrary, cases of multiple faults are common in PMSMs. We propose a fault diagnosis method for PMSMs with single and multiple combined faults. Our method uses three phase BEMF voltages based on the fast Fourier transform (FFT), support vector machine(SVM), and visualization tools for identifying fault types and severities in PMSMs. Principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) are used to visualize the high-dimensional data into two-dimensional space. Experimental results show good visualization performance and high classification accuracy to identify fault types and severities for single and multiple faults in PMSMs.