• Title/Summary/Keyword: t-SNE

Search Result 42, Processing Time 0.035 seconds

A Study on the Visualization of an Airline's Fleet State Variation (항공사 기단의 상태변화 시각화에 관한 연구)

  • Lee, Yonghwa;Lee, Juhwan;Lee, Keumjin
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.29 no.2
    • /
    • pp.84-93
    • /
    • 2021
  • Airline schedule is the most basic data for flight operations and has significant importance to an airline's management. It is crucial to know the airline's current schedule status in order to effectively manage the company and to be prepared for abnormal situations. In this study, machine learning techniques were applied to actual schedule data to examine the possibility of whether the airline's fleet state could be artificially learned without prior information. Given that the schedule is in categorical form, One Hot Encoding was applied and t-SNE was used to reduce the dimension of the data and visualize them to gain insights into the airline's overall fleet status. Interesting results were discovered from the experiments where the initial findings are expected to contribute to the fields of airline schedule health monitoring, anomaly detection, and disruption management.

Comparison of Homograph Meaning Representation according to BERT's layers (BERT 레이어에 따른 동형이의어 의미 표현 비교)

  • Kang, Il Min;Choi, Yong-Seok;Lee, Kong Joo
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.161-164
    • /
    • 2019
  • 본 논문은 BERT 모델을 이용하여 동형이의어의 단어 표현(Word Representation) 차이에 대한 실험을 한다. BERT 모델은 Transformer 모델의 인코더 부분을 사용하여 양방향을 고려한 단어 예측과 문장 수준의 이해를 얻을 수 있는 모델이다. 실험은 동형이의어에 해당되는 단어의 임베딩으로 군집화를 수행하고 이를 Purity와 NMI 점수로 계산하였다. 또한 각 단어 임베딩 사이를 코사인거리(Cosine Distance)로 계산하고 t-SNE를 통해 계층에 따른 변화를 시각화하였다. 군집된 결과는 모델의 중간 계층에서 점수가 가장 높았으며, 코사인거리는 8계층까지는 증가하고 11계층에서 급격히 값이 변하는 것을 확인할 수 있었다.

  • PDF

Detection and Classification of Demagnetization and Short-Circuited Turns in Permanent Magnet Synchronous Motors

  • Youn, Young-Woo;Hwang, Don-Ha;Song, Sung-ju;Kim, Yong-Hwa
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.4
    • /
    • pp.1614-1622
    • /
    • 2018
  • The research related to fault diagnosis in permanent magnet synchronous motors (PMSMs) has attracted considerable attention in recent years because various faults such as permanent magnet demagnetization and short-circuited turns can occur and result in unexpected failure of motor related system. Several conventional current and back electromotive force (BEMF) analysis techniques were proposed to detect certain faults in PMSMs; however, they generally deal with a single fault only. On the contrary, cases of multiple faults are common in PMSMs. We propose a fault diagnosis method for PMSMs with single and multiple combined faults. Our method uses three phase BEMF voltages based on the fast Fourier transform (FFT), support vector machine(SVM), and visualization tools for identifying fault types and severities in PMSMs. Principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) are used to visualize the high-dimensional data into two-dimensional space. Experimental results show good visualization performance and high classification accuracy to identify fault types and severities for single and multiple faults in PMSMs.

Development of big data based Skin Care Information System SCIS for skin condition diagnosis and management

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.137-147
    • /
    • 2022
  • Diagnosis and management of skin condition is a very basic and important function in performing its role for workers in the beauty industry and cosmetics industry. For accurate skin condition diagnosis and management, it is necessary to understand the skin condition and needs of customers. In this paper, we developed SCIS, a big data-based skin care information system that supports skin condition diagnosis and management using social media big data for skin condition diagnosis and management. By using the developed system, it is possible to analyze and extract core information for skin condition diagnosis and management based on text information. The skin care information system SCIS developed in this paper consists of big data collection stage, text preprocessing stage, image preprocessing stage, and text word analysis stage. SCIS collected big data necessary for skin diagnosis and management, and extracted key words and topics from text information through simple frequency analysis, relative frequency analysis, co-occurrence analysis, and correlation analysis of key words. In addition, by analyzing the extracted key words and information and performing various visualization processes such as scatter plot, NetworkX, t-SNE, and clustering, it can be used efficiently in diagnosing and managing skin conditions.

A Study on Regional Differences in Healthcare in Korea: Using Position Value for Relative Comparison Index (한국 지역 간 보건의료수준의 상대적 위치 비교 연구: Position Value for Relative Comparison Index를 활용하여)

  • Youn, Hin-Moi;Yun, Choa;Kang, Soo Hyun;Kwon, Junhyun;Lee, Hyeon Ji;Park, Eun-Cheol;Jang, Sung-In
    • Health Policy and Management
    • /
    • v.31 no.4
    • /
    • pp.491-507
    • /
    • 2021
  • Background: This study aims to measure regional healthcare differences in Korea, and define relatively underserved areas. Methods: We employed position value for relative comparison index (PARC) to measure the healthcare status of 250 areas using 137 indicators in five following domains: healthcare demand, supply, accessibility, service utilization, and outcome. We performed a sensitivity analysis using t-SNE (t-distributed stochastic neighboring embedding). Results: Based on PARC values, 83 areas were defined as relatively underserved areas, 49 of which were categorized as moderate and 34 as severe. The provincial regions with the most underserved areas were Gyeongbuk (16 areas), Gangwon (13), Jeonnam (13), and Gyeongnam (12). Conclusion: This study suggests a relative comparison approach to define relatively underserved areas in healthcare. Further studies incorporating various perspectives and methods are required for policy implications.

Effect of Overshooting on Final Masses of Type Ibc Supernova Progenitors

  • Chun, Wonseok;Yoon, Sung-Chul
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.39 no.2
    • /
    • pp.88.1-88.1
    • /
    • 2014
  • Helium mass in the envelope is one of the most important properties in progenitors of type Ib/c supernovae (SNe Ib/c), since SN Ib/c progenitors are distinguished by the presence of He I lines. However, previous progenitor models do not reproduce the required He mass limit($M_{He}$ < $0.14M_{\odot}$) suggested by a spectroscopic analysis of SN Ib/c. In this work, we investigated the effect of overshooting on the evolution of pure helium stars, focusing on the final He mass in the envelope, $M_{He,f}$. We used the MESA code to calculate single helium star models with the initial masses of $M_{init}=5{\sim}30M_{\odot}$, Z=0.02, 0.04 and overshooting parameters of $f_{ov}=0{\sim}0.4$. The final He mass $M_{He,f}$ decreases as $f_{ov}$ increases, due to larger burning core compared to weak overshooting models. Dependence of the final mass $M_{He,f}$ on overshooting is strongest for models with $M_{init}=7{\sim}10M_{\odot}$, and this effect originates from accelerated mass loss during transition between WNE and WC/O phase. However, $M_{He,f}$ exceeds $0.27M_{\odot}$ for all models, which still doesn't meet the criteria of $M_{He}$ < $0.14M_{\odot}$. This implies that mass loss during the post helium burning phase must be enhanced dramatically compared to what the standard models predict.

  • PDF

Discriminative Manifold Learning Network using Adversarial Examples for Image Classification

  • Zhang, Yuan;Shi, Biming
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.5
    • /
    • pp.2099-2106
    • /
    • 2018
  • This study presents a novel approach of discriminative feature vectors based on manifold learning using nonlinear dimension reduction (DR) technique to improve loss function, and combine with the Adversarial examples to regularize the object function for image classification. The traditional convolutional neural networks (CNN) with many new regularization approach has been successfully used for image classification tasks, and it achieved good results, hence it costs a lot of Calculated spacing and timing. Significantly, distrinct from traditional CNN, we discriminate the feature vectors for objects without empirically-tuned parameter, these Discriminative features intend to remain the lower-dimensional relationship corresponding high-dimension manifold after projecting the image feature vectors from high-dimension to lower-dimension, and we optimize the constrains of the preserving local features based on manifold, which narrow the mapped feature information from the same class and push different class away. Using Adversarial examples, improved loss function with additional regularization term intends to boost the Robustness and generalization of neural network. experimental results indicate that the approach based on discriminative feature of manifold learning is not only valid, but also more efficient in image classification tasks. Furthermore, the proposed approach achieves competitive classification performances for three benchmark datasets : MNIST, CIFAR-10, SVHN.

Vibration-based structural health monitoring using CAE-aided unsupervised deep learning

  • Minte, Zhang;Tong, Guo;Ruizhao, Zhu;Yueran, Zong;Zhihong, Pan
    • Smart Structures and Systems
    • /
    • v.30 no.6
    • /
    • pp.557-569
    • /
    • 2022
  • Vibration-based structural health monitoring (SHM) is crucial for the dynamic maintenance of civil building structures to protect property security and the lives of the public. Analyzing these vibrations with modern artificial intelligence and deep learning (DL) methods is a new trend. This paper proposed an unsupervised deep learning method based on a convolutional autoencoder (CAE), which can overcome the limitations of conventional supervised deep learning. With the convolutional core applied to the DL network, the method can extract features self-adaptively and efficiently. The effectiveness of the method in detecting damage is then tested using a benchmark model. Thereafter, this method is used to detect damage and instant disaster events in a rubber bearing-isolated gymnasium structure. The results indicate that the method enables the CAE network to learn the intact vibrations, so as to distinguish between different damage states of the benchmark model, and the outcome meets the high-dimensional data distribution characteristics visualized by the t-SNE method. Besides, the CAE-based network trained with daily vibrations of the isolating layer in the gymnasium can precisely recover newly collected vibration and detect the occurrence of the ground motion. The proposed method is effective at identifying nonlinear variations in the dynamic responses and has the potential to be used for structural condition assessment and safety warning.

Intensive Monitoring Survey of Nearby Galaxies (IMSNG) : Constraints on the progenitor system of a normal Type Ia SN 2019ein from its light curve at the early phase

  • Lim, Gu;Im, Myungshin;Kim, Dohyeong;Paek, Gregory S.H;Choi, Changsu;Kim, Sophia;Hwang, Sungyong
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.1
    • /
    • pp.55.2-56
    • /
    • 2021
  • The progenitor of Type Ia supernovae (SNe Ia) is mainly believed to be a close binary system of acarbon-oxygen white dwarf (CO WD) and non-degenerate companion (single degenerate) or another WD (double degenerate). However, it is unclear which system is more prevalent. Here, we present a high cadence optical/Near-IR light curve of normal but slightly faint type Ia SN 2019ein from IMSNG project. We fit the early light curve (t <+8.3 days from the first detection) with various models to find the shock-heated cooling emission from SN ejecta-companion interaction. No significant shock-heated cooling emission is found, from which we constrain the progenitor star size as the following. The upper limit (Rupper,*) of the companion size in R-band is ~0.2R when forcing the first light time (tfl) to have one value and ~0.9R when using the mean value of tfl from the fitting in each band. Assuming the source of the I-band curve is almost powered from the radioactive decay, we obtained Rupper,*~1.2R. The early B-V color curve is in agreement with the model color curve of the 2M main sequence companion. These results allow us to at least rule out large stars like red giants as a companion star of the binary progenitor system of this supernova. B-R and V-R color do not show any significant signs of a red bump, which shows a thin helium shell (MHe<0.1M) for the sub-Mch WD (double detonation model). In addition, we estimated the distance to NGC 5353 as 37.098±0.028Mpc.

  • PDF

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.