• Title/Summary/Keyword: Unlabeled

Search Result 154, Processing Time 0.026 seconds

A Statistical Model for Choosing the Best Translation of Prepositions. (통계 정보를 이용한 전치사 최적 번역어 결정 모델)

  • 심광섭
    • Language and Information
    • /
    • v.8 no.1
    • /
    • pp.101-116
    • /
    • 2004
  • This paper proposes a statistical model for the translation of prepositions in English-Korean machine translation. In the proposed model, statistical information acquired from unlabeled Korean corpora is used to choose the best translation from several possible translations. Such information includes functional word-verb co-occurrence information, functional word-verb distance information, and noun-postposition co-occurrence information. The model was evaluated with 443 sentences, each of which has a prepositional phrase, and we attained 71.3% accuracy.

  • PDF

Analysis of L-asparaginase Related Adverse Reaction (L-asparaginase 약물 유해 반응 보고 분석)

  • Ko, Kyung Mi;La, Hyen O
    • Korean Journal of Clinical Pharmacy
    • /
    • v.27 no.3
    • /
    • pp.143-149
    • /
    • 2017
  • Background: L-asparaginase (L-ASP) is a critical agent for the treatment of acute lymphoblastic leukemia and lymphoma, which is associated with serious toxicities including hypersensitivity, pancreatitis and thrombosis. Methods: To evaluate the toxicity of L-ASP in real clinical settings, we included the patients with L-ASP adverse drug reactions (ADRs) reported in a regional pharmacovigilance center of Seoul St. Mary's hospital from January 2014 to December 2015. Results: A total of 83 cases of L-ASP related ADRs were reported in 54 patients. Of these 83 cases, 65 cases (78.3%, 65/83) were spontaneously reported and 18 cases (21.7%, 18/83) were detected by further medical records review. Of the patients with ADRs, pediatric patients accounted for 83.3% of the cases (45/54) and median age was 9 years. The most common clinical manifestations of ADRs were hematology manifestations (31.3%, 26/83), followed by hepatobiliary manifestations (18.1%, 15/83). Thirty-four serious ADRs were reported in 19 patients. The sserious ADR group showed significantly longer hospitalization and higher rate of discontinuation of L-ASP than the non-serious ADR group (p = 0.005, 0.03). The most common clinical manifestations of serious ADRs were hepatobiliary manifestations (41.2%, 14/34). In total, 8 cases (9.6%, 8/83) of unlabeled ADRs were identified. They were serious ADRs. Conclusion: We identified unlabeled serious ADRs of L-ASP. Also, correlations were observed between serious ADRs and length of hospitalization, discontinuation rate respectively. Further investigations and developed spontaneous ADR reporting systems are needed to evaluate these correlations.

Automatic Text Categorization based on Semi-Supervised Learning (준지도 학습 기반의 자동 문서 범주화)

  • Ko, Young-Joong;Seo, Jung-Yun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.5
    • /
    • pp.325-334
    • /
    • 2008
  • The goal of text categorization is to classify documents into a certain number of pre-defined categories. The previous studies in this area have used a large number of labeled training documents for supervised learning. One problem is that it is difficult to create the labeled training documents. While it is easy to collect the unlabeled documents, it is not so easy to manually categorize them for creating training documents. In this paper, we propose a new text categorization method based on semi-supervised learning. The proposed method uses only unlabeled documents and keywords of each category, and it automatically constructs training data from them. Then a text classifier learns with them and classifies text documents. The proposed method shows a similar degree of performance, compared with the traditional supervised teaming methods. Therefore, this method can be used in the areas where low-cost text categorization is needed. It can also be used for creating labeled training documents.

A Fusion Method of Co-training and Label Propagation for Prediction of Bank Telemarketing (은행 텔레마케팅 예측을 위한 레이블 전파와 협동 학습의 결합 방법)

  • Kim, Aleum;Cho, Sung-Bae
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.686-691
    • /
    • 2017
  • Telemarketing has become the center of marketing action of the industry in the information society. Recently, machine learning has emerged in many areas, especially, financial prediction. Financial data consists of lots of unlabeled data in most parts, and therefore, it is difficult for humans to perform their labeling. In this paper, we propose a fusion method of semi-supervised learning for automatic labeling of unlabeled data to predict telemarketing. Specifically, we integrate labeling results of label propagation and co-training with a decision tree. The data with lower reliabilities are removed, and the data are extracted that have consistent label from two labeling methods. After adding them to the training set, a decision tree is learned with all of them. To confirm the usefulness of the proposed method, we conduct the experiments with a real telemarketing dataset in a Portugal bank. Accuracy of the proposed method is 83.39%, which is 1.82% higher than that of the conventional method, and precision of the proposed method is 19.37%, which is 2.67% higher than that of the conventional method. As a result, we have shown that the proposed method has a better performance as assessed by the t-test.

The transfer of diacylglycerol from lipophor in to fat body in larval Manduca sexta (유충 Manduca sexta 리포포린에 의한 지방체로의 디아실글리세리드 운반)

  • Yun, Hwa-Kyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.4
    • /
    • pp.1770-1774
    • /
    • 2011
  • This paper was to characterize the transfer of diacylglycerol(DAG) from lipophorin to Manduca sexta larval fat bodies. $[^3H]$-DAG-labeled Lp($[^3H]$-DAG-Lp) was incubated with the larval fat bodies under different times and the time of DAG transfer was determined. Incubation of fat bodies with $[^3H]$-DAG-Lp resulted in accumulation of DAG and TAG in the tissue. The transfer of $[^3H]$-DAG was inhibited in the presence of suramin and unlabeled lipophorin, which would be consistent with a lipophorin receptor. The effects of suramin may be complex because it can change membrane properties when bound to the lipophorin receptor and affect the rate of DAG transfer. To investigate the lipid uptake via receptor-mediated endocytosis, we treated with endocytosis inhibitors, ammonium chloride and chloroquine. The results show that the transfer process of lipid by lipophorin and fat bodies is receptor-mediated endocytosis.

Auto-tagging Method for Unlabeled Item Images with Hypernetworks for Article-related Item Recommender Systems (잡지기사 관련 상품 연계 추천 서비스를 위한 하이퍼네트워크 기반의 상품이미지 자동 태깅 기법)

  • Ha, Jung-Woo;Kim, Byoung-Hee;Lee, Ba-Do;Zhang, Byoung-Tak
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.10
    • /
    • pp.1010-1014
    • /
    • 2010
  • Article-related product recommender system is an emerging e-commerce service which recommends items based on association in contexts between items and articles. Current services recommend based on the similarity between tags of articles and items, which is deficient not only due to the high cost in manual tagging but also low accuracies in recommendation. As a component of novel article-related item recommender system, we propose a new method for tagging item images based on pre-defined categories. We suggest a hypernetwork-based algorithm for learning association between images, which is represented by visual words, and categories of products. Learned hypernetwork are used to assign multiple tags to unlabeled item images. We show the ability of our method with a product set of real-world online shopping-mall including 1,251 product images with 10 categories. Experimental results not only show that the proposed method has competitive tagging performance compared with other classifiers but also present that the proposed multi-tagging method based on hypernetworks improves the accuracy of tagging.

Performance Improvement of Mean-Teacher Models in Audio Event Detection Using Derivative Features (차분 특징을 이용한 평균-교사 모델의 음향 이벤트 검출 성능 향상)

  • Kwak, Jin-Yeol;Chung, Yong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.3
    • /
    • pp.401-406
    • /
    • 2021
  • Recently, mean-teacher models based on convolutional recurrent neural networks are popularly used in audio event detection. The mean-teacher model is an architecture that consists of two parallel CRNNs and it is possible to train them effectively on the weakly-labelled and unlabeled audio data by using the consistency learning metric at the output of the two neural networks. In this study, we tried to improve the performance of the mean-teacher model by using additional derivative features of the log-mel spectrum. In the audio event detection experiments using the training and test data from the Task 4 of the DCASE 2018/2019 Challenges, we could obtain maximally a 8.1% relative decrease in the ER(Error Rate) in the mean-teacher model using proposed derivative features.

Sea Ice Type Classification with Optical Remote Sensing Data (광학영상에서의 해빙종류 분류 연구)

  • Chi, Junhwa;Kim, Hyun-cheol
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_2
    • /
    • pp.1239-1249
    • /
    • 2018
  • Optical remote sensing sensors provide visually more familiar images than radar images. However, it is difficult to discriminate sea ice types in optical images using spectral information based machine learning algorithms. This study addresses two topics. First, we propose a semantic segmentation which is a part of the state-of-the-art deep learning algorithms to identify ice types by learning hierarchical and spatial features of sea ice. Second, we propose a new approach by combining of semi-supervised and active learning to obtain accurate and meaningful labels from unlabeled or unseen images to improve the performance of supervised classification for multiple images. Therefore, we successfully added new labels from unlabeled data to automatically update the semantic segmentation model. This should be noted that an operational system to generate ice type products from optical remote sensing data may be possible in the near future.

Semi-Supervised Learning to Predict Default Risk for P2P Lending (준지도학습 기반의 P2P 대출 부도 위험 예측에 대한 연구)

  • Kim, Hyun-jung
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.185-192
    • /
    • 2022
  • This study investigates the effect of the semi-supervised learning(SSL) method on predicting default risk of peer-to-peer(P2P) loans. Despite its proven performance, the supervised learning(SL) method requires labeled data, which may require a lot of effort and resources to collect. With the rapid growth of P2P platforms, the number of loans issued annually that have no clear final resolution is continuously increasing leading to abundance in unlabeled data. The research data of P2P loans used in this study were collected on the LendingClub platform. This is why an SSL model is needed to predict the default risk by using not only information from labeled loans(fully paid or defaulted) but also information from unlabeled loans. The results showed that in terms of default risk prediction and despite the use of a small number of labeled data, the SSL method achieved a much better default risk prediction performance than the SL method trained using a much larger set of labeled data.

Unsupervised feature learning for classification

  • Abdullaev, Mamur;Alikhanov, Jumabek;Ko, Seunghyun;Jo, Geun Sik
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.07a
    • /
    • pp.51-54
    • /
    • 2016
  • In computer vision especially in image processing, it has become popular to apply deep convolutional networks for supervised learning. Convolutional networks have shown a state of the art results in classification, object recognition, detection as well as semantic segmentation. However, supervised learning has two major disadvantages. One is it requires huge amount of labeled data to get high accuracy, the second one is to train so much data takes quite a bit long time. On the other hand, unsupervised learning can handle these problems more cheaper way. In this paper we show efficient way to learn features for classification in an unsupervised way. The network trained layer-wise, used backpropagation and our network learns features from unlabeled data. Our approach shows better results on Caltech-256 and STL-10 dataset.

  • PDF