• Title/Summary/Keyword: POS

Search Result 506, Processing Time 0.029 seconds

Detecting Errors in POS-Tagged Corpus on XGBoost and Cross Validation (XGBoost와 교차검증을 이용한 품사부착말뭉치에서의 오류 탐지)

  • Choi, Min-Seok;Kim, Chang-Hyun;Park, Ho-Min;Cheon, Min-Ah;Yoon, Ho;Namgoong, Young;Kim, Jae-Kyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.7
    • /
    • pp.221-228
    • /
    • 2020
  • Part-of-Speech (POS) tagged corpus is a collection of electronic text in which each word is annotated with a tag as the corresponding POS and is widely used for various training data for natural language processing. The training data generally assumes that there are no errors, but in reality they include various types of errors, which cause performance degradation of systems trained using the data. To alleviate this problem, we propose a novel method for detecting errors in the existing POS tagged corpus using the classifier of XGBoost and cross-validation as evaluation techniques. We first train a classifier of a POS tagger using the POS-tagged corpus with some errors and then detect errors from the POS-tagged corpus using cross-validation, but the classifier cannot detect errors because there is no training data for detecting POS tagged errors. We thus detect errors by comparing the outputs (probabilities of POS) of the classifier, adjusting hyperparameters. The hyperparameters is estimated by a small scale error-tagged corpus, in which text is sampled from a POS-tagged corpus and which is marked up POS errors by experts. In this paper, we use recall and precision as evaluation metrics which are widely used in information retrieval. We have shown that the proposed method is valid by comparing two distributions of the sample (the error-tagged corpus) and the population (the POS-tagged corpus) because all detected errors cannot be checked. In the near future, we will apply the proposed method to a dependency tree-tagged corpus and a semantic role tagged corpus.

A Study on the Layout of Master File of POS for Apparel Industry (국내 의류산업의 POS 시스템 사용 실태에 관한 연구)

  • 조진숙;차주희
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.24 no.4
    • /
    • pp.451-462
    • /
    • 2000
  • This study is to investigate the current use of POS system in Korean clothing industry, so that we can make suggestions for better use of it. We interviewd companies using POS systems as well as EAN Korea which is in charge of POS data processing. As a results. we found out that standard KAN code has severe difficulties to cope with the diversity of the information which is necessary in clothing industry. Therefore we are making some suggestions to use KAN code as a recognizing code for more structured master data file for extremely diverse clothing items.

  • PDF

Implementation of Secure POS SYSTEM (안전한 POS System의 구현)

  • 박동규;황유동
    • Journal of the Korea Society of Computer and Information
    • /
    • v.6 no.2
    • /
    • pp.70-77
    • /
    • 2001
  • This paper focuses on the design and implementation of the secure POS SYSTEM We propose a secure POS SYSTEM with RSA, MD5, Triple-DES for security and with RBAC model for access control. The client authentication is implemented before the data transferred be4ween client and server in proposed POS SYSTEM. We apply security algorithms with it, so that it can maintain confidentiality and integrity In addition. we apply RBAC model for access control of data. We verified the proposed system's stability by applying the proposed system to real works.

A Hidden Markov Model Imbedding Multiword Units for Part-of-Speech Tagging

  • Kim, Jae-Hoon;Jungyun Seo
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.6
    • /
    • pp.7-13
    • /
    • 1997
  • Morphological Analysis of Korean has known to be a very complicated problem. Especially, the degree of part-of-speech(POS) ambiguity is much higher than English. Many researchers have tried to use a hidden Markov model(HMM) to solve the POS tagging problem and showed arround 95% correctness ratio. However, the lack of lexical information involves a hidden Markov model for POS tagging in lots of difficulties in improving the performance. To alleviate the burden, this paper proposes a method for combining multiword units, which are types of lexical information, into a hidden Markov model for POS tagging. This paper also proposes a method for extracting multiword units from POS tagged corpus. In this paper, a multiword unit is defined as a unit which consists of more than one word. We found that these multiword units are the major source of POS tagging errors. Our experiment shows that the error reduction rate of the proposed method is about 13%.

  • PDF

A Study on the Management Promotion of Small Retail Shops with Information System in Practical Use and Implementation of PDS (소규모 유통점포의 정보시스템 활용 현황과 PDS 구축을 통한 경영 활성화 방안 고찰)

  • JEOUNE, Dae-Seong;RYOO, Yun-Kyoo
    • Journal of the Korea society of information convergence
    • /
    • v.5 no.2
    • /
    • pp.91-99
    • /
    • 2012
  • In this paper, we discuss the effectiveness of information system such as POS(point of sale) and PDS(POS data service) to the government-supported small retail shop called nadle shop. Also, the functional requirements for PDS implementation are examined. Introduction of information system to small retail shop is necessary for achieving good - but a little impact on - management performance. For the empirical study, the survey results for POS utilization and management performance for the nadle shops supported from 2010 to the first half of 2011 are analyzed. Consequently, information system doesn't give direct effect on producing management performance. However, no doubt it contribute service quality and satisfaction to the customers and provides refined information to be useful to the owners.

  • PDF

The Effects of Perceived Organizational Support on Organizational Commitment and Career Commitment of Clinical Nurses (임상간호사의 조직후원인식이 조직몰입과 경력몰입에 미치는 영향)

  • Kim, Myoung-Sook
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.14 no.4
    • /
    • pp.458-466
    • /
    • 2008
  • Purpose: The purpose of this study was to identify the effects of perceived organizational support on organizational commitment and career commitment of nurses. Method: The subjects of this study were 336 nurses who were working in the 6 hospitals. The data were collected by structured questionnaire from Oct. 9 to Nov. 7 of 2006. Data were analyzed using descriptive statistics, t-test, ANOVA, Scheffe test, Pearson correlation coefficients, and multiple regression. Results: The mean score of POS was 2.87, organizational commitment was 3.30 and career commitment was 3.08. The POS was positively correlated with organizational commitment and career commitment. The POS and marital status explained 21.3% of the variance for affective commitment, 12.1% of the variance of continuous commitment. The POS and career explained 14.8% of the variance for career commitment. Conclusion: The findings showed that POS was important factor for enhancing organizational commitment and career commitment of clinical nurses. Therefore, the nurse manager must establish the strategies to improve the POS of the nurses in order to promote the organizational commitment and career commitment.

  • PDF

Detecting and correcting errors in Korean POS-tagged corpora (한국어 품사 부착 말뭉치의 오류 검출 및 수정)

  • Choi, Myung-Gil;Seo, Hyung-Won;Kwon, Hong-Seok;Kim, Jae-Hoon
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.37 no.2
    • /
    • pp.227-235
    • /
    • 2013
  • The quality of the part-of-speech (POS) annotation in a corpus plays an important role in developing POS taggers. There, however, are several kinds of errors in Korean POS-tagged corpora like Sejong Corpus. Such errors are likely to be various like annotation errors, spelling errors, insertion and/or deletion of unexpected characters. In this paper, we propose a method for detecting annotation errors using error patterns, and also develop a tool for effectively correcting them. Overall, based on the proposed method, we have hand-corrected annotation errors in Sejong POS Tagged Corpus using the developed tool. As the result, it is faster at least 9 times when compared without using any tools. Therefore we have observed that the proposed method is effective for correcting annotation errors in POS-tagged corpus.

Improved Character-Based Neural Network for POS Tagging on Morphologically Rich Languages

  • Samat Ali;Alim Murat
    • Journal of Information Processing Systems
    • /
    • v.19 no.3
    • /
    • pp.355-369
    • /
    • 2023
  • Since the widespread adoption of deep-learning and related distributed representation, there have been substantial advancements in part-of-speech (POS) tagging for many languages. When training word representations, morphology and shape are typically ignored, as these representations rely primarily on collecting syntactic and semantic aspects of words. However, for tasks like POS tagging, notably in morphologically rich and resource-limited language environments, the intra-word information is essential. In this study, we introduce a deep neural network (DNN) for POS tagging that learns character-level word representations and combines them with general word representations. Using the proposed approach and omitting hand-crafted features, we achieve 90.47%, 80.16%, and 79.32% accuracy on our own dataset for three morphologically rich languages: Uyghur, Uzbek, and Kyrgyz. The experimental results reveal that the presented character-based strategy greatly improves POS tagging performance for several morphologically rich languages (MRL) where character information is significant. Furthermore, when compared to the previously reported state-of-the-art POS tagging results for Turkish on the METU Turkish Treebank dataset, the proposed approach improved on the prior work slightly. As a result, the experimental results indicate that character-based representations outperform word-level representations for MRL performance. Our technique is also robust towards the-out-of-vocabulary issues and performs better on manually edited text.

A Study on the Understanding of Restaurant Information System : Focus on Point-of-Sale System (식당정보시스템에 관한 연구 Point-of-sale System을 중심으로)

  • Yu, Jong-Seo
    • Culinary science and hospitality research
    • /
    • v.5 no.2
    • /
    • pp.303-323
    • /
    • 1999
  • 컴퓨터의 발전에 따라 식당운영 기법도 많은 영향을 받았다. 오늘날 이러한 많은 변화 중에서 여러 가지의 긍정적인 혜택을 찾아볼 수 있는데 그 중에는 영양분석, 회계, 구매등의 영역에서 여러 가지 발전이 그것이라 할 수 있다. 그러나 모든 컴퓨터 시스템(POS)이 동일한 기능과 잠재성을 가지고 혜택을 주는 것은 아니며 POS 시스템의 구조를 이해하고 발전가능성을 예견하는 것은 매우 중요한 사안이다. 최근의 컴퓨터 환경은 무척 빨리 발전하고 있는데 식당의 운영자에게 중요한 점은 올바른 POS 시스템을 구입하는 것이다. 우리는 이 연구를 통해서 식당의 POS 시스템에 관하여 이해를 하고 향후의 발전 방향을 예측할 수 있을 것이다.

  • PDF

Screening and isolation of antibacterial proteinaceous compounds from flower tissues: Alternatives for treatment of healthcare-associated infections

  • de Almeida, Renato Goulart;Silva, Osmar Nascimento;de Souza Candido, Elizabete;Moreira, Joao Suender;Jojoa, Dianny Elizabeth Jimenez;Gomes, Diego Garces;de Souza Freire, Mirna;de Miranda Burgel, Pedro Henrique;de Oliveira, Nelson Gomes Junior;Valencia, Jorge William Arboleda;Franco, Octavio Luiz;Dias, Simoni Campos
    • CELLMED
    • /
    • v.4 no.1
    • /
    • pp.5.1-5.8
    • /
    • 2014
  • Healthcare-associated infection represents a frequent cause of mortality that increases hospital costs. Due to increasing microbial resistance to antibiotics, it is necessary to search for alternative therapies. Consequently, novel alternatives for the control of resistant microorganisms have been studied. Among them, plant antimicrobial protein presents enormous potential, with flowers being a new source of antimicrobial molecules. In this work, the antimicrobial activity of protein-rich fractions from flower tissues from 18 different species was evaluated against several human pathogenic bacteria. The results showed that protein-rich fractions of 12 species were able to control bacterial development. Due its broad inhibition spectrum and high antibacterial activity, the protein-rich fraction of Hibiscus rosa-sinensis was subjected to DEAE-Sepharose chromatography, yielding a retained fraction and a non-retained fraction. The retained fraction inhibits 29.5% of Klebsiella pneumoniae growth, and the non-retained fraction showed 31.5% of growth inhibition against the same bacteria. The protein profile of the chromatography fractions was analyzed by using SDS-PAGE, revealing the presence of two major protein bands in the retained fraction, of 20 and 15 kDa. The results indicate that medicinal plants have the biotechnological potential to increase knowledge about antimicrobial protein structure and action mechanisms, assisting in the rational design of antimicrobial compounds for the development of new antibiotic drugs.