• Title/Summary/Keyword: Text Classification Application

Search Result 72, Processing Time 0.033 seconds

A Comparative Study of the Rules of Ship Classification Societies on the Propeller Shaft Design. (추진축설계(推進軸設計)에 관(關)한 각선급협회규칙(各船級協會規則)의 비교연구(比較硏究))

  • K.C.,Kim;J.W.,Lee
    • Bulletin of the Society of Naval Architects of Korea
    • /
    • v.4 no.1
    • /
    • pp.59-65
    • /
    • 1967
  • Since the screw propellers were adopted as ship propulsion devices, the replacement of propeller shaft due to damage was mostly of fatigue failure due to the alternative stresses [1],[2]. To prevent such a failure, hence, it is suggested that careful attention should be paid to account of the alternative stresses on the design stage of the propeller shafts. In connection with this fact the Ship Classification Societies' Rules are regarded simply as guidance for preliminary determination of the shaft diameter. In this paper, limiting the topic to the small and medium-sized motor ships, an evaluation of the Rules formulae to a theoretical based on Soderberg's correlation [5] between the factor of safety and the resultant stresses obtained by application of the maximum shear theory is done. For this purpose eleven (11) ships built recently in Korea were taken as a species(refer to table 2. in text). In the end the following conclusions are made: (1) In general the Rules formulae give considerably larger size of the propeller shaft diameter than that derived from theoretical calculation, that is, about 7% more in AB and BV Rules, and about 20% more in LR and KR-NK Rules. (2) LR Rule gives the largest size of all, and AB Rule is mostly closed value to the theoretical. (3) The formular of the AB Rule is considered to be of the simplest in utilization and of the reasonable.

  • PDF

An Opinion Document Clustering Technique for Product Characterization (제품 특징화를 위한 오피니언 문서의 클러스터링 기법)

  • Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.2
    • /
    • pp.95-108
    • /
    • 2014
  • Opinion Mining is one of the application domains of text mining which extracting opinions from documents, and much researches are currently underway. Most of related researches focused on the sentiment classification which classifies the documents into positive/negative opinions. However, there is a little interest in extracting the features characterizing the individual product. In this paper, we propose the technique classifying the opinion documents according to the product features, and selecting the those features characterizing each product. In the proposed method, we utilize the document clustering technique and develope a new algorithm for evaluating the similarity between documents. In addition, through experiments, we prove the usefulness of proposed method.

Application and Analysis of Emotional Attributes using Crowdsourced Method for Hangul Font Recommendation System (한글 글꼴 추천시스템을 위한 크라우드 방식의 감성 속성 적용 및 분석)

  • Kim, Hyun-Young;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.4
    • /
    • pp.704-712
    • /
    • 2017
  • Various researches on content sensibility with the development of digital contents are under way. Emotional research on fonts is also underway in various fields. There is a requirement to use the content expressions in the same way as the content, and to use the font emotion and the textual sensibility of the text in harmony. But it is impossible to select a proper font emotion in Korea because each of more than 6,000 fonts has a certain emotion. In this paper, we analysed emotional classification attributes and constructed the Hangul font recommendation system. Also we verified the credibility and validity of the attributes themselves in order to apply to Korea Hangul fonts. After then, we tested whether general users can find a proper font in a commercial font set through this emotional recommendation system. As a result, when users want to express their emotions in sentences more visually, they can get a recommendation of a Hangul font having a desired emotion by utilizing font-based emotion attribute values collected through the crowdsourced method.

A Dynamic Recommendation Agent System for E-Mail Management based on Rule Filtering Component (이메일 관리를 위한 룰 필터링 컴포넌트 기반 능동형 추천 에이전트 시스템)

  • Jeong, Ok-Ran;Cho, Dong-Sub
    • Proceedings of the KIEE Conference
    • /
    • 2004.05a
    • /
    • pp.126-128
    • /
    • 2004
  • As e-mail is becoming increasingly important in every day life activity, mail users spend more and more time organizing and classifying the e-mails they receive into folder. Many existing recommendation systems or text classification are mostly focused on recommending the products for the commercial purposes or web documents. So this study aims to apply these application to e-mail more necessary to users. This paper suggests a dynamic recommendation agent system based on Rule Filtering Component recommending the relevant category to enable users directly to manage the optimum classification when a new e-mail is received as the effective method for E-Mail Management. Moreover we try to improve the accuracy as eliminating the limits of misclassification that can be key in classifying e-mails by category. While the existing Bayesian Learning Algorithm mostly uses the fixed threshold, we prove to improve the satisfaction of users as increasing the accuracy by changing the fixed threshold to the dynamic threshold. We designed main modules by rule filtering component for enhanced scalability and reusability of our system.

  • PDF

Developing and Evaluating Damage Information Classifier of High Impact Weather by Using News Big Data (재해기상 언론기사 빅데이터를 활용한 피해정보 자동 분류기 개발)

  • Su-Ji, Cho;Ki-Kwang Lee
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.7-14
    • /
    • 2023
  • Recently, the importance of impact-based forecasting has increased along with the socio-economic impact of severe weather have emerged. As news articles contain unconstructed information closely related to the people's life, this study developed and evaluated a binary classification algorithm about snowfall damage information by using media articles text mining. We collected news articles during 2009 to 2021 which containing 'heavy snow' in its body context and labelled whether each article correspond to specific damage fields such as car accident. To develop a classifier, we proposed a probability-based classifier based on the ratio of the two conditional probabilities, which is defined as I/O Ratio in this study. During the construction process, we also adopted the n-gram approach to consider contextual meaning of each keyword. The accuracy of the classifier was 75%, supporting the possibility of application of news big data to the impact-based forecasting. We expect the performance of the classifier will be improve in the further research as the various training data is accumulated. The result of this study can be readily expanded by applying the same methodology to other disasters in the future. Furthermore, the result of this study can reduce social and economic damage of high impact weather by supporting the establishment of an integrated meteorological decision support system.

Efficient dimension reduction using QR-decomposition and its application to text categorization (QR-분해를 이용한 효율적인 차원 감소 방법과 문서 분류에의 응용)

  • Lee Moon-Hwi;Park Cheong-Hee
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.358-360
    • /
    • 2006
  • LDA는 그룹간 간격을 최대화하고 그룹내 분산을 최소화하는 선형변환을 구함으로써 차원 감소된 공간에서 분별력(classification performance)을 높이는 선형 차원 감소 방법이다. 본 논문에서는 저샘플 문제(undersampled problem)에서 LDA를 적용할 수 있도록 QR-분해를 이용한 효율적인 차원 감소 방법을 제안한다. 특히 제안되는 방법은 문서 분류 문제에서처럼 한 문서가 몇 개의 카테고리에 중복적으로 속하는 경우 등 데이터의 독립성이 보장되지 않는 경우에도 효과적으로 적용될 수 있다는 장점이 있다.

  • PDF

Citation Practices in Academic Corpora: Implications for EAP Writing

  • Min, Su-Jung
    • English Language & Literature Teaching
    • /
    • v.10 no.3
    • /
    • pp.113-126
    • /
    • 2004
  • Explicit reference to the work of other authors is an essential feature of most academic research writings. Corpus analysis of academic text can reveal much about what writers actually do and why they do so. Application of corpus tools in language education has been well documented by many scholars (Pedersen, 1995, Swales, 1990, Thompson, 2000). They demonstrate how computer technology can assist in the effective analysis of corpus based data. For teaching purposes, tills recent research provides insights in the areas of English for Academe Purposes (EAP). The need for such support is evident when students have to use appropriate citations in their writings. Using Swales' (1990) division of citation forms into integral and non-integral and Thompson and Tnbble's (2001) classification scheme, this paper codifies academic texts in a corpus. The texts are academic research articles from different disciplines. The results lead into a comparison of the citation practices m different disciplines. Finally, it is argued that the information obtained in this study is useful for EAP writing courses in EFL countries.

  • PDF

Healing of STEP AP214 Automotive CAD Data (STEP AP214 자동차 설계 데이터 정리 시스템)

  • 양정삼;한순흥
    • Korean Journal of Computational Design and Engineering
    • /
    • v.7 no.3
    • /
    • pp.170-176
    • /
    • 2002
  • To exchange CAD data between heterogeneous CAD systems, we generally use a neutral format especially STEP, which is the international standard (ISO-10303) for product model data exchange. AP214 (Application Protocol) for the automotive industry not only takes into account geometry and organizational data, but also provides a classification mechanism for product modeling. When reading a STEP file during a design process that is exported from other CAD systems, it is a burden to a designer to go through the tedious process of removing duplicate or non-manifold entities, adjusting parts, and rearranging text. We analyze the structure of AP214 and develop a healing tool to solve the following problem. Without the assembly information in the Master workspace of CATIA, or to read a STEP file from Pro/Engineer, a designer should do a repetitive process of disintegrating an assembly into parts one by one. We have developed a post-processing tool for STEP AP214 that separates out a part from an assembly model and adjusts superfluous or useless entities using the ACIS kernel.

A Comparison Study on the Application Method of Naive Bayes for Text Classification (텍스트 분류의 성능 향상을 위한 나이브 베이즈 응용 기법 비교 연구)

  • Heo, Jae-Hee;Park, Eun-Young;Park, Young-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.734-736
    • /
    • 2015
  • 텍스트를 분류해내는 일이 점점 중요해지고 있는 현 시점에서 기계학습은 다른 기법들보다도 가장 효과적인 성능을 드러낸다. 그 중에서도 특히 나이브 베이즈 분류기는 간절하고 효율적으로 알려진 기계학습 모델 중에 하나이다. 본 논문은 보다 효과적인 텍스트 분류를 위해 나이브 베이즈의 기법들을 응용 및 개선하고자 한 기존의 연구들을 소개하고, 이를 분석하고자 한다.

Application of a Topic Model on the Korea Expressway Corporation's VOC Data (한국도로공사 VOC 데이터를 이용한 토픽 모형 적용 방안)

  • Kim, Ji Won;Park, Sang Min;Park, Sungho;Jeong, Harim;Yun, Ilsoo
    • Journal of Information Technology Services
    • /
    • v.19 no.6
    • /
    • pp.1-13
    • /
    • 2020
  • Recently, 80% of big data consists of unstructured text data. In particular, various types of documents are stored in the form of large-scale unstructured documents through social network services (SNS), blogs, news, etc., and the importance of unstructured data is highlighted. As the possibility of using unstructured data increases, various analysis techniques such as text mining have recently appeared. Therefore, in this study, topic modeling technique was applied to the Korea Highway Corporation's voice of customer (VOC) data that includes customer opinions and complaints. Currently, VOC data is divided into the business areas of Korea Expressway Corporation. However, the classified categories are often not accurate, and the ambiguous ones are classified as "other". Therefore, in order to use VOC data for efficient service improvement and the like, a more systematic and efficient classification method of VOC data is required. To this end, this study proposed two approaches, including method using only the latent dirichlet allocation (LDA), the most representative topic modeling technique, and a new method combining the LDA and the word embedding technique, Word2vec. As a result, it was confirmed that the categories of VOC data are relatively well classified when using the new method. Through these results, it is judged that it will be possible to derive the implications of the Korea Expressway Corporation and utilize it for service improvement.