• Title/Summary/Keyword: classification model

Search Result 4,128, Processing Time 0.036 seconds

Similar Contents Recommendation Model Based On Contents Meta Data Using Language Model (언어모델을 활용한 콘텐츠 메타 데이터 기반 유사 콘텐츠 추천 모델)

  • Donghwan Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.27-40
    • /
    • 2023
  • With the increase in the spread of smart devices and the impact of COVID-19, the consumption of media contents through smart devices has significantly increased. Along with this trend, the amount of media contents viewed through OTT platforms is increasing, that makes contents recommendations on these platforms more important. Previous contents-based recommendation researches have mostly utilized metadata that describes the characteristics of the contents, with a shortage of researches that utilize the contents' own descriptive metadata. In this paper, various text data including titles and synopses that describe the contents were used to recommend similar contents. KLUE-RoBERTa-large, a Korean language model with excellent performance, was used to train the model on the text data. A dataset of over 20,000 contents metadata including titles, synopses, composite genres, directors, actors, and hash tags information was used as training data. To enter the various text features into the language model, the features were concatenated using special tokens that indicate each feature. The test set was designed to promote the relative and objective nature of the model's similarity classification ability by using the three contents comparison method and applying multiple inspections to label the test set. Genres classification and hash tag classification prediction tasks were used to fine-tune the embeddings for the contents meta text data. As a result, the hash tag classification model showed an accuracy of over 90% based on the similarity test set, which was more than 9% better than the baseline language model. Through hash tag classification training, it was found that the language model's ability to classify similar contents was improved, which demonstrated the value of using a language model for the contents-based filtering.

A Study on Fine-Tuning and Transfer Learning to Construct Binary Sentiment Classification Model in Korean Text (한글 텍스트 감정 이진 분류 모델 생성을 위한 미세 조정과 전이학습에 관한 연구)

  • JongSoo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, generative models based on the Transformer architecture, such as ChatGPT, have been gaining significant attention. The Transformer architecture has been applied to various neural network models, including Google's BERT(Bidirectional Encoder Representations from Transformers) sentence generation model. In this paper, a method is proposed to create a text binary classification model for determining whether a comment on Korean movie review is positive or negative. To accomplish this, a pre-trained multilingual BERT sentence generation model is fine-tuned and transfer learned using a new Korean training dataset. To achieve this, a pre-trained BERT-Base model for multilingual sentence generation with 104 languages, 12 layers, 768 hidden, 12 attention heads, and 110M parameters is used. To change the pre-trained BERT-Base model into a text classification model, the input and output layers were fine-tuned, resulting in the creation of a new model with 178 million parameters. Using the fine-tuned model, with a maximum word count of 128, a batch size of 16, and 5 epochs, transfer learning is conducted with 10,000 training data and 5,000 testing data. A text sentiment binary classification model for Korean movie review with an accuracy of 0.9582, a loss of 0.1177, and an F1 score of 0.81 has been created. As a result of performing transfer learning with a dataset five times larger, a model with an accuracy of 0.9562, a loss of 0.1202, and an F1 score of 0.86 has been generated.

A Classification Model Supporting Dynamic Features of Product Databases (상품 데이터베이스의 동적 특성을 지원하는 분류 모형)

  • Kim Dongkyu;Lee Sang-goo;Choi Dong-Hoon
    • The KIPS Transactions:PartD
    • /
    • v.12D no.1 s.97
    • /
    • pp.165-178
    • /
    • 2005
  • A product classification scheme is the foundation on which product databases are designed, and plays a central role in almost all aspects of management and use of product information. It needs to meet diverse user views to support efficient and convenient use of product information. It needs to be changed and evolved very often without breaking consistency in the cases of introduction of new products, extinction of existing products, class reorganization, and class specialization. It also needs to be merged and mapped with other classification schemes without information loss when B2B transactions occur. For these requirements, a classification scheme should be so dynamic that it takes in them within right time and cost. The existing classification schemes widely used today such as UNSPSC and eCl@ss, however, have a lot of limitations to meet these requirements for dynamic features of classification. Product information implies a plenty of semantics such as class attributes like material, time, place, etc., and integrity constraints. In this Paper, we analyze the dynamic features of product databases and the limitation of existing code based classification schemes, and describe the semantic classification model proposed in [1], which satisfies the requirements for dynamic features of product databases. It provides a means to explicitly and formally express more semantics for product classes and organizes class relationships into a graph.

A Study on Automatic Classification of Subject Headings Using BERT Model (BERT 모형을 이용한 주제명 자동 분류 연구)

  • Yong-Gu Lee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.57 no.2
    • /
    • pp.435-452
    • /
    • 2023
  • This study experimented with automatic classification of subject headings using BERT-based transfer learning model, and analyzed its performance. This study analyzed the classification performance according to the main class of KDC classification and the category type of subject headings. Six datasets were constructed from Korean national bibliographies based on the frequency of the assignments of subject headings, and titles were used as classification features. As a result, classification performance showed values of 0.6059 and 0.5626 on the micro F1 and macro F1 score, respectively, in the dataset (1,539,076 records) containing 3,506 subject headings. In addition, classification performance by the main class of KDC classification showed good performance in the class General works, Natural science, Technology and Language, and low performance in Religion and Arts. As for the performance by the category type of the subject headings, the categories of plant, legal name and product name showed high performance, whereas national treasure/treasure category showed low performance. In a large dataset, the ratio of subject headings that cannot be assigned increases, resulting in a decrease in final performance, and improvement is needed to increase classification performance for low-frequency subject headings.

Evaluation of Classification Performance of Inception V3 Algorithm for Chest X-ray Images of Patients with Cardiomegaly (심장비대증 환자의 흉부 X선 영상에 대한 Inception V3 알고리즘의 분류 성능평가)

  • Jeong, Woo-Yeon;Kim, Jung-Hun;Park, Ji-Eun;Kim, Min-Jeong;Lee, Jong-Min
    • Journal of the Korean Society of Radiology
    • /
    • v.15 no.4
    • /
    • pp.455-461
    • /
    • 2021
  • Cardiomegaly is one of the most common diseases seen on chest X-rays, but if it is not detected early, it can cause serious complications. In view of this, in recent years, many researches on image analysis in which deep learning algorithms using artificial intelligence are applied to medical care have been conducted with the development of various science and technology fields. In this paper, we would like to evaluate whether the Inception V3 deep learning model is a useful model for the classification of Cardiomegaly using chest X-ray images. For the images used, a total of 1026 chest X-ray images of patients diagnosed with normal heart and those diagnosed with Cardiomegaly in Kyungpook National University Hospital were used. As a result of the experiment, the classification accuracy and loss of the Inception V3 deep learning model according to the presence or absence of Cardiomegaly were 96.0% and 0.22%, respectively. From the research results, it was found that the Inception V3 deep learning model is an excellent deep learning model for feature extraction and classification of chest image data. The Inception V3 deep learning model is considered to be a useful deep learning model for classification of chest diseases, and if such excellent research results are obtained by conducting research using a little more variety of medical image data, I think it will be great help for doctor's diagnosis in future.

Generation of 3D STEP Model from 2D Drawings Using Feature Definition of Ship Structure (선체구조 특징형상 정의에 의한 2D 도면에서 3D STEP 선체 모델의 생성)

  • 황호진;한순흥;김용대
    • Korean Journal of Computational Design and Engineering
    • /
    • v.8 no.2
    • /
    • pp.122-132
    • /
    • 2003
  • STEP AP218 has a standard schema to represent the structural model of a midship section. While it helps to exchange ship structural models among heterogeneous automation systems, most shipyards and classification societies still exchange information using 2D paper drawings. We propose a feature parameter input method to generate a 3D STEP model of a ship structure from 2D drawings. We have analyzed the ship structure information contained in 2D drawings and have defined a data model to express the contents of the drawing. We also developed a QUI for the feature parameter input. To translate 2D information extracted from the drawing into a STEP AP2l8 model, we have developed a shape generation library, and generated the 3D ship model through this library. The generated 3D STEP model of a ship structure can be used to exchange information between design departments in a shipyard as well as between classification societies and shipyards.

An Analysis of Nursing Needs for Hospitalized Cancer Patients;Using Data Mining Techniques (데이터 마이닝을 이용한 입원 암 환자 간호 중증도 예측모델 구축)

  • Park, Sun-A
    • Asian Oncology Nursing
    • /
    • v.5 no.1
    • /
    • pp.3-10
    • /
    • 2005
  • Back ground: Nurses now occupy one third of all hospital human resources. Therefore, efficient management of nursing manpower is getting more important. While it is very clear that nursing workload requirement analysis and patient severity classification should be done first for the efficient allocation of nursing workforce, these processes have been conducted manually with ad hoc rule. Purposes: This study was tried to make a predict model for patient classification according to nursing need. We tried to find the easier and faster method to classify nursing patients that can help efficient management of nursing manpower. Methods: The nursing patient classifications data of the hospitalized cancer patients in one of the biggest cancer center in Korea during 2003.1.1-2003.12.31 were assessed by trained nurses. This study developed a prediction model and analyzing nursing needs by data mining techniques. Patients were classified by three different data mining techniques, (Logistic regression, Decision tree and Neural network) and the results were assessed. Results: The data set was created using 165,073 records of 2,228 patients classification database. Main explaining variables were as follows in 3 different data mining techniques. 1) Logistic regression : age, month and section. 2) Decision tree : section, month, age and tumor. 3) Neural network : section, diagnosis, age, sex, metastasis, hospital days and month. Among these three techniques, neural network showed the best prediction power in ROC curve verification. As the result of the patient classification prediction model developed by neural network based on nurse needs, the prediction accuracy was 84.06%. Conclusion: The patient classification prediction model was developed and tested in this study using real patients data. The result can be employed for more accurate calculation of required nursing staff and effective use of labor force.

  • PDF

Detecting Credit Loan Fraud Based on Individual-Level Utility (개인별 유틸리티에 기반한 신용 대출 사기 탐지)

  • Choi, Keunho;Kim, Gunwoo;Suh, Yongmoo
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.79-95
    • /
    • 2012
  • As credit loan products significantly increase in most financial institutions, the number of fraudulent transactions is also growing rapidly. Therefore, to manage the financial risks successfully, the financial institutions should reinforce the qualifications for a loan and augment the ability to detect a credit loan fraud proactively. In the process of building a classification model to detect credit loan frauds, utility from classification results (i.e., benefits from correct prediction and costs from incorrect prediction) is more important than the accuracy rate of classification. The objective of this paper is to propose a new approach to building a classification model for detecting credit loan fraud based on an individual-level utility. Experimental results show that the model comes up with higher utility than the fraud detection models which do not take into account the individual-level utility concept. Also, it is shown that the individual-level utility computed by the model is more accurate than the mean-level utility computed by other models, in both opportunity utility and cash flow perspectives. We provide diverse views on the experimental results from both perspectives.

A Model of Criteria for Classifying Fashion Brands - from the viewpoint of fashion business practice - (패션브랜드 분류 기준 모형에 관한 연구 - 패션업체 실무자 관점으로 -)

  • 박송애;이선재
    • Journal of the Korean Society of Costume
    • /
    • v.53 no.5
    • /
    • pp.155-169
    • /
    • 2003
  • The purpose of this study was to find out criteria for classifying fashion brand from the viewpoint of fashion business practice in order to develop strategy of fashion brands and to manage brand effectively and systematically, and to suggest theoretical frame for application of these criteria. Survey was implemented for this research. 388 Data from the people who works for merchandising, sales or design in fashion business company was analyzed. Questionnaires were developed based on 37 fashion brand classification criteria. SPSS package and LISREL program were used to analyze data. Factor analysis, one-way ANOVA, $$\mu$tiple response analysis, correlation analysis, and structure equation model analysis were applied. The results of this study were as follows First, factor analysis considering 37 classification criteria identified 7 factors as classification criteria which can be used effectively by fashion business company. Second, in two cases, based on the job description and the responsible items, analysis showed that importance of the 7 classification criteria factors was different. And all of 7 criteria were correlated to each other. Third, the effective method to classify fashion brands was proposed by establishing the model of the relationship among the values of 7 criteria and by proving it by the structure equation model analysis. And the two types of the courses to classify fashion brand were shown. Forth, according to the evaluation of these criteria in the importance of appropriateness and difficulty of implementing, classification criteria factor of "the level of product concept" was found to be very effective and "the level of brand value" was ineffective to apply.

Weather Classification and Fog Detection using Hierarchical Image Tree Model and k-mean Segmentation in Single Outdoor Image (싱글 야외 영상에서 계층적 이미지 트리 모델과 k-평균 세분화를 이용한 날씨 분류와 안개 검출)

  • Park, Ki-Hong
    • Journal of Digital Contents Society
    • /
    • v.18 no.8
    • /
    • pp.1635-1640
    • /
    • 2017
  • In this paper, a hierarchical image tree model for weather classification is defined in a single outdoor image, and a weather classification algorithm using image intensity and k-mean segmentation image is proposed. In the first level of the hierarchical image tree model, the indoor and outdoor images are distinguished. Whether the outdoor image is daytime, night, or sunrise/sunset image is judged using the intensity and the k-means segmentation image at the second level. In the last level, if it is classified as daytime image at the second level, it is finally estimated whether it is sunny or foggy image based on edge map and fog rate. Some experiments are conducted so as to verify the weather classification, and as a result, the proposed method shows that weather features are effectively detected in a given image.