• Title/Summary/Keyword: Machine Learning Library

Search Result 81, Processing Time 0.037 seconds

A Study on Applicability of Machine Learning for Book Classification of Public Libraries: Focusing on Social Science and Arts (공공도서관 도서 분류를 위한 머신러닝 적용 가능성 연구 - 사회과학과 예술분야를 중심으로 -)

  • Kwak, Chul Wan
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.1
    • /
    • pp.133-150
    • /
    • 2021
  • The purpose of this study is to identify the applicability of machine learning targeting titles in the classification of books in public libraries. Data analysis was performed using Python's scikit-learn library through the Jupiter notebook of the Anaconda platform. KoNLPy analyzer and Okt class were used for Hangul morpheme analysis. The units of analysis were 2,000 title fields and KDC classification class numbers (300 and 600) extracted from the KORMARC records of public libraries. As a result of analyzing the data using six machine learning models, it showed a possibility of applying machine learning to book classification. Among the models used, the neural network model has the highest accuracy of title classification. The study suggested the need for improving the accuracy of title classification, the need for research on book titles, tokenization of titles, and stop words.

A Sweet Persimmon Grading Algorithm using Object Detection Techniques and Machine Learning Libraries (객체 탐지 기법과 기계학습 라이브러리를 활용한 단감 등급 선별 알고리즘)

  • Roh, SeungHee;Kang, EunYoung;Park, DongGyu;Kang, Young-Min
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.769-782
    • /
    • 2022
  • A study on agricultural automation became more important. In Korea, sweet persimmon farmers spend a lot of time and effort on classifying profitable persimmons. In this paper, we propose and implement an efficient grading algorithm for persimmons before shipment. We gathered more than 1,750 images of persimmons, and the images were graded and labeled for classifications purpose. Our main algorithm is based on EfficientDet object detection model but we implemented more exquisite method for better classification performance. In order to improve the precision of classification, we adopted a machine learning algorithm, which was proposed by PyCaret machine learning workflow generation library. Finally we acquired an improved classification model with the accuracy score of 81%.

A Method for Same Author Name Disambiguation in Domestic Academic Papers (국내 학술논문의 동명이인 저자명 식별을 위한 방법)

  • Shin, Daye;Yang, Kiduk
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.4
    • /
    • pp.301-319
    • /
    • 2017
  • The task of author name disambiguation involves identifying an author with different names or different authors with the same name. The author name disambiguation is important for correctly assessing authors' research achievements and finding experts in given areas as well as for the effective operation of scholarly information services such as citation indexes. In the study, we performed error correction and normalization of data and applied rules-based author name disambiguation to compare with baseline machine learning disambiguation in order to see if human intervention could improve the machine learning performance. The improvement of over 0.1 in F-measure by the corrected and normalized email-based author name disambiguation over machine learning demonstrates the potential of human pattern identification and inference, which enabled data correction and normalization process as well as the formation of the rule-based diambiguation, to complement the machine learning's weaknesses to improve the author name disambiguation results.

Store Sales Prediction Using Gradient Boosting Model (그래디언트 부스팅 모델을 활용한 상점 매출 예측)

  • Choi, Jaeyoung;Yang, Heeyoon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.171-177
    • /
    • 2021
  • Through the rapid developments in machine learning, there have been diverse utilization approaches not only in industrial fields but also in daily life. Implementations of machine learning on financial data, also have been of interest. Herein, we employ machine learning algorithms to store sales data and present future applications for fintech enterprises. We utilize diverse missing data processing methods to handle missing data and apply gradient boosting machine learning algorithms; XGBoost, LightGBM, CatBoost to predict the future revenue of individual stores. As a result, we found that using median imputation onto missing data with the appliance of the xgboost algorithm has the best accuracy. By employing the proposed method, fintech enterprises and customers can attain benefits. Stores can benefit by receiving financial assistance beforehand from fintech companies, while these corporations can benefit by offering financial support to these stores with low risk.

Automatic Classification of Radar Signals Using CNN (CNN을 이용한 레이다 신호 자동 분류)

  • Hong, Seok-Jun;Yi, Yearn-Gui;Jo, Jeil;Lee, Sang-Gil;Seo, Bo-Seok
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.30 no.2
    • /
    • pp.132-140
    • /
    • 2019
  • In this paper, we propose a classification method for radar signals depending on the type of threat by applying machine learning to parameter data of radar signals. Currently, the army uses a library of mapping relations between the parameters and the types of threat to recognize threat signals. This approach has certain limitations when classifying signals and recognizing new types of threat or types of threat that do not exist in the current libraries. In this paper, we propose an automatic radar signal classification method depending on the type of threat that uses only parameter data without a library. A convolutional neural network is used as the classifier and machine learning is applied to train the classifier. The proposed method does not use a library, and hence, can classify threat signals that are new or do not exist in the current library.

A Study on Automatic Recommendation of Keywords for Sub-Classification of National Science and Technology Standard Classification System Using AttentionMesh (AttentionMesh를 활용한 국가과학기술표준분류체계 소분류 키워드 자동추천에 관한 연구)

  • Park, Jin Ho;Song, Min Sun
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.2
    • /
    • pp.95-115
    • /
    • 2022
  • The purpose of this study is to transform the sub-categorization terms of the National Science and Technology Standards Classification System into technical keywords by applying a machine learning algorithm. For this purpose, AttentionMeSH was used as a learning algorithm suitable for topic word recommendation. For source data, four-year research status files from 2017 to 2020, refined by the Korea Institute of Science and Technology Planning and Evaluation, were used. For learning, four attributes that well express the research content were used: task name, research goal, research abstract, and expected effect. As a result, it was confirmed that the result of MiF 0.6377 was derived when the threshold was 0.5. In order to utilize machine learning in actual work in the future and to secure technical keywords, it is expected that it will be necessary to establish a term management system and secure data of various attributes.

An Experimental Study on the Relation Extraction from Biomedical Abstracts using Machine Learning (기계 학습을 이용한 바이오 분야 학술 문헌에서의 관계 추출에 대한 실험적 연구)

  • Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.50 no.2
    • /
    • pp.309-336
    • /
    • 2016
  • This paper introduces a relation extraction system that can be used in identifying and classifying semantic relations between biomedical entities in scientific texts using machine learning methods such as Support Vector Machines (SVM). The suggested system includes many useful functions capable of extracting various linguistic features from sentences having a pair of biomedical entities and applying them into training relation extraction models for maximizing their performance. Three globally representative collections in biomedical domains were used in the experiments which demonstrate its superiority in various biomedical domains. As a result, it is most likely that the intensive experimental study conducted in this paper will provide meaningful foundations for research on bio-text analysis based on machine learning.

Implimentation of Automatic Attendance Management System for Classroom Using OpenCV and Machine Learning (머신러닝과 OpenCV를 이용한 교실용 자동 출결 관리 시스템 프로토타입 구현)

  • Yoo, Sang-yeop;Kim, Jae-won;Park, Hyeon-jun;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.327-329
    • /
    • 2019
  • In this paper, we propose an automatic attendance management system for classrooms using OpenCV and machine learning technology. When a face photograph is input at the entrance of the classroom using a general purpose camera for PC, the attendance is checked by comparing the similarity of the face of the already stored student. In this study, the prototype was implemented using the machine learning library dlib, and about 10% of the students had a recognition rate of about 70%.

  • PDF

Classification of Radar Signals Using Machine Learning Techniques (기계학습 방법을 이용한 레이더 신호 분류)

  • Hong, Seok-Jun;Yi, Yearn-Gui;Choi, Jong-Won;Jo, Jeil;Seo, Bo-Seok
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.162-167
    • /
    • 2018
  • In this paper, we propose a method to classify radar signals according to the jamming technique by applying the machine learning to parameter data extracted from received radar signals. In the present army, the radar signal is classified according to the type of threat based on the library of the radar signal parameters mostly built by the preliminary investigation. However, since radar technology is continuously evolving and diversifying, it can not properly classify signals when applying this method to new threats or threat types that do not exist in existing libraries, thus limiting the choice of appropriate jamming techniques. Therefore, it is necessary to classify the signals so that the optimal jamming technique can be selected using only the parameter data of the radar signal that is different from the method using the existing threat library. In this study, we propose a method based on machine learning to cope with new threat signal form. The method classifies the signal corresponding the new jamming method for the new threat signal by learning the classifier composed of the hidden Markov model and the neural network using the existing library data.

Prediction of the DO concentration using the machine learning algorithm: case study in Oncheoncheon, Republic of Korea

  • Lim, Heesung;An, Hyunuk;Choi, Eunhyuk;Kim, Yeonsu
    • Korean Journal of Agricultural Science
    • /
    • v.47 no.4
    • /
    • pp.1029-1037
    • /
    • 2020
  • The machine learning algorithm has been widely used in water-related fields such as water resources, water management, hydrology, atmospheric science, water quality, water level prediction, weather forecasting, water discharge prediction, water quality forecasting, etc. However, water quality prediction studies based on the machine learning algorithm are limited compared to other water-related applications because of the limited water quality data. Most of the previous water quality prediction studies have predicted monthly water quality, which is useful information but not enough from a practical aspect. In this study, we predicted the dissolved oxygen (DO) using recurrent neural network with long short-term memory model recurrent neural network long-short term memory (RNN-LSTM) algorithms with hourly- and daily-datasets. Bugok Bridge in Oncheoncheon, located in Busan, where the data was collected in real time, was selected as the target for the DO prediction. The 10-month (temperature, wind speed, and relative humidity) data were used as time prediction inputs, and the 5-year (temperature, wind speed, relative humidity, and rainfall) data were used as the daily forecast inputs. Missing data were filled by linear interpolation. The prediction model was coded based on TensorFlow, an open-source library developed by Google. The performance of the RNN-LSTM algorithm for the hourly- or daily-based water quality prediction was tested and analyzed. Research results showed that the hourly data for the water quality is useful for machine learning, and the RNN-LSTM algorithm has potential to be used for hourly- or daily-based water quality forecasting.