• Title/Summary/Keyword: Machine Learning

Search Result 5,339, Processing Time 0.033 seconds

Building an Analytical Platform of Big Data for Quality Inspection in the Dairy Industry: A Machine Learning Approach (유제품 산업의 품질검사를 위한 빅데이터 플랫폼 개발: 머신러닝 접근법)

  • Hwang, Hyunseok;Lee, Sangil;Kim, Sunghyun;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.125-140
    • /
    • 2018
  • As one of the processes in the manufacturing industry, quality inspection inspects the intermediate products or final products to separate the good-quality goods that meet the quality management standard and the defective goods that do not. The manual inspection of quality in a mass production system may result in low consistency and efficiency. Therefore, the quality inspection of mass-produced products involves automatic checking and classifying by the machines in many processes. Although there are many preceding studies on improving or optimizing the process using the data generated in the production process, there have been many constraints with regard to actual implementation due to the technical limitations of processing a large volume of data in real time. The recent research studies on big data have improved the data processing technology and enabled collecting, processing, and analyzing process data in real time. This paper aims to propose the process and details of applying big data for quality inspection and examine the applicability of the proposed method to the dairy industry. We review the previous studies and propose a big data analysis procedure that is applicable to the manufacturing sector. To assess the feasibility of the proposed method, we applied two methods to one of the quality inspection processes in the dairy industry: convolutional neural network and random forest. We collected, processed, and analyzed the images of caps and straws in real time, and then determined whether the products were defective or not. The result confirmed that there was a drastic increase in classification accuracy compared to the quality inspection performed in the past.

A Hybrid Collaborative Filtering-based Product Recommender System using Search Keywords (검색 키워드를 활용한 하이브리드 협업필터링 기반 상품 추천 시스템)

  • Lee, Yunju;Won, Haram;Shim, Jaeseung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.151-166
    • /
    • 2020
  • A recommender system is a system that recommends products or services that best meet the preferences of each customer using statistical or machine learning techniques. Collaborative filtering (CF) is the most commonly used algorithm for implementing recommender systems. However, in most cases, it only uses purchase history or customer ratings, even though customers provide numerous other data that are available. E-commerce customers frequently use a search function to find the products in which they are interested among the vast array of products offered. Such search keyword data may be a very useful information source for modeling customer preferences. However, it is rarely used as a source of information for recommendation systems. In this paper, we propose a novel hybrid CF model based on the Doc2Vec algorithm using search keywords and purchase history data of online shopping mall customers. To validate the applicability of the proposed model, we empirically tested its performance using real-world online shopping mall data from Korea. As the number of recommended products increases, the recommendation performance of the proposed CF (or, hybrid CF based on the customer's search keywords) is improved. On the other hand, the performance of a conventional CF gradually decreased as the number of recommended products increased. As a result, we found that using search keyword data effectively represents customer preferences and might contribute to an improvement in conventional CF recommender systems.

Construction of Test Collection for Evaluation of Scientific Relation Extraction System (과학기술분야 용어 간 관계추출 시스템의 평가를 위한 테스트컬렉션 구축)

  • Choi, Yun-Soo;Choi, Sung-Pil;Jeong, Chang-Hoo;Yoon, Hwa-Mook;You, Beom-Jong
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.754-758
    • /
    • 2009
  • Extracting information in large-scale documents would be very useful not only for information retrieval but also for question answering and summarization. Even though relation extraction is very important area, it is difficult to develop and evaluate a machine learning based system without test collection. The study shows how to build test collection(KREC2008) for the relation extraction system. We extracted technology terms from abstracts of journals and selected several relation candidates between them using Wordnet. Judges who were well trained in evaluation process assigned a relation from candidates. The process provides the method with which even non-experts are able to build test collection easily. KREC2008 are open to the public for researchers and developers and will be utilized for development and evaluation of relation extraction system.

  • PDF

The 4th.industrial revolution and Korean university's role change (4차산업혁명과 한국대학의 역할 변화)

  • Park, Sang-Kyu
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.1
    • /
    • pp.235-242
    • /
    • 2018
  • The interest about 4th Industrial Revolution was impressively increased from newspapers, iindustry, government and academic sectors. Especially AI what could be felt by the skin of many peoples, already overpassed the ability of the human's even in creative areas. Namely, now many people start fo feel that the effect of the revolution is just infront of themselves. There were several issues in this trend, the ability of deep learning by machine, the identity of the human, the change of job environment and the concern about the social change etc. Recently many studies have been made about the 4th industrial revolution in many fields like as AI(artificial intelligence), CRISPR, big data and driverless car etc. As many positive effects and pessimistic effects are existed at the same time and many preventing actions are being suggested recently, these opinions will be compared and analyzed and better solutions will be found eventually. Several educational, political, scientific, social and ethical effects and solutions were studied and suggested in this study. Clear implication from the study is that the world we will live from now on is changing faster than ever in the social, industrial, political and educational environment. If it will reform the social systems according to those changes, a society (nation or government) will grasp the chance of its development or take-off, otherwise, it will consume the resources ineffectively and lose the competition as a whole society. But the method of that reform is not that apparent in many aspects as the revolution is progressing currently and its definition should be made whether in industrial or scientific aspect. The person or nation who will define it will have the advantage of leading the future of that business or society.

A Study on the Effects of Online Word-of-Mouth on Game Consumers Based on Sentimental Analysis (감성분석 기반의 게임 소비자 온라인 구전효과 연구)

  • Jung, Keun-Woong;Kim, Jong Uk
    • Journal of Digital Convergence
    • /
    • v.16 no.3
    • /
    • pp.145-156
    • /
    • 2018
  • Unlike the past, when distributors distributed games through retail stores, they are now selling digital content, which is based on online distribution channels. This study analyzes the effects of eWOM (electronic Word of Mouth) on sales volume of game sold on Steam, an online digital content distribution channel. Recently, data mining techniques based on Big Data have been studied. In this study, emotion index of eWOM is derived by emotional analysis which is a text mining technique that can analyze the emotion of each review among factors of eWOM. Emotional analysis utilizes Naive Bayes and SVM classifier and calculates the emotion index through the SVM classifier with high accuracy. Regression analysis is performed on the dependent variable, sales variation, using the emotion index, the number of reviews of each game, the size of eWOM, and the user score of each game, which is a rating of eWOM. Regression analysis revealed that the size of the independent variable eWOM and the emotion index of the eWOM were influential on the dependent variable, sales variation. This study suggests the factors of eWOM that affect the sales volume when Korean game companies enter overseas markets based on steam.

A Method for Correcting Air-Pressure Data Collected by Mini-AWS (소형 자동기상관측장비(Mini-AWS) 기압자료 보정 기법)

  • Ha, Ji-Hun;Kim, Yong-Hyuk;Im, Hyo-Hyuc;Choi, Deokwhan;Lee, Yong Hee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.182-189
    • /
    • 2016
  • For high accuracy of forecast using numerical weather prediction models, we need to get weather observation data that are large and high dense. Korea Meteorological Administration (KMA) mantains Automatic Weather Stations (AWSs) to get weather observation data, but their installation and maintenance costs are high. Mini-AWS is a very compact automatic weather station that can measure and record temperature, humidity, and pressure. In contrast to AWS, costs of Mini-AWS's installation and maintenance are low. It also has a little space restraints for installing. So it is easier than AWS to install mini-AWS on places where we want to get weather observation data. But we cannot use the data observed from Mini-AWSs directly, because it can be affected by surrounding. In this paper, we suggest a correcting method for using pressure data observed from Mini-AWS as weather observation data. We carried out preconditioning process on pressure data from Mini-AWS. Then they were corrected by using machine learning methods with the aim of adjusting to pressure data of the AWS closest to them. Our experimental results showed that corrected pressure data are in regulation and our correcting method using SVR showed very good performance.

Study on High-speed Cyber Penetration Attack Analysis Technology based on Static Feature Base Applicable to Endpoints (Endpoint에 적용 가능한 정적 feature 기반 고속의 사이버 침투공격 분석기술 연구)

  • Hwang, Jun-ho;Hwang, Seon-bin;Kim, Su-jeong;Lee, Tae-jin
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.21-31
    • /
    • 2018
  • Cyber penetration attacks can not only damage cyber space but can attack entire infrastructure such as electricity, gas, water, and nuclear power, which can cause enormous damage to the lives of the people. Also, cyber space has already been defined as the fifth battlefield, and strategic responses are very important. Most of recent cyber attacks are caused by malicious code, and since the number is more than 1.6 million per day, automated analysis technology to cope with a large amount of malicious code is very important. However, it is difficult to deal with malicious code encryption, obfuscation and packing, and the dynamic analysis technique is not limited to the performance requirements of dynamic analysis but also to the virtual There is a limit in coping with environment avoiding technology. In this paper, we propose a machine learning based malicious code analysis technique which improve the weakness of the detection performance of existing analysis technology while maintaining the light and high-speed analysis performance applicable to commercial endpoints. The results of this study show that 99.13% accuracy, 99.26% precision and 99.09% recall analysis performance of 71,000 normal file and malicious code in commercial environment and analysis time in PC environment can be analyzed more than 5 per second, and it can be operated independently in the endpoint environment and it is considered that it works in complementary form in operation in conjunction with existing antivirus technology and static and dynamic analysis technology. It is also expected to be used as a core element of EDR technology and malware variant analysis.

The study of Defense Artificial Intelligence and Block-chain Convergence (국방분야 인공지능과 블록체인 융합방안 연구)

  • Kim, Seyong;Kwon, Hyukjin;Choi, Minwoo
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.81-90
    • /
    • 2020
  • The purpose of this study is to study how to apply block-chain technology to prevent data forgery and alteration in the defense sector of AI(Artificial intelligence). AI is a technology for predicting big data by clustering or classifying it by applying various machine learning methodologies, and military powers including the U.S. have reached the completion stage of technology. If data-based AI's data forgery and modulation occurs, the processing process of the data, even if it is perfect, could be the biggest enemy risk factor, and the falsification and modification of the data can be too easy in the form of hacking. Unexpected attacks could occur if data used by weaponized AI is hacked and manipulated by North Korea. Therefore, a technology that prevents data from being falsified and altered is essential for the use of AI. It is expected that data forgery prevention will solve the problem by applying block-chain, a technology that does not damage data, unless more than half of the connected computers agree, even if a single computer is hacked by a distributed storage of encrypted data as a function of seawater.

Clustering of Smart Meter Big Data Based on KNIME Analytic Platform (KNIME 분석 플랫폼 기반 스마트 미터 빅 데이터 클러스터링)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.13-20
    • /
    • 2020
  • One of the major issues surrounding big data is the availability of massive time-based or telemetry data. Now, the appearance of low cost capture and storage devices has become possible to get very detailed time data to be used for further analysis. Thus, we can use these time data to get more knowledge about the underlying system or to predict future events with higher accuracy. In particular, it is very important to define custom tailored contract offers for many households and businesses having smart meter records and predict the future electricity usage to protect the electricity companies from power shortage or power surplus. It is required to identify a few groups with common electricity behavior to make it worth the creation of customized contract offers. This study suggests big data transformation as a side effect and clustering technique to understand the electricity usage pattern by using the open data related to smart meter and KNIME which is an open source platform for data analytics, providing a user-friendly graphical workbench for the entire analysis process. While the big data components are not open source, they are also available for a trial if required. After importing, cleaning and transforming the smart meter big data, it is possible to interpret each meter data in terms of electricity usage behavior through a dynamic time warping method.

Current status and future plans of KMTNet microlensing experiments

  • Chung, Sun-Ju;Gould, Andrew;Jung, Youn Kil;Hwang, Kyu-Ha;Ryu, Yoon-Hyun;Shin, In-Gu;Yee, Jennifer C.;Zhu, Wei;Han, Cheongho;Cha, Sang-Mok;Kim, Dong-Jin;Kim, Hyun-Woo;Kim, Seung-Lee;Lee, Chung-Uk;Lee, Yongseok
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.43 no.1
    • /
    • pp.41.1-41.1
    • /
    • 2018
  • We introduce a current status and future plans of Korea Microlensing Telescope Network (KMTNet) microlensing experiments, which include an observational strategy, pipeline, event-finder, and collaborations with Spitzer. The KMTNet experiments were initiated in 2015. From 2016, KMTNet observes 27 fields including 6 main fields and 21 subfields. In 2017, we have finished the DIA photometry for all 2016 and 2017 data. Thus, it is possible to do a real-time DIA photometry from 2018. The DIA photometric data is used for finding events from the KMTNet event-finder. The KMTNet event-finder has been improved relative to the previous version, which already found 857 events in 4 main fields of 2015. We have applied the improved version to all 2016 data. As a result, we find that 2597 events are found, and out of them, 265 are found in KMTNet-K2C9 overlapping fields. For increasing the detection efficiency of event-finder, we are working on filtering false events out by machine-learning method. In 2018, we plan to measure event detection efficiency of KMTNet by injecting fake events into the pipeline near the image level. Thanks to high-cadence observations, KMTNet found fruitful interesting events including exoplanets and brown dwarfs, which were not found by other groups. Masses of such exoplanets and brown dwarfs are measured from collaborations with Spitzer and other groups. Especially, KMTNet has been closely cooperating with Spitzer from 2015. Thus, KMTNet observes Spitzer fields. As a result, we could measure the microlens parallaxes for many events. Also, the automated KMTNet PySIS pipeline was developed before the 2017 Spitzer season and it played a very important role in selecting the Spitzer target. For the 2018 Spitzer season, we will improve the PySIS pipeline to obtain better photometric results.

  • PDF