• Title/Summary/Keyword: 키워드-기반 시스템

Search Result 519, Processing Time 0.029 seconds

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Design and Implementation of Web Based Instruction Based on Constructivism for Self-Directed Learning Ablity (구성주의 이론에 기반한 자기주도적 웹 기반 교육의 설계와 구현)

  • Kim Gi-Nam;Kim Eui-Jeong;Kim Chang-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.855-858
    • /
    • 2006
  • First of all, Developing information technology makes it possible to change a paradigm of all kinds of areas, including an education. Students can choose learning goals and objects themselves and acquire not the accumulation of knowledge but the method of their learning. Moreover, Teachers get to be adviser, and students play a key role in teaming. That is, the subject of leaning is students. Constructivism emphasizes the student-oriented environment of education, which corresponds to the characteristics of hypeimedia. In addition, Internet allows us to make a practical plan for constructivism. Web Based Internet provides us with a proper environment to make constructivism practice md causes an education system to change. Sure Web Based Instruction makes them motivated to learn more, they can gain plenty of information regardless of places or time. Besides, they are able to consult more up-to-date information regarding their learning use hypermedia such as an image, audio, video, and test, and effectively communicate with their instructor through a board, an e-mail, a chatting etc. A school and instructors have been making effort to develop a new model of a teaching method to cope with a new environment change. In this thesis, with 'Design and Implementation of Web Based Instruction Based on Constructivism', providing online learner-oriented and indexed video lesson, learners can get chance of self-oriented learning. In addition, learners doesn't have to cover all contents of a lesson but can choose contents they want to have from a indexed list of a lesson, and they ran search contents they want to have with a 'Keyword Search' on a main page, which can make learners improve learner's achievement.

  • PDF

Analysis of Rice Blast Outbreaks in Korea through Text Mining (텍스트 마이닝을 통한 우리나라의 벼 도열병 발생 개황 분석)

  • Song, Sungmin;Chung, Hyunjung;Kim, Kwang-Hyung;Kim, Ki-Tae
    • Research in Plant Disease
    • /
    • v.28 no.3
    • /
    • pp.113-121
    • /
    • 2022
  • Rice blast is a major plant disease that occurs worldwide and significantly reduces rice yields. Rice blast disease occurs periodically in Korea, causing significant socio-economic damage due to the unique status of rice as a major staple crop. A disease outbreak prediction system is required for preventing rice blast disease. Epidemiological investigations of disease outbreaks can aid in decision-making for plant disease management. Currently, plant disease prediction and epidemiological investigations are mainly based on quantitatively measurable, structured data such as crop growth and damage, weather, and other environmental factors. On the other hand, text data related to the occurrence of plant diseases are accumulated along with the structured data. However, epidemiological investigations using these unstructured data have not been conducted. The useful information extracted using unstructured data can be used for more effective plant disease management. This study analyzed news articles related to the rice blast disease through text mining to investigate the years and provinces where rice blast disease occurred most in Korea. Moreover, the average temperature, total precipitation, sunshine hours, and supplied rice varieties in the regions were also analyzed. Through these data, it was estimated that the primary causes of the nationwide outbreak in 2020 and the major outbreak in Jeonbuk region in 2021 were meteorological factors. These results obtained through text mining can be combined with deep learning technology to be used as a tool to investigate the epidemiology of rice blast disease in the future.

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

  • Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.203-219
    • /
    • 2011
  • Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (비정형 텍스트 분석을 활용한 이슈의 동적 변이과정 고찰)

  • Lim, Myungsu;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Owing to the extensive use of Web media and the development of the IT industry, a large amount of data has been generated, shared, and stored. Nowadays, various types of unstructured data such as image, sound, video, and text are distributed through Web media. Therefore, many attempts have been made in recent years to discover new value through an analysis of these unstructured data. Among these types of unstructured data, text is recognized as the most representative method for users to express and share their opinions on the Web. In this sense, demand for obtaining new insights through text analysis is steadily increasing. Accordingly, text mining is increasingly being used for different purposes in various fields. In particular, issue tracking is being widely studied not only in the academic world but also in industries because it can be used to extract various issues from text such as news, (SocialNetworkServices) to analyze the trends of these issues. Conventionally, issue tracking is used to identify major issues sustained over a long period of time through topic modeling and to analyze the detailed distribution of documents involved in each issue. However, because conventional issue tracking assumes that the content composing each issue does not change throughout the entire tracking period, it cannot represent the dynamic mutation process of detailed issues that can be created, merged, divided, and deleted between these periods. Moreover, because only keywords that appear consistently throughout the entire period can be derived as issue keywords, concrete issue keywords such as "nuclear test" and "separated families" may be concealed by more general issue keywords such as "North Korea" in an analysis over a long period of time. This implies that many meaningful but short-lived issues cannot be discovered by conventional issue tracking. Note that detailed keywords are preferable to general keywords because the former can be clues for providing actionable strategies. To overcome these limitations, we performed an independent analysis on the documents of each detailed period. We generated an issue flow diagram based on the similarity of each issue between two consecutive periods. The issue transition pattern among categories was analyzed by using the category information of each document. In this study, we then applied the proposed methodology to a real case of 53,739 news articles. We derived an issue flow diagram from the articles. We then proposed the following useful application scenarios for the issue flow diagram presented in the experiment section. First, we can identify an issue that actively appears during a certain period and promptly disappears in the next period. Second, the preceding and following issues of a particular issue can be easily discovered from the issue flow diagram. This implies that our methodology can be used to discover the association between inter-period issues. Finally, an interesting pattern of one-way and two-way transitions was discovered by analyzing the transition patterns of issues through category analysis. Thus, we discovered that a pair of mutually similar categories induces two-way transitions. In contrast, one-way transitions can be recognized as an indicator that issues in a certain category tend to be influenced by other issues in another category. For practical application of the proposed methodology, high-quality word and stop word dictionaries need to be constructed. In addition, not only the number of documents but also additional meta-information such as the read counts, written time, and comments of documents should be analyzed. A rigorous performance evaluation or validation of the proposed methodology should be performed in future works.

Analysis of Research Trends of 'Word of Mouth (WoM)' through Main Path and Word Co-occurrence Network (주경로 분석과 연관어 네트워크 분석을 통한 '구전(WoM)' 관련 연구동향 분석)

  • Shin, Hyunbo;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.179-200
    • /
    • 2019
  • Word-of-mouth (WoM) is defined by consumer activities that share information concerning consumption. WoM activities have long been recognized as important in corporate marketing processes and have received much attention, especially in the marketing field. Recently, according to the development of the Internet, the way in which people exchange information in online news and online communities has been expanded, and WoM is diversified in terms of word of mouth, score, rating, and liking. Social media makes online users easy access to information and online WoM is considered a key source of information. Although various studies on WoM have been preceded by this phenomenon, there is no meta-analysis study that comprehensively analyzes them. This study proposed a method to extract major researches by applying text mining techniques and to grasp the main issues of researches in order to find the trend of WoM research using scholarly big data. To this end, a total of 4389 documents were collected by the keyword 'Word-of-mouth' from 1941 to 2018 in Scopus (www.scopus.com), a citation database, and the data were refined through preprocessing such as English morphological analysis, stopwords removal, and noun extraction. To carry out this study, we adopted main path analysis (MPA) and word co-occurrence network analysis. MPA detects key researches and is used to track the development trajectory of academic field, and presents the research trend from a macro perspective. For this, we constructed a citation network based on the collected data. The node means a document and the link means a citation relation in citation network. We then detected the key-route main path by applying SPC (Search Path Count) weights. As a result, the main path composed of 30 documents extracted from a citation network. The main path was able to confirm the change of the academic area which was developing along with the change of the times reflecting the industrial change such as various industrial groups. The results of MPA revealed that WoM research was distinguished by five periods: (1) establishment of aspects and critical elements of WoM, (2) relationship analysis between WoM variables, (3) beginning of researches of online WoM, (4) relationship analysis between WoM and purchase, and (5) broadening of topics. It was found that changes within the industry was reflected in the results such as online development and social media. Very recent studies showed that the topics and approaches related WoM were being diversified to circumstantial changes. However, the results showed that even though WoM was used in diverse fields, the main stream of the researches of WoM from the start to the end, was related to marketing and figuring out the influential factors that proliferate WoM. By applying word co-occurrence network analysis, the research trend is presented from a microscopic point of view. Word co-occurrence network was constructed to analyze the relationship between keywords and social network analysis (SNA) was utilized. We divided the data into three periods to investigate the periodic changes and trends in discussion of WoM. SNA showed that Period 1 (1941~2008) consisted of clusters regarding relationship, source, and consumers. Period 2 (2009~2013) contained clusters of satisfaction, community, social networks, review, and internet. Clusters of period 3 (2014~2018) involved satisfaction, medium, review, and interview. The periodic changes of clusters showed transition from offline to online WoM. Media of WoM have become an important factor in spreading the words. This study conducted a quantitative meta-analysis based on scholarly big data regarding WoM. The main contribution of this study is that it provides a micro perspective on the research trend of WoM as well as the macro perspective. The limitation of this study is that the citation network constructed in this study is a network based on the direct citation relation of the collected documents for MPA.

Fourth industrial revolution of Women's University Students and change of intelligent information technology

  • Hwang, Eui-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.235-243
    • /
    • 2019
  • Universities are opening related majors and subjects to nurture the problem-solving fusion that businesses want. The time has come when rapid technological. On this thesis, we analyzed three years (2017-2019) of survey result of Women University students in order to figuring out and dealing with the change in 4th industrial revolution and intellectual information technology. It turns out that 1) there was an increase of interest in 4th industrial revolution from 59% in 2017 to 80% in 2019, 2) IoT, ICT, Artificial Intelligence, and Education Research System became top priority in technical strategy, 3)the prime keyword is AI, robot, job, 4)the expectation on increasing of the opportunity and the number of jobs in science technology field was 50%, 5)the importance of universities and companies was 50%, 80% each, 6) the information needed for science technology were educational discipline, change in future science, prospective future information in order, and 7)the most needed education were education on creativity, coding, cross-subject, engineering in order. In the era of the fourth industrial revolution, it is essential to expand the SW manpower base in various fields. University education, which should provide connectivity for super-fusion, should provide curriculum optimized for industrial demands such as, fusion and connected education, creative thinking, self-directed problem solving and etc.

An Overview on Features of Research Topics in the Asia Pacific Journal of Small Business (APJSB) for 40 Years (「중소기업연구」 40년 연구주제의 전체 조망)

  • Kim, Sanghee;Lee, Choonwoo
    • Korean small business review
    • /
    • v.42 no.4
    • /
    • pp.47-67
    • /
    • 2020
  • This study analyzed the papers provided by Asia Pacific Journal of Small Business (APJSB) for 40 years. The purpose of this study is looking at the research trends about small and medium business. We tried to identify some stream and feature without manipulation. Textmining and Frequency analysis are executed on topics of every published paper in APJSB to 2019 from 1979. The result suggest that important keyword and feature of research topics in APJSB. And the result show the period feature as well as the whole of research trend in APJSB for 40 years. Futhermore, we suggest some implications derived from the results by adapting business ecosystem model and business managerial system model.

Analysis of articles on water quality accidents in the water distribution networks using big data topic modelling and sentiment analysis (빅데이터 토픽모델링과 감성분석을 활용한 물공급과정에서의 수질사고 기사 분석)

  • Hong, Sung-Jin;Yoo, Do-Guen
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1235-1249
    • /
    • 2022
  • This study applied the web crawling technique for extracting big data news on water quality accidents in the water supply system and presented the algorithm in a procedural way to obtain accurate water quality accident news. In addition, in the case of a large-scale water quality accident, development patterns such as accident recognition, accident spread, accident response, and accident resolution appear according to the occurrence of an accident. That is, the analysis of the development of water quality accidents through key keywords and sentiment analysis for each stage was carried out in detail based on case studies, and the meanings were analyzed and derived. The proposed methodology was applied to the larval accident period of Incheon Metropolitan City in 2020 and analyzed. As a result, in a situation where the disclosure of information that directly affects consumers, such as water quality accidents, is restricted, the tone of news articles and media reports about water quality accidents with long-term damage in the event of an accident and the degree of consumer pride clearly change over time. could check This suggests the need to prepare consumer-centered policies to increase consumer positivity, although rapid restoration of facilities is very important for the development of water quality accidents from the supplier's point of view.

Movie Recommended System base on Analysis for the User Review utilizing Ontology Visualization (온톨로지 시각화를 활용한 사용자 리뷰 분석 기반 영화 추천 시스템)

  • Mun, Seong Min;Kim, Gi Nam;Choi, Gyeong cheol;Lee, Kyung Won
    • Design Convergence Study
    • /
    • v.15 no.2
    • /
    • pp.347-368
    • /
    • 2016
  • Recently, researches for the word of mouth(WOM) imply that consumers use WOM informations of products in their purchase process. This study suggests methods using opinion mining and visualization to understand consumers' opinion of each goods and each markets. For this study we conduct research that includes developing domain ontology based on reviews confined to "movie" category because people who want to have watching movie refer other's movie reviews recently, and it is analyzed by opinion mining and visualization. It has differences comparing other researches as conducting attribution classification of evaluation factors and comprising verbal dictionary about evaluation factors when we conduct ontology process for analyzing. We want to prove through the result if research method will be valid. Results derived from this study can be largely divided into three. First, This research explains methods of developing domain ontology using keyword extraction and topic modeling. Second, We visualize reviews of each movie to understand overall audiences' opinion about specific movies. Third, We find clusters that consist of products which evaluated similar assessments in accordance with the evaluation results for the product. Case study of this research largely shows three clusters containing 130 movies that are used according to audiences'opinion.