• Title/Summary/Keyword: Model Based Evaluation

Search Result 5,776, Processing Time 0.04 seconds

A Topic Modeling-based Recommender System Considering Changes in User Preferences (고객 선호 변화를 고려한 토픽 모델링 기반 추천 시스템)

  • Kang, So Young;Kim, Jae Kyeong;Choi, Il Young;Kang, Chang Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.43-56
    • /
    • 2020
  • Recommender systems help users make the best choice among various options. Especially, recommender systems play important roles in internet sites as digital information is generated innumerable every second. Many studies on recommender systems have focused on an accurate recommendation. However, there are some problems to overcome in order for the recommendation system to be commercially successful. First, there is a lack of transparency in the recommender system. That is, users cannot know why products are recommended. Second, the recommender system cannot immediately reflect changes in user preferences. That is, although the preference of the user's product changes over time, the recommender system must rebuild the model to reflect the user's preference. Therefore, in this study, we proposed a recommendation methodology using topic modeling and sequential association rule mining to solve these problems from review data. Product reviews provide useful information for recommendations because product reviews include not only rating of the product but also various contents such as user experiences and emotional state. So, reviews imply user preference for the product. So, topic modeling is useful for explaining why items are recommended to users. In addition, sequential association rule mining is useful for identifying changes in user preferences. The proposed methodology is largely divided into two phases. The first phase is to create user profile based on topic modeling. After extracting topics from user reviews on products, user profile on topics is created. The second phase is to recommend products using sequential rules that appear in buying behaviors of users as time passes. The buying behaviors are derived from a change in the topic of each user. A collaborative filtering-based recommendation system was developed as a benchmark system, and we compared the performance of the proposed methodology with that of the collaborative filtering-based recommendation system using Amazon's review dataset. As evaluation metrics, accuracy, recall, precision, and F1 were used. For topic modeling, collapsed Gibbs sampling was conducted. And we extracted 15 topics. Looking at the main topics, topic 1, top 3, topic 4, topic 7, topic 9, topic 13, topic 14 are related to "comedy shows", "high-teen drama series", "crime investigation drama", "horror theme", "British drama", "medical drama", "science fiction drama", respectively. As a result of comparative analysis, the proposed methodology outperformed the collaborative filtering-based recommendation system. From the results, we found that the time just prior to the recommendation was very important for inferring changes in user preference. Therefore, the proposed methodology not only can secure the transparency of the recommender system but also can reflect the user's preferences that change over time. However, the proposed methodology has some limitations. The proposed methodology cannot recommend product elaborately if the number of products included in the topic is large. In addition, the number of sequential patterns is small because the number of topics is too small. Therefore, future research needs to consider these limitations.

A Study on the Establishment of Comparison System between the Statement of Military Reports and Related Laws (군(軍) 보고서 등장 문장과 관련 법령 간 비교 시스템 구축 방안 연구)

  • Jung, Jiin;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.109-125
    • /
    • 2020
  • The Ministry of National Defense is pushing for the Defense Acquisition Program to build strong defense capabilities, and it spends more than 10 trillion won annually on defense improvement. As the Defense Acquisition Program is directly related to the security of the nation as well as the lives and property of the people, it must be carried out very transparently and efficiently by experts. However, the excessive diversification of laws and regulations related to the Defense Acquisition Program has made it challenging for many working-level officials to carry out the Defense Acquisition Program smoothly. It is even known that many people realize that there are related regulations that they were unaware of until they push ahead with their work. In addition, the statutory statements related to the Defense Acquisition Program have the tendency to cause serious issues even if only a single expression is wrong within the sentence. Despite this, efforts to establish a sentence comparison system to correct this issue in real time have been minimal. Therefore, this paper tries to propose a "Comparison System between the Statement of Military Reports and Related Laws" implementation plan that uses the Siamese Network-based artificial neural network, a model in the field of natural language processing (NLP), to observe the similarity between sentences that are likely to appear in the Defense Acquisition Program related documents and those from related statutory provisions to determine and classify the risk of illegality and to make users aware of the consequences. Various artificial neural network models (Bi-LSTM, Self-Attention, D_Bi-LSTM) were studied using 3,442 pairs of "Original Sentence"(described in actual statutes) and "Edited Sentence"(edited sentences derived from "Original Sentence"). Among many Defense Acquisition Program related statutes, DEFENSE ACQUISITION PROGRAM ACT, ENFORCEMENT RULE OF THE DEFENSE ACQUISITION PROGRAM ACT, and ENFORCEMENT DECREE OF THE DEFENSE ACQUISITION PROGRAM ACT were selected. Furthermore, "Original Sentence" has the 83 provisions that actually appear in the Act. "Original Sentence" has the main 83 clauses most accessible to working-level officials in their work. "Edited Sentence" is comprised of 30 to 50 similar sentences that are likely to appear modified in the county report for each clause("Original Sentence"). During the creation of the edited sentences, the original sentences were modified using 12 certain rules, and these sentences were produced in proportion to the number of such rules, as it was the case for the original sentences. After conducting 1 : 1 sentence similarity performance evaluation experiments, it was possible to classify each "Edited Sentence" as legal or illegal with considerable accuracy. In addition, the "Edited Sentence" dataset used to train the neural network models contains a variety of actual statutory statements("Original Sentence"), which are characterized by the 12 rules. On the other hand, the models are not able to effectively classify other sentences, which appear in actual military reports, when only the "Original Sentence" and "Edited Sentence" dataset have been fed to them. The dataset is not ample enough for the model to recognize other incoming new sentences. Hence, the performance of the model was reassessed by writing an additional 120 new sentences that have better resemblance to those in the actual military report and still have association with the original sentences. Thereafter, we were able to check that the models' performances surpassed a certain level even when they were trained merely with "Original Sentence" and "Edited Sentence" data. If sufficient model learning is achieved through the improvement and expansion of the full set of learning data with the addition of the actual report appearance sentences, the models will be able to better classify other sentences coming from military reports as legal or illegal. Based on the experimental results, this study confirms the possibility and value of building "Real-Time Automated Comparison System Between Military Documents and Related Laws". The research conducted in this experiment can verify which specific clause, of several that appear in related law clause is most similar to the sentence that appears in the Defense Acquisition Program-related military reports. This helps determine whether the contents in the military report sentences are at the risk of illegality when they are compared with those in the law clauses.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

The Brassica rapa Tissue-specific EST Database (배추의 조직 특이적 발현유전자 데이터베이스)

  • Yu, Hee-Ju;Park, Sin-Gi;Oh, Mi-Jin;Hwang, Hyun-Ju;Kim, Nam-Shin;Chung, Hee;Sohn, Seong-Han;Park, Beom-Seok;Mun, Jeong-Hwan
    • Horticultural Science & Technology
    • /
    • v.29 no.6
    • /
    • pp.633-640
    • /
    • 2011
  • Brassica rapa is an A genome model species for Brassica crop genetics, genomics, and breeding. With the completion of sequencing the B. rapa genome, functional analysis of the genome is forthcoming issue. The expressed sequence tags are fundamental resources supporting annotation and functional analysis of the genome including identification of tissue-specific genes and promoters. As of July 2011, 147,217 ESTs from 39 cDNA libraries of B. rapa are reported in the public database. However, little information can be retrieved from the sequences due to lack of organized databases. To leverage the sequence information and to maximize the use of publicly-available EST collections, the Brassica rapa tissue-specific EST database (BrTED) is developed. BrTED includes sequence information of 23,962 unigenes assembled by StackPack program. The unigene set is used as a query unit for various analyses such as BLAST against TAIR gene model, functional annotation using MIPS and UniProt, gene ontology analysis, and prediction of tissue-specific unigene sets based on statistics test. The database is composed of two main units, EST sequence processing and information retrieving unit and tissue-specific expression profile analysis unit. Information and data in both units are tightly inter-connected to each other using a web based browsing system. RT-PCR evaluation of 29 selected unigene sets successfully amplified amplicons from the target tissues of B. rapa. BrTED provided here allows the user to identify and analyze the expression of genes of interest and aid efforts to interpret the B. rapa genome through functional genomics. In addition, it can be used as a public resource in providing reference information to study the genus Brassica and other closely related crop crucifer plants.

Development and Evaluation of the PBL Teaching/Learning Process Plan of 'Housing Culture and Practical Space Use' for Home Economics in Middle School (중학교 가정과 문제 중심 '주생활 문화와 주거 공간 활용' 교수·학습 과정안 개발과 평가)

  • Cho, Jiwon;Cho, Jaesoon
    • Journal of Korean Home Economics Education Association
    • /
    • v.32 no.2
    • /
    • pp.59-76
    • /
    • 2020
  • The purpose of this study was to develop and evaluate the teaching/learning process plan of 'housing culture and practical space use' for home economics in middle school according to the problem based learning(PBL) model. The plan consisting of 4-lessons has been developed and implemented following the steps of ADDIE model. Various activity materials (4 scenarios, 6 individual activity sheets, 10 reading texts, and 5 working resources) and visual materials (4 sets of ppt and 4 moving pictures) as well as questionnaire were developed for the 4-session lessons. The plans were implemented to a single class of 21 junior students at H middle school in rural area, Kyeongnam, from 1st to 12th of April, 2019. Students highly enjoyed and were satisfied with the whole 4-lessons in aspects such as understanding of the contents, adequacy of materials and activities, and usefulness in one's own daily life. Additionally, they have more actively participated in the lessons than usual and even interested in learning more of such lessons. Students also reported that they highly accomplished the goal of each lesson as well as overall objectives. They showed interest in the major part of PBL lesson such as scenario and group activities. And they engaged themselves in drawing the share housing space plan with '5D planner' web program which they described as the best part of the lessons. The teaching/learning process plan developed in this study may be used as a theme of maker education, which is emerging these days. It can be concluded that the PBL teaching/learning process plans for 'housing values and practical space use' would contribute to improving students' attitude on living with others and ability to manage one's individual life.

Data issue and Improvement Direction for Marine Spatial Planning (해양공간계획 지원을 위한 정보 현안 및 개선 방향 연구)

  • CHANG, Min-Chol;PARK, Byung-Moon;CHOI, Yun-Soo;CHOI, Hee-Jung;KIM, Tae-Hoon;LEE, Bang-Hee
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.175-190
    • /
    • 2018
  • Recently, policy of the marine advanced countries were switched from the preemption using ocean to post-project development. In this study, we suggest improvement and the pending issues when are deducted to the database of the marine spatial information is constructed over the GIS system for the Korean Marine Spatial Planning (KMSP). More than 250 spatial information in the seas of Korea were processed in order of data collection, GIS transformation, data analysis and processing, data grouping, and space mapping. It's process had some problem occurred to error of coordinate system, digitizing process for lack of the spatial information, performed by overlapping for the original marine spatial information, and so on. Moreover, solution is needed to data processing methods excluding personal information which is necessary when produce the spatial data for analysis of the used marine status and minimized method for different between the spatial information based GIS system and the based real information. Therefore, collection and securing system of lacking marine spatial information is enhanced for marine spatial planning. it is necessary to link and expand marine fisheries survey system. It is needed to the marine spatial planning. The marine spatial planning is required to the evaluation index of marine spatial and detailed marine spatial map. In addition, Marine spatial planning is needed to standard guideline and system of quality management. This standard guideline generate to phase for production, processing, analysis, and utilization. Also, the quality management system improve for the information quality of marine spatial information. Finally, we suggest necessity need for the depths study which is considered as opening extension of the marine spatial information and deduction on application model.

Applicability evaluation of GIS-based erosion models for post-fire small watershed in the wildland-urban interface (WUI 산불 소유역에 대한 GIS 기반 침식모형의 적용성 평가)

  • Shin, Seung Sook;Ahn, Seunghyo;Song, Jinuk;Chae, Guk Seok;Park, Sang Deog
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.6
    • /
    • pp.421-435
    • /
    • 2024
  • In April 2023, a wildfire broke out in Gangneung located in the east coast region due to the influence of the Yanggang-local wind. In this study, GIS-based RUSLE(Revised Universal Soil Loss Equation) and SEMMA (Soil Erosion Model for Mountain Areas) were used to evaluate the erosion rate due to vegetation recovery in a small watershed of the Gangneung WUI(Wildland-Urban Interface) fire. The small watershed of WUI fire has a low altitude range of 10-30 m and the average slope of 10.0±7.4° which corresponds to a gentle slope. The soil texture was loamy sand with a high organic content and the deep soil depth. As herbaceous layer regenerated profusely in the gully after the wildfire, the NDVI (Normalized Difference Vegetation Index) reached a maximum of 0.55. Simulation results of erosion rates showed that RUSLE ranged from 0.07-94.9 t/ha/storm and SEMMA ranged from 0.24-83.6 t/ha/storm. RUSLE overestimated the average erosion rate by 1.19-1.48 times compared to SEMMA. The erosion rates were estimated to be high in the middle slope where burned pine trees were widely distributed and the slope was steep and to be relatively low in the hollow below the gully where herbaceous layer recovers rapidly. SEMMA showed a rapid increase in erosion sensitivity under at certain vegetation covers with NDVI below 0.25 (Ic = 0.35) on post-fire hillslopes. Gentle slopes with high organic content and rapid recovery of natural vegetation had relatively low erosion rate compared to steep slopes. As subsequent infrastructure and human damages due to sediment disaster by heavy rain is anticipated in WUI fire areas, the research results may be used as basic data for targeted management and decision making on the implementation of emergency treatment after the wildfire.

A Simple Method for Evaluation of Pepper Powder Color Using Vis/NIR Hyperspectral System (Vis/NIR 초분광 분석을 이용한 고춧가루 색도 간이 측정법 개발)

  • Han, Koeun;Lee, Hoonsoo;Kang, Jin-Ho;Choi, Eunah;Oh, Se-Jeong;Lee, Yong-Jik;Cho, Byoung-Kwan;Kang, Byoung-Cheorl
    • Horticultural Science & Technology
    • /
    • v.33 no.3
    • /
    • pp.403-408
    • /
    • 2015
  • Color is one of the quality determining factors for pepper powder. To measure the color of pepper powder, several methods including high-performance liquid chromatography (HPLC), thin layer chromatography (TLC), and ASTA-20 have been used. Among the methods, the ASTA-20 method is most widely used for color measurement of a large number of samples because of its simplicity and accuracy. However it requires time consuming preprocessing steps and generates chemical waste containing acetone. As an alternative, we developed a fast and simple method based on a visible/near infrared (Vis/NIR) hyperspectral method to measure the color of pepper powder. To evaluate correlation between the ASTA-20 and the visible/near infrared (Vis/NIR) hyperspectral methods, we first measured the color of a total of 488 pepper powder samples using the two methods. Then, a partial least squares (PLS) model was postulated using the color values of randomly selected 3 66 samples to predict ASTA values of unknown samples. When the ASTA values predicted by the PLS model were compared with those of the ASTA-20 method for 122 samples not used for model development, there was very high correlation between two methods ($R^2=0.88$) demonstrating reliability of Vis/NIR hyperspectral method. We believe that this simple and fast method is suitable for highthroughput screening of a large number of samples because this method does not require preprocessing steps required for the ASTA-20 method, and takes less than 30 min to measure the color of pepper powder.

Quantitative Microbial Risk Assessment Model for Staphylococcus aureus in Kimbab (김밥에서의 Staphylococcus aureus에 대한 정량적 미생물위해평가 모델 개발)

  • Bahk, Gyung-Jin;Oh, Deog-Hwan;Ha, Sang-Do;Park, Ki-Hwan;Joung, Myung-Sub;Chun, Suk-Jo;Park, Jong-Seok;Woo, Gun-Jo;Hong, Chong-Hae
    • Korean Journal of Food Science and Technology
    • /
    • v.37 no.3
    • /
    • pp.484-491
    • /
    • 2005
  • Quantitative microbial risk assessment (QMRA) analyzes potential hazard of microorganisms on public health and offers structured approach to assess risks associated with microorganisms in foods. This paper addresses specific risk management questions associated with Staphylococcus aureus in kimbab and improvement and dissemination of QMRA methodology, QMRA model was developed by constructing four nodes from retail to table pathway. Predictive microbial growth model and survey data were combined with probabilistic modeling to simulate levels of S. aureus in kimbab at time of consumption, Due to lack of dose-response models, final level of S. aureus in kimbeb was used as proxy for potential hazard level, based on which possibility of contamination over this level and consumption level of S. aureus through kimbab were estimated as 30.7% and 3.67 log cfu/g, respectively. Regression sensitivity results showed time-temperature during storage at selling was the most significant factor. These results suggested temperature control under $10^{\circ}C$ was critical control point for kimbab production to prevent growth of S. aureus and showed QMRA was useful for evaluation of factors influencing potential risk and could be applied directly to risk management.

Evaluation of Runoff‧Peak Rate Runoff and Sediment Yield under Various Rainfall Intensities and Patterns Using WEPP Watershed Model (다양한 강우강도 및 패턴에 따른 WEPP 모형의 유출‧첨두유출‧토양유실량 평가)

  • Choi, Jae-Wan;Ryu, Ji-Chul;Kim, Ik-Jae;Lim, Kyoung-Jae
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.8
    • /
    • pp.795-804
    • /
    • 2012
  • Recently, changes in rainfall intensity and patterns have been causing increasing soil loss worldwide. As a result, the water ecosystem becomes worse and crops yield are reduced with soil loss and nutrient loss with it. Many studies have been proposed to estimate runoff and soil loss to predict or decrease non-point source pollution. Although the USLE has been used for many years in estimating soil losses, the USLE cannot reflect effects on soil loss of changes in rainfall intensity and patterns. The WEPP, physically based model, is capable of predicting soil loss and runoff using various rainfall intensity. In this study, the WEPP model was simulated for sediment yield, runoff and peak runoff using data of 5, 10, 30, 60 minute term rainfall, Huff's method and design rainfall. In case of rainfall interval of 5 minutes and 60 minutes, the sediment and runoff values decreased by 24% and 19%, respectively. The peak rate runoff values decreased by 16% when rainfall interval changed from 5 minutes to 60 minutes, indicating the peak rate runoff values are affected by rainfall intensity to some degrees. As a result of simulating using Huff's method, all values (sediment yield, runoff, peak runoff) were found to be the greatest at third quartile. According to the analysis under various design rainfall conditions (2, 3, 5, 10, 20, 30, 50, 100, 200, 300 years frequency), sediment yield, runoff, and peak runoff of 906.2%, 249.4% and 183.9% were estimated using 2 year to 300 year frequency rainfall data.