• Title/Summary/Keyword: 시계열 데이터 마이닝

Search Result 70, Processing Time 0.024 seconds

A Study on Trends of Key Issues in Port Safety at Busan Port (부산항 항만안전 주요 이슈 동향에 관한 연구)

  • Jeong-Min Lee;Do-Yeon Ha;Joo-Hye Kim
    • Journal of Navigation and Port Research
    • /
    • v.48 no.1
    • /
    • pp.34-48
    • /
    • 2024
  • As global supply chain risks proliferate unpredictably, the high interdependence of port and logistics industry intensifies the risk burden. This study conducted fundamental research to explore diverse safety issues in domestic ports. Utilizing news article data about Busan Port, we employed LDA topic modeling and time-series linear regression to understand key safety trends. Over the past 30 years, Busan Port faced nine major safety issues-maritime safety, import cargo inspection, labor strikes, and natural disasters emerged cyclically. Major port safety issues in Busan Port are primarily characterized by an unpredictable nature, falling under socio-environmental and natural phenomena types, indicating a significant impact of global uncertainty. Therefore, systematic policies need to be formulated based on identified port safety issues to enhance port safety in Busan Port. Additionally, there is a need to strengthen the resilience of port safety for unpredictable risk situations. In conclusion, advanced research activities are necessary to promote port safety enhancement in response to dynamically changing social conditions.

A Single Index Approach for Time-Series Subsequence Matching that Supports Moving Average Transform of Arbitrary Order (단일 색인을 사용한 임의 계수의 이동평균 변환 지원 시계열 서브시퀀스 매칭)

  • Moon Yang-Sae;Kim Jinho
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.42-55
    • /
    • 2006
  • We propose a single Index approach for subsequence matching that supports moving average transform of arbitrary order in time-series databases. Using the single index approach, we can reduce both storage space overhead and index maintenance overhead. Moving average transform is known to reduce the effect of noise and has been used in many areas such as econometrics since it is useful in finding overall trends. However, the previous research results have a problem of occurring index overhead both in storage space and in update maintenance since tile methods build several indexes to support arbitrary orders. In this paper, we first propose the concept of poly-order moving average transform, which uses a set of order values rather than one order value, by extending the original definition of moving average transform. That is, the poly-order transform makes a set of transformed windows from each original window since it transforms each window not for just one order value but for a set of order values. We then present theorems to formally prove the correctness of the poly-order transform based subsequence matching methods. Moreover, we propose two different subsequence matching methods supporting moving average transform of arbitrary order by applying the poly-order transform to the previous subsequence matching methods. Experimental results show that, for all the cases, the proposed methods improve performance significantly over the sequential scan. For real stock data, the proposed methods improve average performance by 22.4${\~}$33.8 times over the sequential scan. And, when comparing with the cases of building each index for all moving average orders, the proposed methods reduce the storage space required for indexes significantly by sacrificing only a little performance degradation(when we use 7 orders, the methods reduce the space by up to 1/7.0 while the performance degradation is only $9\%{\~}42\%$ on the average). In addition to the superiority in performance, index space, and index maintenance, the proposed methods have an advantage of being generalized to many sorts of other transforms including moving average transform. Therefore, we believe that our work can be widely and practically used in many sort of transform based subsequence matching methods.

A Demand Forecasting for Aircraft Spare Parts using ARMIA (ARIMA를 이용한 항공기 수리부속의 수요 예측)

  • Park, Young-Jin;Jeon, Geon-Wook
    • Journal of the military operations research society of Korea
    • /
    • v.34 no.2
    • /
    • pp.79-101
    • /
    • 2008
  • This study is for improvement of repair part demand forecasting method of Republic of Korea Air Force aircraft. Recently, demand prediction methods are Weighted moving average, Linear moving average, Trend analysis, Simple exponential smoothing, Linear exponential smoothing. But these use fixed weight and moving average range. Also, NORS(Not Operationally Ready upply) is increasing. Recommended method of Box-Jenkins' ARIMA can solve problems of these method and improve estimate accuracy. To compare recent prediction method and ARIMA that use mean squared error(MSE) is reacted sensitively in change of error. ARIMA has high accuracy than existing forecasting method. If apply this method of study in other several Items, can prove demand forecast Capability.

An Analysis of the International Trends of Research on Artificial Intelligence in Education Using Topic Modeling (인공지능 활용 교육의 토픽모델링 분석을 통한 수학교육 연구 방향의 함의)

  • Noh, Jihwa;Ko, Ho Kyoung;Kim, Byeongsoo;Huh, Nan
    • Journal of the Korean School Mathematics Society
    • /
    • v.26 no.1
    • /
    • pp.1-19
    • /
    • 2023
  • This study analyzed the international trends of research concerning artificial intelligence in education by examining 352 papers recently published in the International Journal of Artificial Intelligence in Education(IJAIED) with the topic modeling method. The IJAIED is the official, SCOPUS-indexed journal of the International AIED Society. The analysis revealed that international AIED research trends could be categorized into eight topics with topics such as analyzing student behavior model in learning systems and designing feedback to student solutions being increased over time, whereas research focusing on data handling methods was decreased over time. Based on the findings implications and suggestions for the research and development of the applications of AIED were provided.

A School-tailored High School Integrated Science Q&A Chatbot with Sentence-BERT: Development and One-Year Usage Analysis (인공지능 문장 분류 모델 Sentence-BERT 기반 학교 맞춤형 고등학교 통합과학 질문-답변 챗봇 -개발 및 1년간 사용 분석-)

  • Gyeongmo Min;Junehee Yoo
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.3
    • /
    • pp.231-248
    • /
    • 2024
  • This study developed a chatbot for first-year high school students, employing open-source software and the Korean Sentence-BERT model for AI-powered document classification. The chatbot utilizes the Sentence-BERT model to find the six most similar Q&A pairs to a student's query and presents them in a carousel format. The initial dataset, built from online resources, was refined and expanded based on student feedback and usability throughout over the operational period. By the end of the 2023 academic year, the chatbot integrated a total of 30,819 datasets and recorded 3,457 student interactions. Analysis revealed students' inclination to use the chatbot when prompted by teachers during classes and primarily during self-study sessions after school, with an average of 2.1 to 2.2 inquiries per session, mostly via mobile phones. Text mining identified student input terms encompassing not only science-related queries but also aspects of school life such as assessment scope. Topic modeling using BERTopic, based on Sentence-BERT, categorized 88% of student questions into 35 topics, shedding light on common student interests. A year-end survey confirmed the efficacy of the carousel format and the chatbot's role in addressing curiosities beyond integrated science learning objectives. This study underscores the importance of developing chatbots tailored for student use in public education and highlights their educational potential through long-term usage analysis.

A Spatial Data Mining and Geographical Customer Relationship Management System (공간 데이터마이닝을 이용한 고객 관리시스템)

  • Lee, Sang-Moon;Seo, Jeong-Min
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.6
    • /
    • pp.121-128
    • /
    • 2010
  • Spatial data mining has been developed to support spatial association knowledge between spatial features or its non-spatial attributes for an application areas. At the present time, a number of researchers attempt to the data mining techniques apply to the several analysis areas, for examples, civil engineering, environmental, agricultural areas. Despite the efforts that, until such time as not existed practical systems for the gCRMDMs. gCRMDMs is merged with very large spatial database and CRM information system. Also, it is discovery the association rule for the predictions of customer's shopping pattern informations in a huge database consisted with spatial and non-spatial dataset. For this goal, gCRMDMs need spatial data mining techniques. But, nowadays, in a most case not exist utilizable model for the gCRMDMs. Therefore, in this paper, we proposed a practical gCRMDMs model to support a customer, store, street, building and geographical suited to the trade area.

Analysis of Agenda-setting Changes in Alpine Agricultural of Uljin-gun Using Text-Mining - Focusing on the Keywords of Mass-media, Blog·Cafe - (텍스트마이닝 기법을 활용한 울진군 금강송 산지농업 의제설정 변화 - 매스미디어와 블로그·카페 키워드를 중심으로 -)

  • Do, Jee-Yoon;Jeong, Myeong-Cheol
    • Journal of the Korean Institute of Rural Architecture
    • /
    • v.24 no.3
    • /
    • pp.47-57
    • /
    • 2022
  • This study attempted to grasp the status and perception of Uljin Geumgangsong by grasping mass media issues and user perception using big data, and to present basic data when constructing monitoring using user perception by examining the establishment relationship of agenda setting from a time-series perspective. The results of collecting and analyzing text data that can identify mass media and visitor awareness are as follows. First, both mass media and visitor keywords were related to the importance of the value and meaning of Uljin Geumgangsong. Second, in the case of the connection network, Geumgang Pine Agriculture was centered, but in the case of difference in perception between mass media and visitors, such results were derived due to the object of interest. Third, in the case of the connection relationship structure, the connection strength was strong because there were many overlapping contents of mass media. Fourth, as a result of the centrality analysis, both mass media and visitor-aware keywords were positively recognized as spaces created and maintained through institutional support, and objective perception could be grasped by finding hidden keywords. Fifth, as a result of time series analysis, it was possible to grasp the flow through the issue keywords that appeared by period, and unlike the past, it was recognized as a place for tourism and travel. Finally, as a result of examining whether the agenda setting is consistent, there is a mass media influence, so it is thought that more diverse and more information and publicity are needed by utilizing it.

Efficient Rotation-Invariant Boundary Image Matching Using the Envelope-based Lower Bound (엔빌로프 기반 하한을 사용한 효율적인 회전-불변 윤곽선 이미지 매칭)

  • Kim, Sang-Pil;Moon, Yang-Sae;Hong, Sun-Kyong
    • The KIPS Transactions:PartD
    • /
    • v.18D no.1
    • /
    • pp.9-22
    • /
    • 2011
  • In this paper we present an efficient solution to rotation?invariant boundary image matching. Computing the rotation-invariant distance between image time-series is a time-consuming process since it requires a lot of Euclidean distance computations for all possible rotations. In this paper we propose a novel solution that significantly reduces the number of distance computations using the envelope-based lower bound. To this end, we first present how to construct a single envelope from a query sequence and how to obtain a lower bound of the rotation-invariant distance using the envelope. We then show that the single envelope-based lower bound can reduce a number of distance computations. This approach, however, may cause bad performance since it may incur a larger lower bound by considering all possible rotated sequences in a single envelope. To solve this problem, we present a concept of rotation interval, and using the rotation interval we generalize the envelope-based lower bound by exploiting multiple envelopes rather than a single envelope. We also propose equi-width and envelope minimization divisions as the method of determining rotation intervals in the multiple envelope approach. Experimental results show that our envelope-based solutions outperform existing solutions by one or two orders of magnitude.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

The Prediction of Currency Crises through Artificial Neural Network (인공신경망을 이용한 경제 위기 예측)

  • Lee, Hyoung Yong;Park, Jung Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.19-43
    • /
    • 2016
  • This study examines the causes of the Asian exchange rate crisis and compares it to the European Monetary System crisis. In 1997, emerging countries in Asia experienced financial crises. Previously in 1992, currencies in the European Monetary System had undergone the same experience. This was followed by Mexico in 1994. The objective of this paper lies in the generation of useful insights from these crises. This research presents a comparison of South Korea, United Kingdom and Mexico, and then compares three different models for prediction. Previous studies of economic crisis focused largely on the manual construction of causal models using linear techniques. However, the weakness of such models stems from the prevalence of nonlinear factors in reality. This paper uses a structural equation model to analyze the causes, followed by a neural network model to circumvent the linear model's weaknesses. The models are examined in the context of predicting exchange rates In this paper, data were quarterly ones, and Consumer Price Index, Gross Domestic Product, Interest Rate, Stock Index, Current Account, Foreign Reserves were independent variables for the prediction. However, time periods of each country's data are different. Lisrel is an emerging method and as such requires a fresh approach to financial crisis prediction model design, along with the flexibility to accommodate unexpected change. This paper indicates the neural network model has the greater prediction performance in Korea, Mexico, and United Kingdom. However, in Korea, the multiple regression shows the better performance. In Mexico, the multiple regression is almost indifferent to the Lisrel. Although Lisrel doesn't show the significant performance, the refined model is expected to show the better result. The structural model in this paper should contain the psychological factor and other invisible areas in the future work. The reason of the low hit ratio is that the alternative model in this paper uses only the financial market data. Thus, we cannot consider the other important part. Korea's hit ratio is lower than that of United Kingdom. So, there must be the other construct that affects the financial market. So does Mexico. However, the United Kingdom's financial market is more influenced and explained by the financial factors than Korea and Mexico.