• Title/Summary/Keyword: 시간 마이닝

Search Result 400, Processing Time 0.025 seconds

An Efficient Clustering Algorithm based on Heuristic Evolution (휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘)

  • Ryu, Joung-Woo;Kang, Myung-Ku;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.80-90
    • /
    • 2002
  • Clustering is a useful technique for grouping data points such that points within a single group/cluster have similar characteristics. Many clustering algorithms have been developed and used in engineering applications including pattern recognition and image processing etc. Recently, it has drawn increasing attention as one of important techniques in data mining. However, clustering algorithms such as K-means and Fuzzy C-means suffer from difficulties. Those are the needs to determine the number of clusters apriori and the clustering results depending on the initial set of clusters which fails to gain desirable results. In this paper, we propose a new clustering algorithm, which solves mentioned problems. In our method we use evolutionary algorithm to solve the local optima problem that clustering converges to an undesirable state starting with an inappropriate set of clusters. We also adopt a new measure that represents how well data are clustered. The measure is determined in terms of both intra-cluster dispersion and inter-cluster separability. Using the measure, in our method the number of clusters is automatically determined as the result of optimization process. And also, we combine heuristic that is problem-specific knowledge with a evolutionary algorithm to speed evolutionary algorithm search. We have experimented our algorithm with several sets of multi-dimensional data and it has been shown that one algorithm outperforms the existing algorithms.

Committee Learning Classifier based on Attribute Value Frequency (속성 값 빈도 기반의 전문가 다수결 분류기)

  • Lee, Chang-Hwan;Jung, In-Chul;Kwon, Young-S.
    • Journal of KIISE:Databases
    • /
    • v.37 no.4
    • /
    • pp.177-184
    • /
    • 2010
  • In these day, many data including sensor, delivery, credit and stock data are generated continuously in massive quantity. It is difficult to learn from these data because they are large in volume and changing fast in their concepts. To handle these problems, learning methods based in sliding window methods over time have been used. But these approaches have a problem of rebuilding models every time new data arrive, which requires a lot of time and cost. Therefore we need very simple incremental learning methods. Bayesian method is an example of these methods but it has a disadvantage which it requries the prior knowledge(probabiltiy) of data. In this study, we propose a learning method based on attribute values. In the proposed method, even though we don't know the prior knowledge(probability) of data, we can apply our new method to data. The main concept of this method is that each attribute value is regarded as an expert learner, summing up the expert learners lead to better results. Experimental results show our learning method learns from data very fast and performs well when compared to current learning methods(decision tree and bayesian).

Extracting Typical Group Preferences through User-Item Optimization and User Profiles in Collaborative Filtering System (사용자-상품 행렬의 최적화와 협력적 사용자 프로파일을 이용한 그룹의 대표 선호도 추출)

  • Ko Su-Jeong
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.7
    • /
    • pp.581-591
    • /
    • 2005
  • Collaborative filtering systems have problems involving sparsity and the provision of recommendations by making correlations between only two users' preferences. These systems recommend items based only on the preferences without taking in to account the contents of the items. As a result, the accuracy of recommendations depends on the data from user-rated items. When users rate items, it can be expected that not all users ran do so earnestly. This brings down the accuracy of recommendations. This paper proposes a collaborative recommendation method for extracting typical group preferences using user-item matrix optimization and user profiles in collaborative tittering systems. The method excludes unproven users by using entropy based on data from user-rated items and groups users into clusters after generating user profiles, and then extracts typical group preferences. The proposed method generates collaborative user profiles by using association word mining to reflect contents as well as preferences of items and groups users into clusters based on the profiles by using the vector space model and the K-means algorithm. To compensate for the shortcoming of providing recommendations using correlations between only two user preferences, the proposed method extracts typical preferences of groups using the entropy theory The typical preferences are extracted by combining user entropies with item preferences. The recommender system using typical group preferences solves the problem caused by recommendations based on preferences rated incorrectly by users and reduces time for retrieving the most similar users in groups.

Research Trends and Knowledge Structure of Digital Transformation in Fashion (패션 영역에서 디지털 전환 관련 연구동향 및 지식구조)

  • Choi, Yeong-Hyeon;Jeong, Jinha;Lee, Kyu-Hye
    • Journal of Digital Convergence
    • /
    • v.19 no.3
    • /
    • pp.319-329
    • /
    • 2021
  • This study aims to investigate Korean fashion-related research trends and knowledge structures on digital transformation through information-based approaches. Accordingly, we first identified the current status of the relevant research in Korean academic literature by year and journal; subsequently, we derived key research topics through network analysis, and then analyzed major research trends and knowledge structures by time. From 2010 to 2020, we collected 159 studies published on Korean academic platforms, cleansed data through Python 3.7, and measured centrality and network implementation through NodeXL 1.0.1. The results are as follows: first, related research has been actively conducted since 2016, mainly concentrated in clothing and art areas. Second, the online platform, AR/VR, appeared as the most frequently mentioned topic, and consumer psychological analysis, marketing strategy suggestion, and case analysis were used as the main research methods. Through clustering, major research contents for each sub-major of clothing were derived. Third, major subject by period was considered, which has, over time, changed from consumer-centered research to strategy suggestion, and design development research of platforms or services. This study contributes to enhancing insight into the fashion field on digital transformation, and can be used as a basic research to design research on related topics.

Exploring Potential Application Industry for Fintech Technology by Expanding its Terminology: Network Analysis and Topic Modelling Approach (용어 확장을 통한 핀테크 기술 적용가능 산업의 탐색 :네트워크 분석 및 토픽 모델링 접근)

  • Park, Mingyu;Jeon, Byeongmin;Kim, Jongwoo;Geum, Youngjung
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.1
    • /
    • pp.1-28
    • /
    • 2021
  • FinTech has been discussed as an important business area towards technology-driven financial innovation. The term fintech is a combination of finance and technology, which means ICT technology currently associated with all finance areas. The popularity of the fintech industry has significantly increased over time, with full investment and support for numerous startups. Therefore, both academia and practice tried to analyze the trend of the fintech area. Despite the fact, however, previous research has limitations in terms of collecting relevant databases for fintech and identifying proper application areas. In response, this study proposed a new method for analyzing the trend of Fintech fields by expanding Fintech's terminology and using network analysis and topic modeling. A new Fintech terminology list was created and a total of 18,341 patents were collected from USPTO for 10 years. The co-classification analysis and network analysis was conducted to identify the technological trends of patent classification. In addition, topic modeling was conducted to identify the trends of fintech in order to analyze the contents of fintech. This study is expected to help both managers and investors who want to be involved in technology-driven financial services seize new FinTech technology opportunities.

A study on the User Experience at Unmanned Checkout Counter Using Big Data Analysis (빅데이터 분석을 통한 무인계산대 사용자 경험에 관한 연구)

  • Kim, Ae-sook;Jung, Sun-mi;Ryu, Gi-hwan;Kim, Hee-young
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.343-348
    • /
    • 2022
  • This study aims to analyze the user experience of unmanned checkout counters perceived by consumers using SNS big data. For this study, blogs, news, intellectuals, cafes, intellectuals (tips), and web documents were analyzed on Naver and Daum, and 'unmanned checkpoints' were used as keywords for data search. The data analysis period was selected as two years from January 1, 2020 to December 31, 2021. For data collection and analysis, frequency and matrix data were extracted through Textom, and network analysis and visualization analysis were conducted using the NetDraw function of the UCINET 6 program. As a result, the perception of the checkout counter was clustered into accessibility, usability, continuous use intention, and others according to the definition of consumers' experience factors. From a supplier's point of view, if unmanned checkpoints spread indiscriminately to solve the problem of raising the minimum wage and shortening working hours, a bigger employment problem will arise from a social point of view. In addition, institutionalization is needed to supply easy and convenient unmanned checkout counters for the elderly and younger generations, children, and foreigners who are not familiar with unmanned calculation.

Trend Forecasting and Analysis of Quantum Computer Technology (양자 컴퓨터 기술 트렌드 예측과 분석)

  • Cha, Eunju;Chang, Byeong-Yun
    • Journal of the Korea Society for Simulation
    • /
    • v.31 no.3
    • /
    • pp.35-44
    • /
    • 2022
  • In this study, we analyze and forecast quantum computer technology trends. Previous research has been mainly focused on application fields centered on technology for quantum computer technology trends analysis. Therefore, this paper analyzes important quantum computer technologies and performs future signal detection and prediction, for a more market driven technical analysis and prediction. As analyzing words used in news articles to identify rapidly changing market changes and public interest. This paper extends conference presentation of Cha & Chang (2022). The research is conducted by collecting domestic news articles from 2019 to 2021. First, we organize the main keywords through text mining. Next, we explore future quantum computer technologies through analysis of Term Frequency - Inverse Document Frequency(TF-IDF), Key Issue Map(KIM), and Key Emergence Map (KEM). Finally, the relationship between future technologies and supply and demand is identified through random forests, decision trees, and correlation analysis. As results of the study, the interest in artificial intelligence was the highest in frequency analysis, keyword diffusion and visibility analysis. In terms of cyber-security, the rate of mention in news articles is getting overwhelmingly higher than that of other technologies. Quantum communication, resistant cryptography, and augmented reality also showed a high rate of increase in interest. These results show that the expectation is high for applying trend technology in the market. The results of this study can be applied to identifying areas of interest in the quantum computer market and establishing a response system related to technology investment.

Trend Analysis of Sports for All-Related Issues in Early Stage of COVID-19 Using Topic Modeling (토픽 모델링을 활용한 코로나19 초기 생활체육 이슈 분석)

  • Chung, Yunkil;Seo, Sumin;Kang, Hyunmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.57-79
    • /
    • 2022
  • COVID-19, which started in December 2019, has had a great impact on our lives in general, including politics, economy, society, and culture, and activities in sports and arts have also been significantly reduced. In the case of sports, sports for all fields in which ordinary citizens participate were particularly affected, and cases of infection in places closely related to people's lives, such as gyms, table tennis, and badminton clubs, also amplified the social fear of the spread of COVID-19. Therefore, in this study, we analyzed news articles related to sports for all at the time when COVID-19 was first spread, and investigated what issues were emerging and being discussed in the sports for all field under the COVID-19 situation. Specifically, we collected news articles dealt with sports for all issues under the COVID-19 situation from Korea's leading portal news sites and identified key sports for all issues by performing topic modeling on these articles. Through the analysis, we found meaningful issues such as COVID-19 outbreak in sports facilities and support for sports activities. In addition, through wordcloud analysis of these major issues, we visually understood the issues and identified the changes in these issues over time.

Analysis of articles on water quality accidents in the water distribution networks using big data topic modelling and sentiment analysis (빅데이터 토픽모델링과 감성분석을 활용한 물공급과정에서의 수질사고 기사 분석)

  • Hong, Sung-Jin;Yoo, Do-Guen
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1235-1249
    • /
    • 2022
  • This study applied the web crawling technique for extracting big data news on water quality accidents in the water supply system and presented the algorithm in a procedural way to obtain accurate water quality accident news. In addition, in the case of a large-scale water quality accident, development patterns such as accident recognition, accident spread, accident response, and accident resolution appear according to the occurrence of an accident. That is, the analysis of the development of water quality accidents through key keywords and sentiment analysis for each stage was carried out in detail based on case studies, and the meanings were analyzed and derived. The proposed methodology was applied to the larval accident period of Incheon Metropolitan City in 2020 and analyzed. As a result, in a situation where the disclosure of information that directly affects consumers, such as water quality accidents, is restricted, the tone of news articles and media reports about water quality accidents with long-term damage in the event of an accident and the degree of consumer pride clearly change over time. could check This suggests the need to prepare consumer-centered policies to increase consumer positivity, although rapid restoration of facilities is very important for the development of water quality accidents from the supplier's point of view.

A Generation and Matching Method of Normal-Transient Dictionary for Realtime Topic Detection (실시간 이슈 탐지를 위한 일반-급상승 단어사전 생성 및 매칭 기법)

  • Choi, Bongjun;Lee, Hanjoo;Yong, Wooseok;Lee, Wonsuk
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.5
    • /
    • pp.7-18
    • /
    • 2017
  • Recently, the number of SNS user has rapidly increased due to smart device industry development and also the amount of generated data is exponentially increasing. In the twitter, Text data generated by user is a key issue to research because it involves events, accidents, reputations of products, and brand images. Twitter has become a channel for users to receive and exchange information. An important characteristic of Twitter is its realtime. Earthquakes, floods and suicides event among the various events should be analyzed rapidly for immediately applying to events. It is necessary to collect tweets related to the event in order to analyze the events. But it is difficult to find all tweets related to the event using normal keywords. In order to solve such a mentioned above, this paper proposes A Generation and Matching Method of Normal-Transient Dictionary for realtime topic detection. Normal dictionaries consist of general keywords(event: suicide-death-loop, death, die, hang oneself, etc) related to events. Whereas transient dictionaries consist of transient keywords(event: suicide-names and information of celebrities, information of social issues) related to events. Experimental results show that matching method using two dictionary finds more tweets related to the event than a simple keyword search.