• Title/Summary/Keyword: Graph Mining

Search Result 105, Processing Time 0.026 seconds

Flow-based Anomaly Detection Using Access Behavior Profiling and Time-sequenced Relation Mining

  • Liu, Weixin;Zheng, Kangfeng;Wu, Bin;Wu, Chunhua;Niu, Xinxin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2781-2800
    • /
    • 2016
  • Emerging attacks aim to access proprietary assets and steal data for business or political motives, such as Operation Aurora and Operation Shady RAT. Skilled Intruders would likely remove their traces on targeted hosts, but their network movements, which are continuously recorded by network devices, cannot be easily eliminated by themselves. However, without complete knowledge about both inbound/outbound and internal traffic, it is difficult for security team to unveil hidden traces of intruders. In this paper, we propose an autonomous anomaly detection system based on behavior profiling and relation mining. The single-hop access profiling model employ a novel linear grouping algorithm PSOLGA to create behavior profiles for each individual server application discovered automatically in historical flow analysis. Besides that, the double-hop access relation model utilizes in-memory graph to mine time-sequenced access relations between different server applications. Using the behavior profiles and relation rules, this approach is able to detect possible anomalies and violations in real-time detection. Finally, the experimental results demonstrate that the designed models are promising in terms of accuracy and computational efficiency.

The Effect of Traditional Korean Medicine Treatment and Herbal Network Analysis in Postoperative Hip Fracture Inpatients (고관절 골절 수술 후 한의 입원치료 효과 및 다빈도 처방 약재 네트워크 분석)

  • Oh, Jihong;Lee, Myeong-Jong;Kim, Hojun
    • Journal of Korean Medicine Rehabilitation
    • /
    • v.32 no.3
    • /
    • pp.119-129
    • /
    • 2022
  • Objectives This study aimed to evaluate the effects of Integrative treatment of traditional Korean medicine (TKM) on 7 hospitalized patients after hip fracture surgery, and to identify significant herbs and co-prescribed herbs by using network analysis and association rule mining. Methods A retrospective chart review of the 7 hospitalized patients treated for postoperative hip fractures between January and December 2021 was performed. All TKM treatments for the patients were identified and Wilcoxon signed-rank test was performed to compare hip pain and mobility on admission and discharge. We visualized the network of herbal medicines and complications. By using network analysis, we also identified the significant herbs (high centrality of degree, eigenvector, and sub-graph). Co-prescription patterns for the hip fracture patients were further analyzed by association rule mining. Results We found that TKM treatment significantly relieved hip pain and improved mobility. Accompanying symptoms reported by the patients were general weakness, anorexia, dizziness, delirium, edema, sputum, sore throat, cough, rhinorrhea, and chills. Herbs composed of Sagunja-tang and Samul-tang showed high centralities and high associations with other herbs. In addition, Gupan, Nokyong, Yukjongyong, Useul, and Hyunhosaek were identified as important herbs for postoperative hip fracture patients. Conclusions This study provides evidence for clinical TKM use as an effective postoperative treatment for pain relief and improvement of mobility in patients with hip fractures. In addition, herbs that can be considered in the treatment of patients after hip fracture surgery were identified through network analysis and association rule mining.

Proposal for User-Product Attributes to Enhance Chatbot-Based Personalized Fashion Recommendation Service (챗봇 기반의 개인화 패션 추천 서비스 향상을 위한 사용자-제품 속성 제안)

  • Hyosun An;Sunghoon Kim;Yerim Choi
    • Journal of Fashion Business
    • /
    • v.27 no.3
    • /
    • pp.50-62
    • /
    • 2023
  • The e-commerce fashion market has experienced a remarkable growth, leading to an overwhelming availability of shared information and numerous choices for users. In light of this, chatbots have emerged as a promising technological solution to enhance personalized services in this context. This study aimed to develop user-product attributes for a chatbot-based personalized fashion recommendation service using big data text mining techniques. To accomplish this, over one million consumer reviews from Coupang, an e-commerce platform, were collected and analyzed using frequency analyses to identify the upper-level attributes of users and products. Attribute terms were then assigned to each user-product attribute, including user body shape (body proportion, BMI), user needs (functional, expressive, aesthetic), user TPO (time, place, occasion), product design elements (fit, color, material, detail), product size (label, measurement), and product care (laundry, maintenance). The classification of user-product attributes was found to be applicable to the knowledge graph of the Conversational Path Reasoning model. A testing environment was established to evaluate the usefulness of attributes based on real e-commerce users and purchased product information. This study is significant in proposing a new research methodology in the field of Fashion Informatics for constructing the knowledge base of a chatbot based on text mining analysis. The proposed research methodology is expected to enhance fashion technology and improve personalized fashion recommendation service and user experience with a chatbot in the e-commerce market.

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

Mining Trip Patterns in the Large Trip-Transaction Database and Analysis of Travel Behavior (대용량 교통카드 트랜잭션 데이터베이스에서 통행 패턴 탐사와 통행 행태의 분석)

  • Park, Jong-Soo;Lee, Keum-Sook
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.44-63
    • /
    • 2007
  • The purpose of this study is to propose mining processes in the large trip-transaction database of the Metropolitan Seoul area and to analyze the spatial characteristics of travel behavior. For the purpose. this study introduces a mining algorithm developed for exploring trip patterns from the large trip-transaction database produced every day by transit users in the Metropolitan Seoul area. The algorithm computes trip chains of transit users by using the bus routes and a graph of the subway stops in the Seoul subway network. We explore the transfer frequency of the transit users in their trip chains in a day transaction database of three different years. We find the number of transit users who transfer to other bus or subway is increasing yearly. From the trip chains of the large trip-transaction database, trip patterns are mined to analyze how transit users travel in the public transportation system. The mining algorithm is a kind of level-wise approaches to find frequent trip patterns. The resulting frequent patterns are illustrated to show top-ranked subway stations and bus stops in their supports. From the outputs, we explore the travel patterns of three different time zones in a day. We obtain sufficient differences in the spatial structures in the travel patterns of origin and destination depending on time zones. In order to examine the changes in the travel patterns along time, we apply the algorithm to one day data per year since 2004. The results are visualized by utilizing GIS, and then the spatial characteristics of travel patterns are analyzed. The spatial distribution of trip origins and destinations shows the sharp distinction among time zones.

  • PDF

Effective Utilization of Data based on Analysis of Spatial Data Mining (공간 데이터마이닝 분석을 통한 데이터의 효과적인 활용)

  • Kim, Kibum;An, Beongku
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.157-163
    • /
    • 2013
  • Data mining is a useful technology that can support new discoveries based on the pattern analysis and a variety of linkages between data, and currently is utilized in various fields such as finance, marketing, medical. In this paper, we propose an effective utilization method of data based on analysis of spatial data mining. We make use of basic data of foreigners living in Seoul. However, the data has some features distinguished from other areas of data, classification as sensitive information and legal problem such as personal information protection. So, we use the basic statistical data that does not contain personal information. The main features and contributions of the proposed method are as follows. First, we can use Big Data as information through a variety of ways and can classify and cluster Big Data through refinement. Second. we can use these kinds of information for decision-making of future and new patterns. In the performance evaluation, we will use visual approach through graph of themes. The results of performance evaluation show that the analysis using data mining technology can support new discoveries of patterns and results.

Research Trends of Health Recommender Systems (HRS): Applying Citation Network Analysis and GraphSAGE (건강추천시스템(HRS) 연구 동향: 인용네트워크 분석과 GraphSAGE를 활용하여)

  • Haryeom Jang;Jeesoo You;Sung-Byung Yang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.57-84
    • /
    • 2023
  • With the development of information and communications technology (ICT) and big data technology, anyone can easily obtain and utilize vast amounts of data through the Internet. Therefore, the capability of selecting high-quality data from a large amount of information is becoming more important than the capability of just collecting them. This trend continues in academia; literature reviews, such as systematic and non-systematic reviews, have been conducted in various research fields to construct a healthy knowledge structure by selecting high-quality research from accumulated research materials. Meanwhile, after the COVID-19 pandemic, remote healthcare services, which have not been agreed upon, are allowed to a limited extent, and new healthcare services such as health recommender systems (HRS) equipped with artificial intelligence (AI) and big data technologies are in the spotlight. Although, in practice, HRS are considered one of the most important technologies to lead the future healthcare industry, literature review on HRS is relatively rare compared to other fields. In addition, although HRS are fields of convergence with a strong interdisciplinary nature, prior literature review studies have mainly applied either systematic or non-systematic review methods; hence, there are limitations in analyzing interactions or dynamic relationships with other research fields. Therefore, in this study, the overall network structure of HRS and surrounding research fields were identified using citation network analysis (CNA). Additionally, in this process, in order to address the problem that the latest papers are underestimated in their citation relationships, the GraphSAGE algorithm was applied. As a result, this study identified 'recommender system', 'wireless & IoT', 'computer vision', and 'text mining' as increasingly important research fields related to HRS research, and confirmed that 'personalization' and 'privacy' are emerging issues in HRS research. The study findings would provide both academic and practical insights into identifying the structure of the HRS research community, examining related research trends, and designing future HRS research directions.

Stock Price Prediction Based on Time Series Network (시계열 네트워크에 기반한 주가예측)

  • Park, Kang-Hee;Shin, Hyun-Jung
    • Korean Management Science Review
    • /
    • v.28 no.1
    • /
    • pp.53-60
    • /
    • 2011
  • Time series analysis methods have been traditionally used in stock price prediction. However, most of the existing methods represent some methodological limitations in reflecting influence from external factors that affect the fluctuation of stock prices, such as oil prices, exchange rates, money interest rates, and the stock price indexes of other countries. To overcome the limitations, we propose a network based method incorporating the relations between the individual company stock prices and the external factors by using a graph-based semi-supervised learning algorithm. For verifying the significance of the proposed method, it was applied to the prediction problems of company stock prices listed in the KOSPI from January 2007 to August 2008.

A XML Schema Matching based on Fuzzy Similarity Measure

  • Kim, Chang-Suk;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1482-1485
    • /
    • 2005
  • An equivalent schema matching among several different source schemas is very important for information integration or mining on the XML based World Wide Web. Finding most similar source schema corresponding mediated schema is a major bottleneck because of the arbitrary nesting property and hierarchical structures of XML DTD schemas. It is complex and both very labor intensive and error prune job. In this paper, we present the first complex matching of XML schema, i.e. XML DTD, inlining two dimensional DTD graph into flat feature values. The proposed method captures not only schematic information but also integrity constraints information of DTD to match different structured DTD. We show the integrity constraints based hierarchical schema matching is more semantic than the schema matching only to use schematic information and stored data.

  • PDF

News Data Analysis Technique using Graph Mining (그래프 마이닝을 이용한 뉴스 데이터 분석 기법)

  • Lee, ChangJu;Park, Kisung;Han, Yongkoo;Lee, Young-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.730-733
    • /
    • 2015
  • 대용량의 인터넷 뉴스 데이터로부터 유용한 정보를 찾기 위해 연관 키워드, 핫 키워드 분석과 같은 다양한 분석 기술들이 연구되고 있다. 기존의 토픽 모델 기반의 기법은 키워드들간의 연관성을 제대로 표현하지 못하여 마이닝한 연관 키워드와 핫 키워드의 정확도가 낮은 문제점이 있다. 최근, 뉴스 데이터를 뉴스 내의 단어를 버텍스로, 같은 문장내의 단어들을 에지로 연결하는 그래프 기반의 모델링기법이 연구되었다. 이러한 뉴스 그래프 DB에서 그래프 마이닝 기술을 적용하면 연관 키워드, 핫 키워드를 마이닝 할 수 있다. 본 논문은 그래프 마이닝 기술 기반의 효과적인 뉴스 데이터 분석 기술을 제안한다. 실제 뉴스 데이터를 통해 마이닝한 유용한 뉴스 그래프 패턴들을 보이고 뉴스 데이터 분석에 효과적으로 활용될 수 있음을 보인다.