• Title/Summary/Keyword: bigdata analysis

Search Result 345, Processing Time 0.027 seconds

Association Analysis of Product Sales using Sequential Layer Filtering (순차적 레이어 필터링을 이용한 상품 판매 연관도 분석)

  • Sun-Ho Bang;Kang-Hyun Lee;Ji-Young Jang;Tsatsral Telmentugs;Kwnag-Sup Shin
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.213-224
    • /
    • 2022
  • In logistics and distribution, Market Basket Analysis (MBA) is used as an important means to analyze the correlation between major sales products and to increase internal operational efficiency. In particular, the results of market basket analysis are used as important reference data for decision-making processes such as product purchase prediction, product recommendation, and product display structure in stores. With the recent development of e-commerce, the number of items handled by a single distribution and logistics company has rapidly increased, And the existing analytical methods such as Apriori and FP-Growth have slowed down due to the exponential increase in the amount of calculation and applied to actual business. There is a limit to examining important association rules to overcome this limitation, In this study, at the Main-Category level, which is the highest classification system of products, the utility item set mining technique that can consider the sales volume of products together was used to first select a group of products mainly sold together. Then, at the sub-category level, the types of products sold together were identified using FP-Growth. By using this sequential layer filtering technique, it may be possible to reduce the unnecessary calculations and to find practically usable rules for enhancing the effectiveness and profitability.

Korean Food Review Analysis Using Large Language Models: Sentiment Analysis and Multi-Labeling for Food Safety Hazard Detection (대형 언어 모델을 활용한 한국어 식품 리뷰 분석: 감성분석과 다중 라벨링을 통한 식품안전 위해 탐지 연구)

  • Eun-Seon Choi;Kyung-Hee Lee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.75-88
    • /
    • 2024
  • Recently, there have been cases reported in the news of individuals experiencing symptoms of food poisoning after consuming raw beef purchased from online platforms, or reviews claiming that cherry tomatoes tasted bitter. This suggests the potential for analyzing food reviews on online platforms to detect food hazards, enabling government agencies, food manufacturers, and distributors to manage consumer food safety risks. This study proposes a classification model that uses sentiment analysis and large language models to analyze food reviews and detect negative ones, multi-labeling key food safety hazards (food poisoning, spoilage, chemical odors, foreign objects). The sentiment analysis model effectively minimized the misclassification of negative reviews with a low False Positive rate using a 'funnel' model. The multi-labeling model for food safety hazards showed high performance with both recall and accuracy over 96% when using GPT-4 Turbo compared to GPT-3.5. Government agencies, food manufacturers, and distributors can use the proposed model to monitor consumer reviews in real-time, detect potential food safety issues early, and manage risks. Such a system can protect corporate brand reputation, enhance consumer protection, and ultimately improve consumer health and safety.

The Study on the Meaning Change of 'Startup' and 'Entrepreneurship' using the Bigdata-based Corpus Network Analysis (빅데이터 기반 어휘연결망분석을 활용한 '창업'과 '기업가정신'의 의미변화연구)

  • Kim, Yeonjong;Park, Sanghyeok
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.4
    • /
    • pp.75-93
    • /
    • 2020
  • The purpose of this study is to extract keywords for 'startup' and 'entrepreneurship' from Naver news articles in Korea since 1990 and Google news articles in foreign countries, and to understand the changes in the meaning of entrepreneurship and entrepreneurship in each era It is aimed at doing. In summary, first, in terms of the frequency of keywords, venture sprouting is a sample of the entrepreneurial spirit of the government-led and entrepreneurs' chairman, and various technology investments and investments in corporate establishment have been made. It can be seen that training for the development of items and items was carried out, and in the case of the venture re-emergence period, it can be seen that the youth-oriented entrepreneurship and innovation through the development of various educational programs were emphasized. Second, in the result of vocabulary network analysis, the network connection and centrality of keywords in the leap period tended to be stronger than in the germination period, but the re-leap period tended to return to the level of germination. Third, in topic analysis, it can be seen that Naver keyword topics are mostly business-related content related to support, policy, and education, whereas topics through Google News consist of major keywords that are more specifically applicable to practical work.

Analysis of Research Trends in Homomorphic Encryption Using Bibliometric Analysis (서지통계학적 분석을 이용한 동형 암호의 연구경향 분석)

  • Akihiko Yamada;Eunsang Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.4
    • /
    • pp.601-608
    • /
    • 2023
  • Homomorphic encryption is a promising technology that has been extensively researched in recent years. It allows computations to be performed on encrypted data, without the need to decrypt it. In this paper, we perform bibliometric analysis to objectively and quantitatively analyze the research trends of homomorphic encryption technology using 6,047 homomorphic encryption papers from the Scopus database. Specifically, we analyze the number of papers by year, keyword co-occurrence, topic clustering, changes in related keywords over time, and country of homomorphic encryption research institutions. Our analysis results provide strategic directions for research and application of homomorphic encryption and can be a great help for subsequent research and industrial applications.

Developing Graphic Interface for Efficient Online Searching and Analysis of Graph-Structured Bibliographic Big Data (그래프 구조를 갖는 서지 빅데이터의 효율적인 온라인 탐색 및 분석을 지원하는 그래픽 인터페이스 개발)

  • You, Youngseok;Park, Beomjun;Jo, Sunhwa;Lee, Suan;Kim, Jinho
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.77-88
    • /
    • 2020
  • Recently, many researches habe been done to organize and analyze various complex relationships in real world, represented in the form of graphs. In particular, the computer field literature data system, such as DBLP, is a representative graph data in which can be composed of papers, their authors, and citation among papers. Becasue graph data is very complex in storage structure and expression, it is very difficult task to search, analysis, and visualize a large size of bibliographic big data. In this paper, we develop a graphic user interface tool, called EEUM, which visualizes bibliographic big data in the form of graphs. EEUM provides the features to browse bibliographic big data according to the connected graph structure by visually displaying graph data, and implements search, management and analysis of the bibliographc big data. It also shows that EEUM can be conveniently used to search, explore, and analyze by applying EEUM to the bibliographic graph big data provided by DBLP. Through EEUM, you can easily find influential authors or papers in every research fields, and conveniently use it as a search and analysis tool for complex bibliographc big data, such as giving you a glimpse of all the relationships between several authors and papers.

Changes in Public Bicycle Usage Patterns before and after COVID-19 in Seoul (코로나19 전후 서울시 공공 자전거 이용 패턴의 변화)

  • Il-Jung Seo;Jaehee Cho
    • The Journal of Bigdata
    • /
    • v.6 no.2
    • /
    • pp.139-149
    • /
    • 2021
  • Ddareungi, a public bicycle service in Seoul, establishes itself as a means of daily transportation for citizens in Seoul. We speculated that the pattern of using Ddareungi may have changed since COVID-19. This study explores changes in using Ddareungi after COVID-19 with descriptive statistical analysis and network analysis. The analysis results are summarized as follows. The average traveling distance and average traveling speed have decreased over the entire time in a day since COVID-19. The round trip rate has increased at dawn and morning and has decreased in the evening and night. The average weighted degree and average clustering coefficient have decreased, and the modularity has increased. The clusters, located north of the Han River in Seoul, had a similar geographic distribution before and after COVID-19. However, the clusters, located south of the Han River, had different geographic distributions after COVID-19. Traveling routes added to the top 5 traffic rankings after COVID-19 had an average traveling distance of fewer than 1,000 meters. We expect that the results of this study will help improve the public bicycle service in Seoul.

Analysis of Review Data of 'Tamna' Franchisees to Promote Sustainable Travel in Jeju City (제주시의 지속가능한 여행 활성화를 위한 지역화폐 '탐나는전' 가맹점의 리뷰 데이터 분석)

  • Sehui Baek;Sehyoung Kim;Miran Bae;Juyoung Kang
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.113-128
    • /
    • 2022
  • After COVID-19, interest in "sustainable tourism" increased, and the number of tourists who wanted to experience "sustainable tourism" also increased. However, there is a problem that the programs and methods for 'sustainable tourism' are not specific and diverse. In addition, since most of the interests of "sustainable tourism" focus on "environment" and "carbon neutrality," there are not many programs or government policies that can contribute to the community. Therefore, in this study, news data and review data were analyzed to suggest a method for promoting 'sustainable tourism'. First, in this study, major themes of sustainable travel were derived through news big data analysis. Through this analysis, policy themes and events of 'sustainable tourism' were derived. By analyzing news big data related to "sustainable tourism," we would like to analyze the reasons why sustainable travel has not been activated in Korea. Finally, in order to promote sustainable travel in Jeju island, we analyzed user review data of Jeju local currency, and propose a idea to coexist with the local community.

Welfare Policy Visualization Analysis using Big Data -Chungcheong- (빅데이터를 활용한 복지정책 시각화분석 -충청도 중심으로-)

  • Dae-Yu Kim;Won-Shik Na
    • Advanced Industrial SCIence
    • /
    • v.2 no.1
    • /
    • pp.15-20
    • /
    • 2023
  • The purpose of this study is to analyze the changes and importance of welfare policies in Chungcheong Province using big data analysis technology in the era of the Fourth Industrial Revolution, and to propose stable welfare policies for all generations, including the socially underprivileged. Chungcheong-do policy-related big data is coded in Python, and stable government policies are proposed based on the results of visualization analysis. As a result of the study, the keywords of Chungcheong-do government policy were confirmed in the order of region, society, government and support, education, and women, and welfare policy should be strengthened with a focus on improving local health policy and social welfare. For future research direction, it will be necessary to compare overseas cases and make policy proposals on the stable impact of national welfare policies.

Artificial Intelligence Algorithms, Model-Based Social Data Collection and Content Exploration (소셜데이터 분석 및 인공지능 알고리즘 기반 범죄 수사 기법 연구)

  • An, Dong-Uk;Leem, Choon Seong
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.23-34
    • /
    • 2019
  • Recently, the crime that utilizes the digital platform is continuously increasing. About 140,000 cases occurred in 2015 and about 150,000 cases occurred in 2016. Therefore, it is considered that there is a limit handling those online crimes by old-fashioned investigation techniques. Investigators' manual online search and cognitive investigation methods those are broadly used today are not enough to proactively cope with rapid changing civil crimes. In addition, the characteristics of the content that is posted to unspecified users of social media makes investigations more difficult. This study suggests the site-based collection and the Open API among the content web collection methods considering the characteristics of the online media where the infringement crimes occur. Since illegal content is published and deleted quickly, and new words and alterations are generated quickly and variously, it is difficult to recognize them quickly by dictionary-based morphological analysis registered manually. In order to solve this problem, we propose a tokenizing method in the existing dictionary-based morphological analysis through WPM (Word Piece Model), which is a data preprocessing method for quick recognizing and responding to illegal contents posting online infringement crimes. In the analysis of data, the optimal precision is verified through the Vote-based ensemble method by utilizing a classification learning model based on supervised learning for the investigation of illegal contents. This study utilizes a sorting algorithm model centering on illegal multilevel business cases to proactively recognize crimes invading the public economy, and presents an empirical study to effectively deal with social data collection and content investigation.

  • PDF

Bigdata Analysis of Fine Dust Theme Stock Price Volatility According to PM10 Concentration Change (PM10 농도변화에 따른 미세먼지 테마주 주가변동 빅데이터 분석)

  • Kim, Mu Jeong;Lim, Gyoo Gun
    • Journal of Service Research and Studies
    • /
    • v.10 no.1
    • /
    • pp.55-67
    • /
    • 2020
  • Fine dust has recently become one of the greatest concerns of Korean people and has been a target of considerable efforts by governments and local governments. In the academic world, many researches have been carried out in relation to fine dust, but the research on the economic field has been relatively few. So we wanted to know how fine dust affects the economy. Big data of PM10 concentration for fine dust and fine dust theme stock price were collected for five years from 2013 to 2017. Regression analysis was performed using the linear regression model, the generalized least squares method. As a result, the change in the fine dust concentration was found to have a effect on the related theme stocks' price. When the fine dust concentration increased compared to the previous day, the fine dust theme stocks' price also showed a tendency to increase. Also, according to the analysis of stock price change from 2013 to 2017 based on fine dust theme stocks, companies with large regression coefficients were changed every year. Among them, the regression coefficients of Monalisa were repeatedly high in 2014, 2015, 2017, Samil Pharmaceutical in 2015, 2016 and 2017, and Welcron in 2016 and 2017, and the companies were judged to be sensitive to the concentration of fine dust. The companies that responded the most in the past 5 years were Wokong, Welcron, Dongsung Pharmaceutical, Samil Pharmaceutical, and Monalisa. If PM2.5 measurement data are accumulated enough, it would be meaningful to compare and analyze PM2.5 concentration with independent variables. In this study, only the fine dust concentration is used as an independent variable. However, it is expected that a more clear and well-explained result can be found by adding appropriate additional variables to increase the explanatory power.