• Title/Summary/Keyword: Big Data Mining

Search Result 684, Processing Time 0.029 seconds

A Study on the Characteristics of Amekaji Fashion Trends Using Big Data Text Mining Analysis (빅데이터 텍스트 마이닝 분석을 활용한 아메카지 패션 트렌드 특징 고찰)

  • Kim, Gihyung
    • Journal of Fashion Business
    • /
    • v.26 no.3
    • /
    • pp.138-154
    • /
    • 2022
  • The purpose of this study is to identify the characteristics of domestic American casual fashion trends using big data text mining analysis. 108,524 posts and 2,038,999 extracted keywords from Naver and Daum related to American casual fashion in the past 5 years were collected and refined by the Textom program, and frequency analysis, word cloud, N-gram, centrality analysis, and CONCOR analysis were performed. The frequency analysis, 'vintage', 'style', 'daily look', 'coordination', 'workwear', 'men's wear' appeared as the main keywords. The main nationality of the representative brands was Japanese, followed by American, Korean, and others. As a result of the CONCOR analysis, four clusters were derived: "general American casual trend", "vintage taste", "direct sales mania", and "American styling". This study results showed that Japanese American casual clothes are influenced by American casual clothes, and American casual fashion in Korea, which has been reinterpreted, is completed with various coordination and creative styles such as workwear, street, military, classic, etc., focusing on items and brands. Looks were worn and shared on social networks, and the existence of an active consumer group and market potential to obtain genuine products, ranging from second-hand transactions for limited edition vintages to individual transactions were also confirmed. The significance of this study is that it presented the characteristics of American casual fashion trends academically based on online text data that the public actually uses because it has been spread by the public.

The Big Data Analytics Regarding the Cadastral Resurvey News Articles

  • Joo, Yong-Jin;Kim, Duck-Ho
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.6
    • /
    • pp.651-659
    • /
    • 2014
  • With the popularization of big data environment, big data have been highlighted as a key information strategy to establish national spatial data infrastructure for a scientific land policy and the extension of the creative economy. Especially interesting from our point of view is the cadastral information is a core national information source that forms the basis of spatial information that leads to people's daily life including the production and consumption of information related to real estate. The purpose of our paper is to suggest the scheme of big data analytics with respect to the articles of cadastral resurvey project in order to approach cadastral information in terms of spatial data integration. As specific research method, the TM (Text Mining) package from R was used to read various formats of news reports as texts, and nouns were extracted by using the KoNLP package. That is, we searched the main keywords regarding cadastral resurvey, performing extraction of compound noun and data mining analysis. And visualization of the results was presented. In addition, new reports related to cadastral resurvey between 2012 and 2014 were searched in newspapers, and nouns were extracted from the searched data for the data mining analysis of cadastral information. Furthermore, the approval rating, reliability, and improvement of rules were presented through correlation analyses among the extracted compound nouns. As a result of the correlation analysis among the most frequently used ones of the extracted nouns, five groups of data consisting of 133 keywords were generated. The most frequently appeared words were "cadastral resurvey," "civil complaint," "dispute," "cadastral survey," "lawsuit," "settlement," "mediation," "discrepant land," and "parcel." In Conclusions, the cadastral resurvey performed in some local governments has been proceeding smoothly as positive results. On the other hands, disputes from owner of land have been provoking a stream of complaints from parcel surveying for the cadastral resurvey. Through such keyword analysis, various public opinion and the types of civil complaints related to the cadastral resurvey project can be identified to prevent them through pre-emptive responses for direct call centre on the cadastral surveying, Electronic civil service and customer counseling, and high quality services about cadastral information can be provided. This study, therefore, provides a stepping stones for developing an account of big data analytics which is able to comprehensively examine and visualize a variety of news report and opinions in cadastral resurvey project promotion. Henceforth, this will contribute to establish the foundation for a framework of the information utilization, enabling scientific decision making with speediness and correctness.

Big Data Smoothing and Outlier Removal for Patent Big Data Analysis

  • Choi, JunHyeog;Jun, Sunghae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.8
    • /
    • pp.77-84
    • /
    • 2016
  • In general statistical analysis, we need to make a normal assumption. If this assumption is not satisfied, we cannot expect a good result of statistical data analysis. Most of statistical methods processing the outlier and noise also need to the assumption. But the assumption is not satisfied in big data because of its large volume and heterogeneity. So we propose a methodology based on box-plot and data smoothing for controling outlier and noise in big data analysis. The proposed methodology is not dependent upon the normal assumption. In addition, we select patent documents as target domain of big data because patent big data analysis is a important issue in management of technology. We analyze patent documents using big data learning methods for technology analysis. The collected patent data from patent databases on the world are preprocessed and analyzed by text mining and statistics. But the most researches about patent big data analysis did not consider the outlier and noise problem. This problem decreases the accuracy of prediction and increases the variance of parameter estimation. In this paper, we check the existence of the outlier and noise in patent big data. To know whether the outlier is or not in the patent big data, we use box-plot and smoothing visualization. We use the patent documents related to three dimensional printing technology to illustrate how the proposed methodology can be used for finding the existence of noise in the searched patent big data.

Bigdata Analysis on Keyword by Generations through Text Mining: Focused on Board of Nate Pann in 10s, 20s, 30s (텍스트 마이닝을 활용한 세대별 키워드 빅데이터 분석: 네이트판 10대·20대·30대 게시판을 중심으로)

  • Jeong, Baek;Bae, Sungwon;Hwangbo, Yujeong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.513-516
    • /
    • 2022
  • 본 논문에서는 텍스트 마이닝 기법을 이용하여 MZ 세대를 이해하는 키워드를 도출하고자 한다. MZ 세대의 비중이 높아지면서, MZ 세대를 분석하려고 하는 많은 연구들이 수행되고 있다. 이에 본 연구에서는 MZ 세대를 이해하기 위하여 네이트 판의 연령별 게시판 크롤링을 통해 빅데이터를 수집하였다. 그리고 텍스트 마이닝 기법을 활용하여 10대, 20대, 30대의 각각의 키워드를 도출할 수 있었다. 본 논문에서 도출된 키워드는 이는 MZ 세대를 이해하는데 중요한 키워드로 볼 수 있을 것이다. 향후 연구로는 MZ 세대와 기성 세대를 비교하기 위하여 추가 크롤링을 통해 세대 간 비교 연구를 수행하고자 한다.

  • PDF

A Study on FIFA Partner Adidas of 2022 Qatar World Cup Using Big Data Analysis

  • Kyung-Won, Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.164-170
    • /
    • 2023
  • The purpose of this study is to analyze the big data of Adidas brand participating in the Qatar World Cup in 2022 as a FIFA partner to understand useful information, semantic connection and context from unstructured data. Therefore, this study collected big data generated during the World Cup from Adidas participating in sponsorship as a FIFA partner for the 2022 Qatar World Cup and collected data from major portal sites to understand its meaning. According to text mining analysis, 'Adidas' was used the most 3,340 times based on the frequency of keyword appearance, followed by 'World Cup', 'Qatar World Cup', 'Soccer', 'Lionel Messi', 'Qatar', 'FIFA', 'Korea', and 'Uniform'. In addition, the TF-IDF rankings were 'Qatar World Cup', 'Soccer', 'Lionel Messi', 'World Cup', 'Uniform', 'Qatar', 'FIFA', 'Ronaldo', 'Korea', and 'Nike'. As a result of semantic network analysis and CONCOR analysis, four groups were formed. First, Cluster A named it 'Qatar World Cup Sponsor' as words such as 'Adidas', 'Nike', 'Qatar World Cup', 'Sponsor', 'Sponsor Company', 'Marketing', 'Nation', 'Launch', 'Official', 'Commemoration' and 'National Team' were formed into groups. Second, B Cluster named it 'Group stage' as words such as 'Qatar', 'Uruguay', 'FIFA' and 'group stage' were formed into groups. Third, C Cluster named it 'Winning' as words such as 'World Cup Winning', 'Champion', 'France', 'Argentina', 'Lionel Messi', 'Advertising' and 'Photograph' formed a group. Fourth, D Cluster named it 'Official Ball' as words such as 'Official Ball', 'World Cup Official Ball', 'Soccer Ball', 'All Times', 'Al Rihla', 'Public', 'Technology' was formed into groups.

Big data Cloud Service for Manufacturing Process Analysis (제조 공정 분석을 위한 빅데이터 클라우드 서비스)

  • Lee, Yong-Hyeok;Song, Min-Seok;Ha, Seung-Jin;Baek, Tae-Hyun;Son, Sook-Young
    • The Journal of Bigdata
    • /
    • v.1 no.1
    • /
    • pp.41-51
    • /
    • 2016
  • Big data is an emerging issue as large data which was impossible to be processed in the past is possible to be handled with the development of information and communication technology. Manufacturing is the most promising field that big data is applied such that there are abundant data available. It is important to improve an efficiency of manufacturing process for quality control and production efficiency because the processes from production design, sales, productions and so on are mixed intricately. This study proposes big data cloud service for manufacturing analysis using a big data technology and a process mining technique. It is expected for manufacturing corporations to improve a manufacturing process and reduced the cost by applying the proposed service. The service provides various analyses including manufacturing analysis and manufacturing duration analysis. Big data cloud service has been implemented and it has been validated by conducting a case study.

  • PDF

A Big Data Analysis on Research Keywords, Centrality, and Topics of International Trade using the Text Mining and Social Network (텍스트 마이닝과 소셜 네트워크 기법을 활용한 국제무역 키워드, 중심성과 토픽에 대한 빅데이터 분석)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.4
    • /
    • pp.137-159
    • /
    • 2022
  • This study aims to analyze international trade papers published in Korea during the past 2002-2022 years. Through this study, it is possible to understand the main subject and direction of research in Korea's international trade field. As the research mythologies, this study uses the big data analysis such as the text mining and Social Network Analysis such as frequency analysis, several centrality analysis, and topic analysis. After analyzing the empirical results, the frequency of key word is very high in trade, export, tariff, market, industry, and the performance of firm. However, there has been a tendency to include logistics, e-business, value and chain, and innovation over the time. The degree and closeness centrality analyses also show that the higher frequency key words also have been higher in the degree and closeness centrality. In contrast, the order of eigenvector centrality seems to be different from those of the degree and closeness centrality. The ego network shows the density of business, sale, exchange, and integration appears to be high in order unlike the frequency analysis. The topic analysis shows that the export, trade, tariff, logstics, innovation, industry, value, and chain seem to have high the probabilities of included in several topics.

Extracting of Interest Issues Related to Patient Medical Services for Small and Medium Hospital by SNS Big Data Text Mining and Social Networking (중소병원 환자의료서비스에 관한 관심 이슈 도출을 위한 SNS 빅 데이터 텍스트 마이닝과 사회적 연결망 적용)

  • Hwang, Sang Won
    • Korea Journal of Hospital Management
    • /
    • v.23 no.4
    • /
    • pp.26-39
    • /
    • 2018
  • Purposes: The purpose of this study is to analyze the issue of interest in patient medical service of small and medium hospitals using big data. Methods: The method of this study was implemented by data mining and social network using SNS big data. The analysis tool were extracted key keywords and analyzed correlation by using Textom, Ucinet6 and NetDraw program. Findings: In the results of frequency, the network-centered and closeness centrality analysis, It was shown that the government center is interested in the major explanations and evaluations of the technology, information, security, safety, cost and problems of small and medium hospitals, coping with infections, and actual involvement in bank settlement. And, were extracted care for disabilities such as pediatrics, dentistry, obstetrics and gynecology, dementia, nursing, the elderly, and rehabilitation. Practical Implications: Future studies will be more useful if analyzed the needs of customers for medical services in the metropolitan area and provinces may be different in the small and medium hospitals to be studied, further classification studies.

A Study of Consumer Perception on Fashion Show Using Big Data Analysis (빅데이터를 활용한 패션쇼에 대한 소비자 인식 연구)

  • Kim, Da Jeong;Lee, Seunghee
    • Journal of Fashion Business
    • /
    • v.23 no.3
    • /
    • pp.85-100
    • /
    • 2019
  • This study examines changes in consumer perceptions of fashion shows, which are critical elements in the apparel industry and a means to represent a brand's image and originality. For this purpose, big data in clothing marketing, text mining, semantic network analysis techniques were applied. This study aims to verify the effectiveness and significance of fashion shows in an effort to give directions for their future utilization. The study was conducted in two major stages. First, data collection with the key word, "fashion shows," was conducted across websites, including Naver and Daum between 2015 and 2018. The data collection period was divided into the first- and second-half periods. Next, Textom 3.0 was utilized for data refinement, text mining, and word clouding. The Ucinet 6.0 and NetDraw, were used for semantic network analysis, degree centrality, CONCOR analysis and also visualization. The level of interest in "models" was found to be the highest among the perception factors related to fashion shows in both periods. In the first-half period, the consumer interests focused on detailed visual stimulants such as model and clothing while in the second-half period, perceptions changed as the value of designers and brands were increasingly recognized over time. The findings of this study can be utilized as a tool to evaluate fashion shows, the apparel industry sectors, and the marketing methods. Additionally, it can also be used as a theoretical framework for big data analysis and as a basis of strategies and research in industrial developments.

Keyword Analysis of Arboretums and Botanical Gardens Using Social Big Data

  • Shin, Hyun-Tak;Kim, Sang-Jun;Sung, Jung-Won
    • Journal of People, Plants, and Environment
    • /
    • v.23 no.2
    • /
    • pp.233-243
    • /
    • 2020
  • This study collects social big data used in various fields in the past 9 years and explains the patterns of major keywords of the arboretums and botanical gardens to use as the basic data to establish operational strategies for future arboretums and botanical gardens. A total of 6,245,278 cases of data were collected: 4,250,583 from blogs (68.1%), 1,843,677 from online cafes (29.5%), and 151,018 from knowledge search engine (2.4%). As a result of refining valid data, 1,223,162 cases were selected for analysis. We came up with keywords through big data, and used big data program Textom to derive keywords of arboretums and botanical gardens using text mining analysis. As a result, we identified keywords such as 'travel', 'picnic', 'children', 'festival', 'experience', 'Garden of Morning Calm', 'program', 'recreation forest', 'healing', and 'museum'. As a result of keyword analysis, we found that keywords such as 'healing', 'tree', 'experience', 'garden', and 'Garden of Morning Calm' received high public interest. We conducted word cloud analysis by extracting keywords with high frequency in total 6,245,278 titles on social media. The results showed that arboretums and botanical gardens were perceived as spaces for relaxation and leisure such as 'travel', 'picnic' and 'recreation', and that people had high interest in educational aspects with keywords such as 'experience' and 'field trip'. The demand for rest and leisure space, education, and things to see and enjoy in arboretums and botanical gardens increased than in the past. Therefore, there must be differentiation and specialization strategies such as plant collection strategies, exhibition planning and programs in establishing future operation strategies.