• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.035 seconds

A Big Data Analysis on Research Keywords, Centrality, and Topics of International Trade using the Text Mining and Social Network (텍스트 마이닝과 소셜 네트워크 기법을 활용한 국제무역 키워드, 중심성과 토픽에 대한 빅데이터 분석)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.4
    • /
    • pp.137-159
    • /
    • 2022
  • This study aims to analyze international trade papers published in Korea during the past 2002-2022 years. Through this study, it is possible to understand the main subject and direction of research in Korea's international trade field. As the research mythologies, this study uses the big data analysis such as the text mining and Social Network Analysis such as frequency analysis, several centrality analysis, and topic analysis. After analyzing the empirical results, the frequency of key word is very high in trade, export, tariff, market, industry, and the performance of firm. However, there has been a tendency to include logistics, e-business, value and chain, and innovation over the time. The degree and closeness centrality analyses also show that the higher frequency key words also have been higher in the degree and closeness centrality. In contrast, the order of eigenvector centrality seems to be different from those of the degree and closeness centrality. The ego network shows the density of business, sale, exchange, and integration appears to be high in order unlike the frequency analysis. The topic analysis shows that the export, trade, tariff, logstics, innovation, industry, value, and chain seem to have high the probabilities of included in several topics.

Keywords Analysis of Clothing Materials in Consumer Reviews Using Big Data Text Mining (빅데이터 텍스트 마이닝을 활용한 소비자 리뷰에서의 의류 소재 키워드 분석)

  • Gaeun Kang;Jiwon Park;Shinjung Yoo
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.48 no.4
    • /
    • pp.729-743
    • /
    • 2024
  • This research explores consumer preferences for materials in different clothing product categories, using web-crawling and text mining techniques. Specifically, the study focuses on the material-related terms found in consumer reviews across three distinct product categories: functional clothing, formal shirts, and knit sweaters. Top-selling products within each category were identified on the Naver Shopping website based on the volume of reviews, and the four most-reviewed products were selected. Six hundred reviews per product were analyzed using the Textom big-data analysis software to determine the frequency of material-related mentions and word associations. The analysis utilized two comparative metrics: product category and usage duration. Our findings reveal notable variations in the material preferences mentioned by consumers across different product categories. The study suggests a need to re-evaluate existing standardized review criteria to better reflect consumer interests specific to each product category. Additionally, an increase in material-related terms in reviews over one month indicates the potential importance of extending the duration of product reviews to enhance the accuracy of information that reflects longer-term consumer experiences with material quality.

A Study on Comparison of Open Application Programming Interface of Securities Companies Supporting Python

  • Ryu, Gui Yeol
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.97-104
    • /
    • 2021
  • Securities and investment services had the most data per company on the average, and used the most data. Investors are increasingly demanding to invest through their own analysis methods. Therefore, securities and investment companies provide stock data to investors through open API. The data received using the open API is in text format. Python is effective and convenient for requesting and receiving text data. We investigate there are 22 major securities and investment companies in Korea and only 6 companies. Only Daishin Securities Co. supports Python officially. We compare how to receive stock data through open API using Python, and Python programming features. The open APIs for the study are Daishin Securities Co. and eBest Investment & Securities Co. Comparing the two APIs for receiving the current stock data, we find the main two differences are the login method and the method of sending and receiving data. As for the login method, CYBOS plus has login information, but xingAPI does not have. As for the method of sending and receiving data, Cybos Plus sends and receives data by calling the request method, and the reply method. xingAPI sends and receives data in the form of an event. Therefore, the number of xingAPI codes is more than that of CYBOS plus. And we find that CYBOS plus executes a loop statement by lists and tuple, dictionary, and CYBOS plus supports the basic commands provided by Python.

Automatic Measurement of Temperature in Real Time by Using an Internal and Data Processing System (인터넷을 이용한 원격 실시간 온도 계측 모니터 및 계측데이터 자동처리 시스템)

  • Kim, Hui-Sik;Kim, Yeong-Il;Seol, Dae-Yeon;Nam, Cheol;O, Heung-Il
    • Proceedings of the KIEE Conference
    • /
    • 2003.11b
    • /
    • pp.99-102
    • /
    • 2003
  • In this paper, we have developed a system for monitoring and processing the real time sensor data in remote site through Internet. For realizing this system, measurement equipment and protocol are used to transmit the measurement data to remote server and to process measurement data. In server part, the received data from remote site sensor is converted to text or graphic charts for user. The measurement device in sensor part receives the sensor data form sensor and store the received data to its internal memory for transmitting data to server part through Internet. Also the measurement device can receive data form server. The temperature sensor is corrected to the measurement device located in laboratory and the measurement device measures temperature of laboratory which can be confirmed by user through Internet. We have developed a server program working on the Linux to store measurement data from measurement device to server memory. The program is use for SNMP(Simple Network Management Protocol) to exchange data with measurement device. Also the program changes the measurement data into text and graphic charts for user display. The program is use apache PHP program for user display and inquiry. The real time temperature measurement system can be applly for many parts of industry and living.

  • PDF

HTML Text Extraction Using Tag Path and Text Appearance Frequency (태그 경로 및 텍스트 출현 빈도를 이용한 HTML 본문 추출)

  • Kim, Jin-Hwan;Kim, Eun-Gyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1709-1715
    • /
    • 2021
  • In order to accurately extract the necessary text from the web page, the method of specifying the tag and style attributes where the main contents exist to the web crawler has a problem in that the logic for extracting the main contents. This method needs to be modified whenever the web page configuration is changed. In order to solve this problem, the method of extracting the text by analyzing the frequency of appearance of the text proposed in the previous study had a limitation in that the performance deviation was large depending on the collection channel of the web page. Therefore, in this paper, we proposed a method of extracting texts with high accuracy from various collection channels by analyzing not only the frequency of appearance of text but also parent tag paths of text nodes extracted from the DOM tree of web pages.

Incidence of Online Public Opinion on Guangzhou Simultaneous Renting and Purchasing Policy - A data mining application

  • Wang, Yancheng;Li, Haixian
    • Asian Journal for Public Opinion Research
    • /
    • v.5 no.4
    • /
    • pp.266-284
    • /
    • 2018
  • This paper adopts the big data research method, and draws 491 data from the Tianya Forum about the Simultaneous Renting and Purchasing policy of Guangzhou. The qualitative analysis software Nvivo11 is used to cluster the main questions about the Simultaneous Renting and Purchasing policy in the forum. The 36 high-frequency word frequencies are obtained through text clustering. Through rooted theory analysis, the main driving factors for summarizing people's doubts are 9 main categories, 3 core categories, and the model of driving factors for online forums is established. The study finds that resource factors are the most key factor, economic factors are the important drivers, and policy guiding factors are sub-important drivers.

N- gram Adaptation Using Information Retrieval and Dynamic Interpolation Coefficient (정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응)

  • Choi Joon Ki;Oh Yung-Hwan
    • MALSORI
    • /
    • no.56
    • /
    • pp.207-223
    • /
    • 2005
  • The goal of language model adaptation is to improve the background language model with a relatively small adaptation corpus. This study presents a language model adaptation technique where additional text data for the adaptation do not exist. We propose the information retrieval (IR) technique with N-gram language modeling to collect the adaptation corpus from baseline text data. We also propose to use a dynamic language model interpolation coefficient to combine the background language model and the adapted language model. The interpolation coefficient is estimated from the word hypotheses obtained by segmenting the input speech data reserved for held-out validation data. This allows the final adapted model to improve the performance of the background model consistently The proposed approach reduces the word error rate by $13.6\%$ relative to baseline 4-gram for two-hour broadcast news speech recognition.

  • PDF

Recommended Chocolate Applications Based On The Propensity To Consume Dining outside Using Big Data On Social Networks

  • Lee, Tae-gyeong;Moon, Seok-jae;Ryu, Gihwan
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.325-333
    • /
    • 2020
  • In the past, eating outside was usually the purpose of eating. However, it has recently expanded into a restaurant culture market. In particular, a dessert culture is being established where people can talk and enjoy. Each consumer has a different tendency to buy chocolate such as health, taste, and atmosphere. Therefore, it is time to recommend chocolate according to consumers' tendency to eat out. In this paper, we propose a chocolate recommendation application based on the tendency to eat out using data on social networks. To collect keyword-based chocolate information, Textom is used as a text mining big data analysis solution.Text mining analysis and related topics are extracted and modeled. Because to shorten the time to recommend chocolate to users. In addition, research on the propensity of eating out is based on prior research. Finally, it implements hybrid app base.

A Study on the COntour Machining of Text using CNC Laser Machine (CNC레이저 가공기를 이용한 활자체 가공에 관한 연구)

  • 구영회
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 1999.10a
    • /
    • pp.554-559
    • /
    • 1999
  • The purpose of this study is the machining of texture shapes by the contour fitting data. The hardware of the system comprises PC and scanning system, CO2 laser machine. There are four steps, (1) text image loading using scanning shapes or 2D image files, (2) generation of contour fitting data by the line and arc, cubic Bezier curve, (3) generation of NC code from the contouring fitting data, (4) machining by the DNC system. It is developed a software package, with which can conduct a micro CAM system of CNC laser machine in the PC without economical burden.

  • PDF

Implementation of Annotation and Thesaurus for Remote Sensing

  • Chae, Gee-Ju;Yun, Young-Bo;Park, Jong-Hyun
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.222-224
    • /
    • 2003
  • Many users want to add some their own information to data which was on the web and computer without actually needing to touch data. In remote sensing, the result data for image classification consist of image and text file in general. To overcome these inconvenience problems, we suggest the annotation method using XML language. We give the efficient annotation method which can be applied to web and viewing of image classification. We can apply the annotation for web and image classification with image and text file. The need for thesaurus construction is the lack of information for remote sensing and GIS on search engine like Empas, Naver and Google. In search engine, we can’t search the information for word which has many different names simultaneously. We select the remote sensing data from different sources and make the relation between many terms. For this process, we analyze the meaning for different terms which has similar meaning.

  • PDF