• Title/Summary/Keyword: Related Keywords

Search Result 924, Processing Time 0.029 seconds

A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns (인터넷 검색기록 분석을 통한 쇼핑의도 포함 키워드 자동 추출 기법)

  • Kim, Mingyu;Kim, Namgyu;Jung, Inhwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.123-136
    • /
    • 2014
  • Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance. In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords. For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords. This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.

A Study on the Keyword Extraction for ESG Controversies Through Association Rule Mining (연관규칙 분석을 통한 ESG 우려사안 키워드 도출에 관한 연구)

  • Ahn, Tae Wook;Lee, Hee Seung;Yi, June Suh
    • The Journal of Information Systems
    • /
    • v.30 no.1
    • /
    • pp.123-149
    • /
    • 2021
  • Purpose The purpose of this study is to define the anti-ESG activities of companies recognized by media by reflecting ESG recently attracted attention. This study extracts keywords for ESG controversies through association rule mining. Design/methodology/approach A research framework is designed to extract keywords for ESG controversies as follows: 1) From DeepSearch DB, we collect 23,837 articles on anti-ESG activities exposed to 130 media from 2013 to 2018 of 294 listed companies with ESG ratings 2) We set keywords related to environment, social, and governance, and delete or merge them with other keywords based on the support, confidence, and lift derived from association rule mining. 3) We illustrate the importance of keywords and the relevance between keywords through density, degree centrality, and closeness centrality on network analysis. Findings We identify a total of 26 keywords for ESG controversies. 'Gapjil' records the highest frequency, followed by 'corruption', 'bribery', and 'collusion'. Out of the 26 keywords, 16 are related to governance, 8 to social, and 2 to environment. The keywords ranked high are mostly related to the responsibility of shareholders within corporate governance. ESG controversies associated with social issues are often related to unfair trade. As a result of confidence analysis, the keywords related to social and governance are clustered and the probability of mutual occurrence between keywords is high within each group. In particular, in the case of "owner's arrest", it is caused by "bribery" and "misappropriation" with an 80% confidence level. The result of network analysis shows that 'corruption' is located in the center, which is the most likely to occur alone, and is highly related to 'breach of duty', 'embezzlement', and 'bribery'.

A Study on the General Public's Perceptions of Dental Fear Using Unstructured Big Data

  • Han-A Cho;Bo-Young Park
    • Journal of dental hygiene science
    • /
    • v.23 no.4
    • /
    • pp.255-263
    • /
    • 2023
  • Background: This study used text mining techniques to determine public perceptions of dental fear, extracted keywords related to dental fear, identified the connection between the keywords, and categorized and visualized perceptions related to dental fear. Methods: Keywords in texts posted on Internet portal sites (NAVER and Google) between 1 January, 2000, and 31 December, 2022, were collected. The four stages of analysis were used to explore the keywords: frequency analysis, term frequency-inverse document frequency (TF-IDF), centrality analysis and co-occurrence analysis, and convergent correlations. Results: In the top ten keywords based on frequency analysis, the most frequently used keyword was 'treatment,' followed by 'fear,' 'dental implant,' 'conscious sedation,' 'pain,' 'dental fear,' 'comfort,' 'taking medication,' 'experience,' and 'tooth.' In the TF-IDF analysis, the top three keywords were dental implant, conscious sedation, and dental fear. The co-occurrence analysis was used to explore keywords that appear together and showed that 'fear and treatment' and 'treatment and pain' appeared the most frequently. Conclusion: Texts collected via unstructured big data were analyzed to identify general perceptions related to dental fear, and this study is valuable as a source data for understanding public perceptions of dental fear by grouping associated keywords. The results of this study will be helpful to understand dental fear and used as factors affecting oral health in the future.

Design and Implementation of Potential Advertisement Keyword Extraction System Using SNS (SNS를 이용한 잠재적 광고 키워드 추출 시스템 설계 및 구현)

  • Seo, Hyun-Gon;Park, Hee-Wan
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.7
    • /
    • pp.17-24
    • /
    • 2018
  • One of the major issues in big data processing is extracting keywords from internet and using them to process the necessary information. Most of the proposed keyword extraction algorithms extract keywords using search function of a large portal site. In addition, these methods extract keywords based on already posted or created documents or fixed contents. In this paper, we propose a KAES(Keyword Advertisement Extraction System) system that helps the potential shopping keyword marketing to extract issue keywords and related keywords based on dynamic instant messages such as various issues, interests, comments posted on SNS. The KAES system makes a list of specific accounts to extract keywords and related keywords that have most frequency in the SNS.

Occupational Health Could be the New Normal Challenge in the Trade and Health Cycle: Keywords Analysis Between 1990 and 2020

  • Kiran, Sibel
    • Safety and Health at Work
    • /
    • v.12 no.2
    • /
    • pp.272-276
    • /
    • 2021
  • This brief report aims to establish the keyword content of studies on occupational health and safety-the key framework of the world of work in the trade and health domain. Data were collected from the SCOPUS database, focusing on articles on occupational health and safety and related keywords, with an emphasis on abstracts and titles. Data were analyzed and summarized based on keywords included from the MeSH database. There were 24,499 manuscripts in the domain and 1,346 (5.40%) occupational health-related keywords, including those that overlapped. The most frequently referenced occupational health-related keyword was "occupational health" (452 articles), followed by "occupational safety" (141 articles). There were fewer keywords on occupational health in the trade and health literature. As the world of work has been prioritized because of the recent new normal of work life since the COVID-19 pandemic, examining the focus of occupational health priorities within the global perspective is crucial.

An Analysis of Domestic Research Trend on Research Data Using Keyword Network Analysis (키워드 네트워크 분석을 이용한 연구데이터 관련 국내 연구 동향 분석)

  • Sangwoo Han
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.4
    • /
    • pp.393-414
    • /
    • 2023
  • The goal of this study is to investigate domestic research trend on research data study. To achieve this goal, articles related research data topic were collected from RISS. After data cleansing, 134 author keywords were extracted from a total of 58 articles and keyword network analysis was performed. As a result, first, the number of studies related to research data in Korea is still only 58, so it was found that many related studies need to be conducted in the future. Second, most research fields related to research data were focused on library and information science among complex studies. Third, as a result of frequency analysis of author keywords related to research data, 'research data management', 'research data sharing', 'data repository', and 'open science' were analyzed as major frequent keywords, so research data-related research focuses on the above keywords. The keyword network analysis results also showed that high-frequency keywords occupy a central position in degree centrality and betweenness centrality and are located as core keywords in related studies. Through the results of this study, we were able to identify trends related to recent research data and identify areas that require intensive research in the future.

A Study on Social Perceptions of Public Libraries Utilizing the sentiment analysis

  • Noh, Younghee;Kim, Dongseok
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.12 no.4
    • /
    • pp.41-65
    • /
    • 2022
  • This study would understand the overall perception of our society about public libraries, analyzing the texts related to public libraries, utilizing the semantic connection network & sentiment analysis. For this purpose, this study collected data from the last five years with keywords, 'Library' and 'Lifelong Learning Center' from January 1, 2016 through November 30, 2020 through the blogs and cafés of major domestic portal sites. With the collected data, text mining, centrality of keywords, network structure, structural equipotentiality, and sensitivity analyses were conducted. As a result of the analysis, First, 'reading' and 'book' were identified as representative keywords that form the social perception of public libraries. Second, it turned out that there were keywords related to the use of the library and the untact service due to the recent spread of COVID-19. Third, in seeking a plan for the development of public libraries through the keywords drawn to have positive meanings, it is necessary to create continuous services that can form a new image of the library, breaking away from the existing fixed role and image of the library and increase the convenience of use. Fourth, facilities and facilities for library services were recognized from a neutral point of view. Fifth, the spread of infectious diseases, social distancing, and temporary closure and closure of libraries are negatively related to public libraries, and awareness of librarians has been identified as negative keywords.

A Study on the Research Trend of Elementary Environmental Education through an Analysis of the Network of Author Keywords (저자 키워드 네트워크 분석을 통한 초등 환경교육의 연구 동향 탐색)

  • Kim, Dong-Ryeul
    • Journal of Korean Elementary Science Education
    • /
    • v.36 no.2
    • /
    • pp.113-128
    • /
    • 2017
  • This study aims to investigate the research trend of elementary environmental education. Thus, author keywords were extracted from a total of 197 academic these related to elementary environmental education during two different periods when detailed goals were applied to the 2007 and 2009 revised curriculums respectively, and then this study analyzed the network of author keywords. The results of this study can be summarized as below. Firstly, as a result of analyzing the frequency of author keywords from academic theses related to elementary environmental education, this study discovered 369 author keywords from the period when detailed goals were applied to 2009 revised curriculum. Out of them, it was found that the keyword, 'climate change education', showed the highest frequency, followed by 'environmental literacy' and 'environmental perception', except such central keywords as 'environmental education' and 'elementary school student'. From the period when detailed goals were applied to the 2007 revised curriculum, a total of 394 author keywords were discovered, and the keyword, 'environmental literacy', showed the highest frequency, followed by 'environmental perception' and 'ESD (education for sustainable development)'. Secondly, as a result of analyzing the network of author keywords, this study found out that in the total number of network connections, average connection degree, density and clique, the period when detailed goals were applied to the 2007 revised curriculum was somewhat higher than the period when detailed goals were applied to the 2009 revised curriculum. As a result of analyzing the centrality of author keywords, this study found out that during both the periods, 'environmental perception' and 'environmental literacy' were high in degree centrality and betweenness centrality, except such central keywords as 'environmental education' and 'elementary school student'. As a result of analyzing the components of author keywords as sub-networks, this study discovered 9 components from the period when detailed goals were applied to the 2009 revised curriculum and 6 components from the period when detailed goals were applied to the 2007 revised curriculum. During both the periods, the largest component was composed of keywords high in degree centrality and betweenness centrality.

Structuring Risk Factors of Industrial Incidents Using Natural Language Process (자연어 처리 기법을 활용한 산업재해 위험요인 구조화)

  • Kang, Sungsik;Chang, Seong Rok;Lee, Jongbin;Suh, Yongyoon
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.1
    • /
    • pp.56-63
    • /
    • 2021
  • The narrative texts of industrial accident reports help to identify accident risk factors. They relate the accident triggers to the sequence of events and the outcomes of an accident. Particularly, a set of related keywords in the context of the narrative can represent how the accident proceeded. Previous studies on text analytics for structuring accident reports have been limited to extracting individual keywords without context. We proposed a context-based analysis using a Natural Language Processing (NLP) algorithm to remedy this shortcoming. This study aims to apply Word2Vec of the NLP algorithm to extract adjacent keywords, known as word embedding, conducted by the neural network algorithm based on supervised learning. During processing, Word2Vec is conducted by adjacent keywords in narrative texts as inputs to achieve its supervised learning; keyword weights emerge as the vectors representing the degree of neighboring among keywords. Similar keyword weights mean that the keywords are closely arranged within sentences in the narrative text. Consequently, a set of keywords that have similar weights presents similar accidents. We extracted ten accident processes containing related keywords and used them to understand the risk factors determining how an accident proceeds. This information helps identify how a checklist for an accident report should be structured.

An Analysis of the Experience of Users of National Ecological and Cultural Exploration Routes Using Big Data - A Focus on the Buan Masil Road and Gunsan Gubul Road - (빅데이터를 활용한 국가생태문화탐방로 이용자의 경험분석 - 부안 마실길과 군산 구불길을 대상으로 -)

  • Lee, Hyun-Jung;An, Byung-Chul
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.23 no.6
    • /
    • pp.151-166
    • /
    • 2020
  • Various experience keywords were derived through text mining analysis of two National Ecological and Cultural Exploration Routes. The results of this study were drawn as follows: The interaction between the experience keywords was analyzed by the degree centrality, closeness centrality, and betweenness centrality value calculated through the centrality analysis of the research site experience keywords. First, In the text mining analysis, 'walking' appeared as the top keyword in the I, II, and III periods of the two target areas. The keywords related to the stay type of "rental cottage" and "recreational forest" were derived for Masil Road in relation to accommodation facilities. However, the keywords related to the accommodation were not derived in Gubul Road. Second, as a result of the centrality analysis, the degree centrality of the keywords "walking", "sea", "look", "salt flats" of Masil Road and "walking", "lake" and "park" of Gubul Road was high. The keywords located at the center are "walking" and "sea" in the Masil Road, and "walking" in the Gubul Road. As an influential keyword, Masil Road is "experience" and Gubul Road is "history". Third, According to the results of the analysis, the keywords that appeared at the top of the Gubul Road are derived from the keywords related to the 1 ~ 8 course, and it is judged that the visitors are visiting the 1 ~ 8 course trail evenly. However, the Gubul Road only appears in the top keyword only for a few courses. Through this, it seems that three courses are intensively visited as the main course of 6 Gubul Road, 6-1 Gubul Road, and 8 Gubul Road.