• 제목/요약/키워드: Document Frequency

검색결과 298건 처리시간 0.028초

Research on Brand Value Dimensions of Employers: Based on Online Reviews by the Employees

  • XU, Meng
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제9권10호
    • /
    • pp.215-225
    • /
    • 2022
  • This study investigates employees' online reviews, conducts in-depth text topic mining, effectively summarizes the dimensions of employer brand value, and seeks effective ways to build employer brands from a multi-dimensional perspective. This study employs samples of employer reviews, filter keywords according to word frequency-inverse document frequency, builds a review network containing the same keywords, explore the community and summarize the theme dimensions. Simultaneously, it makes a dynamic comparison and analysis of the employer brand value dimension of different industries and enterprises. The study shows that the community exploration theme can be summarized into 11 dimensions of employer brand value, and the dimensions of employer brand value are significantly different across industries and among different enterprises within the industry. The attention to the employer brand value dimension has a significant time change. Various industries pay increasing attention to the dimension of work intensity and career development, while employers pay steady attention to the dimension of welfare benefits. The findings of this study suggest that seeking the heterogeneity of employer brand resources from the multi-dimensional differences and changes is an effective way to improve the competitiveness of enterprises in the human capital market.

토픽 모델링을 활용한 COVID-19 발생 전후 간호사 관련 토픽 비교: 인터넷 포털과 소셜미디어를 중심으로 (Comparison of Topics Related to Nurse on the Internet Portals and Social Media Before and During the COVID-19 era Using Topic Modeling)

  • 윤영미;김성광;김혜경;김은주;정윤의
    • 근관절건강학회지
    • /
    • 제27권3호
    • /
    • pp.255-267
    • /
    • 2020
  • Purpose: The purpose of this study is to compare topics through keywords related to nurses in internet portals and social media Pre coronavirus disease (COVID-19) era and during the COVID-19 era. Methods: For six months before and during the outbreak of COVID-19 in Korea, "nurse" was searched on the internet. For data collection, we implemented web crawlers in programming languages such as Python and collected keywords. The keywords collected were classified into three domains of topic Modeling. Results: The keyword 'nurse' increased by 15% during COVID-19 era. Keywords that ranked high in Term Frequency - Inverse Document Frequency (TF-IDF) values were before COVID-19, such as "nurse" and "C-section". during COVID-19, however, they were not only "nurse" but also "emergency" and "gown" related to pandemics. Conclusion: Various topics were being uploaded into the internet media. Nursing professionals should be interested in the text that is revealed in the internet media and try to continuously identify and improve problems.

LSTM Android Malicious Behavior Analysis Based on Feature Weighting

  • Yang, Qing;Wang, Xiaoliang;Zheng, Jing;Ge, Wenqi;Bai, Ming;Jiang, Frank
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권6호
    • /
    • pp.2188-2203
    • /
    • 2021
  • With the rapid development of mobile Internet, smart phones have been widely popularized, among which Android platform dominates. Due to it is open source, malware on the Android platform is rampant. In order to improve the efficiency of malware detection, this paper proposes deep learning Android malicious detection system based on behavior features. First of all, the detection system adopts the static analysis method to extract different types of behavior features from Android applications, and extract sensitive behavior features through Term frequency-inverse Document Frequency algorithm for each extracted behavior feature to construct detection features through unified abstract expression. Secondly, Long Short-Term Memory neural network model is established to select and learn from the extracted attributes and the learned attributes are used to detect Android malicious applications, Analysis and further optimization of the application behavior parameters, so as to build a deep learning Android malicious detection method based on feature analysis. We use different types of features to evaluate our method and compare it with various machine learning-based methods. Study shows that it outperforms most existing machine learning based approaches and detects 95.31% of the malware.

The Impact of Transforming Unstructured Data into Structured Data on a Churn Prediction Model for Loan Customers

  • Jung, Hoon;Lee, Bong Gyou
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권12호
    • /
    • pp.4706-4724
    • /
    • 2020
  • With various structured data, such as the company size, loan balance, and savings accounts, the voice of customer (VOC), which is text data containing contact history and counseling details was analyzed in this study. To analyze unstructured data, the term frequency-inverse document frequency (TF-IDF) analysis, semantic network analysis, sentiment analysis, and a convolutional neural network (CNN) were implemented. A performance comparison of the models revealed that the predictive model using the CNN provided the best performance with regard to predictive power, followed by the model using the TF-IDF, and then the model using semantic network analysis. In particular, a character-level CNN and a word-level CNN were developed separately, and the character-level CNN exhibited better performance, according to an analysis for the Korean language. Moreover, a systematic selection model for optimal text mining techniques was proposed, suggesting which analytical technique is appropriate for analyzing text data depending on the context. This study also provides evidence that the results of previous studies, indicating that individual customers leave when their loyalty and switching cost are low, are also applicable to corporate customers and suggests that VOC data indicating customers' needs are very effective for predicting their behavior.

텍스트 마이닝 기법을 이용한 게임 마케팅 비디오에서의 스피치 분석 (Analysis of speech in game marketing video using text mining techniques)

  • 이여경;김재직
    • 응용통계연구
    • /
    • 제35권1호
    • /
    • pp.147-159
    • /
    • 2022
  • 오늘날 다양한 소셜 미디어 플랫폼이 널리 퍼져 있고 사람들은 그들의 일상생활 속에서 밀접하게 그러한 플랫폼들을 이용하고 있다. 이에 따라, 많은 수의 구독자, 시청, 댓글 등을 보유한 인플루언서들은 우리 사회 속에서 큰 영향력을 가지게 되었다. 이러한 추세에 따라 많은 회사들은 그들의 상품과 서비스 판매의 촉진을 위한 마케팅 목적으로 인플루언서들을 적극 활용하고 있다. 본 연구에서는 게임 마케팅을 위한 비디오에서 인플루언서들의 스피치를 추출하고 텍스트화하여 이를 텍스트 마이닝 기술을 이용하여 탐색적으로 분석한다. 분석에 있어, 성공한 마케팅 비디오와 실패한 마케팅 비디오를 구분하고 성공, 실패한 마케팅 비디오에서 인플루언서들의 언어적 특징들을 비교 분석한다.

Anti-Jamming and Time Delay Performance Analysis of Future SATURN Upgraded Military Aerial Communication Tactical Systems

  • Yang, Taeho;Lee, Kwangyull;Han, Chulhee;An, Kyeongsoo;Jang, Indong;Ahn, Seungbeom
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권9호
    • /
    • pp.3029-3042
    • /
    • 2022
  • For over half a century, the United States (US) and its coalition military aircrafts have been using Ultra High Frequency (UHF) band analog modulation (AM) radios in ground-to-air communication and short-range air-to-air communications. Evolving from this, since 2007, the US military and the North Atlantic Treaty Organization (NATO) adopted HAVE QUICK to be used by almost all aircrafts, because it had been revealed that intercepting and jamming of former aircraft communication signals was possible, which placed a serious threat to defense systems. The second-generation Anti-jam Tactical UHF Radio for NATO (SATURN) was developed to replace HAVE QUICK systems by 2023. The NATO Standardization Agreement (STANAG) 4372 is a classified document that defines the SATURN technical and operational specifications. In preparation of this future upgrade to SATURN systems, in this paper, the SATURN technical and operational specifications are reviewed, and the network synchronization, frequency hopping, and communication setup parameters that are controlled by the Network (NET) Time, Time Of Day (TOD), Word Of Day (WOD), and Multiple Word of Day (MWOD) are described in addition to SATURN Edition 3 (ED3) and future Edition 4 (ED4) basic features. In addition, an anti-jamming performance analysis (in reference to partial band jamming and pulse jamming) and the time delay queueing model analysis are conducted based on a SATURN transmitter and receiver assumed model.

Retrieval methodology for similar NPP LCO cases based on domain specific NLP

  • No Kyu Seong ;Jae Hee Lee ;Jong Beom Lee;Poong Hyun Seong
    • Nuclear Engineering and Technology
    • /
    • 제55권2호
    • /
    • pp.421-431
    • /
    • 2023
  • Nuclear power plants (NPPs) have technical specifications (Tech Specs) to ensure that the equipment and key operating parameters necessary for the safe operation of the power plant are maintained within limiting conditions for operation (LCO) determined by a safety analysis. The LCO of Tech Specs that identify the lowest functional capability of equipment required for safe operation for a facility must be complied for the safe operation of NPP. There have been previous studies to aid in compliance with LCO relevant to rule-based expert systems; however, there is an obvious limit to expert systems for implementing the rules for many situations related to LCO. Therefore, in this study, we present a retrieval methodology for similar LCO cases in determining whether LCO is met or not met. To reflect the natural language processing of NPP features, a domain dictionary was built, and the optimal term frequency-inverse document frequency variant was selected. The retrieval performance was improved by adding a Boolean retrieval model based on terms related to the LCO in addition to the vector space model. The developed domain dictionary and retrieval methodology are expected to be exceedingly useful in determining whether LCO is met.

환경 빅데이터 이슈 분석을 위한 용어 가중치 기법 비교 (Comparison of Term-Weighting Schemes for Environmental Big Data Analysis)

  • 김정진;정한석
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2021년도 학술발표회
    • /
    • pp.236-236
    • /
    • 2021
  • 최근 텍스트와 같은 비정형 데이터의 생성 속도가 급격하게 증가함에 따라, 이를 분석하기 위한 기술들의 필요성이 커지고 있다. 텍스트 마이닝은 자연어 처리기술을 사용하여 비정형 텍스트를 정형화하고, 문서에서 가치있는 정보를 획득할 수 있는 기법 중 하나이다. 텍스트 마이닝 기법은 일반적으로 각각의 분서별로 특정 용어의 사용 빈도를 나타내는 문서-용어 빈도행렬을 사용하여 용어의 중요도를 나타내고, 다양한 연구 분야에서 이를 활용하고 있다. 하지만, 문서-용어 빈도 행렬에서 나타내는 용어들의 빈도들은 문서들의 차별성과 그에 따른 용어들의 중요도를 나타내기 어렵기때문에, 용어 가중치를 적용하여 문서가 가지고 있는 특징을 분류하는 방법이 필수적이다. 다양한 용어 가중치를 적용하는 방법들이 개발되어 적용되고 있지만, 환경 분야에서는 용어 가중치 기법 적용에 따른 효율성 평가 연구가 미비한 상황이다. 또한, 환경 이슈 분석의 경우 단순히 문서들에 특징을 파악하고 주어진 문서들을 분류하기보다, 시간적 분포도에 따른 각 문서의 특징을 반영하는 것도 상대적으로 중요하다. 따라서, 본 연구에서는 텍스트 마이닝을 이용하여 2015-2020년의 서울지역 환경뉴스 데이터를 사용하여 환경 이슈 분석에 적합한 용어 가중치 기법들을 비교분석하였다. 용어 가중치 기법으로는 TF-IDF (Term frequency-inverse document frquency), BM25, TF-IGM (TF-inverse gravity moment), TF-IDF-ICSDF (TF-IDF-inverse classs space density frequency)를 적용하였다. 본 연구를 통해 환경문서 및 개체 분류에 대한 최적화된 용어 가중치 기법을 제시하고, 서울지역의 환경 이슈와 관련된 핵심어 추출정보를 제공하고자 한다.

  • PDF

로컬 특징 기반 글로벌 이미지를 사용한 CNN 기반의 악성코드 분류 방법 (Convolutional Neural Network-based Malware Classification Method utilizing Local Feature-based Global Image)

  • 장세준;성연식
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2020년도 춘계학술발표대회
    • /
    • pp.222-223
    • /
    • 2020
  • 최근 악성코드로 인한 피해가 증가하고 있다. 악성코드는 악성코드가 속한 종류에 따라서 대응하는 방법도 다르기 때문에 악성코드를 종류별로 분류하는 연구도 중요하다. 기존에는 악성코드 시각화 과정을 통해서 생성된 악성코드의 글로벌 이미지를 사용해 악성코드를 각 종류별로 분류한다. 글로벌 이미지를 악성코드로부터 추출한 바이너리 정보를 사용해서 생성한다. 하지만, 글로벌 이미지만을 사용해서 악성코드를 각 종류별로 분류하는 경우 악성코드의 종류별로 중요한 특징을 고려하기 않기 때문에 분류 정확도가 떨어진다. 본 논문에서는 악성코드의 글로벌 이미지에 악성코드의 종류별 특징을 나타내기 위한 로컬 특징 기반 글로벌 이미지를 사용한 악성코드 분류 방법을 제안한다. 첫 번째, 악성 코드로부터 바이너리를 추출하고 추출된 바이너리를 사용해서 글로벌 이미지를 생성한다. 두 번째, 악성 코드로부터 로컬 특징을 추출하고 악성코드의 종류별 핵심 로컬 특징을 단어-역문서 빈도(Term Frequency Inverse Document Frequency, TFIDF) 알고리즘을 사용해 선택한다. 세 번째, 생성된 글로벌 이미지에 악성코드의 패밀리별 핵심 특징을 픽셀화해서 적용한다. 네 번째, 생성된 로컬 특징 기반 글로벌 이미지를 사용해서 컨볼루션 모델을 학습하고, 학습된 컨볼루션 모델을 사용해서 악성코드를 각 종류별로 분류한다.

뉴스 빅데이터를 활용한 수소 이슈 탐색 (A Study on Social Issues for Hydrogen Industry Using News Big Data)

  • 최일영;김혜경
    • 한국수소및신에너지학회논문집
    • /
    • 제33권2호
    • /
    • pp.121-129
    • /
    • 2022
  • With the advent of the post-2020 climate regime, the hydrogen industry is growing rapidly around the world. In order to build the hydrogen economy, it is important to identify social issues related to hydrogen and prepare countermeasures for them. Accordingly, this study conducted a semantic network analysis on hydrogen news from NAVER. As a result of the analysis, the number of hydrogen news in 2020 increased by 4.5 times compared to 2016, and as of 2018, the hydrogen issue has shifted from an environmental aspect to an economic aspect. In addition, although the initial government-led hydrogen industry is expanding to the mobility field such as privately-led fuel cell electric vehicles and hydrogen fuel, terms showing concerns about the safety such as explosions are constantly being exposed. Thus, it is necessary not only to expand the hydrogen ecosystem through the participation of private companies, but also to promote hydrogen safety.