• 제목/요약/키워드: 연구 토픽

검색결과 705건 처리시간 0.03초

비정형 Security Intelligence Report의 정형 정보 자동 추출 (An Automatically Extracting Formal Information from Unstructured Security Intelligence Report)

  • 허윤아;이찬희;김경민;조재춘;임희석
    • 디지털융복합연구
    • /
    • 제17권11호
    • /
    • pp.233-240
    • /
    • 2019
  • 사이버 공격을 예측하고 대응하기 위해서 수많은 보안 기업 회사에서는 공격기법의 특성, 수법 유형을 빠르게 파악하고, 이에 대한 Security Intelligence Report(SIR)들을 배포한다. 하지만 각 기업에서 배포하는 SIR들은 방대하며, 형식이 맞춰져 있지 않다. 본 논문은 대량의 비정형한 SIR들에서 정보를 추출하는데 소요되는 시간을 줄이고 효율적으로 파악하기 위해 SIR들에 대해 정형화하고 주요 정보를 추출하기 위해 5가지 분석기술이 적용된 프레임워크를 제안한다. SIR들의 데이터는 정답 라벨이 없기 때문에 비지도 학습방식을 통해 키워드 추출, 토픽 모델링, 문서 요약, 유사문서 검색 총 4가지 분석기술을 제안한다. 마지막으로 SIR들에서 위협 정보 추출하기 위해 데이터를 구축하였으며, 개체명 인식 기술에 적용하여 IP, Domain/URL, Hash, Malware에 속하는 단어를 인식하고 그 단어가 어떤 유형에 속하는지 판단하는 분석기술을 포함한 총 5가지 분석기술이 적용된 프레임워크를 제안한다.

미세먼지 관련 건강행위 강화를 위한 정책의 탐색적 연구: 미디어 정보의 토픽 및 의미연결망 분석을 활용하여 (An Exploratory Study on the Policy for Facilitating of Health Behaviors Related to Particulate Matter: Using Topic and Semantic Network Analysis of Media Text)

  • 변혜민;박유진;윤은경
    • 대한간호학회지
    • /
    • 제51권1호
    • /
    • pp.68-79
    • /
    • 2021
  • Purpose: This study aimed to analyze the mass and social media contents and structures related to particulate matter before and after the policy enforcement of the comprehensive countermeasures for particulate matter, derive nursing implications, and provide a basis for designing health policies. Methods: After crawling online news articles and posts on social networking sites before and after policy enforcement with particulate matter as keywords, we conducted topic and semantic network analysis using TEXTOM, R, and UCINET 6. Results: In topic analysis, behavior tips was the common main topic in both media before and after the policy enforcement. After the policy enforcement, influence on health disappeared from the main topics due to increased reports about reduction measures and government in mass media, whereas influence on health appeared as the main topic in social media. However semantic network analysis confirmed that social media had much number of nodes and links and lower centrality than mass media, leaving substantial information that was not organically connected and unstructured. Conclusion: Understanding of particulate matter policy and implications influence health, as well as gaps in the needs and use of health information, should be integrated with leadership and supports in the nurses' care of vulnerable patients and public health promotion.

A Development Method of Framework for Collecting, Extracting, and Classifying Social Contents

  • Cho, Eun-Sook
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권1호
    • /
    • pp.163-170
    • /
    • 2021
  • 빅데이터가 여러 분야에서 다양하게 접목됨에 따라 빅데이터 시장이 하드웨어로부터 시작해서 서비스 소프트웨어 부문으로 확장되고 있다. 특히 빅데이터 의미 파악 및 이해 능력, 분석 결과 등 총체적이고 직관적인 시각화를 위하여 애플리케이션을 제공하는 거대 플랫폼 시장으로 확대되고 있다. 그 중에서 SNS(Social Network Service) 등과 같은 소셜 미디어를 활용한 빅데이터 추출 및 분석에 대한 수요가 기업 뿐만 아니라 개인에 이르기까지 매우 활발히 진행되고 있다. 그러나 이처럼 사용자 트렌드 분석과 마케팅을 위한 소셜 미디어 데이터의 수집 및 분석에 대한 많은 수요에도 불구하고, 다양한 소셜 미디어 서비스 인터페이스의 이질성으로 인한 동적 연동의 어려움과 소프트웨어 플랫폼 구축 및 운영의 복잡성을 해결하기 위한 연구가 미흡한 상태이다. 따라서 본 논문에서는 소셜 미디어 데이터의 수집에서 추출 및 분류에 이르는 과정을 하나로 통합하여 운영할 수 있는 프레임워크를 개발하는 방법에 대해 제시한다. 제시된 프레임워크는 이질적인 소셜 미디어 데이터 수집 채널의 문제를 어댑터 패턴을 통해 해결하고, 의미 연관성 기반 추출 기법과 주제 연관성 기반 분류 기법을 통해 소셜 토픽 추출과 분류의 정확성을 높였다.

한국산업경영시스템학회지 연구 주제의 토픽모델링 분석 비교: 1978년~99년 논문을 중심으로 (Topic Modeling Analysis Comparison for Research Topic in Korean Society of Industrial and Systems Engineering: Concentrated on Research Papers from 1978~1999)

  • 박동준;오형술;김호균;윤민
    • 산업경영시스템학회지
    • /
    • 제44권4호
    • /
    • pp.113-127
    • /
    • 2021
  • Topic modeling has been receiving much attention in academic disciplines in recent years. Topic modeling is one of the applications in machine learning and natural language processing. It is a statistical modeling procedure to discover topics in the collection of documents. Recently, there have been many attempts to find out topics in diverse fields of academic research. Although the first Department of Industrial Engineering (I.E.) was established in Hanyang university in 1958, Korean Institute of Industrial Engineers (KIIE) which is truly the most academic society was first founded to contribute to research for I.E. and promote industrial techniques in 1974. Korean Society of Industrial and Systems Engineering (KSIE) was established four years later. However, the research topics for KSIE journal have not been deeply examined up until now. Using topic modeling algorithms, we cautiously aim to detect the research topics of KSIE journal for the first half of the society history, from 1978 to 1999. We made use of titles and abstracts in research papers to find out topics in KSIE journal by conducting four algorithms, LSA, HDP, LDA, and LDA Mallet. Topic analysis results obtained by the algorithms were compared. We tried to show the whole procedure of topic analysis in detail for further practical use in future. We employed visualization techniques by using analysis result obtained from LDA. As a result of thorough analysis of topic modeling, eight major research topics were discovered including Production/Logistics/Inventory, Reliability, Quality, Probability/Statistics, Management Engineering/Industry, Engineering Economy, Human Factor/Safety/Computer/Information Technology, and Heuristics/Optimization.

LDA 를 이용한 '프랜차이즈 규제' 관련 뉴스기사 토픽모델링 (Topic Modeling of News Article Related to Franchise Regulation Using LDA)

  • 양우령;양회창
    • 한국프랜차이즈경영연구
    • /
    • 제13권4호
    • /
    • pp.1-12
    • /
    • 2022
  • Purpose: In 2020, the franchise industry accomplished a significant growth compared to the previous year, as the number of franchise companies increased by 9.0% while the number of franchise brands increased by 12.5%. Despite growth in size, the Korean franchise industry underwent many negative incidents, such as franchise ownership sales to private equity funds, that led to deterioration of businesses. From this point of view, this study aims to make various proposals to help policy makers develop franchise industry policies by analyzing trends of the current and previous presidential administrations' franchise policies and regulations using newspaper articles. Research design, data and methodology: A total of 7,439 articles registered in Naver API from February 25, 2013 to November 29, 2021 were extracted. Among them, 34 unrelated video articles were deleted, and a total of 7,405 articles from both administrations were used for analysis. The R package was used for word frequency analysis, word clouding, word correlation analysis, and LDA (Latent Dirichlet Allocation) topic modeling. Results: The keyword frequency analysis shows that the most frequently mentioned keywords during the previous administration include 'no-brand', 'major company', 'bill', 'business field', and 'SMEs', and those mentioned during the current administration include 'industry' and 'policy'. As a result of LDA topic modeling, 9 topics such as 'global startups' and 'job creation' from the previous administration, and 10 topics such as 'franchise business' and 'distribution industry' from the current administration were derived. The results of LDAvis showed that the previous administration operated a policy based on mutual growth of large and small businesses rather than hostile regulations in the franchise business, whereas the current administration extended the regulation related to franchise business to the employment sector. Conclusions: The analysis of past two administrations' franchise policy, it can be suggested that franchisors and franchisees may complement each other in developing the Fair Transactions in Franchise Business Act and achieving balanced growth. Moreover, political support is needed for sound development of franchisors. Limitations and future research suggestions are presented at the end of this study.

국내 모바일 뱅킹 애플리케이션에 대한 이용자 중요도-만족도 분석(IPA): 구글 플레이스토어 리뷰 데이터를 활용하여 (Importance-Performance Analysis for Korea Mobile Banking Applications: Using Google Playstore Review Data)

  • 김소희;김무건;류민호
    • 한국산업정보학회논문지
    • /
    • 제27권6호
    • /
    • pp.115-126
    • /
    • 2022
  • 본 연구는 국내 모바일 뱅킹 애플리케이션에 대한 이용자 리뷰 데이터에 텍스트 마이닝 기법을 적용하여 중요도-만족도 분석을 시도하고, 개선의 우선순위를 도출하는 것을 목적으로 한다. 분석에는 구글 플레이스토어에서 국내 시중은행(국민은행, 신한은행, 우리은행, 하나은행), 지역은행(경남은행, 부산은행), 인터넷 은행(카카오뱅크, 케이뱅크, 토스)의 모바일 뱅킹 애플리케이션에 대한 이용자 리뷰 데이터를 활용하였으며, 주요 속성 도출 및 각 속성에 대한 중요도와 만족도 측정을 위해 토픽 모델링, 빈도분석 및 감성분석을 진행하였다. 분석 결과 '인증서비스', '기능 개선', '로그인', '속도/연결성', '시스템/업데이트' 그리고 '뱅킹서비스'가 이용자들이 모바일 뱅킹 애플리케이션을 사용할 때 느끼는 중요도가 상대적으로 높은 속성임에도 불구하고 그 만족도가 평균 수준에 미치지 못해 개선이 시급한 속성으로 나타났다.

국내 농촌 이주민의 사회통합을 위한 국·내외 연구 동향 분석 - 계량서지학적 방법론을 중심으로 - (An Analysis of Internal and External Research Trend on the Issues of Rural Migrant's Social Integration - Focused on Bibliometric Method -)

  • 김두원;남진보
    • 한국농촌건축학회논문집
    • /
    • 제25권1호
    • /
    • pp.35-44
    • /
    • 2023
  • This study aimed to understand the driver change of recent research in relation to rural and migrant and draw overarching issues as well as to provide implications to contribute to migrants' social integration in Korean rural areas. As for the scope and method of the study, data through quantitative bibliographic analysis (quantitative data) and research keywords by period were derived. To address the aim this study employed bibliometric analysis utilising netwok mapping interface analysis by VOSviewer and topic modeling analysis by Netminer. The findings were revealed that firstly mental health issues in abroad research and employment and discrimination in domestic research both derived from migrant mobility constituted staple key issues, secondly internal and external research differed two issues in health and violence where Korea has overlooked the issues seriously. Therefore this study presented implications which are about first, health and violence-related sections for migrants should be specified into domestic law, second domestic-focused MIPEX index should be developed in which the two issues are over-weighted and last such newly emerging approach 'inclusive formation of social psychological mechanisms should be widely spread. Concluding remark is that delivering the implications can be foster to migrants' integration in rural area underlining that this will ultimately contribute to migrants' quality of life.

토픽 모델링을 활용한 광범위 선천성 대사이상 신생아 선별검사 관련 온라인 육아 커뮤니티 게시 글 분석: 계량적 내용분석 연구 (Analysis of online parenting community posts on expanded newborn screening for metabolic disorders using topic modeling: a quantitative content analysis)

  • 이명선;정현숙;김진선
    • 여성건강간호학회지
    • /
    • 제29권1호
    • /
    • pp.20-31
    • /
    • 2023
  • Purpose: As more newborns have received expanded newborn screening (NBS) for metabolic disorders, the overall number of false-positive results has increased. The purpose of this study was to explore the psychological impacts experienced by mothers related to the NBS process. Methods: An online parenting community in Korea was selected, and questions regarding NBS were collected using web crawling for the period from October 2018 to August 2021. In total, 634 posts were analyzed. The collected unstructured text data were preprocessed, and keyword analysis, topic modeling, and visualization were performed. Results: Of 1,057 words extracted from posts, the top keyword based on 'term frequency-inverse document frequency' values was "hypothyroidism," followed by "discharge," "close examination," "thyroid-stimulating hormone levels," and "jaundice." The top keyword based on the simple frequency of appearance was "XXX hospital," followed by "close examination," "discharge," "breastfeeding," "hypothyroidism," and "professor." As a result of LDA topic modeling, posts related to inborn errors of metabolism (IEMs) were classified into four main themes: "confirmatory tests of IEMs," "mother and newborn with thyroid function problems," "retests of IEMs," and "feeding related to IEMs." Mothers experienced substantial frustration, stress, and anxiety when they received positive NBS results. Conclusion: The online parenting community played an important role in acquiring and sharing information, as well as psychological support related to NBS in newborn mothers. Nurses can use this study's findings to develop timely and evidence-based information for parents whose children receive positive NBS results to reduce the negative psychological impact.

공공기관 기록물 분류체계 재정비를 위한 지능화 방안: L 기관 사례를 중심으로 (An Intelligent Approach for Reorganization Record Classification Schemes in Public Institutions: Case Study on L Institution)

  • 임진솔;한희정;오효정
    • 정보관리학회지
    • /
    • 제40권2호
    • /
    • pp.137-156
    • /
    • 2023
  • 사회·정치적 패러다임의 변화에 따라 공공기관의 기관업무 및 직제는 시시각각 신설되거나 통합 또는 폐지된다. 효과적인 기록관리 관점에서는 이러한 변화를 반영하여 이전에 구축된 기록물 분류체계와 현행 업무 맥락이 적정한지 검토할 필요가 있다. 그러나 대부분 기관에서는 분류체계 재정비 과정이 실무담당자나 기관 기록물 담당자의 실무 경험적 판단에 의존한 수작업으로 진행되고 있어, 기업의 변화가 적시에 반영되거나 전체 큰 맥락을 통합적으로 파악하기가 어렵다. 이에 본 연구는 이러한 문제를 보완하고 나아가 기록의 효율적인 관리를 위해 자동화 및 지능화 기술을 활용한 기록물 분류체계 재정비 방안을 제안한다. 또한 제안된 방법론을 실제 공공기관에 적용하고, 도출된 결과물을 기관의 기능분류 담당 실무자와 면담을 수행하여 그 실효성과 한계점을 검증하였다. 이를 통해 재정비한 기록물 분류체계의 정확도와 신뢰도를 높여 기록물 관리의 표준화 실현을 도모하고자 한다.

토픽모델링을 활용한 한국산업경영시스템학회지의 최근 연구주제 분석 (Recent Research Trend Analysis for the Journal of Society of Korea Industrial and Systems Engineering Using Topic Modeling)

  • 박동준;구평회;오형술;윤 민
    • 산업경영시스템학회지
    • /
    • 제46권3호
    • /
    • pp.170-185
    • /
    • 2023
  • The advent of big data has brought about the need for analytics. Natural language processing (NLP), a field of big data, has received a lot of attention. Topic modeling among NLP is widely applied to identify key topics in various academic journals. The Korean Society of Industrial and Systems Engineering (KSIE) has published academic journals since 1978. To enhance its status, it is imperative to recognize the diversity of research domains. We have already discovered eight major research topics for papers published by KSIE from 1978 to 1999. As a follow-up study, we aim to identify major topics of research papers published in KSIE from 2000 to 2022. We performed topic modeling on 1,742 research papers during this period by using LDA and BERTopic which has recently attracted attention. BERTopic outperformed LDA by providing a set of coherent topic keywords that can effectively distinguish 36 topics found out this study. In terms of visualization techniques, pyLDAvis presented better two-dimensional scatter plots for the intertopic distance map than BERTopic. However, BERTopic provided much more diverse visualization methods to explore the relevance of 36 topics. BERTopic was also able to classify hot and cold topics by presenting 'topic over time' graphs that can identify topic trends over time.