• Title/Summary/Keyword: 비정형자료

Search Result 104, Processing Time 0.027 seconds

Using noise filtering and sufficient dimension reduction method on unstructured economic data (노이즈 필터링과 충분차원축소를 이용한 비정형 경제 데이터 활용에 대한 연구)

  • Jae Keun Yoo;Yujin Park;Beomseok Seo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.119-138
    • /
    • 2024
  • Text indicators are increasingly valuable in economic forecasting, but are often hindered by noise and high dimensionality. This study aims to explore post-processing techniques, specifically noise filtering and dimensionality reduction, to normalize text indicators and enhance their utility through empirical analysis. Predictive target variables for the empirical analysis include monthly leading index cyclical variations, BSI (business survey index) All industry sales performance, BSI All industry sales outlook, as well as quarterly real GDP SA (seasonally adjusted) growth rate and real GDP YoY (year-on-year) growth rate. This study explores the Hodrick and Prescott filter, which is widely used in econometrics for noise filtering, and employs sufficient dimension reduction, a nonparametric dimensionality reduction methodology, in conjunction with unstructured text data. The analysis results reveal that noise filtering of text indicators significantly improves predictive accuracy for both monthly and quarterly variables, particularly when the dataset is large. Moreover, this study demonstrated that applying dimensionality reduction further enhances predictive performance. These findings imply that post-processing techniques, such as noise filtering and dimensionality reduction, are crucial for enhancing the utility of text indicators and can contribute to improving the accuracy of economic forecasts.

Ultrastructural Charateristics of a Human Sialolith (인간 타석의 미세구조적 특징)

  • Kim, Hyun-Jin;Lee, Soo-Guen;Suh, Bong-Jik
    • Journal of Oral Medicine and Pain
    • /
    • v.24 no.4
    • /
    • pp.375-385
    • /
    • 1999
  • 타석에 관한 연구는 타석증을 보이는 환자에 대한 임상적 특징, 진단 및 치료에서부터 타석의 성분 및 구조 등에 이르기까지 다양한 범위에 걸쳐 이루어지고 있다. 타석의 미세구조에 관한 연구는 타석의 미세구조가 다양한 형태인 것으로 보고되고 있으며, 특히 최근 타석증의 치료에 새롭게 소개되고 있는 체외충격파쇄석술은 타석의 구조에 따라 그 효과가 영향을 받을 수 있을 것으로 사료된다. 이에 저자는 인간 타석의 미세구조에 관한 기본 자료가 필요할 것으로 사료되어 한국인 중년 여성으로부터 적출된 악하선 타석을 광학현미경 및 주사전자현미경을 이용하여 미세구조적 특징을 관찰한 결과, 다음과 같은 결론을 얻었다. 1. 타석은 중심부의 핵, 핵 주변의 층상구조 및 외피막으로 이루어져 있었다. 2. 핵은 비정형의 중심과 상대적으로 균질의 외곽부위로 구성되어 있었다. 3. 핵 주변은 대부분 동심원적인 층상구조를 보였지만 일부분에서는 균질의 구조를 보였다. 4. 타석 단면의 전체직경과 중심부 핵의 직경은 각각 $3,500{\mu}m$$1,500{\mu}m$였고, 층상구조를 이루는 각 층의 두께는 위치에 따라 약 $10{\sim}40{\mu}m$ 이내였다.

  • PDF

Design of a Sentiment Analysis System to Prevent School Violence and Student's Suicide (학교폭력과 자살사고를 예방하기 위한 감성분석 시스템의 설계)

  • Kim, YoungTaek
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.6
    • /
    • pp.115-122
    • /
    • 2014
  • One of the problems with current youth generations is increasing rate of violence and suicide in their school lives, and this study aims at the design of a sentiment analysis system to prevent suicide by uising big data process. The main issues of the design are economical implementation, easy and fast processing for the users, so, the open source Hadoop system with MapReduce algorithm is used on the HDFS(Hadoop Distributed File System) for the experimentation. This study uses word count method to do the sentiment analysis with informal data on some sns communications concerning a kinds of violent words, in terms of text mining to avoid some expensive and complex statistical analysis methods.

  • PDF

Hadoop Security Technologies and Vulnerability Analysis (하둡 보안 기술과 취약점 분석)

  • Kim, A-Yong;He, Yilun;Kim, Han-Kil;Park, Man-Seub;Jung, Hoe-Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.05a
    • /
    • pp.681-683
    • /
    • 2013
  • And were the prevalence of smartphones is the Big Data era, such as Facebook or Twitter, SNS (Social Network Service) routine is used in the real world. Take advantage of the analysis, and to extract and utilize developed in the Apache Foundation Hadoop (Hadoop) without abandoning the SNS unstructured data here. Hadoop is an open source framework that can handle large amounts of data. Hadoop has been introduced in the domestic corporate and commercial development and Compared to the technology development Hadoop has been pointed out that the lack of security sector. In this paper, we propose a method to enhance the security and vulnerability analysis of security technologies and Hadoop.

  • PDF

Design and Implementation of Opinion Mining System based on Association Model (연관성 모델에 기반한 오피년마이닝 시스템의 설계 및 구현)

  • Kim, Keun-Hyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.1
    • /
    • pp.133-140
    • /
    • 2011
  • For both customers and companies, it is very important to analyze online customer reviews, which consist of small documents that include opinions or experiences about products or services, because the customers can get good informations and the companies can establish good marketing strategies. In this paper, we propose the association model for the opinion mining which can analyze customer opinions posted on web. The association model is to modify the association rules mining model in data mining in order to apply efficiently and effectively the association mining techniques to the opinion mining. We designed and implemented the opinion mining systems based on the modified association model and the grouping idea which would enable it to generate significant rules more.

Analysis of Factors Affecting Surge in Container Shipping Rates in the Era of Covid19 Using Text Analysis (코로나19 판데믹 이후 컨테이너선 운임 상승 요인분석: 텍스트 분석을 중심으로)

  • Rha, Jin Sung
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.1
    • /
    • pp.111-123
    • /
    • 2022
  • In the era of the Covid19, container shipping rates are surging up. Many studies have attempted to investigate the factors affecting a surge in container shipping rates. However, there is limited literature using text mining techniques for analyzing the underlying causes of the surge. This study aims to identify the factors behind the unprecedented surge in shipping rates using network text analysis and LDA topic modeling. For the analysis, we collected the data and keywords from articles in Lloyd's List during past two years(2020-2021). The results of the text analysis showed that the current surge is mainly due to "US-China trade war", "rising blanking sailings", "port congestion", "container shortage", and "unexpected events such as the Suez canal blockage".

P-TAF: A Big Data-based Platform for Total Air Traffic Forecast (빅데이터 기반 항공 수요예측 통합 플랫폼 설계 및 실증)

  • Jung, Jooik;Son, Seokhyun;Cha, Hee-June
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.281-282
    • /
    • 2021
  • 본 논문에서는 항공 수요예측을 위한 빅데이터 기반 플랫폼의 설계 및 실증 결과를 제시한다. 항공 수요예측 통합 플랫폼은 항공산업 관련 데이터를 Open API, RSS Feed, 웹크롤러(Web Crawler) 등을 이용하여 수집 및 분석하여 자체 개발한 항공 수요예측 알고리즘을 기반으로 결과를 시각화하여 보여주도록 구현되어 있다. 또한, 제안하는 플랫폼의 사용자 인터페이스를 통해 변수 설정을 하여 단위별(Global, National 등), 기간별(단기, 중장기 등), 유형별(여객, 화물 등) 예측 통계 자료를 도출할 수 있다. 플랫폼의 성능 검증을 위해 정형화된 데이터를 비롯하여 소셜네트워크서비스(SNS), 검색엔진 등에서 수집한 비정형 데이터까지 활용하여 특정 키워드의 빈도와 특정 노선에 대한 항공 수요간 상관관계를 분석하였다. 개발한 통합 플랫폼의 지능형 항공 수요예측 알고리즘을 통해 전반적인 공항 운영 및 공항 운영 정책 수립에 기여할 것으로 예상한다.

  • PDF

Advancing Societal Statistics Processing Methodology through Artificial Intelligence: A Case Study on Household Trend Survey and Time Use Survey (인공지능 기반 사회 통계 생산 방법론 고도화 방안: 가계동향조사와 생활시간조사 사례)

  • Kyo-Joong Oh;Ho-Jin Choi;Ilgu Kim;Seungwoo Han;Kunsoo Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.563-567
    • /
    • 2023
  • 본 연구는 한국 통계청이 수행하는 가계동향조사와 생활시간조사에서 자료처리 과정 및 방법을 혁신하려는 시도로, 기존의 통계 생산 방법론의 한계를 극복하고, 대규모 데이터의 효과적인 관리와 분석을 가능하게 하는 인공지능 기반의 통계 생산을 목표로 한다. 본 연구는 데이터 과학과 통계학의 교차점에서 진행되며, 인공지능 기술, 특히 자연어 처리와 딥러닝을 활용하여 비정형 텍스트 분류 방법의 성능을 검증하며, 인공지능 기반 통계분류 방법론의 확장성과 추가적인 조사 확대 적용의 가능성을 탐구한다. 이 연구의 결과는 통계 데이터의 품질 향상과 신뢰성 증가에 기여하며, 국민의 생활 패턴과 행동에 대한 더 깊고 정확한 이해를 제공한다.

  • PDF

Analysis of Domestic Research on Depression and Stress : Focused on the Treatment and Subjects (우울과 스트레스에 관한 국내 연구 분석 : 치료와 대상자를 중심으로)

  • Jo, Nam-Hee;Na, Eun-Young
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.6
    • /
    • pp.53-59
    • /
    • 2017
  • This study was attempted to identify the domestic research related to depression and stress. The subjects of the analysis were 1,875 college degree theses thrown in the National Assembly Library searched by the depression and stress keyword as of November 30, 2016. The analysis method visualizes atypical data with Word Cloud, which is one of the text mining techniques. We also used the R'LDA package and LDA to classify treatment and subjects. As a result of the analysis, 233(12.4%) of the total papers with therapeutic keywords were found. Application of treatment methods was art therapy, music therapy, horticultural therapy, cognitive behavior therapy, clinical art therapy, cognitive therapy, psychological therapy, depression treatment, group therapy, laughter treatment sequence. The study subjects were adolescents, elderly, patient, mother, child, female, parents, and college students in order. The results of LDA topic analysis for adolescents were classified into four topics: self-support, treatment program, relationship effect, and variable study.

Analysis of the National Police Agency business trends using text mining (텍스트 마이닝 기법을 이용한 경찰청 업무 트렌드 분석)

  • Sun, Hyunseok;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.301-317
    • /
    • 2019
  • There has been significant research conducted on how to discover various insights through text data using statistical techniques. In this study we analyzed text data produced by the Korean National Police Agency to identify trends in the work by year and compare work characteristics among local authorities by identifying distinctive keywords in documents produced by each local authority. A preprocessing according to the characteristics of each data was conducted and the frequency of words for each document was calculated in order to draw a meaningful conclusion. The simple term frequency shown in the document is difficult to describe the characteristics of the keywords; therefore, the frequency for each term was newly calculated using the term frequency-inverse document frequency weights. The L2 norm normalization technique was used to compare the frequency of words. The analysis can be used as basic data that can be newly for future police work improvement policies and as a method to improve the efficiency of the police service that also help identify a demand for improvements in indoor work.