• Title/Summary/Keyword: Preprocess Data

Search Result 62, Processing Time 0.021 seconds

A Study on Big Data Analysis of Related Patents in Smart Factories Using Topic Models and ChatGPT (토픽 모형과 ChatGPT를 활용한 스마트팩토리 연관 특허 빅데이터 분석에 관한 연구)

  • Sang-Gook Kim;Minyoung Yun;Taehoon Kwon;Jung Sun Lim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.15-31
    • /
    • 2023
  • In this study, we propose a novel approach to analyze big data related to patents in the field of smart factories, utilizing the Latent Dirichlet Allocation (LDA) topic modeling method and the generative artificial intelligence technology, ChatGPT. Our method includes extracting valuable insights from a large data-set of associated patents using LDA to identify latent topics and their corresponding patent documents. Additionally, we validate the suitability of the topics generated using generative AI technology and review the results with domain experts. We also employ the powerful big data analysis tool, KNIME, to preprocess and visualize the patent data, facilitating a better understanding of the global patent landscape and enabling a comparative analysis with the domestic patent environment. In order to explore quantitative and qualitative comparative advantages at this juncture, we have selected six indicators for conducting a quantitative analysis. Consequently, our approach allows us to explore the distinctive characteristics and investment directions of individual countries in the context of research and development and commercialization, based on a global-scale patent analysis in the field of smart factories. We anticipate that our findings, based on the analysis of global patent data in the field of smart factories, will serve as vital guidance for determining individual countries' directions in research and development investment. Furthermore, we propose a novel utilization of GhatGPT as a tool for validating the suitability of selected topics for policy makers who must choose topics across various scientific and technological domains.

A study on rethinking EDA in digital transformation era (DX 전환 환경에서 EDA에 대한 재고찰)

  • Seoung-gon Ko
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.87-102
    • /
    • 2024
  • Digital transformation refers to the process by which a company or organization changes or innovates its existing business model or sales activities using digital technology. This requires the use of various digital technologies - cloud computing, IoT, artificial intelligence, etc. - to strengthen competitiveness in the market, improve customer experience, and discover new businesses. In addition, in order to derive knowledge and insight about the market, customers, and production environment, it is necessary to select the right data, preprocess the data to an analyzable state, and establish the right process for systematic analysis suitable for the purpose. The usefulness of such digital data is determined by the importance of pre-processing and the correct application of exploratory data analysis (EDA), which is useful for information and hypothesis exploration and visualization of knowledge and insights. In this paper, we reexamine the philosophy and basic concepts of EDA and discuss key visualization information, information expression methods based on the grammar of graphics, and the ACCENT principle, which is the final visualization review standard, for effective visualization.

A Real-time Context Integration System for Multimodal Sensor Networks using XML (XML을 활용한 멀티모달 센서기반 실시간 컨텍스트 통합 시스템)

  • Yang, Sung-Ihk;Hong, Jin-Hyuk;Cho, Sung-Bae
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.141-146
    • /
    • 2008
  • As the interest about ubiquitous environment is increasing, there are many researches about the services in this environment. These services have important issues in interpreting the users' context, using many kinds of sensors, like PDA, GPS and accelerometers. Low level raw data, which sensors like accelerometers calibrates, are hard to use, and to provide real-time services preprocessing and interpreting the data into context, in real-time, is important. This paper describes a context integrate system which can integrate these sensors and also sensors which has raw data, like accelerometers and physiological sensors, and define the context interpret rule with XML. The proposing system reduces programming operations when adding a sensor to the sensor network or modifying the context interpreting rule by using XML. By using this system, we implemented a real-time data monitoring system which can describe the numeric data into graphs, and assist the user to validate the data and results of the preprocess phase, and also support the external services and applications to use the context of the user.

  • PDF

A RAMS Atmospheric Field I Predicted by an Improved Initial Input Dataset - An Application of NOAA SST data - (초기 입력 자료의 개선에 의한 RAMS 기상장의 예측 I - NOAA SST자료의 적용 -)

  • Won, Gyeong-Mee;Jeong, Gi-Ho;Lee, Hwa-Woon;Jung, Woo-Sik;Lee, Kang-Yoel
    • Journal of Environmental Science International
    • /
    • v.18 no.5
    • /
    • pp.489-499
    • /
    • 2009
  • In an effort to examine the Regional Atmospheric Modeling System (RAMS ver. 4.3) to the initial meteorological input data, detailed observational data of NOAA satellite SST (Sea Surface Temperature) was employed. The NOAA satellite SST which is currently provided daily as a seven-day mean value with resolution of 0.1 $^{\circ}$ grid spacing was used instead of the climatologically derived monthly mean SST using in RAMS. In addition, the RAMS SST data must be changed new one because it was constructed in 1993. For more realistic initial meteorological fields, the NOAA satellite SST was incorporated into the RAMS-preprocess package named ISentropic Analysis package (ISAN). When the NOAA SST data was imposed to the initial condition of prognostic RAMS model, the resultant performance of near surface atmospheric fields was discussed and compared with that of default option of SST. We got the good results that the new SST data was made in a standard RAMS format and showed the detailed variation of SST. As the modeling grid became smaller, the SST differences of the NOAA SST run and the RAMS SST43 (default) run in diurnal variation were very minor but this research can apply to further study for the realistic SST situation and the development in predicting regional atmospheric field which imply the regional circulation due to differential surface heating between sea and land or climatological phenomenon.

Virtual Block Game Interface based on the Hand Gesture Recognition (손 제스처 인식에 기반한 Virtual Block 게임 인터페이스)

  • Yoon, Min-Ho;Kim, Yoon-Jae;Kim, Tae-Young
    • Journal of Korea Game Society
    • /
    • v.17 no.6
    • /
    • pp.113-120
    • /
    • 2017
  • With the development of virtual reality technology, in recent years, user-friendly hand gesture interface has been more studied for natural interaction with a virtual 3D object. Most earlier studies on the hand-gesture interface are using relatively simple hand gestures. In this paper, we suggest an intuitive hand gesture interface for interaction with 3D object in the virtual reality applications. For hand gesture recognition, first of all, we preprocess various hand data and classify the data through the binary decision tree. The classified data is re-sampled and converted to the chain-code, and then constructed to the hand feature data with the histograms of the chain code. Finally, the input gesture is recognized by MCSVM-based machine learning from the feature data. To test our proposed hand gesture interface we implemented a 'Virtual Block' game. Our experiments showed about 99.2% recognition ratio of 16 kinds of command gestures and more intuitive and user friendly than conventional mouse interface.

A Study on Building Extraction from LiDAR Data Using LISA (LISA를 이용한 LIDAR 데이터로부터 건물 추출에 관한 연구)

  • Byun, Young-Gi;Lee, Jeong-Ho;Son, Jeong-Hoon;Yu, Ki-Yun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.24 no.4
    • /
    • pp.335-341
    • /
    • 2006
  • This paper aims at developing an efficient method that extracts building using local spatial association of raw LiDAR data without setting up empirical variables such as a minimum building area, and applying the method to survey data to evaluate the efficiency of that. To do this, LISA(Local Indicatiors of Spatial Association) statistics are used which reflect local variations that can be appeared in the research area. It can be also a preprocess that detects spatial outliers through the significance test of LISA statistics and interpolate using kernel estimation. Boundaries of buildings as well as buildings can be extracted based on quadrant of Moran Scatterplot. Experimental results show that the proposed method is promising in extracting buildings from LiDAR data automatically.

Reversible Watermarking with Adaptive Embedding Threshold Matrix

  • Gao, Guangyong;Shi, Yun-Qing;Sun, Xingming;Zhou, Caixue;Cui, Zongmin;Xu, Liya
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.9
    • /
    • pp.4603-4624
    • /
    • 2016
  • In this paper, a new reversible watermarking algorithm with adaptive embedding threshold matrix is proposed. Firstly, to avoid the overflow and underflow, two flexible thresholds, TL and TR, are applied to preprocess the image histogram with least histogram shift cost. Secondly, for achieving an optimal or near optimal tradeoff between the embedding capacity and imperceptibility, the embedding threshold matrix, composed of the embedding thresholds of all blocks, is determined adaptively by the combination between the composite chaos and the average energy of Integer Wavelet Transform (IWT) block. As a non-liner system with good randomness, the composite chaos is suitable to search the optimal embedding thresholds. Meanwhile, the average energy of IWT block is calculated to adjust the block embedding capacity, and more data are embedded into those IWT blocks with larger average energy. The experimental results demonstrate that compared with the state-of-the-art reversible watermarking schemes, the proposed scheme has better performance for the tradeoff between the embedding capacity and imperceptibility.

Implementation of Variable Threshold Dual Rate ADPCM Speech CODEC Considering the Background Noise (배경잡음을 고려한 가변임계값 Dual Rate ADPCM 음성 CODEC 구현)

  • Yang, Jae-Seok;Han, Kyong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2000.07d
    • /
    • pp.3166-3168
    • /
    • 2000
  • This paper proposed variable threshold dual rate ADPCM coding method which is modified from the standard ADPCM of ITU G.726 for speech quality improvement. The speech quality of variable threshold dual rate ADPCM is better than single rate ADPCM at noisy environment without increasing the complexity by using ZCR(Zero Crossing Rate). In this case, ZCR is used to divide input signal samples into two categories(noisy & speech). The samples with higher ZCR is categorized as the noisy region and the samples with lower ZCR is categorized as the speech region. Noisy region uses higher threshold value to be compressed by 16Kbps for reduced bit rates and the speech region uses lower threshold value to be compressed by 40Kbps for improved speech quality. Comparing with the conventional ADPCM, which adapts the fixed coding rate. the proposed variable threshold dual rate ADPCM coding method improves noise character without increasing the bit rate. For real time applications, ZCR calculation was considered as a simple method to obtain the background noise information for preprocess of speech analysis such as FFT and the experiment showed that the simple calculation of ZCR can be used without complexity increase. Dual rate ADPCM can decrease the amount of transferred data efficiently without increasing complexity nor reducing speech quality. Therefore result of this paper can be applied for real-time speech application such as the internet phone or VoIP.

  • PDF

DEM Extraction from LiDAR DSM of Urban Area (도시지역 LiDAR DSM으로부터 DEM추출기법 연구)

  • Choi, Yun-Woong;Cho, Gi-Sung
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.13 no.1 s.31
    • /
    • pp.19-25
    • /
    • 2005
  • Nowadays, it is possible to construct the DEMs of urban area effectively and economically by LiDAR system. But the data from LiDAR system has form of DSM which is included various objects as trees and buildings. So the preprocess is necessary to extract the DEMs from LiDAR DSMs for particular purpose as effects analysis of man-made objects for flood prediction. As this study is for extracting DEM from LiDAR DSM of urban area, we detected the edges of various objects using edge detecting algorithm of image process. And, we tried mean value filtering, median value filtering and minimum value filtering or detected edges instead of interpolation method which is used in the previous study and could be modified the source data. it could minimize the modification of source data, and the extracting process of DEMs from DSMs could be simplified and automated.

  • PDF

A Study on the Gender and Age Classification of Speech Data Using CNN (CNN을 이용한 음성 데이터 성별 및 연령 분류 기술 연구)

  • Park, Dae-Seo;Bang, Joon-Il;Kim, Hwa-Jong;Ko, Young-Jun
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.11
    • /
    • pp.11-21
    • /
    • 2018
  • Research is carried out to categorize voices using Deep Learning technology. The study examines neural network-based sound classification studies and suggests improved neural networks for voice classification. Related studies studied urban data classification. However, related studies showed poor performance in shallow neural network. Therefore, in this paper the first preprocess voice data and extract feature value. Next, Categorize the voice by entering the feature value into previous sound classification network and proposed neural network. Finally, compare and evaluate classification performance of the two neural networks. The neural network of this paper is organized deeper and wider so that learning is better done. Performance results showed that 84.8 percent of related studies neural networks and 91.4 percent of the proposed neural networks. The proposed neural network was about 6 percent high.