• Title/Summary/Keyword: Data Pre-processing

Search Result 816, Processing Time 0.027 seconds

Development of an Informetric Analysis System KnowledgeMatrix (계량정보분석시스템 KnowledgeMatrix 개발)

  • Lee, Bangrae;Yeo, Woon Dong;Lee, June Young;Lee, Chang-Hoan;Kwon, Oh-Jin;Moon, Yeong-ho
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.167-171
    • /
    • 2007
  • Application areas of Knowledge Discovery in Database (KDD) have been expanded into many R&D management processes including technology trends analysis, forecasting and evaluation etc. Established research field such as informetrics (or scientometrics) has recently fully utilized techniques or methods of KDD. Various systems have been developed to support works of analyzing large-scale R&D related databases such as patent DB or bibliographic DB by a few researchers or institutions. But extant systems have some problems for korean users to use. Their prices is not cheap, korean language process not available, and user's demands not reflected. To solve these problems, Korea Institute of Science and Technology Information (KISTI) developed stand-alone type information analysis system named as KnowledgeMatrix. KnowledgeMatrix system offer various functions to analyze retrieved data set from databases. Knowledge Matrix main operation unit is composed of user-defined lists and matrix generation, cluster analysis, visualization, data pre-processing. KnowledgeMatrix show better performances and offer more various functions than extant systems.

  • PDF

Design and Implementation of ASTERIX Parsing Module Based on Pattern Matching for Air Traffic Control Display System (항공관제용 현시시스템을 위한 패턴매칭 기반의 ASTERIX 파싱 모듈 설계 및 구현)

  • Kim, Kanghee;Kim, Hojoong;Yin, Run Dong;Choi, SangBang
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.3
    • /
    • pp.89-101
    • /
    • 2014
  • Recently, as domestic air traffic dramatically increases, the need of ATC(air traffic control) systems has grown for safe and efficient ATM(air traffic management). Especially, for smooth ATC, it is far more important that performance of display system which should show all air traffic situation in FIR(Flight Information Region) without additional latency is guaranteed. In this paper, we design a ASTERIX(All purpose STructured Eurocontrol suRveillance Information eXchange) parsing module to promote stable ATC by minimizing system loads, which is connected with reducing overheads arisen when we parse ASTERIX message. Our ASTERIX parsing module based on pattern matching creates patterns by analyzing received ASTERIX data, and handles following received ASTERIX data using pre-defined procedure through patterns. This module minimizes display errors by rapidly extracting only necessary information for display different from existing parsing module containing unnecessary parsing procedure. Therefore, this designed module is to enable controllers to operate stable ATC. The comparison with existing general bit level ASTERIX parsing module shows that ASTERIX parsing module based on pattern matching has shorter processing delay, higher throughput, and lower CPU usage.

Implementation of High-Throughput SHA-1 Hash Algorithm using Multiple Unfolding Technique (다중 언폴딩 기법을 이용한 SHA-1 해쉬 알고리즘 고속 구현)

  • Lee, Eun-Hee;Lee, Je-Hoon;Jang, Young-Jo;Cho, Kyoung-Rok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.4
    • /
    • pp.41-49
    • /
    • 2010
  • This paper proposes a new high speed SHA-1 architecture using multiple unfolding and pre-computation techniques. We unfolds iterative hash operations to 2 continuos hash stage and reschedules computation timing. Then, the part of critical path is computed at the previous hash operation round and the rest is performed in the present round. These techniques reduce 3 additions to 2 additions on the critical path. It makes the maximum clock frequency of 118 MHz which provides throughput rate of 5.9 Gbps. The proposed architecture shows 26% higher throughput with a 32% smaller hardware size compared to other counterparts. This paper also introduces a analytical model of multiple SHA-1 architecture at the system level that maps a large input data on SHA-1 block in parallel. The model gives us the required number of SHA-1 blocks for a large multimedia data processing that it helps to make decision hardware configuration. The hs fospeed SHA-1 is useful to generate a condensed message and may strengthen the security of mobile communication and internet service.

A Basic Study on the Differential Diagnostic System of Laryngeal Diseases using Hierarchical Neural Networks (다단계 신경회로망을 이용한 후두질환 감별진단 시스템의 개발)

  • 전계록;김기련;권순복;예수영;이승진;왕수건
    • Journal of Biomedical Engineering Research
    • /
    • v.23 no.3
    • /
    • pp.197-205
    • /
    • 2002
  • The objectives of this Paper is to implement a diagnostic classifier of differential laryngeal diseases from acoustic signals acquired in a noisy room. For this Purpose, the voice signals of the vowel /a/ were collected from Patients in a soundproof chamber and got mixed with noise. Then, the acoustic Parameters were analyzed, and hierarchical neural networks were applied to the data classification. The classifier had a structure of five-step hierarchical neural networks. The first neural network classified the group into normal and benign or malign laryngeal disease cases. The second network classified the group into normal or benign laryngeal disease cases The following network distinguished polyp. nodule. Palsy from the benign laryngeal cases. Glottic cancer cases were discriminated into T1, T2. T3, T4 by the fourth and fifth networks All the neural networks were based on multilayer perceptron model which classified non-linear Patterns effectively and learned by an error back-propagation algorithm. We chose some acoustic Parameters for classification by investigating the distribution of laryngeal diseases and Pilot classification results of those Parameters derived from MDVP. The classifier was tested by using the chosen parameters to find the optimum ones. Then the networks were improved by including such Pre-Processing steps as linear and z-score transformation. Results showed that 90% of T1, 100% of T2-4 were correctly distinguished. On the other hand. 88.23% of vocal Polyps, 100% of normal cases. vocal nodules. and vocal cord Paralysis were classified from the data collected in a noisy room.

Park Golf Participation of Physically Disabled Impact on Psychological Well-being and Subjective Happiness (파크골프 참여가 지체장애인의 심리적 웰빙과 주관적 행복감에 미치는 영향)

  • Kim, Dong Won
    • 재활복지
    • /
    • v.18 no.4
    • /
    • pp.187-205
    • /
    • 2014
  • Is to identify how this affects the physically disabled to participate in the program 12 weeks Park Golf psychological well-being and happiness, the purpose of this research is subjective. How to study subjects, only 40-year-old disabled man more than 24 people total delay experimental group and 12 patients(failure cut seven, delayed dysfunction 5) and the control group and 12 patients(failure cut six, delayed dysfunction in 4, two people were involved in the joint disorder). 3 times a week(Mon, Wed, Fri), was carried out 50 minutes into 12 weeks of the experimental period, was located at River Park Golf Course A test place. We calculate the pre-and post-test data mean and standard deviation using SPSS Statistics 21.0 statistical data processing program, binary repeated measures ANOVA to analyze the effects on the psychological well-being of the disabled and subjective effects euphoria Park Golf Participation(was performed 2-way [2] RM ANOVA). First results in psychological well-being of the two groups according to Park Golf participate in group comparisons before and after the exercise involved only fun, immersive and shows were not significantly different, within each group enjoyment, competence, self-realization, all the children of the immersion showed a significant difference in the factors. Second, before and after participation in exercise, there was a significant difference between groups in subjective happiness of two groups according to Park Golf participation, the two groups were not significantly different within. Taken together the results to see more, showed that the positive effects on the psychological well-being and subjective happiness Park Golf participation is the Physically Disabled.

Convergence Study on Effects of Underwater Rehabilitation Exercise on Physical Fitness and Blood Lipids in Middle Aged Women (중년여성의 수중재활운동이 신체적성과 혈중지질에 미치는 융합연구)

  • Beak, Soon-Gi;Kim, Do-Jin
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.8
    • /
    • pp.260-267
    • /
    • 2019
  • The purpose of this study is to find out how underwater rehabilitation exercises affect physical fitness and blood lipids for 10 weeks and provide basic data to help prevent middle-aged women from cardiovascular diseases. The subjects of this study were middle-aged women living in Seoul, Korea. The underwater rehabilitation exercise was performed for 1 week and 3 times for 10 weeks, and the exercise time was 60 minutes for 1 time including the warm up, the main exercise and the cool down. The exercise intensity was set at 60-70% of the heart rate reserve calculated from the pre-exercise test. The measurement variables were physical fitness and blood lipid. In the data processing, descriptive statistics were presented for each measurement item and a 2-way RGRM ANOVA was conducted to examine the interaction effects between groups. The results have shown significant interaction effects in physical fitness(Flexibility, Cardiorespiratory Endurance, Muscular Endurance) and the blood lipids(TG, TC, HLD-C, LDL-C). This study found that the 10-week underwater rehabilitation exercise program of middle-aged women increased physical fitness level and decreased and increased blood lipid, which could be an effective and convergent program to prevent and reduce cardiovascular disease.

MLP-based 3D Geotechnical Layer Mapping Using Borehole Database in Seoul, South Korea (MLP 기반의 서울시 3차원 지반공간모델링 연구)

  • Ji, Yoonsoo;Kim, Han-Saem;Lee, Moon-Gyo;Cho, Hyung-Ik;Sun, Chang-Guk
    • Journal of the Korean Geotechnical Society
    • /
    • v.37 no.5
    • /
    • pp.47-63
    • /
    • 2021
  • Recently, the demand for three-dimensional (3D) underground maps from the perspective of digital twins and the demand for linkage utilization are increasing. However, the vastness of national geotechnical survey data and the uncertainty in applying geostatistical techniques pose challenges in modeling underground regional geotechnical characteristics. In this study, an optimal learning model based on multi-layer perceptron (MLP) was constructed for 3D subsurface lithological and geotechnical classification in Seoul, South Korea. First, the geotechnical layer and 3D spatial coordinates of each borehole dataset in the Seoul area were constructed as a geotechnical database according to a standardized format, and data pre-processing such as correction and normalization of missing values for machine learning was performed. An optimal fitting model was designed through hyperparameter optimization of the MLP model and model performance evaluation, such as precision and accuracy tests. Then, a 3D grid network locally assigning geotechnical layer classification was constructed by applying an MLP-based bet-fitting model for each unit lattice. The constructed 3D geotechnical layer map was evaluated by comparing the results of a geostatistical interpolation technique and the topsoil properties of the geological map.

Cross-Lingual Style-Based Title Generation Using Multiple Adapters (다중 어댑터를 이용한 교차 언어 및 스타일 기반의 제목 생성)

  • Yo-Han Park;Yong-Seok Choi;Kong Joo Lee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.341-354
    • /
    • 2023
  • The title of a document is the brief summarization of the document. Readers can easily understand a document if we provide them with its title in their preferred styles and the languages. In this research, we propose a cross-lingual and style-based title generation model using multiple adapters. To train the model, we need a parallel corpus in several languages with different styles. It is quite difficult to construct this kind of parallel corpus; however, a monolingual title generation corpus of the same style can be built easily. Therefore, we apply a zero-shot strategy to generate a title in a different language and with a different style for an input document. A baseline model is Transformer consisting of an encoder and a decoder, pre-trained by several languages. The model is then equipped with multiple adapters for translation, languages, and styles. After the model learns a translation task from parallel corpus, it learns a title generation task from monolingual title generation corpus. When training the model with a task, we only activate an adapter that corresponds to the task. When generating a cross-lingual and style-based title, we only activate adapters that correspond to a target language and a target style. An experimental result shows that our proposed model is only as good as a pipeline model that first translates into a target language and then generates a title. There have been significant changes in natural language generation due to the emergence of large-scale language models. However, research to improve the performance of natural language generation using limited resources and limited data needs to continue. In this regard, this study seeks to explore the significance of such research.

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

  • Bogyung Park;Somin Park;Hyunki Hong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.303-314
    • /
    • 2023
  • Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.

Development of deep learning structure for complex microbial incubator applying deep learning prediction result information (딥러닝 예측 결과 정보를 적용하는 복합 미생물 배양기를 위한 딥러닝 구조 개발)

  • Hong-Jik Kim;Won-Bog Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.1
    • /
    • pp.116-121
    • /
    • 2023
  • In this paper, we develop a deep learning structure for a complex microbial incubator that applies deep learning prediction result information. The proposed complex microbial incubator consists of pre-processing of complex microbial data, conversion of complex microbial data structure, design of deep learning network, learning of the designed deep learning network, and GUI development applied to the prototype. In the complex microbial data preprocessing, one-hot encoding is performed on the amount of molasses, nutrients, plant extract, salt, etc. required for microbial culture, and the maximum-minimum normalization method for the pH concentration measured as a result of the culture and the number of microbial cells to preprocess the data. In the complex microbial data structure conversion, the preprocessed data is converted into a graph structure by connecting the water temperature and the number of microbial cells, and then expressed as an adjacency matrix and attribute information to be used as input data for a deep learning network. In deep learning network design, complex microbial data is learned by designing a graph convolutional network specialized for graph structures. The designed deep learning network uses a cosine loss function to proceed with learning in the direction of minimizing the error that occurs during learning. GUI development applied to the prototype shows the target pH concentration (3.8 or less) and the number of cells (108 or more) of complex microorganisms in an order suitable for culturing according to the water temperature selected by the user. In order to evaluate the performance of the proposed microbial incubator, the results of experiments conducted by authorized testing institutes showed that the average pH was 3.7 and the number of cells of complex microorganisms was 1.7 × 108. Therefore, the effectiveness of the deep learning structure for the complex microbial incubator applying the deep learning prediction result information proposed in this paper was proven.