• Title/Summary/Keyword: Data quality metrics

Search Result 140, Processing Time 0.022 seconds

프로세스 성숙도가 기업 R&D 프로젝트의 성과에 미치는 영향에 관한 연구 (The Effect of Process Maturity on the Performance of Industrial R&D Projects)

  • 홍순욱
    • 산업공학
    • /
    • 제16권3호
    • /
    • pp.362-374
    • /
    • 2003
  • The major objective of this paper is to empirically examine the effect of process maturity on the performance of industrial R&D projects. Process maturity, a fundamental concept of the Capability Maturity Model developed by Software Engineering Institute, represents the essential of Total Quality Management (TQM). Based on literature, our research model constructs process maturity in terms of structured process, goal setting and controlling, metrics, and process learning; and links it to the R&D performance that consists of technical, commercial and managerial successes. The model also includes firm size as a moderator of different effects that process maturity may have across firms. Measures for process maturity are based on the best practices identified in literature. Data are obtained from 77 successful R&D projects carried out by Korean manufacturing firms. Multiple regression and t-test are used to test proposed hypotheses. Findings are as follows. (1) In the R&D process, process maturity partially contributes to the performance of R&D projects. More specifically, goal setting and controlling-related practices drive both technical and commercial successes, while process learning-related practices drive commercial success. In contrast, traditionally emphasized elements such as structured process or metrics are found not to be significant. (2) The degree of process maturity is significantly higher in large firms. (3) Process maturity impacts on commercial success in the case of large firms, whereas it does on technical success in the case of small firms. The results imply that the TQM principles are partially associated with R&D performance, and the nature of benefit from high maturity could vary according to firm size.

X-means 클러스터링을 이용한 악성 트래픽 탐지 방법 (A Malicious Traffic Detection Method Using X-means Clustering)

  • 한명지;임지혁;최준용;김현준;서정주;유철;김성렬;박근수
    • 정보과학회 논문지
    • /
    • 제41권9호
    • /
    • pp.617-624
    • /
    • 2014
  • 악성 트래픽은 디도스 공격, 봇넷 통신 등의 인터넷 망을 교란시키거나 특정 네트워크, 서버, 혹은 호스트에 피해를 끼칠 의도를 가지고 발생시키는 트래픽을 지칭한다. 이와 같은 악성 트래픽은 인터넷이 발생한 이래 꾸준히 양과 질에서 진화하고 있고 이에 대한 대응 연구도 계속되고 있다. 이 논문에서는 악성 트래픽을 기존 X-means 클러스터링 알고리즘을 적용하여 효과적으로 탐지하는 방법을 제시하였다. 특히 악성 트래픽의 통계적 특징을 분석하고 클러스터링을 위한 메트릭을 정의하는 방법을 체계적으로 제시하였다. 또한 두 개의 공개된 트래픽 데이터에 대한 실험을 통해 실효성을 검증하였다.

문서 분류를 위한 용어 가중치 기법 비교 (Comparison of term weighting schemes for document classification)

  • 정호영;신상민;최용석
    • 응용통계연구
    • /
    • 제32권2호
    • /
    • pp.265-276
    • /
    • 2019
  • 문서-용어 빈도행렬은 텍스트 마이닝에서 분석하고자 하는 개체 정보를 가지고 있는 일반적인 자료 형태이다. 본 연구에서 문서 분류를 위해 문서-용어 빈도행렬에 적용되는 기존의 용어 가중치인 TF-IDF를 소개한다. 추가하여 최근에 알려진 용어 가중치인 TF-IDF-ICSDF와 TF-IGM의 정의와 장단점을 소개하고 비교한다. 또한 문서 분류 분석의 질을 높이기 위해 핵심어를 추출하는 방법을 제시하고자 한다. 추출된 핵심어를 바탕으로 문서 분류에 있어서 가장 많이 활용된 기계학습 알고리즘 중에서 서포트 벡터 머신을 이용하였다. 본 연구에서 소개한 용어 가중치들의 성능을 비교하기 위하여 정확률, 재현율, F1-점수와 같은 성능 지표들을 이용하였다. 그 결과 TF-IGM 방법이 모두 높은 성능 지표를 보였고, 텍스트를 분류하는데 있어 최적화 된 방법으로 나타났다.

다중 전송률을 반영한 무선랜 매쉬 링크 품질 측정방법의 설계 및 구현 (Design and Implementation of Multi-rate Broadcast based Link Quality Measurement for WLAN Mesh Network)

  • 이덕환;양승철;김종덕
    • 한국통신학회논문지
    • /
    • 제36권9A호
    • /
    • pp.801-808
    • /
    • 2011
  • 본 논문은 무선랜 메쉬 네트워크 링크 품질 측정 정확성 향상을 위한 MBAP(Multi-rate Broadcast Active Probing) 방법을 제안한다. 라우팅을 위한 척도로 홉수를 주로 사용하는 MANET과 달리 메쉬 네트워크는 ETX, ETT 등의 링크 품질 기반의 척도를 주로 사용한다. ETX. ETT 등의 측정을 위해 기존 연구들에서는 단일 전송률 브로트캐스트 기반의 측정 방법을 사용한다. 하지만 이 방법은 다양한 전송률을 가질 수 있는 무선랜 링크의 품질을 제대로 반영하지 못하며 이는 적절하지 못한 경로 선택으로 귀결될 수 있다. MBAP는 다양한 전송률의 브로드캐스트를 사용함으로써 우선 링크 품질 측정의 정확성을 향상시켰다. 우리는 MBAP를 리눅스 기반의 링크계층에서 구현하였으며 실험을 통해 그 효용성을 검증하였다.

A Flow Analysis Framework for Traffic Video

  • Bai, Lu-Shuang;Xia, Ying;Lee, Sang-Chul
    • 한국공간정보시스템학회 논문지
    • /
    • 제11권2호
    • /
    • pp.45-53
    • /
    • 2009
  • The fast progress on multimedia data acquisition technologies has enabled collecting vast amount of videos in real time. Although the amount of information gathered from these videos could be high in terms of quantity and quality, the use of the collected data is very limited typically by human-centric monitoring systems. In this paper, we propose a framework for analyzing long traffic video using series of content-based analyses tools. Our framework suggests a method to integrate theses analyses tools to extract highly informative features specific to a traffic video analysis. Our analytical framework provides (1) re-sampling tools for efficient and precise analysis, (2) foreground extraction methods for unbiased traffic flow analysis, (3) frame property analyses tools using variety of frame characteristics including brightness, entropy, Harris corners, and variance of traffic flow, and (4) a visualization tool that summarizes the entire video sequence and automatically highlight a collection of frames based on some metrics defined by semi-automated or fully automated techniques. Based on the proposed framework, we developed an automated traffic flow analysis system, and in our experiments, we show results from two example traffic videos taken from different monitoring angles.

  • PDF

선진국 사례 벤치마킹을 통한 건설공사 사후평가 성과분석 체계 개발 (Performance Analysis Framework for Post-Evaluation of Construction Projects through Benchmarking from Advanced Countries)

  • 이강욱
    • 한국산업융합학회 논문집
    • /
    • 제25권6_2호
    • /
    • pp.1017-1027
    • /
    • 2022
  • Development of social overhead capital (SOC) requires huge national finance, and performance issues such as cost-efficiency, safety, and environment have been constantly raised. However, currently each construction client has limited access to its own projects' performance without analytic methodology for industry-level comparisons and benchmarking for improvement. To overcome this problem, this study proposes a comprehensive performance analysis framework for post-evaluation of large-scale construction projects. To this end, this study performed a case study of advanced countries (the U.S., the U.K. and Japan) and consultation with related experts to develop a tailored performance analysis framework for the Post- Construction Evaluation and Management system in Korea. The developed framework covers three categories (project performance, project efficiency, and ripple effect), nine areas (cost, schedule, change, safety, quality, demand, benefit-cost ratio, civil complaint, and defect), and 31 detailed metrics. Using industry-level project performance database and statistical techniques, the proposed framework can be used not only to diagnose excellent and unsatisfactory performance areas for completed construction projects, but also to provide reference data for future similar projects. This study can contribute to the improvement of clients' performance management practices and effectiveness of construction projects.

Effectiveness of Fuzzy Graph Based Document Model

  • Aswathy M R;P.C. Reghu Raj;Ajeesh Ramanujan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권8호
    • /
    • pp.2178-2198
    • /
    • 2024
  • Graph-based document models have good capabilities to reveal inter-dependencies among unstructured text data. Natural language processing (NLP) systems that use such models as an intermediate representation have shown good performance. This paper proposes a novel fuzzy graph-based document model and to demonstrate its effectiveness by applying fuzzy logic tools for text summarization. The proposed system accepts a text document as input and identifies some of its sentence level features, namely sentence position, sentence length, numerical data, thematic word, proper noun, title feature, upper case feature, and sentence similarity. The fuzzy membership value of each feature is computed from the sentences. We also propose a novel algorithm to construct the fuzzy graph as an intermediate representation of the input document. The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metric is used to evaluate the model. The evaluation based on different quality metrics was also performed to verify the effectiveness of the model. The ANOVA test confirms the hypothesis that the proposed model improves the summarizer performance by 10% when compared with the state-of-the-art summarizers employing alternate intermediate representations for the input text.

Criticality benchmarking of ENDF/B-VIII.0 and JEFF-3.3 neutron data libraries with RMC code

  • Zheng, Lei;Huang, Shanfang;Wang, Kan
    • Nuclear Engineering and Technology
    • /
    • 제52권9호
    • /
    • pp.1917-1925
    • /
    • 2020
  • New versions of ENDF/B and JEFF data libraries have been released during the past two years with significant updates in the neutron reaction sublibrary and the thermal neutron scattering sublibrary. In order to get a more comprehensive impression of the criticality quality of these two latest neutron data libraries, and to provide reference for the selection of the evaluated nuclear data libraries for the science and engineering applications of the Reactor Monte Carlo code RMC, the criticality benchmarking of the two latest neutron data libraries has been performed. RMC was employed as the computational tools, whose processing capability for the continuous representation ENDF/B-VIII.0 thermal neutron scattering laws was developed. The RMC criticality validation suite consisting of 116 benchmarks was established for the benchmarking work. The latest ACE format data libraries of the neutron reaction and the thermal neutron scattering laws for ENDF/B-VIII.0, ENDF/B-VII.1, and JEFF-3.3 were downloaded from the corresponding official sites. The ENDF/B-VII.0 data library was also employed to provide code-to-code validation for RMC. All the calculations for the four different data libraries were performed by using a parallel version of RMC, and all the calculated standard deviations are lower than 30pcm. Comprehensive analyses including the C/E values with uncertainties, the δk/σ values, and the metrics of χ2 and < |Δ| >, were conducted and presented. The calculated keff eigenvalues based on the four data libraries generally agree well with the benchmark evaluations for most cases. Among the 116 criticality benchmarks, the numbers of the calculated keff eigenvalues which agree with the benchmark evaluations within 3σ interval (with a confidence level of 99.6%) are 107, 109, 112, and 113 for ENDF/B-VII.0, ENDF/B-VII.1, ENDF/B-VIII.0 and JEFF-3.3, respectively. The present results indicate that the ENDF/B-VIII.0 neutron data library has a better performance on average.

CNN 모델을 이용한 프로그램 코드 변경 예측 (Predicting Program Code Changes Using a CNN Model)

  • 김동관
    • 한국융합학회논문지
    • /
    • 제12권9호
    • /
    • pp.11-19
    • /
    • 2021
  • 소프트웨어 시스템은 생명주기동안 기능 추가, 버그 수정, 새로운 컴퓨팅 환경 수용 등의 다양한 이유로 프로그램 코드 변경이 요구된다. 이러한 코드 수정 과정에서 새로운 오류 발생을 가져올 수 있으므로 프로그램 코드 수정 과정은 새로운 시스템 개발 못지 않게 신중하게 처리되야 한다. 또한, 오픈 소스 프로그램에 대한 재사용이 일반화된 소프트웨어 개발환경에서 오픈 소스 프로그램의 코드 변경 가능성을 예측할 수 있다면, 보다 양질의 프로그램 개발 효과를 기대할 수 있을 것이다. 본 논문은 소스 코드 변경을 예측하는 Convolutional Neural Network (CNN) 기반의 딥러닝 모델을 제안한다. 소스 코드 변경을 예측하는 문제는 딥러닝의 이진 분류 문제이며 레이블된 데이터가 요구되는 지도학습을 사용한다. 코드 예측 모델의 학습 및 시험을 위해 깃허브에서 수집한 Java 소스 코드와 코드 변경 로그를 데이터로 사용한다. 수집된 Java 소스 코드에서 소프트웨어 메트릭스를 계산한 후 제안된 코드 변경 예측 모델의 입력 데이터로 사용한다. 제안된 모델의 성능 평가를 위해 정밀도, 재현율, F1점수, 정확도가 측정되었으며 각각의 평가 지표에 있이서 CNN 모델은 95%, 다층 퍼셉트 기반의 DNN 모델은 92%를 달성했다.

Accelerated Resting-State Functional Magnetic Resonance Imaging Using Multiband Echo-Planar Imaging with Controlled Aliasing

  • Seo, Hyung Suk;Jang, Kyung Eun;Wang, Dingxin;Kim, In Seong;Chang, Yongmin
    • Investigative Magnetic Resonance Imaging
    • /
    • 제21권4호
    • /
    • pp.223-232
    • /
    • 2017
  • Purpose: To report the use of multiband accelerated echo-planar imaging (EPI) for resting-state functional MRI (rs-fMRI) to achieve rapid high temporal resolution at 3T compared to conventional EPI. Materials and Methods: rs-fMRI data were acquired from 20 healthy right-handed volunteers by using three methods: conventional single-band gradient-echo EPI acquisition (Data 1), multiband gradient-echo EPI acquisition with 240 volumes (Data 2) and 480 volumes (Data 3). Temporal signal-to-noise ratio (tSNR) maps were obtained by dividing the mean of the time course of each voxel by its temporal standard deviation. The resting-state sensorimotor network (SMN) and default mode network (DMN) were estimated using independent component analysis (ICA) and a seed-based method. One-way analysis of variance (ANOVA) was performed between the tSNR map, SMN, and DMN from the three data sets for between-group analysis. P < 0.05 with a family-wise error (FWE) correction for multiple comparisons was considered statistically significant. Results: One-way ANOVA and post-hoc two-sample t-tests showed that the tSNR was higher in Data 1 than Data 2 and 3 in white matter structures such as the striatum and medial and superior longitudinal fasciculus. One-way ANOVA revealed no differences in SMN or DMN across the three data sets. Conclusion: Within the adapted metrics estimated under specific imaging conditions employed in this study, multiband accelerated EPI, which substantially reduced scan times, provides the same quality image of functional connectivity as rs-fMRI by using conventional EPI at 3T. Under employed imaging conditions, this technique shows strong potential for clinical acceptance and translation of rs-fMRI protocols with potential advantages in spatial and/or temporal resolution. However, further study is warranted to evaluate whether the current findings can be generalized in diverse settings.