• 제목/요약/키워드: Classification Database

검색결과 940건 처리시간 0.026초

Novel Database Classification and Life Estimation Model for Accurate Database Asset Valuation

  • Youn-Soo Park;Ho-Hyun Park;Dong-Woon Jeon
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권7호
    • /
    • pp.131-143
    • /
    • 2023
  • 미래 지식의 사회에서는 비즈니스 데이터의 중요성이 증가할 것으로 예상되며, 기업이 제품을 제조하거나 서비스를 개발하기 위한 원재료로 인식되고 있다. 데이터의 중요성이 증가하면서 데이터베이스 자산의 경제적 가치를 판단하는 연구도 이루어지고 있다. 그러나 기존 연구는 데이터베이스 자산의 특성이 충분히 반영되지 않았다. 이에 본 연구에서는 데이터베이스 자산의 특성을 고려하여 데이터베이스 자산을 수익형, 비수익형과 공공재형 데이터베이스 자산으로 분류하였다. 또한, 수익형 데이터베이스 자산은 기존 기술가치평가와 유사하게 가치를 판단하는 것이 가능함에 착안하여, 기업의 위험 조정 할인율을 내포하는 데이터베이스 자산의 수명 산출 방법을 개발하였다.

Object Classification Method Using Dynamic Random Forests and Genetic Optimization

  • Kim, Jae Hyup;Kim, Hun Ki;Jang, Kyung Hyun;Lee, Jong Min;Moon, Young Shik
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권5호
    • /
    • pp.79-89
    • /
    • 2016
  • In this paper, we proposed the object classification method using genetic and dynamic random forest consisting of optimal combination of unit tree. The random forest can ensure good generalization performance in combination of large amount of trees by assigning the randomization to the training samples and feature selection, etc. allocated to the decision tree as an ensemble classification model which combines with the unit decision tree based on the bagging. However, the random forest is composed of unit trees randomly, so it can show the excellent classification performance only when the sufficient amounts of trees are combined. There is no quantitative measurement method for the number of trees, and there is no choice but to repeat random tree structure continuously. The proposed algorithm is composed of random forest with a combination of optimal tree while maintaining the generalization performance of random forest. To achieve this, the problem of improving the classification performance was assigned to the optimization problem which found the optimal tree combination. For this end, the genetic algorithm methodology was applied. As a result of experiment, we had found out that the proposed algorithm could improve about 3~5% of classification performance in specific cases like common database and self infrared database compare with the existing random forest. In addition, we had shown that the optimal tree combination was decided at 55~60% level from the maximum trees.

The Relationship between Preoperative Wound Classification and Postoperative Infection: A Multi-Institutional Analysis of 15,289 Patients

  • Mioton, Lauren M.;Jordan, Sumanas W.;Hanwright, Philip J.;Bilimoria, Karl Y.;Kim, John Y.S.
    • Archives of Plastic Surgery
    • /
    • 제40권5호
    • /
    • pp.522-529
    • /
    • 2013
  • Background Despite advances in surgical techniques, sterile protocols, and perioperative antibiotic regimens, surgical site infections (SSIs) remain a significant problem. We investigated the relationship between wound classification (i.e., clean, clean/contaminated, contaminated, dirty) and SSI rates in plastic surgery. Methods We performed a retrospective review of a multi-institutional, surgical outcomes database for all patients undergoing plastic surgery procedures from 2006-2010. Patient demographics, wound classification, and 30-day outcomes were recorded and analyzed by multivariate logistic regression. Results A total of 15,289 plastic surgery cases were analyzed. The overall SSI rate was 3.00%, with superficial SSIs occurring at comparable rates across wound classes. There were similar rates of deep SSIs in the clean and clean/contaminated groups (0.64%), while rates reached over 2% in contaminated and dirty cases. Organ/space SSIs occurred in less than 1% of each wound classification. Contaminated and dirty cases were at an increased risk for deep SSIs (odds ratios, 2.81 and 2.74, respectively); however, wound classification did not appear to be a significant predictor of superficial or organ/space SSIs. Clean/contaminated, contaminated, and dirty cases were at increased risk for a postoperative complication, and contaminated and dirty cases also had higher odds of reoperation and 30-day mortality. Conclusions Analyzing a multi-center database, we found that wound classification was a significant predictor of overall complications, reoperation, and mortality, but not an adequate predictor of surgical site infections. When comparing infections for a given wound classification, plastic surgery had lower overall rates than the surgical population at large.

Gabor 웨이블릿을 이용한 회전 변화에 무관한 질감 분류 기법 (Rotation-Invariant Texture Classification Using Gabor Wavelet)

  • 김원희;윤청파;문광석;김종남
    • 한국멀티미디어학회논문지
    • /
    • 제10권9호
    • /
    • pp.1125-1134
    • /
    • 2007
  • 본 논문에서는 가보 웨이블릿(Gabor Wavelet)을 이용한 회전 변화에 무관한 질감 분류 기법을 제안한다. 기존의 방법들은 대용량 질감 데이터베이스에서 낮은 정정분류비(Correct Classification Rate)를 나타내었다. 제안한 방법은 가보 웨이블릿 필터링 된 영상에서 전역 특징 벡터(Global Feature Vector)와 지역 특징행렬(Local Feature Matrix)을 정의하였다. 회전 변화에 무관한 두 가지 특징 그룹을 이용하여 개선된 유사도 측정 판별식(Discriminant)을 정의하였으며, 실험을 통하여 대용량 질감 데이터베이스에 적용한 결과 향상된 정정분류비를 얻을 수 있었다. 또한 질감 영상 스펙트럼의 대칭성을 이용하여 기존의 방법보다 실험회수를 50% 가까이 감소시켰다 결론적으로 112개의 브로다츠(Brodatz) 질감 클래스에서 비교 방법에 따라 차이는 있으나 $2.3%{\sim}15.6%$의 향상된 정정분류비를 얻었다.

  • PDF

내진설계기준의 지반분류체계 및 설계응답스펙트럼 개선을 위한 연구 - (I) 데이터베이스 및 지반응답해석 (Site Classification and Design Response Spectra for Seismic Code Provisions - (I) Database and Site Response Analyses)

  • 조형익;;김동수
    • 한국지진공학회논문집
    • /
    • 제20권4호
    • /
    • pp.235-243
    • /
    • 2016
  • Korea is part of a region of low to moderate seismicity located inside the Eurasian plate with bedrock located at depths less than 30 m. However, the spectral acceleration obtained from site response analyses based on the geologic conditions of inland areas of the Korean peninsula are significantly different from the current Korean seismic code. Therefore, suitable site classification scheme and design response spectra based on local site conditions in the Korean peninsula are required to produce reliable estimates of earthquake ground motion. In this study, site-specific response analyses were performed at more than 300 sites with at least 100 sites at each site categories of $S_C$, $S_D$, and $S_E$ as defined in the current seismic code in Korea. The process of creating a huge database of input parameters - such as shear wave velocity profiles, normalized shear modulus reduction curves, damping curves, and input earthquake motions - for site response analyses were described. The response spectra and site coefficients obtained from site response analyses were compared with those proposed for the site categories in the current code. Problems with the current seismic design code were subsequently discussed, and the development and verifications of new site classification system and corresponding design response spectra are detailed in companion papers (II-development of new site categories and design response spectra and III-Verifications)

MFCC 특징 벡터를 이용한 수중 천이 신호 식별 (Classification of Underwater Transient Signals Using MFCC Feature Vector)

  • 임태균;황찬식;이형욱;배건성
    • 한국통신학회논문지
    • /
    • 제32권8C호
    • /
    • pp.675-680
    • /
    • 2007
  • 일반적으로 천이 신호의 식별은 지진학이나 상태 모니터링 분야, 특히 수중 음향 신호 처리 분야에서 활발한 연구가 이루어지고 있다. 수중 환경에서 발생하는 천이 신호로는 돌고래와 같은 해양 생물이 내는 천이 신호와 선박, 잠수함 등에서 발생하는 인위적인 천이 신호 등이 있으며, 수중 감시 체계에서 이러한 수중 천이 신호를 식별하는 문제는 매우 중요한 연구 주제이다. 본 논문에서는 음성 인식 분야에서 우수한 인식 성능을 보이는 MFCC(Mel Frequency Cepstral Coefficient)를 기반으로, 천이 신호로 탐지된 입력 신호에 대하여 분석 프레임 단위로 MFCC 특징 벡터를 추출하고, 식별하고자 하는 데이터베이스에 있는 모든 참조 신호들의 MFCC 특징 벡터와의 유클리디언 거리(euclidean distance)를 계산한 후, 가장 작은 값을 갖는 참조 신호로 입력 프레임들을 사상(mapping)시킴으로써 사상이 가장 많이 된 참조 신호로 탐지된 수중 천이신호를 식별하는 프레임 기반의 식별 알고리즘을 제안한다.

An Active Co-Training Algorithm for Biomedical Named-Entity Recognition

  • Munkhdalai, Tsendsuren;Li, Meijing;Yun, Unil;Namsrai, Oyun-Erdene;Ryu, Keun Ho
    • Journal of Information Processing Systems
    • /
    • 제8권4호
    • /
    • pp.575-588
    • /
    • 2012
  • Exploiting unlabeled text data with a relatively small labeled corpus has been an active and challenging research topic in text mining, due to the recent growth of the amount of biomedical literature. Biomedical named-entity recognition is an essential prerequisite task before effective text mining of biomedical literature can begin. This paper proposes an Active Co-Training (ACT) algorithm for biomedical named-entity recognition. ACT is a semi-supervised learning method in which two classifiers based on two different feature sets iteratively learn from informative examples that have been queried from the unlabeled data. We design a new classification problem to measure the informativeness of an example in unlabeled data. In this classification problem, the examples are classified based on a joint view of a feature set to be informative/non-informative to both classifiers. To form the training data for the classification problem, we adopt a query-by-committee method. Therefore, in the ACT, both classifiers are considered to be one committee, which is used on the labeled data to give the informativeness label to each example. The ACT method outperforms the traditional co-training algorithm in terms of f-measure as well as the number of training iterations performed to build a good classification model. The proposed method tends to efficiently exploit a large amount of unlabeled data by selecting a small number of examples having not only useful information but also a comprehensive pattern.

An Automatic Diagnosis System for Hepatitis Diseases Based on Genetic Wavelet Kernel Extreme Learning Machine

  • Avci, Derya
    • Journal of Electrical Engineering and Technology
    • /
    • 제11권4호
    • /
    • pp.993-1002
    • /
    • 2016
  • Hepatitis is a major public health problem all around the world. This paper proposes an automatic disease diagnosis system for hepatitis based on Genetic Algorithm (GA) Wavelet Kernel (WK) Extreme Learning Machines (ELM). The classifier used in this paper is single layer neural network (SLNN) and it is trained by ELM learning method. The hepatitis disease datasets are obtained from UCI machine learning database. In Wavelet Kernel Extreme Learning Machine (WK-ELM) structure, there are three adjustable parameters of wavelet kernel. These parameters and the numbers of hidden neurons play a major role in the performance of ELM. Therefore, values of these parameters and numbers of hidden neurons should be tuned carefully based on the solved problem. In this study, the optimum values of these parameters and the numbers of hidden neurons of ELM were obtained by using Genetic Algorithm (GA). The performance of proposed GA-WK-ELM method is evaluated using statical methods such as classification accuracy, sensitivity and specivity analysis and ROC curves. The results of the proposed GA-WK-ELM method are compared with the results of the previous hepatitis disease studies using same database as well as different database. When previous studies are investigated, it is clearly seen that the high classification accuracies have been obtained in case of reducing the feature vector to low dimension. However, proposed GA-WK-ELM method gives satisfactory results without reducing the feature vector. The calculated highest classification accuracy of proposed GA-WK-ELM method is found as 96.642 %.

데이터마이닝 기법(CHAID)을 이용한 효과적인 데이터베이스 마케팅에 관한 연구 (A Study on the Effective Database Marketing using Data Mining Technique(CHAID))

  • 김신곤
    • 정보기술과데이타베이스저널
    • /
    • 제6권1호
    • /
    • pp.89-101
    • /
    • 1999
  • Increasing number of companies recognize that the understanding of customers and their markets is indispensable for their survival and business success. The companies are rapidly increasing the amount of investments to develop customer databases which is the basis for the database marketing activities. Database marketing is closely related to data mining. Data mining is the non-trivial extraction of implicit, previously unknown and potentially useful knowledge or patterns from large data. Data mining applied to database marketing can make a great contribution to reinforce the company's competitiveness and sustainable competitive advantages. This paper develops the classification model to select the most responsible customers from the customer databases for telemarketing system and evaluates the performance of the developed model using LIFT measure. The model employs the decision tree algorithm, i.e., CHAID which is one of the well-known data mining techniques. This paper also represents the effective database marketing strategy by applying the data mining technique to a credit card company's telemarketing system.

  • PDF

실시간 서지데이터베이스 평가방법에 관한 연구 (A Study on Real-time Quality Evaluation Method of Bibliographic Database)

  • 노경란;권오진;유현종;문영호;홍성화
    • 한국콘텐츠학회논문지
    • /
    • 제2권4호
    • /
    • pp.76-84
    • /
    • 2002
  • The conventional database evaluation method is carried out by the way in which the person in charge of each specialty database(DB manager) composes the evaluation sheets for corretionㆍrevision on the already-constructed database in a manual method and carries out the measurement and re-education of DB workers based upon it. As a result, that way consumes much time on career information and measurement works about DB workers, causing low time and cost efficiency and lack of systematic management of DB workers, resulting in becoming the hindrance factor of databases quality improvement. This research provides on-line, red-time results of measurements about the efficiency of DB production and DB workers by combining the static measurement with dynamic measurement by DB manager, both of which utilize the System. Therefore, the DB manager can contribute to the improvement of DB quality by determining the continuation of DB production by DB workers or carrying out the re-education of DB workers without being affected by time or spacial constraints.

  • PDF