• 제목/요약/키워드: decision tree

검색결과 1,615건 처리시간 0.032초

Identification of Tea Diseases Based on Spectral Reflectance and Machine Learning

  • Zou, Xiuguo;Ren, Qiaomu;Cao, Hongyi;Qian, Yan;Zhang, Shuaitang
    • Journal of Information Processing Systems
    • /
    • 제16권2호
    • /
    • pp.435-446
    • /
    • 2020
  • With the ability to learn rules from training data, the machine learning model can classify unknown objects. At the same time, the dimension of hyperspectral data is usually large, which may cause an over-fitting problem. In this research, an identification methodology of tea diseases was proposed based on spectral reflectance and machine learning, including the feature selector based on the decision tree and the tea disease recognizer based on random forest. The proposed identification methodology was evaluated through experiments. The experimental results showed that the recall rate and the F1 score were significantly improved by the proposed methodology in the identification accuracy of tea disease, with average values of 15%, 7%, and 11%, respectively. Therefore, the proposed identification methodology could make relatively better feature selection and learn from high dimensional data so as to achieve the non-destructive and efficient identification of different tea diseases. This research provides a new idea for the feature selection of high dimensional data and the non-destructive identification of crop diseases.

서비스수준을 고려한 GIS기반의 차량 운송시스템 (Design of a GIS-Based Distribution System with Service Consideration)

  • 황흥석;조규성
    • 경영과학
    • /
    • 제18권2호
    • /
    • pp.125-134
    • /
    • 2001
  • This paper is concerned with the development of a GIS-based distribution system with service consideration. The proposed model could be used for a wide range of logistics applications in planning, engineering and operational purpose for logistics system. This research addresses the formulation of those complex prob1ems of two-echelon logistics system to plan the incorporating supply center locations and distribution problems based on GIS. We propose an integrated logistics model for determining the optimal patterns of supply centers and inventory allocations (customers) with a three-step sequential approach. 1) First step, Developing GIS-distance model and stochastic set-covering program to determine Optimel pattern of supply center location. 2) Second step, Optimal sector-clustering to support customers. 3) Third step, Optimal vehicle rouse scheduling based on GIS, GIS-VRP In this research we developed GUI-tree program, the GIS-VRP provide the vehicle to users and freight information in real time. We applied a set of sample examples to this model and demonstrated samp1e results. It has been found that the proposed model is potentially efficient and useful in solving multi-depot problem through examples. However the proposed model can provide logistics decision makers to get the best supply schedule.

  • PDF

사회 네트워크 분석을 이용한 충성고객과 이탈고객의 구매 특성 비교 연구 (Social Network Analysis to Analyze the Purchase Behavior Of Churning Customers and Loyal Customers)

  • 김재경;최일영;김혜경;김남희
    • 경영과학
    • /
    • 제26권1호
    • /
    • pp.183-196
    • /
    • 2009
  • Customer retention has been a pressing issue for companies to get and maintain the loyal customers in the competing environment. Lots of researchers make effort to seek the characteristics of the churning customers and the loyal customers using the data mining techniques such as decision tree. However, such existing researches don't consider relationships among customers. Social network analysis has been used to search relationships among social entities such as genetics network, traffic network, organization network and so on. In this study, a customer network is proposed to investigate the differences of network characteristics of churning customers and loyal customers. The customer networks are constructed by analyzing the real purchase data collected from a Korean cosmetic provider. We investigated whether the churning customers and the loyal customers have different degree centralities and densities of the customer networks. In addition, we compared products purchased by the churning customers and those by the loyal customers. Our data analysis results indicate that degree centrality and density of the churning customer network are higher than those of the loyal customer network, and the various products are purchased by churning customers rather than by the loyal customers. We expect that the suggested social network analysis is used to as a complementary analysis methodology with existing statistical analysis and data mining analysis.

SNPHarvester를 활용한 주요 유전자 상호작용 효과 감명 (Identify Major Gene-Gene Interaction Effects Using SNPHarvester)

  • 이제영;김동철
    • Communications for Statistical Applications and Methods
    • /
    • 제16권6호
    • /
    • pp.915-923
    • /
    • 2009
  • 광범위 유전자 연관(genome-wide association) 연구에서는 무수히 많은 유전자들 중에 인간의 질병에 관련된 유전자를 찾아왔다. 기존의 인간 질병에 관련된 유전자를 찾는 방법에서 이렇게 많은 유전자들 중에서 우수한 유전자를 찾는데 직접 이용할 시에는 계산이 복잡해지고 비용이 많이 들어가며 시간이 오래 걸린다는 단점이 생긴다. 따라서 이번 수많은 유전자들 중 주요 유전자 그룹을 찾는 방법으로 SNPHarvester가 개발되였다. 본 연구에서는 인간의 질병이 아닌 한우의 여러 경제형질에 관련된 우수 유전자를 SNPHarvester를 이용하여 17 개의 SNP들 중에서 우수한 유전자 그룹을 찾았고 의사결정나무(decision tree)를 이용하여 한우의 여러 경제형질을 높일 수 있는 SNP 그룹 내의 우수 유전자형도 함께 규명할 수 있었다.

R&B 투자에 대한 경제성 분석의 사례연구 - 초전도 한류기 개발을 중심으로 - (A Case Study of Economic Analysis on R&D Investment)

  • 조현춘;김재천;박상덕
    • 기술혁신연구
    • /
    • 제6권2호
    • /
    • pp.159-177
    • /
    • 1998
  • Although each company is trying to develop an economic analysis model with its own particular style or format, the appropriate method is not yet developed because there are many problems to be solved such as uncertainity of outcomes and intangible benefits of technology. The purpose of tris paper therefore is to suggest an economic analysis methodology, which reflects the complexity and the risk of R&D investment, through a case study on the development of a superconductor fault current limiter. A self-developed Monte Carlo simulation program utilized as a main tool in this paper was very useful for risk analysis of R&D investment which could not be solved in the previous DCF(Discounted Cash Flow) model. We also introduce learning effect to consider the intangible benefits such as Know-How obtained from R&D execution. The expected value and its probability distribution for R&D investment can be obtained by combining the Monte Carlo method with the decision tree approach. This result is helpful in judging the priority and the resource-allocation of R&D projects. It is however necessary to develop more precise model for quantifying the technology stock and the simulation program using the continuous probability distribution in expected values to improve the reliability of economic analysis on R&D projects.

  • PDF

의미적 토픽 기반 지식모델의 통합에 관한 연구 (A study on integration of semantic topic based Knowledge model)

  • 전승수;이상진;배상태
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2012년도 한국컴퓨터종합학술대회논문집 Vol.39 No.1(B)
    • /
    • pp.181-183
    • /
    • 2012
  • 최근 자연어 및 정형언어 처리, 인공지능 알고리즘 등을 활용한 효율적인 의미 기반 지식모델의 생성과 분석 방법이 제시되고 있다. 이러한 의미 기반 지식모델은 효율적 의사결정트리(Decision Making Tree)와 특정 상황에 대한 체계적인 문제해결(Problem Solving) 경로 분석에 활용된다. 특히 다양한 복잡계 및 사회 연계망 분석에 있어 정적 지표 생성과 회귀 분석, 행위적 모델을 통한 추이분석, 거시예측을 지원하는 모의실험(Simulation) 모형의 기반이 된다. 본 연구에서는 이러한 의미 기반 지식모델을 통합에 있어 텍스트 마이닝을 통해 도출된 토픽(Topic) 모델 간 통합 방법과 정형적 알고리즘을 제시한다. 이를 위해 먼저, 텍스트 마이닝을 통해 도출되는 키워드 맵을 동치적 지식맵으로 변환하고 이를 의미적 지식모델로 통합하는 방법을 설명한다. 또한 키워드 맵으로부터 유의미한 토픽 맵을 투영하는 방법과 의미적 동치 모델을 유도하는 알고리즘을 제안한다. 통합된 의미 기반 지식모델은 토픽 간의 구조적 규칙과 정도 중심성, 근접 중심성, 매개 중심성 등 관계적 의미분석이 가능하며 대규모 비정형 문서의 의미 분석과 활용에 실질적인 기반 연구가 될 수 있다.

시군별 홍수위험잠재능 유형화 및 특성분석 (A Study on Potential Flood Damage Classification and characteristic analysis)

  • 김수진;은상규;김성필;배승종
    • 농촌계획
    • /
    • 제23권3호
    • /
    • pp.21-36
    • /
    • 2017
  • Climate change is intensifying storms and floods around the world. Where nature has been destroyed by development, communities are at risk from these intensified climate patterns. This study was to suggest a methodology for estimating flood vulnerability using Potential Flood Damage(PFD) concept and classify city/county about Potential Flood Damage(PFD) using various typology techniques. To evaluate the PFD at a spatial resolutions of city/county units, the 20 representative evaluation indexing factors were carefully selected for the three categories such as damage target(FDT), damage potential(FDP) and prevention ability(FPA). The three flood vulnerability indices of FDT, FDP and FPA were applied for the 167 cities and counties in Korea for the pattern classification of potential flood damage. Potential Flood Damage(PFD) was classified by using grouping analysis, decision tree analysis, and cluster analysis, and characteristics of each type were analyzed. It is expected that the suggested PFD can be utilized as the useful flood vulnerability index for more rational and practical risk management plans against flood damage.

원거리 음성 인식을 위한 효율적인 에코제거 시스템 (Efficient Acoustic Echo Cancellation System for Distant-Talking Automatic Speech Recognition)

  • 김기범;김상윤;이우정;권민석;고병섭
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2014년도 추계학술대회 논문집
    • /
    • pp.150-155
    • /
    • 2014
  • 본 논문에서는, 원거리 음성인식을 위한 서브밴드 필터링 기반의 빠르고 효율적인 에코제거 시스템을 제안한다. 제안하는 에코제거 시스템은 우선 채널간 유사도 (correlation) 가 높을 경우 적응필터가 오작동하는 것을 방지하기 위해 spatial decorrelation 을 적용하게 된다. 그리고 tree 형태를 가지는 IIR filterbank 기반의 subband 구조를 채택함으로써, 적은 차수로도 효과적인 analysis, synthesis 필터링을 수행할 수 있도록 한다. 이 과정에서 불가피하게 발생하는 서브 밴드간 spectral aliasing은 notch filter를 적용해 해결할 수 있다. 또한 적응 필터로는 improved proportionate normalized least-mean-square (IP-NLMS) 알고리즘을 사용해 수렴속도 및 에코제거 성능에서 우수함을 확인하였다. 마지막으로 decision-directed estimation 기반의 residual echo suppressor를 적용해 잔여 에코를 제거하게 된다. 본 논문에서는 각 단계를 구성하게 된 이론적인 배경을 소개하고, 실제 에코가 존재하는 환경에서 ERLE, 원거리 음성 인식률, computational complexity를 통해 제안하는 에코제거 시스템의 효과를 입증하도록 한다.

  • PDF

Fast Quadtree Structure Decision for HEVC Intra Coding Using Histogram Statistics

  • Li, Yuchen;Liu, Yitong;Yang, Hongwen;Yang, Dacheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권5호
    • /
    • pp.1825-1839
    • /
    • 2015
  • The final draft of the latest video coding standard, High Efficiency Video Coding (HEVC), was approved in January 2013. The coding efficiency of HEVC surpasses its predecessor, H.264/MPEG-4 Advanced Video Coding (AVC), by using only half of the bitrate to encode the same sequence with similar quality. However, the complexity of HEVC is sharply increased compared to H.264/AVC. In this paper, a method is proposed to decrease the complexity of intra coding in HEVC. Early pruning and an early splitting strategy are applied to the quadtree structure of coding tree units (CTU) and residual quadtree (RQT). According to our experiment, when our method is applied to sequences from Class A to Class E, the coding time is decreased by 44% at the cost of a 1.08% Bjontegaard delta rate (BD-rate) increase on average.

A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data

  • Pouramini, Jafar;Minaei-Bidgoli, Behrouze;Esmaeili, Mahdi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권8호
    • /
    • pp.3725-3748
    • /
    • 2018
  • Text data distribution is often imbalanced. Imbalanced data is one of the challenges in text classification, as it leads to the loss of performance of classifiers. Many studies have been conducted so far in this regard. The proposed solutions are divided into several general categories, include sampling-based and algorithm-based methods. In recent studies, feature selection has also been considered as one of the solutions for the imbalance problem. In this paper, a novel one-sided feature selection known as probabilistic feature selection (PFS) was presented for imbalanced text classification. The PFS is a probabilistic method that is calculated using feature distribution. Compared to the similar methods, the PFS has more parameters. In order to evaluate the performance of the proposed method, the feature selection methods including Gini, MI, FAST and DFS were implemented. To assess the proposed method, the decision tree classifications such as C4.5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per F-measure suggested that the proposed feature selection has significantly improved the performance of the classifiers.