• Title/Summary/Keyword: balanced tree

Search Result 53, Processing Time 0.022 seconds

A Study on Enhancement of Digital Image Performance Using Dual Tree Wavelet Transformation in Non-separable Image Processing (비분리 영상처리에서 이중 트리 웨이브렛 변환을 사용한 디지털 영상 성능 개선에 관한 연구)

  • Lim, Joong-Hee;Jee, Inn-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.65-74
    • /
    • 2012
  • In this paper, we explore the application of 2-D dual-tree discrete wavelet transform (DDWT), which is a directional and redundant transform, for image coding. DDWT introduces limited redundancy and allows the transform to provide approximate shift invariance and directionally selective filters while preserving the usual properties of perfect reconstruction and computational efficiency with good well-balanced frequency responses. Also, quincunx lattice yields a non separable 2D-wavelet transform, which is also symmetric in both horizontal and vertical direction. And non-separable wavelet transformation can generate sub-images of multiple degrees rotated versions. The proposed 2-D non-separable DDWT can provide efficient approximation for directional features of images schemes, such as edges and contours in images that are not aligned with the horizontal or vertical direction. Finally, non-separable image processing using DDWT services good performance.

Classification of Class-Imbalanced Data: Effect of Over-sampling and Under-sampling of Training Data (계급불균형자료의 분류: 훈련표본 구성방법에 따른 효과)

  • 김지현;정종빈
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.445-457
    • /
    • 2004
  • Given class-imbalanced data in two-class classification problem, we often do over-sampling and/or under-sampling of training data to make it balanced. We investigate the validity of such practice. Also we study the effect of such sampling practice on boosting of classification trees. Through experiments on twelve real datasets it is observed that keeping the natural distribution of training data is the best way if you plan to apply boosting methods to class-imbalanced data.

Characterization of palm oil and its utilization in food industry (팜기름의 특성 및 식품산업에의 이용)

  • Yoon, Suk Hoo
    • Food Science and Industry
    • /
    • v.50 no.3
    • /
    • pp.70-92
    • /
    • 2017
  • Crude palm oil (CPO) is obtained from the fruit of oil palm tree, and is rich in palmitic acid, ${\beta}$-carotene and vitamin E. CPO containing a balanced range of saturated and unsaturated fatty acids is fractionated mainly into liquid palm olein and solid palm. Palm oil is highly stable during frying due to its fatty acid composition, and the synergistic antioxidant activity of ${\beta}$-carotene and tocotrienol. Blending and interesterification of palm oil and other oils are the main processes used to offer functional, nutritional, and technical advantages to produce oils suitable for margarine, shortening, vanaspati, and frying oils etc. The advantages of using palm oil products include cheap raw materials, good availability, and low cost of processing, since hydrogenation is not necessary. Future research should lead to the production of oils with a higher oleic acid content and a higher content of vitamins E, carotenoids, and tocotrienols.

Studies on the Estimation of Annual Tree Volume Growth for the Use as Basic Data on the Plan of Timber Supply and Demand in Korea - The Sub-sampling Oriented - (우리나라 목재수급계획(木材需給計劃)의 기초자료(基礎資料)로 활용(活用)키 위한 연간(年間) 임목성장량(林木成長量)의 추정(推定)에 관한 연구(硏究) - 부차추출법(副次抽出法)을 중심(中心)으로 -)

  • Lee, Jong Lak
    • Journal of Korean Society of Forest Science
    • /
    • v.61 no.1
    • /
    • pp.37-44
    • /
    • 1983
  • This study was to estimate total annual volume growth by the measurement of mean tree growth during the last 10 years. Surveyed Forest stand was the second block (20.80 ha.)of Kyung Hee University Forests located at San 58 and 64, Gaegok-Ri, Gapyung-Yeup, Gapyung-Goon, Kyunggi province in Korea. The stand was mainly composed of uneven-aged Pinus densiflora and the estimation of tree volume was conducted by taking the cores at the D.B.H. of the sample tree which was selected by sub-sampling. The results obtained were as follows; 1) The regression between the diameter (D) and diameter growth ($\hat{I}$) was $\hat{I}=0.5499+0.0101D$. 2) The estimated equation of confidence interval for the diameter growth was $S^2{\hat{I}}=0.00817(0.09538-0.00952D+0.00027D^2$) 3) The equation for estimating tree height (H) from diameter was $H=1.32376D^{0.77958}$ 4) The equation for estimating tree volume from diameter and height $V=0.0000622D^{1.6918}H^{1.1397}$ 5) Total annual tree volume growth was $5.4041m^3/ha$, and ranged from 5.6131 to $5.1984m^3/ha$. 6) Annual growth rate of total tree volume and its error were 8.8% and 3.9%, respectively. The annual volume growth per tree for any districts can be estimated by this method, and the annual volume growth will be successfully predicted. Because of poor forest growing stock in Korea, annual amount of allowable cut should not exceed annual tree volume growth for better forest management. Accordingly, annual amount of allowable cut should be either equal to or less than annual tree volume growth for the balanced establishment between timber supply and demand in Korea. Demand shortage will be substituted with imported timber. Such plans enable Korean Government to develop a better policy of forest resources management.

  • PDF

A Hybrid Multi-Level Feature Selection Framework for prediction of Chronic Disease

  • G.S. Raghavendra;Shanthi Mahesh;M.V.P. Chandrasekhara Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.12
    • /
    • pp.101-106
    • /
    • 2023
  • Chronic illnesses are among the most common serious problems affecting human health. Early diagnosis of chronic diseases can assist to avoid or mitigate their consequences, potentially decreasing mortality rates. Using machine learning algorithms to identify risk factors is an exciting strategy. The issue with existing feature selection approaches is that each method provides a distinct set of properties that affect model correctness, and present methods cannot perform well on huge multidimensional datasets. We would like to introduce a novel model that contains a feature selection approach that selects optimal characteristics from big multidimensional data sets to provide reliable predictions of chronic illnesses without sacrificing data uniqueness.[1] To ensure the success of our proposed model, we employed balanced classes by employing hybrid balanced class sampling methods on the original dataset, as well as methods for data pre-processing and data transformation, to provide credible data for the training model. We ran and assessed our model on datasets with binary and multivalued classifications. We have used multiple datasets (Parkinson, arrythmia, breast cancer, kidney, diabetes). Suitable features are selected by using the Hybrid feature model consists of Lassocv, decision tree, random forest, gradient boosting,Adaboost, stochastic gradient descent and done voting of attributes which are common output from these methods.Accuracy of original dataset before applying framework is recorded and evaluated against reduced data set of attributes accuracy. The results are shown separately to provide comparisons. Based on the result analysis, we can conclude that our proposed model produced the highest accuracy on multi valued class datasets than on binary class attributes.[1]

A Cluster Based Energy Efficient Tree Routing Protocol in Wireless Sensor Networks (광역 WSN 을 위한 클러스팅 트리 라우팅 프로토콜)

  • Nurhayati, Nurhayati;Choi, Sung-Hee;Lee, Kyung-Oh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.576-579
    • /
    • 2011
  • Wireless sensor network are widely all over different fields. Because of its distinguished characteristics, we must take account of the factor of energy consumed when designing routing protocol. Wireless sensor networks consist of small battery powered devices with limited energy resources. Once deployed, the small sensor nodes are usually inaccessible to the user, and thus replacement of the energy source is not feasible. Hence, energy efficiency is a key design issue that needs to be enhanced in order to improve the life span of the network. In BCDCP, all sensors sends data from the CH (Cluster Head) and then to the BS (Base Station). BCDCP works well in a smallscale network however is not preferred in a large scale network since it uses much energy for long distance wireless communication. TBRP can be used for large scale network, but it weakness lies on the fact that the nodedry out of energy easily since it uses multi-hops transmission data to the Base Station. Here, we proposed a routing protocol. A Cluster Based Energy Efficient Tree Routing Protocol (CETRP) in Wireless Sensor Networks (WSNs) to prolong network life time through the balanced energy consumption. CETRP selects Cluster Head of cluster tree shape and uses maximum two hops data transmission to the Cluster Head in every level. We show CETRP outperforms BCDCP and TBRP with several experiments.

Prediction of Safety Grade of Bridges Using the Classification Models of Decision Tree and Random Forest (의사결정나무 및 랜덤포레스트 분류 모델을 이용한 교량 안전등급 예측)

  • Hong, Jisu;Jeon, Se-Jin
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.3
    • /
    • pp.397-411
    • /
    • 2023
  • The number of deteriorated bridges with a service period of more than 30 years has been rapidly increasing in Korea. Accordingly, the importance of advanced maintenance technologies through the predictions of age-induced deterioration degree, condition, and performance of bridges is more and more noticed. The prediction method of the safety grade of bridges was proposed in this study using the classification models of the Decision Tree and the Random Forest based on machine learning. As a result of analyzing these models for the 8,850 bridges located in national roads with various evaluation indexes such as confusion matrix, balanced accuracy, recall, ROC curve, and AUC, the Random Forest largely showed better predictive performance than that of the Decision Tree. In particular, random under-sampling in the Random Forest showed higher predictive performance than that of other sampling techniques for the C and D grade bridges, with the recall of 83.4%, which need more attention to maintenance because of the significant deterioration degree. The proposed model can be usefully applied to rapidly identify the safety grade and to establish an efficient and economical maintenance plan of bridges that have not recently been inspected.

A Hybrid SVM Classifier for Imbalanced Data Sets (불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델)

  • Lee, Jae Sik;Kwon, Jong Gu
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.125-140
    • /
    • 2013
  • We call a data set in which the number of records belonging to a certain class far outnumbers the number of records belonging to the other class, 'imbalanced data set'. Most of the classification techniques perform poorly on imbalanced data sets. When we evaluate the performance of a certain classification technique, we need to measure not only 'accuracy' but also 'sensitivity' and 'specificity'. In a customer churn prediction problem, 'retention' records account for the majority class, and 'churn' records account for the minority class. Sensitivity measures the proportion of actual retentions which are correctly identified as such. Specificity measures the proportion of churns which are correctly identified as such. The poor performance of the classification techniques on imbalanced data sets is due to the low value of specificity. Many previous researches on imbalanced data sets employed 'oversampling' technique where members of the minority class are sampled more than those of the majority class in order to make a relatively balanced data set. When a classification model is constructed using this oversampled balanced data set, specificity can be improved but sensitivity will be decreased. In this research, we developed a hybrid model of support vector machine (SVM), artificial neural network (ANN) and decision tree, that improves specificity while maintaining sensitivity. We named this hybrid model 'hybrid SVM model.' The process of construction and prediction of our hybrid SVM model is as follows. By oversampling from the original imbalanced data set, a balanced data set is prepared. SVM_I model and ANN_I model are constructed using the imbalanced data set, and SVM_B model is constructed using the balanced data set. SVM_I model is superior in sensitivity and SVM_B model is superior in specificity. For a record on which both SVM_I model and SVM_B model make the same prediction, that prediction becomes the final solution. If they make different prediction, the final solution is determined by the discrimination rules obtained by ANN and decision tree. For a record on which SVM_I model and SVM_B model make different predictions, a decision tree model is constructed using ANN_I output value as input and actual retention or churn as target. We obtained the following two discrimination rules: 'IF ANN_I output value <0.285, THEN Final Solution = Retention' and 'IF ANN_I output value ${\geq}0.285$, THEN Final Solution = Churn.' The threshold 0.285 is the value optimized for the data used in this research. The result we present in this research is the structure or framework of our hybrid SVM model, not a specific threshold value such as 0.285. Therefore, the threshold value in the above discrimination rules can be changed to any value depending on the data. In order to evaluate the performance of our hybrid SVM model, we used the 'churn data set' in UCI Machine Learning Repository, that consists of 85% retention customers and 15% churn customers. Accuracy of the hybrid SVM model is 91.08% that is better than that of SVM_I model or SVM_B model. The points worth noticing here are its sensitivity, 95.02%, and specificity, 69.24%. The sensitivity of SVM_I model is 94.65%, and the specificity of SVM_B model is 67.00%. Therefore the hybrid SVM model developed in this research improves the specificity of SVM_B model while maintaining the sensitivity of SVM_I model.

Distributing Network Loads in Tree-based Content Distribution System

  • Han, Seung Chul;Chung, Sungwook;Lee, Kwang-Sik;Park, Hyunmin;Shin, Minho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.22-37
    • /
    • 2013
  • Content distribution to a large number of concurrent clients stresses both server and network. While the server limitation can be circumvented by deploying server clusters, the network limitation is far less easy to cope with, due to the difficulty in measuring and balancing network load. In this paper, we use two useful network load metrics, the worst link stress (WLS) and the degree of interference (DOI), and formulate the problem as partitioning the clients into disjoint subsets subject to the server capacity constraint so that the WLS and the DOI are reduced for each session and also well balanced across the sessions. We present a network load-aware partition algorithm, which is practicable and effective in achieving the design goals. Through experiments on PlanetLab, we show that the proposed scheme has the remarkable advantages over existing schemes in reducing and balancing the network load. We expect the algorithm and performance metrics can be easily applied to various Internet applications, such as media streaming, multicast group member selection.

Mobile Base Station Placement with BIRCH Clustering Algorithm for HAP Network (HAP 네트워크에서 BIRCH 클러스터링 알고리즘을 이용한 이동 기지국의 배치)

  • Chae, Jun-Byung;Song, Ha-Yoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.10
    • /
    • pp.761-765
    • /
    • 2009
  • This research aims an optimal placement of Mobile Base Station (MBS) under HAP based network configurations with the restrictions of HAP capabilities. With clustering algorithm based on BIRCH, mobile ground nodes are clustered and the centroid of the clusters will be the location of MBS. The hierarchical structure of BIRCH enables mobile node management by CF tree and the restrictions of maximum nodes per MBS and maximum radio coverage are accomplished by splitting and merging clusters. Mobility models based on Jeju island are used for simulations and such restrictions are met with proper placement of MBS.