Search | Korea Science

Ensemble Gene Selection Method Based on Multiple Tree Models

Mingzhu Lou
- Journal of Information Processing Systems
- /
- v.19 no.5
- /
- pp.652-662
- /
- 2023
Identifying highly discriminating genes is a critical step in tumor recognition tasks based on microarray gene expression profile data and machine learning. Gene selection based on tree models has been the subject of several studies. However, these methods are based on a single-tree model, often not robust to ultra-highdimensional microarray datasets, resulting in the loss of useful information and unsatisfactory classification accuracy. Motivated by the limitations of single-tree-based gene selection, in this study, ensemble gene selection methods based on multiple-tree models were studied to improve the classification performance of tumor identification. Specifically, we selected the three most representative tree models: ID3, random forest, and gradient boosting decision tree. Each tree model selects top-n genes from the microarray dataset based on its intrinsic mechanism. Subsequently, three ensemble gene selection methods were investigated, namely multipletree model intersection, multiple-tree module union, and multiple-tree module cross-union, were investigated. Experimental results on five benchmark public microarray gene expression datasets proved that the multiple tree module union is significantly superior to gene selection based on a single tree model and other competitive gene selection methods in classification accuracy.
https://doi.org/10.3745/JIPS.04.0290 인용 PDF

The Superior Tree Breeding of Rubus coreanus Miq. Cultivar 'Jungkeum' for High Productivity in Korea

Kim, Sea-Hyun;Chung, Hun-Gwan;Han, Jin-Gyu
- Korean Journal of Plant Resources
- /
- v.19 no.3
- /
- pp.381-384
- /
- 2006
This study was conducted to selected Korean black raspberry (Rubus coreanus Miq.) for high productivity. The eight major agronomic traits were investigated from 198 clones of the clone bank established in Korea Forest Research Institute, Suwon, Korea. The selection levels based on number of fruit per fructify lateral (NFFL) over 20, and fruit weight (FW) over 1.3g, and yield of individual per fructify lateral (YIFL) over 25g, were applied on 198 clones, resulted in 17 clones selected. The selected superior trees, 17 clones, appeared regional differences for amount of fruiting among 4 different test sites. When number of fruit per fruit petiole (NRFP), fruit weight (FW), yield of individual (YI) and sugar content were satisfied over 20, 1.4g, 6kg and 9.5 brix, respectively, as a select condition, 5 clones were reselected as the superior trees among 17 clones. for 3 years.
PDF KSCI

Multivariate Analysis on Fruit Morphological Characteristics and Estimation on Selection Effect of Selected Individuals of Sorbus alnifolia (Sieb. et Zucc.) K. Koch (팥배나무 집단의 열매의 형태적 특성에 의한 다변량분석과 선발효과추정)

Kim, Moon Sup;Kim, Sea Hyun;Han, Jingyu;Kwon, Hae Yun;Song, Jeong Ho;Kim, Hyeusoo
- Journal of Korean Society of Forest Science
- /
- v.103 no.2
- /
- pp.196-202
- /
- 2014
In order to select superior trees based on fruit characteristics and provide basic informations necessary for their improvement, total 107 individual trees of Sorbus alnifolia (Sieb. et Zucc.) K. Koch were selected from 11 wild populations in South Korea. After collecting normal fruit branch, we investigated morphological characteristics of fruit and then considered its relationship among the 11 populations by multivariate analysis method. Results from principal compound analysis showed that it represented 85.8% accumulated explanation from five principal compounds. According to cluster analysis based on fruit characteristics, the natural S. alnifolia populations were classified into four groups and Mt. Mani population was different from other populations. Selection effect with outstanding candidate trees including superior 5 individual trees (Gwangyo 1, Gwangyo 2, Deogyu 7, Mani 29, Mani 30) was estimated at 122.8%, 115.5% and 182.7% in fruit width, length and yield per fruit bunch, respectively. The object of this results will give us invaluable information about breeding by selection of S. alnifolia in south Korea.
https://doi.org/10.14578/jkfs.2014.103.2.196 인용 PDF KSCI

Selection of Superior Trees for Larger Fruit and High Productivity in Sorbus commixta Hedl.

Kim, Sea-Hyun;Jang, Yong-Seok;Chung, Hun-Gwan;Choi, Myoung-Sub;Kim, Sun-Chang
- Plant Resources
- /
- v.6 no.2
- /
- pp.120-128
- /
- 2003
The objectives of this study, an analysis of the variation for leaf and fruit characteristics among the selected ten populations of Sorbus commixta Hedl. could be used for the conservation of gene resources and could provide information to superior trees selection. The results obtained from this study can be summarized as follows; Approximately, the Mt. Sungin population at Ulleung island showed larger values in overall characteristics and populations. On the other hand, Mt. Halla population at Jeju island showed the smaller values of the overall characteristics and populations. ANOV A tests showed that there were statistically significant differences in all leaf characteristics among the populations as well as individual trees within populations. But, for fruit characteristics, differences were statistically significant only among the populations. Cluster analysis using single linkage method based on leaf and fruit characteristics showed that ten selected populations of S. commixta in Korea could be clustered into three groups. Group I is Mt. Sungin at Ulleung island, Group II is Mt. Halla at Jeju island, and Group III comprises Osan, Mt. Kaji, Mt. Duckyoo, Mt. Balwang, Mt. Sobaek, Mt. O-dae, Mt. Jiri, and Mt. Taebaek. The selection level based on major agronomic traits, which are the Number of Fruit per Fruiting Lateral(NFL) over 50, and Fruit Length(FL) and Width(FW) over 10 mm, and Weight of 100 Fruit(WFI00) over 66 g, was applied on 100 sample trees, and five trees were selected. The selection effects from selected trees in NFL, FL, FW, and WF100 were evaluated as 132%, 151 %, 142%, and 264% compared to the mean of those 100 sample trees, respectively. Especially, Ulleung 2 showed excellent values that NFL and WFI00 were 95, and 69 g, respectively, suggesting a promising new cultivar for larger fruit and high productivity.
PDF

The Seeds Characteristics of Artificial Populations of Yellowhorn (Xanthoceras sorbifolium) in China

Hyunseok Lee
- Proceedings of the Plant Resources Society of Korea Conference
- /
- 2020.08a
- /
- pp.71-71
- /
- 2020
Xanthoceras sorbifolia Bunge, the sole species in the genus Xanthoceras, is a flowering plant in the family Sapindaceae. It is an important tree species being a source of edible oil and biodiesel with a capacity as a pioneer of degraded and desert land. Seeds of X. sorbifolia were collected from two plantations and two superior trees in Inner Mongolia; and one plantation and one superior tree in Liaoning, China. An inter simple sequence repeat (ISSR) analysis showed genetic variation among four artificial populations in China: two in Inner Mongolia (IM), one in Liaoning (LN), and one in Shandong (SD). The average percentage of polymorphic loci was 81.25 % for these four populations. Based on an analysis of molecular variance, 23 % of the total genetic variation was found among populations, and 77 % within populations. Traits of seeds varied considerably between and among areas, for example two trees produced quite different seeds in several traits although they are adjacent to each other in the same farm. As much attention has not been paid to the traits of seeds, there should be a genetic test to understand this variation. It is necessary to obtain information on seed characteristics first and then provide basic information for further research on the selection of superior trees and provenances.
PDF

Decision Tree-Based Feature-Selective Neural Network Model: Case of House Price Estimation (의사결정나무를 활용한 신경망 모형의 입력특성 선택: 주택가격 추정 사례)

Yoon Han-Seong
- Journal of Korea Society of Digital Industry and Information Management
- /
- v.19 no.1
- /
- pp.109-118
- /
- 2023
Data-based analysis methods have become used more for estimating or predicting housing prices, and neural network models and decision trees in the field of big data are also widely used more and more. Neural network models are often evaluated to be superior to existing statistical models in terms of estimation or prediction accuracy. However, there is ambiguity in determining the input feature of the input layer of the neural network model, that is, the type and number of input features, and decision trees are sometimes used to overcome these disadvantages. In this paper, we evaluate the existing methods of using decision trees and propose the method of using decision trees to prioritize input feature selection in neural network models. This can be a complementary or combined analysis method of the neural network model and decision tree, and the validity was confirmed by applying the proposed method to house price estimation. Through several comparisons, it has been summarized that the selection of appropriate input characteristics according to priority can increase the estimation power of the model.
https://doi.org/10.17662/ksdim.2023.19.1.109 인용 PDF HTML

Ethnobotany of Wild Baobab (Adansonia digitata L.): A Way Forward for Species Domestication and Conservation in Sudan

Gurashi, N.A.;Kordofani, M.A.Y.;Adam, Y.O.
- Journal of Forest and Environmental Science
- /
- v.33 no.4
- /
- pp.270-280
- /
- 2017
Selection of superior phenotypes of fruit trees and products based on established criteria by local people is a prerequisite for future species domestication and conservation. Thus the study objective was to identify the local people's perceptions and preferences on baobab trees and products. A sample of 142 respondents was randomly selected using structured interviews in Blue Nile and North Kordofan, Sudan in 2013. Descriptive analysis was employed using SPSS and Excel programs. The study results indicated that local people use the morphological characteristics of the tree (leaves, fruits, seeds, kernels and bark) to differentiate individual trees. Based on the perceptions, local people recorded trees with delicious leaves, white pulp color, big fruit size and mature capsule size, and high pulp yield as criteria for differentiating between baobab trees in the study areas. In contrast, the undesirable traits were connected to trees with acidic pulp, slimy pulp, bitter leaves, and low pulp yield. The study concluded that the ethnobotanical knowledge of the baobab tree and its products may play an important role in tree domestication and improvement in Sudan. However, further research on tree genetics is needed to complement the ethnobotanical knowledge for baobab resources domestication and conservation.
https://doi.org/10.7747/JFES.2017.33.4.270 인용 PDF KSCI

Morphological Variations in Tetrapleura tetraptera Taub. (Fabaceae) Fruits and Seed Traits from Lowland Rainforest Zones of Nigeria: A Keystone Non Timber Forest Tree Species in the Tropics

Aishat Adeola Olaniyi;Samuel Olalekan Olajuyigbe;Musbau Bayo Olaniyi
- Journal of Forest and Environmental Science
- /
- v.40 no.2
- /
- pp.111-117
- /
- 2024
An evaluation was carried out on variability in morphology of fruits and seeds (number and weight) of Tetrapleura tetraptera (Schumach. and Thonn.) Taub. from different populations across its distribution range in Nigeria. Bulk fruit samples were collected and examined for variations in morphological characters. Differences in morphological character of fruits and seeds among the populations were determined using analysis of variance at 5% level of probability. The relationships among morphological characters were determined using Pearson correlation coefficient (r). Significant variations (p<0.05) existed among T. tetraptera populations for all the evaluated characters: fruit length, fruit width, number of seeds per fruit and seed weight. A positive significant strong correlation (r=0.96) was found between seed weight and number of seeds per fruit, while no correlation existed between fruit length, width and number of seeds. Seed weight was positively correlated with minimum altitude (r=0.97) and maximum altitude (r=0.99) of seed populations. Number of seeds was also significantly correlated with maximum altitude (r=0.965). There was no significant correlation between geo-climatic variables and fruit dimensions (length and width). Observed variations in morphological traits within and across populations of T. tetraptera may be used as proxy to estimate genetic diversity and selection of superior trees for improved productivity.
https://doi.org/10.7747/JFES.2024.40.2.111 인용 PDF

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

Lee, Dae-Bum;Seo, Jae-Hyun
- Journal of the Korea Convergence Society
- /
- v.10 no.5
- /
- pp.35-42
- /
- 2019
Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.
https://doi.org/10.15207/JKCS.2019.10.5.035 인용 PDF KSCI HTML

Application of Random Forest Algorithm for the Decision Support System of Medical Diagnosis with the Selection of Significant Clinical Test (의료진단 및 중요 검사 항목 결정 지원 시스템을 위한 랜덤 포레스트 알고리즘 적용)

Yun, Tae-Gyun;Yi, Gwan-Su
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.57 no.6
- /
- pp.1058-1062
- /
- 2008
In clinical decision support system(CDSS), unlike rule-based expert method, appropriate data-driven machine learning method can easily provide the information of individual feature(clinical test) for disease classification. However, currently developed methods focus on the improvement of the classification accuracy for diagnosis. With the analysis of feature importance in classification, one may infer the novel clinical test sets which highly differentiate the specific diseases or disease states. In this background, we introduce a novel CDSS that integrate a classifier and feature selection module together. Random forest algorithm is applied for the classifier and the feature importance measure. The system selects the significant clinical tests discriminating the diseases by examining the classification error during backward elimination of the features. The superior performance of random forest algorithm in clinical classification was assessed against artificial neural network and decision tree algorithm by using breast cancer, diabetes and heart disease data in UCI Machine Learning Repository. The test with the same data sets shows that the proposed system can successfully select the significant clinical test set for each disease.
PDF KSCI

Search Result 25, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)