• Title/Summary/Keyword: Accuracy of Selection

Search Result 1,163, Processing Time 0.043 seconds

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

A Study on Hepatomegaly and Facial Telangiectasia in a Group of the Insured (간종대(肝腫大)와 안면모세혈관확장(顔面毛細血管擴張)의 보험의학적연구(保險醫學的硏究))

  • Im, Young-Hoon
    • The Journal of the Korean life insurance medical association
    • /
    • v.4 no.1
    • /
    • pp.110-132
    • /
    • 1987
  • A study on hepatomegaly detected by abdominal palpation, and facial telangiectasia in a total of 3,418 insured persons medically examined at the Honam Medical Room of Dong Bang Life Insurance Company Ltd. from February, 1984 to August, 1985 was undertaken. The results were as follows: 1) Hepatomegaly was found in 383 cases(27.5%) among the 1,395 insureds of male and in 163 cases(8.1%) among the 2,023 insureds of female. The difference of incidence of hepatomegaly between all males and females showed statistical significance(p<0.001). In each age group, the incidence of hepatomegaly in :nale was higher than that in female. The incidence of hepatomegaly in each age group in male increased cnosiderably with age; it showed 11.6%,16.2%, 42.6% and 52.9% from second to sixth decade in order, thereafter in seventh decade it decreased to 26.7%, While the incidence of hepatomegaly in female increased slightly in each age group. 2) Facial telangiectasia was found in 318 cases(22.8%) among all males and in 157 cases(7.8%) among all females. The difference of incidence of telangiectasia between all males and females showed statistical significance(p<0.001). In each age group, the incidence of telangiectasia in male was higher than that in female, except of second decade. The incidence of facial telangiectasia in each age group in male increased considerably with age; while it increased slightly in female. 3) Facial telangiectasia accompanied by hepatomegaly was found in 235 cases(61.4%) among 383 cases of hepatomegaly in male and in 69 cases(42.3%) among 163 cases of hepatomegaly in female. The difference of incidence of telangiectasia between males and females show ed statistical significance(p<0.001). 4) Facial telangiectasia without spider angiomata accompanied by hepatomegaly was found in 201 cases(52.5%) among 383 cases of hepatomegaly in all males and in 67 casgs(41.4%) among 163 cases of hepatomegaly in all females; facial spider angiomata accompanied by hepatomegaly was found in 34 cases(8.9%) among 383 cases of hepatomegaly in all males and in 2 cases(1.2%) among 163 cases of hepatomegaly in all females. 5) Abnormal SGOT activity was found in 19 cases(7.9%) among 242 cases of hepatomegaly in all males and in one case(1.5%) among 67 cases of hepatomegaly in all females. The difference of incidence of abnormal SGOT activity showed statistical significance(p<0.001). The incidence of abnormal SGOT activity by the size of hepatomegaly, that is, palpated <1 finger's breadth, <2 fingers' breadth and ${\geqq}2$ fingers' breadth, revealed 2.2%, 6.0% and 60.0% respectively in all males, while abnormal SGOT activity was found only one case in fifth decade among 67 cases of hepatomegaly in all females. 6) In ordinary medical examination(the insured amount is low) abnormal SGOT activity was found in 7 cases(4.8%) among 146 cases of hepatomegaly palpated $1\frac{1}{2}$ fingers' breadth and under, while it was not found in 37 cases of the same sized hepatomegaly in all females. Above mentioned 7 cases are thought to be very significant because 7 cases occupy 35% in 20 cases of abnormal SGOT activity with hepatomegaly. 7) Abnormal SGOT activity was found in 12 cases(4.4%) among 273 cases of hepatomegaly of "not firm" consistency, while it was found in 8 cases(22.2%) among 36 cases of hepatomegaly of "firm" consistency. The difference of incidence of abnormal SGOT activity showed statistical significance(p<0.05). 8) Abnormal SGOT activity was found in 5 cases(17.9%) among 28 cases of spider angiomata with hepatomegaly, while it was found in 10 cases(7.3%) among 166 cases of telangiectasia without spider angiomata with hepatomegaly. Owing to a small number of cases, statistical significance was not recognized, but the incidence of abnormal SGOT activity in spider angiomata cases with hepatomegaly is apt to be higher than that in telangiectasia cases without spider angiomata with hepatomegaly. 9) The incidence of abnormal SGOT activity is apt to be higher with age in male group; abnormal SGOT activity was not found among 4 cases of hepatomegaly in second decade and it was 3.8% in third decade, 4.5% in fourth decade, 9.3% in fifth decade, 17.5% in sixth decade and 33.3% in seventh decade, while the incidence of it was only one case among 67 cases in all females. 10) It is believed that the performance of liver function test to the subjects with hepatomegaly even in ordinary medical examination(the insured amount is low) will give considerable contribution for medical selection of hepatomegaly risk. 11) Age of the insured(young or old), presence of facial telangiectasia or spider angiomata especially and their severity, and consistency of enlarged liver(firm or not) should be considered to increase accuracy in evaluating hepatomegaly risk.

  • PDF

Establishing a Nomogram for Stage IA-IIB Cervical Cancer Patients after Complete Resection

  • Zhou, Hang;Li, Xiong;Zhang, Yuan;Jia, Yao;Hu, Ting;Yang, Ru;Huang, Ke-Cheng;Chen, Zhi-Lan;Wang, Shao-Shuai;Tang, Fang-Xu;Zhou, Jin;Chen, Yi-Le;Wu, Li;Han, Xiao-Bing;Lin, Zhong-Qiu;Lu, Xiao-Mei;Xing, Hui;Qu, Peng-Peng;Cai, Hong-Bing;Song, Xiao-Jie;Tian, Xiao-Yu;Zhang, Qing-Hua;Shen, Jian;Liu, Dan;Wang, Ze-Hua;Xu, Hong-Bing;Wang, Chang-Yu;Xi, Ling;Deng, Dong-Rui;Wang, Hui;Lv, Wei-Guo;Shen, Keng;Wang, Shi-Xuan;Xie, Xing;Cheng, Xiao-Dong;Ma, Ding;Li, Shuang
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.9
    • /
    • pp.3773-3777
    • /
    • 2015
  • Background: This study aimed to establish a nomogram by combining clinicopathologic factors with overall survival of stage IA-IIB cervical cancer patients after complete resection with pelvic lymphadenectomy. Materials and Methods: This nomogram was based on a retrospective study on 1,563 stage IA-IIB cervical cancer patients who underwent complete resection and lymphadenectomy from 2002 to 2008. The nomogram was constructed based on multivariate analysis using Cox proportional hazard regression. The accuracy and discriminative ability of the nomogram were measured by concordance index (C-index) and calibration curve. Results: Multivariate analysis identified lymph node metastasis (LNM), lymph-vascular space invasion (LVSI), stromal invasion, parametrial invasion, tumor diameter and histology as independent prognostic factors associated with cervical cancer survival. These factors were selected for construction of the nomogram. The C-index of the nomogram was 0.71 (95% CI, 0.65 to 0.77), and calibration of the nomogram showed good agreement between the 5-year predicted survival and the actual observation. Conclusions: We developed a nomogram predicting 5-year overall survival of surgically treated stage IA-IIB cervical cancer patients. More comprehensive information that is provided by this nomogram could provide further insight into personalized therapy selection.

Comparison of Breeding Value by Establishment of Genomic Relationship Matrix in Pure Landrace Population (유전체 관계행렬 구성에 따른 Landrace 순종돈의 육종가 비교)

  • Lee, Joon-Ho;Cho, Kwang-Hyun;Cho, Chung-Il;Park, Kyung-Do;Lee, Deuk Hwan
    • Journal of Animal Science and Technology
    • /
    • v.55 no.3
    • /
    • pp.165-171
    • /
    • 2013
  • Genomic relationship matrix (GRM) was constructed using whole genome SNP markers of swine and genomic breeding value was estimated by substitution of the numerator relationship matrix (NRM) based on pedigree information to GRM. Genotypes of 40,706 SNP markers from 448 pure Landrace pigs were used in this study and five kinds of GRM construction methods, G05, GMF, GOF, $GOF^*$ and GN, were compared with each other and with NRM. Coefficients of GOF considering each of observed allele frequencies showed the lowest deviation with coefficients of NRM and as coefficients of GMF considering the average minor allele frequency showed huge deviation from coefficients of NRM, movement of mean was expected by methods of allele frequency consideration. All GRM construction methods, except for $GOF^*$, showed normally distributed Mendelian sampling. As the result of breeding value (BV) estimation for days to 90 kg (D90KG) and average back-fat thickness (ABF) using NRM and GRM, correlation between BV of NRM and GRM was the highest by GOF and as genetic variance was overestimated by $GOF^*$, it was confirmed that scale of GRM is closely related with estimation of genetic variance. With the same amount of phenotype information, accuracy of BV based on genomic information was higher than BV based on pedigree information and these symptoms were more obvious for ABF then D90KG. Genetic evaluation of animal using relationship matrix by genomic information could be useful when there is lack of phenotype or relationship and prediction of BV for young animals without phenotype.

Lung cancer, chronic obstructive pulmonary disease and air pollution (대기오염에 의한 폐암 및 만성폐색성호흡기질환 -개인 흡연력을 보정한 만성건강영향평가-)

  • Sung, Joo-Hon;Cho, Soo-Hun;Kang, Dae-Hee;Yoo, Keun-Young
    • Journal of Preventive Medicine and Public Health
    • /
    • v.30 no.3 s.58
    • /
    • pp.585-598
    • /
    • 1997
  • Background : Although there are growing concerns about the adverse health effect of air pollution, not much evidence on health effect of current air pollution level had been accumulated yet in Korea. This study was designed to evaluate the chronic health effect of ai. pollution using Korean Medical Insurance Corporation (KMIC) data and air quality data. Medical insurance data in Korea have some drawback in accuracy, but they do have some strength especially in their national coverage, in having unified ID system and individual information which enables various data linkage and chronic health effect study. Method : This study utilized the data of Korean Environmental Surveillance System Study (Surveillance Study), which consist of asthma, acute bronchitis, chronic obstructive pulmonary diseases (COPD), cardiovascular diseases (congestive heart failure and ischemic heart disease), all cancers, accidents and congenital anomaly, i. e., mainly potential environmental diseases. We reconstructed a nested case-control study wit5h Surveillance Study data and air pollution data in Korea. Among 1,037,210 insured who completed? questionnaire and physical examination in 1992, disease free (for chronic respiratory disease and cancer) persons, between the age of 35-64 with smoking status information were selected to reconstruct cohort of 564,991 persons. The cohort was followed-up to 1995 (1992-5) and the subjects who had the diseases in Surveillance Study were selected. Finally, the patients, with address information and available air pollution data, left to be 'final subjects' Cases were defined to all lung cancer cases (424) and COPD admission cases (89), while control groups are determined to all other patients than two case groups among 'final subjects'. That is, cases are putative chronic environmental diseases, while controls are mainly acute environmental diseases. for exposure, Air quality data in 73 monitoring sites between 1991 - 1993 were analyzed to surrogate air pollution exposure level of located areas (58 areas). Five major air pollutants data, TSP, $O_3,\;SO_2$, CO, NOx was available and the area means were applied to the residents of the local area. 3-year arithmetic mean value, the counts of days violating both long-term and shot-term standards during the period were used as indices of exposure. Multiple logistic regression model was applied. All analyses were performed adjusting for current and past smoking history, age, gender. Results : Plain arithmetic means of pollutants level did not succeed in revealing any relation to the risk of lung cancer or COPD, while the cumulative counts of non-at-tainment days did. All pollutants indices failed to show significant positive findings with COPD excess. Lung cancer risks were significantly and consistently associated with the increase of $O_3$ and CO exceedance counts (to corrected error level -0.017) and less strongly and consistently with $SO_2$ and TSP. $SO_2$ and TSP showed weaker and less consistent relationship. $O_3$ and CO were estimated to increase the risks of lung cancer by 2.04 and 1.46 respectively, the maximal probable risks, derived from comparing more polluted area (95%) with cleaner area (5%). Conclusions : Although not decisive due to potential misclassication of exposure, these results wert drawn by relatively conservative interpretation, and could be used as an evidence of chronic health effect especially for lung cancer. $O_3$ might be a candidate for promoter of lung cancer, while CO should be considered as surrogated measure of motor vehicle emissions. The control selection in this study could have been less appropriate for COPD, and further evaluation with another setting might be necessary.

  • PDF

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

  • Kim, JaeHun;Lee, Myungjin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.43-61
    • /
    • 2019
  • Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.

Development and Analysis of COMS AMV Target Tracking Algorithm using Gaussian Cluster Analysis (가우시안 군집분석을 이용한 천리안 위성의 대기운동벡터 표적추적 알고리듬 개발 및 분석)

  • Oh, Yurim;Kim, Jae Hwan;Park, Hyungmin;Baek, Kanghyun
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.6
    • /
    • pp.531-548
    • /
    • 2015
  • Atmospheric Motion Vector (AMV) from satellite images have shown Slow Speed Bias (SSB) in comparison with rawinsonde. The causes of SSB are originated from tracking, selection, and height assignment error, which is known to be the leading error. However, recent works have shown that height assignment error cannot be fully explained the cause of SSB. This paper attempts a new approach to examine the possibility of SSB reduction of COMS AMV by using a new target tracking algorithm. Tracking error can be caused by averaging of various wind patterns within a target and changing of cloud shape in searching process over time. To overcome this problem, Gaussian Mixture Model (GMM) has been adopted to extract the coldest cluster as target since the shape of such target is less subject to transformation. Then, an image filtering scheme is applied to weigh more on the selected coldest pixels than the other, which makes it easy to track the target. When AMV derived from our algorithm with sum of squared distance method and current COMS are compared with rawindsonde, our products show noticeable improvement over COMS products in mean wind speed by an increase of $2.7ms^{-1}$ and SSB reduction by 29%. However, the statistics regarding the bias show negative impact for mid/low level with our algorithm, and the number of vectors are reduced by 40% relative to COMS. Therefore, further study is required to improve accuracy for mid/low level winds and increase the number of AMV vectors.

The Effect of Non-genetic Factors on Birth Weight and Weaning Weight in Three Sheep Breeds of Zimbabwe

  • Assan, N.;Makuza, S.M.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.18 no.2
    • /
    • pp.151-157
    • /
    • 2005
  • Sheep production is affected by genetic and non-genetic factors. A knowledge of these factors is essential for efficient management and for the accurate estimation of breeding values. The objective of this study was to establish the non-genetic factors which affect birth weight and weaning weight in Dorper, Mutton Merino and indigenous Sabi sheep breeds. A total of 2,625 birth and weaning weight records from Grasslands Research Station collected from 1991 through 1993, were used. The records were collected from indigenous Sabi (939), Dorper (807) and Mutton Merino (898) sheep. A mixed classification model containing the fixed effects of year, birth status and sex was used for identification of non-genetic factors. Sire within breed was included as a random effect. Two factor interactions and three factor interactions were important in indigenous Sabi, Mutton Merino and Dorper sheep. The mean birth weights were 4.37${\pm}$0.04 kg, 4.62${\pm}$0.04 kg and 3.29${\pm}$0.04 kg for Mutton Merino, Dorper and Sabi sheep, respectively. Sire had significant effects (p<0.05) on birth weight in Mutton Merino and indigenous Sabi sheep. Year of lambing had significant effects (p<0.05) on birth weight in indigenous Sabi, Mutton Merino and Dorper sheep. The effect of birth status was non significant in Dorper and Mutton Merino sheep while effect of birth status was significant on birth weight in indigenous Sabi sheep. In Indigenous Sabi sheep lambs born as singles (3.30${\pm}$0.05 kg) were 0.23 kg heavier than twins (3.07${\pm}$0.05 kg), in Mutton Merino lambs born as singles (3.99${\pm}$0.08 kg) were 0.07 kg heavier than twins (3.92${\pm}$0.08 kg) and in Dorper lambs born as singles (4.41${\pm}$0.04 kg) were 0.02 kg heavier than twins (4.39${\pm}$0.04 kg). On average males were heavier than females (p<0.05) weighing (3.32${\pm}$0.04 kg vs. 3.05${\pm}$0.07 kg) in indigenous Sabi, 4.73${\pm}$0.03 kg vs. 4.08${\pm}$0.05 in Dorper and 4.26${\pm}$0.07 kg vs. 3.66${\pm}$0.09 kg in Mutton Merino sheep. Two way factor interactions of sire*year, year*sex and sex*birth status had significant effects (p<0.05) on birth weight in indigenous Sabi, Mutton Merino and Dorper sheep while the effect of year*birth status was non significant on birth weight in Indigenous Sabi sheep. The three way factor interaction of year*sex*birth status had a significant effect (p<0.01) on birth weight in indigenous Sabi and Mutton Merino. Tupping weight fitted as a covariate had significant effects (p<0.001) on birth weight in indigenous Sabi, Mutton Merino and Dorper sheep. The mean weaning weights were 17.94${\pm}$0.31 kg, 18.19${\pm}$0.28 kg and 14.39${\pm}$0.28 kg for Mutton Merino, Dorper and Indigenous Sabi sheep, respectively. Effects of sire and sire*year were non significant on weaning weight in Dorper and Mutton Merino while year, sex and sex*year interaction had significant effects (p<0.001) on weaning weight. On average males were heavier than females (p<0.001) at weaning. The respective weaning weights were 18.05${\pm}$0.46 kg, 18.68${\pm}$0.19 kg, 14.14${\pm}$0.15 kg for males and 16.64${\pm}$0.60 kg, 16.41${\pm}$0.31 kg, 12.64${\pm}$0.32 kg for females in Mutton Merino, Dorper and Indigenous Sabi sheep. Lambs born as singles were significantly heavier at weaning than twins, 0.05 kg, 0.06 kg and 0.78 kg for Mutton Merino, Dorper and Indigenous Sabi sheep, respectively. Effect of tupping weight was highly significant on weaning weight. The three way factor interaction year*sex*birth status had a significant effect (p<0.01) on weaning weight. Correction for environmental effects is necessary to increase accuracy of direct selection for birth weight and weaning weight.

Habitat Distribution Change Prediction of Asiatic Black Bears (Ursus thibetanus) Using Maxent Modeling Approach (Maxent 모델을 이용한 반달가슴곰의 서식지 분포변화 예측)

  • Kim, Tae-Geun;Yang, DooHa;Cho, YoungHo;Song, Kyo-Hong;Oh, Jang-Geun
    • Korean Journal of Ecology and Environment
    • /
    • v.49 no.3
    • /
    • pp.197-207
    • /
    • 2016
  • This study aims at providing basic data to objectively evaluate the areas suitable for reintroduction of the species of Asiatic black bear (Ursus thibetanus) in order to effectively preserve the Asiatic black bears in the Korean protection areas including national parks, and for the species restoration success. To this end, this study predicted the potential habitats in East Asia, Southeast Asia and India, where there are the records of Asiatic black bears' appearances using the Maxent model and environmental variables related with climate, topography, road and land use. In addition, this study evaluated the effects of the relevant climate and environmental variables. This study also analyzed inhabitation range area suitable for Asiatic black and geographic change according to future climate change. As for the judgment accuracy of the Maxent model widely utilized for habitat distribution research of wildlife for preservation, AUC value was calculated as 0.893 (sd=0.121). This was useful in predicting Asiatic black bears' potential habitat and evaluate the habitat change characteristics according to future climate change. Compare to the distribution map of Asiatic black bears evaluated by IUCN, Habitat suitability by the Maxent model were regionally diverse in extant areas and low in the extinct areas from IUCN map. This can be the result reflecting the regional difference in the environmental conditions where Asiatic black bears inhabit. As for the environment affecting the potential habitat distribution of Asiatic black bears, inhabitation rate was the highest, according to land coverage type, compared to climate, topography and artificial factors like distance from road. Especially, the area of deciduous broadleaf forest was predicted to be preferred, in comparison with other land coverage types. Annual mean precipitation and the precipitation during the driest period were projected to affect more than temperature's annual range, and the inhabitation possibility was higher, as distance was farther from road. The reason is that Asiatic black bears are conjectured to prefer more stable area without human's intervention, as well as prey resource. The inhabitation range was predicted to be expanded gradually to the southern part of India, China's southeast coast and adjacent inland area, and Vietnam, Laos and Malaysia in the eastern coastal areas of Southeast Asia. The following areas are forecast to be the core areas, where Asiatic black bears can inhabit in the Asian region: Jeonnam, Jeonbuk and Gangwon areas in South Korea, Kyushu, Chugoku, Shikoku, Chubu, Kanto and Tohoku's border area in Japan, and Jiangxi, Zhejiang and Fujian border area in China. This study is expected to be used as basic data for the preservation and efficient management of Asiatic black bear's habitat, artificially introduced individual bear's release area selection, and the management of collision zones with humans.

Discrimination of African Yams Containing High Functional Compounds Using FT-IR Fingerprinting Combined by Multivariate Analysis and Quantitative Prediction of Functional Compounds by PLS Regression Modeling (FT-IR 스펙트럼 데이터의 다변량 통계분석을 이용한 고기능성 아프리칸 얌 식별 및 기능성 성분 함량 예측 모델링)

  • Song, Seung Yeob;Jie, Eun Yee;Ahn, Myung Suk;Kim, Dong Jin;Kim, In Jung;Kim, Suk Weon
    • Horticultural Science & Technology
    • /
    • v.32 no.1
    • /
    • pp.105-114
    • /
    • 2014
  • We established a high throughput screening system of African yam tuber lines which contain high contents of total carotenoids, flavonoids, and phenolic compounds using ultraviolet-visible (UV-VIS) spectroscopy and Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. The total carotenoids contents from 62 African yam tubers varied from 0.01 to $0.91{\mu}g{\cdot}g^{-1}$ dry weight (wt). The total flavonoids and phenolic compounds also varied from 12.9 to $229{\mu}g{\cdot}g^{-1}$ and from 0.29 to $5.2mg{\cdot}g^{-1}$dry wt. FT-IR spectra confirmed typical spectral differences between the frequency regions of 1,700-1,500, 1,500-1,300 and $1,100-950cm^{-1}$, respectively. These spectral regions were reflecting the quantitative and qualitative variations of amide I, II from amino acids and proteins ($1,700-1,500cm^{-1}$), phosphodiester groups from nucleic acid and phospholipid ($1,500-1,300cm^{-1}$) and carbohydrate compounds ($1,100-950cm^{-1}$). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate the 62 African yam tuber lines into three separate clusters corresponding to their taxonomic relationship. The quantitative prediction modeling of total carotenoids, flavonoids, and phenolic compounds from African yam tuber lines were established using partial least square regression algorithm from FT-IR spectra. The regression coefficients ($R^2$) between predicted values and estimated values of total carotenoids, flavonoids and phenolic compounds were 0.83, 0.86, and 0.72, respectively. These results showed that quantitative predictions of total carotenoids, flavonoids, and phenolic compounds were possible from FT-IR spectra of African yam tuber lines with higher accuracy. Therefore we suggested that quantitative prediction system established in this study could be applied as a rapid selection tool for high yielding African yam lines.