• Title/Summary/Keyword: Model validation

Search Result 3,236, Processing Time 0.03 seconds

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

Comparison between Uncertainties of Cultivar Parameter Estimates Obtained Using Error Calculation Methods for Forage Rice Cultivars (오차 계산 방식에 따른 사료용 벼 품종의 품종모수 추정치 불확도 비교)

  • Young Sang Joh;Shinwoo Hyun;Kwang Soo Kim
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.3
    • /
    • pp.129-141
    • /
    • 2023
  • Crop models have been used to predict yield under diverse environmental and cultivation conditions, which can be used to support decisions on the management of forage crop. Cultivar parameters are one of required inputs to crop models in order to represent genetic properties for a given forage cultivar. The objectives of this study were to compare calibration and ensemble approaches in order to minimize the uncertainty of crop yield estimates using the SIMPLE crop model. Cultivar parameters were calibrated using Log-likelihood (LL) and Generic Composite Similarity Measure (GCSM) as an objective function for Metropolis-Hastings (MH) algorithm. In total, 20 sets of cultivar parameters were generated for each method. Two types of ensemble approach. First type of ensemble approach was the average of model outputs (Eem), using individual parameters. The second ensemble approach was model output (Epm) of cultivar parameter obtained by averaging given 20 sets of parameters. Comparison was done for each cultivar and for each error calculation methods. 'Jowoo' and 'Yeongwoo', which are forage rice cultivars used in Korea, were subject to the parameter calibration. Yield data were obtained from experiment fields at Suwon, Jeonju, Naju and I ksan. Data for 2013, 2014 and 2016 were used for parameter calibration. For validation, yield data reported from 2016 to 2018 at Suwon was used. Initial calibration indicated that genetic coefficients obtained by LL were distributed in a narrower range than coefficients obtained by GCSM. A two-sample t-test was performed to compare between different methods of ensemble approaches and no significant difference was found between them. Uncertainty of GCSM can be neutralized by adjusting the acceptance probability. The other ensemble method (Epm) indicates that the uncertainty can be reduced with less computation using ensemble approach.

Comparative study on the performance of Pod type waterjet by experiment and computation

  • Kim, Moon-Chan;Park, Warn-Gyu;Chun, Ho-Hwan;Jung, Un-Hwa
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.2 no.1
    • /
    • pp.1-13
    • /
    • 2010
  • A comparative study between a computation and an experiment has been conducted to predict the performance of a Pod type waterjet for cm amphibious wheeled vehicle. The Pod type waterjet has been chosen on the basis of the required specific speed of more than 2500. As the Pod type waterjet is an extreme type of axial flow type waterjet, theoretical as well as experimental works about Pod type waterjets are very rare. The main purpose of the present study is to validate and compare to the experimental results of the Pod type waterjet with the developed CFD in-house code based on the RANS equations. The developed code has been validated by comparing with the experimental results of the well-known turbine problem. The validation also extended to the flush type waterjet where the pressures along the duct surface and also velocities at nozzle area have been compared with experimental results. The Pod type waterjet has been designed and the performance of the designed waterjet system including duct, impeller and stator was analyzed by the previously mentioned m-house CFD Code. The pressure distributions and limiting streamlines on the blade surfaces were computed to confirm the performance of the designed waterjets. In addition, the torque and momentum were computed to find the entire efficiency and these were compared with the model test results. Measurements were taken of the flow rate at the nozzle exit, static pressure at the various sections along the duct and also the nozzle, revolution of the impeller, torque, thrust and towing forces at various advance speed's for the prediction of performance as well as for comparison with the computations. Based on these measurements, the performance was analyzed according to the ITTC96 standard analysis method. The full-scale effective and the delivered power of the wheeled vehicle were estimated for the prediction of the service speed. This paper emphasizes the confirmation of the ITTC96 analysis method and the developed analysis code for the design and analysis of the Pod type waterjet system.

Determination of Precipitable Water Vapor from Combined GPS/GLONASS Measurements and its Accuracy Validation (GPS/GLONASS 통합관측자료를 이용한 가강수량 산출과 정확도 검증)

  • Sohn, Dong Hyo;Park, Kwan Dong;Kim, Yeon Hee
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.21 no.4
    • /
    • pp.95-100
    • /
    • 2013
  • Several observation equipments are being used for determination of the water vapor content and precipitable water vapor (PWV) because the water vapor is highly variable temporally and spatially. In this study, we used GNSS systems such as GPS and GLONASS in standalone and combined modes to compute PWV and validated their accuracy with respect to the results of other water-vapor monitoring systems. The other systems used were radiosonde and microwave radiometer, and the comparisons were convenient because all three systems were collocated at the test site. The differences of PWW were in the range of 0.6-3.4 mm in the mean sense, and their standard deviations were 1.0-3.8 mm. The relatively large difference of GNSS compared with the other two systems were believed to be caused by the fact that the GNSS antenna used in this study was the kind for which the international standard of phase center variations (PCV) calibration is not available. We expect better accuracy of PWV determination and improved availability of it through integrated data processing of GPS/GLONASS when an appropriate antenna with PCV correction model is used.

A Study on the Influence of Originality and Usefulness of Artificial Intelligence Music Products on Consumer Perceived Attractiveness and Purchase intention

  • Meilin, Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.9
    • /
    • pp.45-52
    • /
    • 2020
  • In this paper, we propose an intention to study the purchase of smart music by Chinese consumers. To study the influence of the originality and usefulness of intelligent music products on the purchase intention of Chinese consumers, and to explore how the originality and usefulness of intelligent music products affect the purchase intention. To achieve this goal, 372 questionnaires were collected through the Internet for frequency analysis, factor analysis, confidence analysis and structural equation analysis of data collection, and were carried out by SPSSV22.0 and AMOSV22.0 methods. Research the validation of assumptions in the model to reveal the psychological and behavioral responses of consumers to smart music products. The results show that the originality and usefulness of new products not only directly affect the purchase intention of Chinese consumers, but also indirectly affect their purchase intention by enhancing their attractiveness. The conclusion of this study is of guiding significance for the development of intelligent music product development and marketing strategy.

Validation of Equivalent Shear Beam Container Using Dynamic Centrifuge Tests (동적 원심모형실험을 이용한 등가전단보 토조의 성능 검증)

  • Kim, Yoon-Ah;Lee, Hae-In;Ko, Kil-Wan;Kim, Dong-Soo
    • Journal of the Korean Geotechnical Society
    • /
    • v.36 no.11
    • /
    • pp.61-70
    • /
    • 2020
  • In dynamic centrifuge tests, equivalent shear beam (ESB) container minimizes the boundary effect between the soil model and the wall of the container so as to effectively simulate the boundary conditions of real field state. The ESB container at KAIST was evaluated to be performing properly by Lee et al. (2013). However, it is necessary to re-evaluate the performance of ESB container since the ESB container may have deteriorated over time. Thus, the performance of eight-year-old ESB container was re-evaluated through dynamic centrifuge tests. Firstly, the natural period of the empty ESB container was compared with the results of Lee et al. (2013). Then the boundary effect of sand-filled ESB container was evaluated. Results show that the dynamic behavior of the sand-filled ESB container was similar to that of the ground, despite a decrease in the natural period of the empty ESB container over time. In addition, the dynamic response of the ground built in the ESB container and the same ground simulated through numerical analysis with free-field boundary conditions were similar. Therefore, it was found that the boundary effect of the ESB container due to the decrease in the natural period was not significant.

Construction of a reference stature growth curve using spline function and prediction of final stature in Korean (스플라인 함수를 이용한 한국인 키 기준 성장 곡선 구성과 최종 키 예측 연구)

  • An, Hong-Sug;Lee, Shin-Jae
    • The korean journal of orthodontics
    • /
    • v.37 no.1 s.120
    • /
    • pp.16-28
    • /
    • 2007
  • Objective: Evaluation of individual growth is important in orthodontics. The aim of this study was to develop a convenient software that can evaluate current growth status and predict further growth. Methods: Stature data of 2 to 20 year-old Koreans (4893 boys and 4987 girls) were extracted from a nationwide data. Age-sex-specific continuous functions describing percentile growth curves were constructed using natural cubic spline function (NCSF). Then, final stature prediction algorithm was developed and its validity was tested using longitudinal series of stature measurements on randomly selected 200 samples. Various accuracy measurements and analyses of errors between observed and predicted stature using NCSF growth curves were performed. Results: NCSF growth curves were shown to be excellent models in describing reference percentile stature growth curie over age. The prediction accuracy compared favorably with previous prediction models, even more accurate. The current prediction models gave more accurate results in girls than boys. Although the prediction accuracy was high, the error pattern of the validation data showed that in most cases, there were a lot of residuals with the same sign, suggestive of autocorrelation among them. Conclusion: More sophisticated growth prediction algorithm is warranted to enhance a more appropriate goodness of model fit for individual growth.

An Analysis on Curriculum Content of child Nursing in Korea (아동간호학 국가시험문제 보완을 위한 교과목 강의 내용 분석)

  • Cho Kyoul Ja;Song Ji Ho;Choe Myoung Ae;Shin Hee Sun;Kim Soon Ae;Jung Hyun Sook;Tak Young Ran
    • Child Health Nursing Research
    • /
    • v.4 no.1
    • /
    • pp.5-16
    • /
    • 1998
  • The purpose of nursing education is to prepare the professional pratictioner as nurse who will be interesteed in the health and the related aspects of community and will assume responsibility for con tributing toward the improvement of the health for the all. This means that nursing education must provide opportunities for the development of knowledge, skills, and attitudes which make this possible. Consequently, this approach has relavence for nursing education. Faculty engaged in endless debates about what is to be included, and to what de1th, and what will be given short shrift as a result. Thus, it can be seen why there is so much confusion and lack of agreement between the emphasis and objectives in nursing. This study attempted to review and identify the curriculum content of child nursing in Korea to build and develop the standard curriculum contents for national board examination for nurses and child's health needs for the coming 21st centry. The questionnaire was consisted of items for selection and organization of the knowledge components and type of unit with weigh to be attained in child nursing. Response of 34% of nursing program in university and junior college. Content analysis was done by using consensual validation of essential knowledge for curriculum content to identify what is obvious or trivial. This study pointed out that it is not yet apparent that demographic fact has greatly influenced child nursing curriculum content. In a similar vein the majority of content of child nursing devote little time and weigh to social and epidemically significant to child health. It seems to be needed that the content of child nursing may push the paradigm shift in nursing education such as health promotion and prevention for potentional roles of child and family. In conclusion, it is the time to convoke and debate for convergence of model on essential content and standarization on job analysis for national board exam for nurses in Korea.

  • PDF

Development and Validation of the Stand Density Management Diagram for Pinus densiflora Forests in Korea (소나무 임분밀도관리도 작성 및 실용성 검정)

  • Park, Joon Hyung;Lee, Kwang Soo;Yoo, Byung Oh;Park, Yong Bae;Jung, Su Young
    • Journal of Korean Society of Forest Science
    • /
    • v.105 no.3
    • /
    • pp.342-350
    • /
    • 2016
  • This study aims to make the stand density management diagram which is very useful for establishing systematic management plan and obtaining management goal in Pinus densiflora forest. To estimate 5 models mainly composed of stand density management diagram, we used total of 1,886 sample plots having more than 75% of the total basal area of the pine trees in each stand. To test the goodness of fit, $X^2$ was computed with a significance level of 5%, and the acceptable error range as 20%. Also standard deviation of the model was $34.59m^3{\cdot}ha^{-1}$, minimum acceptable error range was 16.59% and coefficient of variation was 22.11%. If we use the stand density management diagram, it would be useful to establish the timber yield and thinning plan understanding the pathway of stand density management.

Recommendation of Nitrogen Topdressing Rates at Panicle Initiation Stage of Rice Using Canopy Reflectance

  • Nguyen, Hung T.;Lee, Kyu-Jong;Lee, Byun-Woo
    • Journal of Crop Science and Biotechnology
    • /
    • v.11 no.2
    • /
    • pp.141-150
    • /
    • 2008
  • The response of grain yield(GY) and milled-rice protein content(PC) to crop growth status and nitrogen(N) rates at panicle initiation stage(PIS) is critical information for prescribing topdress N rate at PIS(Npi) for target GY and PC. Three split-split-plot experiments including various N treatments and rice cultivars were conducted in Experimental Farm, Seoul National University, Korea in 2003-2005. Shoot N density(SND, g N in shoot $m^{-2}$) and canopy reflectance were measured before N application at PIS, and GY, PC, and SND were measured at harvest. Data from the first two years(2003-2004) were used for calibrating the predictive models for GY, PC, and SND accumulated from PIS to harvest using SND at PIS and Npi by multiple stepwise regression. After that the calibrated models were used for calculating N requirement at PIS for each of nine plots based on the target PC of 6.8% and the values of SND at PIS that was estimated by canopy reflectance method in the 2005 experiment. The result showed that SND at PIS in combination with Npi were successful to predict GY, PC, and SND from PIS to harvest in the calibration dataset with the coefficients of determination ($R^2$) of 0.87, 0.73, and 0.82 and the relative errors in prediction(REP, %) of 5.5, 4.3, and 21.1%, respectively. In general, the calibrated model equations showed a little lower performance in calculating GY, PC, and SND in the validation dataset(data from 2005) but REP ranging from 3.3% for PC and 13.9% for SND accumulated from PIS to harvest was acceptable. Nitrogen rate prescription treatment(PRT) for the target PC of 6.8% reduced the coefficient of variation in PC from 4.6% in the fixed rate treatment(FRT, 3.6g N $m^{-2}$) to 2.4% in PRT and the average PC of PRT was 6.78%, being very close to the target PC of 6.8%. In addition, PRT increased GY by 42.1 $gm^{-2}$ while Npi increased by 0.63 $gm^{-2}$ compared to the FRT, resulting in high agronomic N-use efficiency of 68.8 kg grain from additional kg N. The high agronomic N-use efficiency might have resulted from the higher response of grain yield to the applied N in the prescribed N rate treatment because N rate was prescribed based on the crop growth and N status of each plot.

  • PDF