• Title/Summary/Keyword: 벡터합

Search Result 126, Processing Time 0.041 seconds

A Korean Community-based Question Answering System Using Multiple Machine Learning Methods (다중 기계학습 방법을 이용한 한국어 커뮤니티 기반 질의-응답 시스템)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1085-1093
    • /
    • 2016
  • Community-based Question Answering system is a system which provides answers for each question from the documents uploaded on web communities. In order to enhance the capacity of question analysis, former methods have developed specific rules suitable for a target region or have applied machine learning to partial processes. However, these methods incur an excessive cost for expanding fields or lead to cases in which system is overfitted for a specific field. This paper proposes a multiple machine learning method which automates the overall process by adapting appropriate machine learning in each procedure for efficient processing of community-based Question Answering system. This system can be divided into question analysis part and answer selection part. The question analysis part consists of the question focus extractor, which analyzes the focused phrases in questions and uses conditional random fields, and the question type classifier, which classifies topics of questions and uses support vector machine. In the answer selection part, the we trains weights that are used by the similarity estimation models through an artificial neural network. Also these are a number of cases in which the results of morphological analysis are not reliable for the data uploaded on web communities. Therefore, we suggest a method that minimizes the impact of morphological analysis by using character features in the stage of question analysis. The proposed system outperforms the former system by showing a Mean Average Precision criteria of 0.765 and R-Precision criteria of 0.872.

Speaker-Independent Korean Digit Recognition Using HCNN with Weighted Distance Measure (가중 거리 개념이 도입된 HCNN을 이용한 화자 독립 숫자음 인식에 관한 연구)

  • 김도석;이수영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.10
    • /
    • pp.1422-1432
    • /
    • 1993
  • Nonlinear mapping function of the HCNN( Hidden Control Neural Network ) can change over time to model the temporal variability of a speech signal by combining the nonlinear prediction of conventional neural networks with the segmentation capability of HMM. We have two things in this paper. first, we showed that the performance of the HCNN is better than that of HMM. Second, the HCNN with its prediction error measure given by weighted distance is proposed to use suitable distance measure for the HCNN, and then we showed that the superiority of the proposed system for speaker-independent speech recognition tasks. Weighted distance considers the differences between the variances of each component of the feature vector extraced from the speech data. Speaker-independent Korean digit recognition experiment showed that the recognition rate of 95%was obtained for the HCNN with Euclidean distance. This result is 1.28% higher than HMM, and shows that the HCNN which models the dynamical system is superior to HMM which is based on the statistical restrictions. And we obtained 97.35% for the HCNN with weighted distance, which is 2.35% better than the HCNN with Euclidean distance. The reason why the HCNN with weighted distance shows better performance is as follows : it reduces the variations of the recognition error rate over different speakers by increasing the recognition rate for the speakers who have many misclassified utterances. So we can conclude that the HCNN with weighted distance is more suit-able for speaker-independent speech recognition tasks.

  • PDF

Comparison of Electrocardiographic Time Intervals, Amplitudes and Vectors in 7 Different Athletic Groups (운동종목별(運動種目別) 선수(選手)의 심전도시간간격(心電圖時間間隔), 파고(波高) 및 벡터의 비교(比較))

  • Kwon, Ki-Young;Lee, Won-Jung;Hwang, Soo-Kwan;Choo, Young-Eun
    • The Korean Journal of Physiology
    • /
    • v.19 no.1
    • /
    • pp.61-72
    • /
    • 1985
  • In order to compare the cardiac function of various groups of athletes, the resting electrocardiographic time intervals, amplitudes and vectors were analyzed in high school athletes of throwing(n=7), jumping(n=11), short track(n=8), long track(n=14), boxing(n=7), volleyball(n=8) and baseball(n=9), and nonathletic control students(n= 19). All athletic groups showed a significantly longer R-R interval(0.96-1.09 sec) than the controls (0.78 sec). Therefore, the heart rate was significantly slower in atheletes than in the control, but was not different among the different athletic groups. R-R interval is the sum of intervals of P-R, 0-T and T-P: P-R and Q-T intervals showed no difference among the control and athletic groups, but T-P interval in the jump, short track, long track and boxing groups was significantly higher than the control. R-B interval showed a significant correlation with T-P or Q-T intervals but no correlation with P-R or QRS complex. Comparing the amplitude of electrocardiographic waves, the athletic groups showed a lower trend in P wave than the controls. T wave in lead $V_5\;(Tv_5)$ was similar in the athletic and control groups. The long track group showed a significantly higher waves of $Rv_5$, $Sv_1$, and the sum of $Rv_5$ and $Sv_1$ than not only the controls but also the other athletic group. The angles of P, QRS, and T vector in the frontal and horizontal planes were not different among the control and all the athletic groups. Each athletic group stowed a lower trend in amplitude of P vector in the frontal plane, but in horizontal plane, throwing, jump, short track and baseball groups showed a significantly lower than the controls. The amplitude of QRS and T vector was similar in the athletic and control groups, but only the baseball group showed a significantly higher QRS vector in the frontal plane. In taken together, all the athletic groups showed a slower heart rate than the controls, mainly because of elongated T-P interval. Comparing the electrocardiographic waves and vector, the athletic groups showed lower amplitudes of P wave and P vector than the controls. Values of $Rv_5$ and $Sv_1$ strongly suggest that only the long distance runners among the various athletic groups developed a left ventricular hypertrophy.

  • PDF

Development of Highway Safety Evaluation Considering Design Consistency using Acceleration (가속도를 고려한 도로의 설계일관성 평가기법에 관한 연구)

  • 하태준;박제진;김유철
    • Journal of Korean Society of Transportation
    • /
    • v.21 no.1
    • /
    • pp.127-136
    • /
    • 2003
  • Road safety is defined under the minimum design standard and design examination process is consisted of the standard according to current road design. However, road safety in practical way is correlative to not only all element of roads but also road shape, such as, between straight line and curved line and between curved lines. Also. it is related to alignments such as horizontal alignment and vertical alignment, and cross section. That is, the practical road design should be examined in both sides of 3 dimension and consecutiveness (consistency) as the actual road is a 3 - dimensional successive object. The paper presents a concept for acceleration to evaluate consistency of road considering actual road shape on 3-dimension. Acceleration of vehicle is influential to road consistency based on running state of vehicles and state of drivers. The magnitude of acceleration. especially, is a quite influential element to drivers. Based on above, the acceleration on each point on 3-D road can be calculated and then displacement can be done. Computation of acceleration means total calculation on each axis. Speed profile refers to “Development of a safety evaluation model for highway horizontal alignment based on running speed(Jeong, Jun-Hwa, 2001)” and then acceleration can be calculated by using the speed pronto. According to literature review, definition of acceleration on 3-D and g-g-g diagram are established. For example, as a result of the evaluation, if the acceleration is out of range, the road is out of consistency. The paper shows calculation for change of acceleration on imaginary road under minimum design standard and the change tried to be applied to consistency. However accurate acceleration is not shown because the speed forecasting model is limited and the paper did not consider state of vehicles (suspension, tires and model of vehicles). If speed pronto is defined exactly, acceleration is calculated on all road shapes, such as. compound curve and clothoid curve. and then it is appled to consistency evaluation. Unfortunately, speed forecasting model on 3 -D road and on compound curves have rarely presented. Speed forecasting model and speed profile model need to be established and standard of consistency evaluation need to developed and verified by experimental vehicles.

Identifying sources of heavy metal contamination in stream sediments using machine learning classifiers (기계학습 분류모델을 이용한 하천퇴적물의 중금속 오염원 식별)

  • Min Jeong Ban;Sangwook Shin;Dong Hoon Lee;Jeong-Gyu Kim;Hosik Lee;Young Kim;Jeong-Hun Park;ShunHwa Lee;Seon-Young Kim;Joo-Hyon Kang
    • Journal of Wetlands Research
    • /
    • v.25 no.4
    • /
    • pp.306-314
    • /
    • 2023
  • Stream sediments are an important component of water quality management because they are receptors of various pollutants such as heavy metals and organic matters emitted from upland sources and can be secondary pollution sources, adversely affecting water environment. To effectively manage the stream sediments, identification of primary sources of sediment contamination and source-associated control strategies will be required. We evaluated the performance of machine learning models in identifying primary sources of sediment contamination based on the physico-chemical properties of stream sediments. A total of 356 stream sediment data sets of 18 quality parameters including 10 heavy metal species(Cd, Cu, Pb, Ni, As, Zn, Cr, Hg, Li, and Al), 3 soil parameters(clay, silt, and sand fractions), and 5 water quality parameters(water content, loss on ignition, total organic carbon, total nitrogen, and total phosphorous) were collected near abandoned metal mines and industrial complexes across the four major river basins in Korea. Two machine learning algorithms, linear discriminant analysis (LDA) and support vector machine (SVM) classifiers were used to classify the sediments into four cases of different combinations of the sampling period and locations (i.e., mine in dry season, mine in wet season, industrial complex in dry season, and industrial complex in wet season). Both models showed good performance in the classification, with SVM outperformed LDA; the accuracy values of LDA and SVM were 79.5% and 88.1%, respectively. An SVM ensemble model was used for multi-label classification of the multiple contamination sources inlcuding landuses in the upland areas within 1 km radius from the sampling sites. The results showed that the multi-label classifier was comparable performance with sinlgle-label SVM in classifying mines and industrial complexes, but was less accurate in classifying dominant land uses (50~60%). The poor performance of the multi-label SVM is likely due to the overfitting caused by small data sets compared to the complexity of the model. A larger data set might increase the performance of the machine learning models in identifying contamination sources.

Development of Beauty Experience Pattern Map Based on Consumer Emotions: Focusing on Cosmetics (소비자 감성 기반 뷰티 경험 패턴 맵 개발: 화장품을 중심으로)

  • Seo, Bong-Goon;Kim, Keon-Woo;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.179-196
    • /
    • 2019
  • Recently, the "Smart Consumer" has been emerging. He or she is increasingly inclined to search for and purchase products by taking into account personal judgment or expert reviews rather than by relying on information delivered through manufacturers' advertising. This is especially true when purchasing cosmetics. Because cosmetics act directly on the skin, consumers respond seriously to dangerous chemical elements they contain or to skin problems they may cause. Above all, cosmetics should fit well with the purchaser's skin type. In addition, changes in global cosmetics consumer trends make it necessary to study this field. The desire to find one's own individualized cosmetics is being revealed to consumers around the world and is known as "Finding the Holy Grail." Many consumers show a deep interest in customized cosmetics with the cultural boom known as "K-Beauty" (an aspect of "Han-Ryu"), the growth of personal grooming, and the emergence of "self-culture" that includes "self-beauty" and "self-interior." These trends have led to the explosive popularity of cosmetics made in Korea in the Chinese and Southeast Asian markets. In order to meet the customized cosmetics needs of consumers, cosmetics manufacturers and related companies are responding by concentrating on delivering premium services through the convergence of ICT(Information, Communication and Technology). Despite the evolution of companies' responses regarding market trends toward customized cosmetics, there is no "Intelligent Data Platform" that deals holistically with consumers' skin condition experience and thus attaches emotions to products and services. To find the Holy Grail of customized cosmetics, it is important to acquire and analyze consumer data on what they want in order to address their experiences and emotions. The emotions consumers are addressing when purchasing cosmetics varies by their age, sex, skin type, and specific skin issues and influences what price is considered reasonable. Therefore, it is necessary to classify emotions regarding cosmetics by individual consumer. Because of its importance, consumer emotion analysis has been used for both services and products. Given the trends identified above, we judge that consumer emotion analysis can be used in our study. Therefore, we collected and indexed data on consumers' emotions regarding their cosmetics experiences focusing on consumers' language. We crawled the cosmetics emotion data from SNS (blog and Twitter) according to sales ranking ($1^{st}$ to $99^{th}$), focusing on the ample/serum category. A total of 357 emotional adjectives were collected, and we combined and abstracted similar or duplicate emotional adjectives. We conducted a "Consumer Sentiment Journey" workshop to build a "Consumer Sentiment Dictionary," and this resulted in a total of 76 emotional adjectives regarding cosmetics consumer experience. Using these 76 emotional adjectives, we performed clustering with the Self-Organizing Map (SOM) method. As a result of the analysis, we derived eight final clusters of cosmetics consumer sentiments. Using the vector values of each node for each cluster, the characteristics of each cluster were derived based on the top ten most frequently appearing consumer sentiments. Different characteristics were found in consumer sentiments in each cluster. We also developed a cosmetics experience pattern map. The study results confirmed that recommendation and classification systems that consider consumer emotions and sentiments are needed because each consumer differs in what he or she pursues and prefers. Furthermore, this study reaffirms that the application of emotion and sentiment analysis can be extended to various fields other than cosmetics, and it implies that consumer insights can be derived using these methods. They can be used not only to build a specialized sentiment dictionary using scientific processes and "Design Thinking Methodology," but we also expect that these methods can help us to understand consumers' psychological reactions and cognitive behaviors. If this study is further developed, we believe that it will be able to provide solutions based on consumer experience, and therefore that it can be developed as an aspect of marketing intelligence.