• Title/Summary/Keyword: 유사도 가중치

Search Result 576, Processing Time 0.04 seconds

An Efficient kNN Algorithm (효율적인 kNN 알고리즘)

  • Lee Jae Moon
    • The KIPS Transactions:PartB
    • /
    • v.11B no.7 s.96
    • /
    • pp.849-854
    • /
    • 2004
  • This paper proposes an algorithm to enhance the execution time of kNN in the document classification. The proposed algorithm is to enhance the execution time by minimizing the computing cost of the similarity between two documents by using the list of pairs, while the conventional kNN uses the iist of pairs. The 1ist of pairs can be obtained by applying the matrix transposition to the list of pairs at the training phase of the document classification. This paper analyzed the proposed algorithm in the time complexity and compared it with the conventional kNN. And it compared the proposed algorithm with the conventional kNN by using routers-21578 data experimentally. The experimental results show that the proposed algorithm outperforms kNN about $90{\%}$ in terms of the ex-ecution time.

Traffic Violation Fine Standard by the Severity and the Number of Total/Fatal Accidents (교통/사망 사고 발생건수 및 보도에 의한 범칙금 부과 방안)

  • 이태경;장명순
    • Journal of Korean Society of Transportation
    • /
    • v.16 no.4
    • /
    • pp.89-98
    • /
    • 1998
  • 교통사고의 원인은 인적 요인, 차량적 요인, 도로 환경적 요인으로 분류된다. 주어진 도로 환경과 차량 조건하에서는 운전자가 마지막으로 안전을 제어할 책임을 지고 있다. 따라서, 교통사고를 사전에 예방하기 위하여 운전자의 교통법규 위반 행위에 대하여는 도로교통법에 근거하여 징역, 벌금, 구류, 과료, 과태료, 범칙금에 처하고 있다. 교통법규 위반 행위 단속 시에는 교통사고 유발 가능성과 위험도에 따라 단속의 강약을 포함하여 차등화된 처벌이 이루어져야 한다. 교통 범칙금 기준 제시를 위하여 1991~1995년의 5년간 교통사고 및 교통법규 위반을 분석한 결과 전체 교통법규 위반 단속 중 교통사고를 야기하는 동적 위반 행위인 사고관련 위반 행위 단속의 비율이 44%로 일본의 61%에 비해 매우 낮은 수준이다. 따라서 사고유발 가능성에 근거한 교통법규 위반 행위 단속의 강화가 필요하다. 한편 범칙금 부과방안으로 피해도 모형과 빈도 모형을 비교한 결과 교통법규 위반 행위로 인해 발생된 교통사고 비용을 고려한 피해도 모형은 범칙금의 차등화가 분명하지 않고 변별력이 뚜렷이 나타나지 않아 적합하지 않은 것으로 분석되었다. 교통법규 위반 행위에 따른 빈도 모형은 교통사고 건수와 사망사고 건수의 가중치(w)설정을 위해 동적 위반행위가 우리나라와 유사한 일본 자료와 비교한 결과 가중치가 한국=0.7, 일본=0.8일 때 상대적으로 $x^2$가 31.71로 가장 낮게 나타났다. 따라서, 사고건수에 대한 가중치는 0.7로 사망사고에 대한 가중치는 0.3을 적용하였다. 마지막으로 현행 범칙금과 제안된 범칙금을 비교분석하였다.

  • PDF

Moving Objects Modeling for Supporting Content and Similarity Searches (내용 및 유사도 검색을 위한 움직임 객체 모델링)

  • 복경수;김미희;신재룡;유재수;조기형
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.5
    • /
    • pp.617-632
    • /
    • 2004
  • Video Data includes moving objects which change spatial positions as time goes by. In this paper, we propose a new modeling method for a moving object contained in the video data. In order to effectively retrieve moving objects, the proposed modeling method represents the spatial position and the size of a moving object. It also represents the visual features and the trajectory by considering direction, distance and speed or moving objects as time goes by. Therefore, It allows various types of retrieval such as visual feature based similarity retrieval, distance based similarity retrieval and trajectory based similarity retrieval and their mixed type of weighted retrieval.

  • PDF

Performance Improvement in Speech Recognition by Weighting HMM Likelihood (은닉 마코프 모델 확률 보정을 이용한 음성 인식 성능 향상)

  • 권태희;고한석
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.2
    • /
    • pp.145-152
    • /
    • 2003
  • In this paper, assuming that the score of speech utterance is the product of HMM log likelihood and HMM weight, we propose a new method that HMM weights are adapted iteratively like the general MCE training. The proposed method adjusts HMM weights for better performance using delta coefficient defined in terms of misclassification measure. Therefore, the parameter estimation and the Viterbi algorithms of conventional 1:.um can be easily applied to the proposed model by constraining the sum of HMM weights to the number of HMMs in an HMM set. Comparing with the general segmental MCE training approach, computing time decreases by reducing the number of parameters to estimate and avoiding gradient calculation through the optimal state sequence. To evaluate the performance of HMM-based speech recognizer by weighting HMM likelihood, we perform Korean isolated digit recognition experiments. The experimental results show better performance than the MCE algorithm with state weighting.

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Implementation of Tactical Path-finding Integrated with Weight Learning (가중치 학습과 결합된 전술적 경로 찾기의 구현)

  • Yu, Kyeon-Ah
    • Journal of the Korea Society for Simulation
    • /
    • v.19 no.2
    • /
    • pp.91-98
    • /
    • 2010
  • Conventional path-finding has focused on finding short collision-free paths. However, as computer games become more sophisticated, it is required to take tactical information like ambush points or lines of enemy sight into account. One way to make this information have an effect on path-finding is to represent a heuristic function of a search algorithm as a weighted sum of tactics. In this paper we consider the problem of learning heuristic to optimize path-finding based on given tactical information. What is meant by learning is to produce a good weight vector for a heuristic function. Training examples for learning are given by a game level-designer and will be compared with search results in every search level to update weights. This paper proposes a learning algorithm integrated with search for tactical path-finding. The perceptron-like method for updating weights is described and a simulation tool for implementing these is presented. A level-designer can mark desired paths according to characters' properties in the heuristic learning tool and then it uses them as training examples to learn weights and shows traces of paths changing along with weight learning.

Automated Areal Feature Matching in Different Spatial Data-sets (이종의 공간 데이터 셋의 면 객체 자동 매칭 방법)

  • Kim, Ji Young;Lee, Jae Bin
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.24 no.1
    • /
    • pp.89-98
    • /
    • 2016
  • In this paper, we proposed an automated areal feature matching method based on geometric similarity without user intervention and is applied into areal features of many-to-many relation, for confusion of spatial data-sets of different scale and updating cycle. Firstly, areal feature(node) that a value of inclusion function is more than 0.4 was connected as an edge in adjacency matrix and candidate corresponding areal features included many-to-many relation was identified by multiplication of adjacency matrix. For geometrical matching, these multiple candidates corresponding areal features were transformed into an aggregated polygon as a convex hull generated by a curve-fitting algorithm. Secondly, we defined matching criteria to measure geometrical quality, and these criteria were changed into normalized values, similarity, by similarity function. Next, shape similarity is defined as a weighted linear combination of these similarities and weights which are calculated by Criteria Importance Through Intercriteria Correlation(CRITIC) method. Finally, in training data, we identified Equal Error Rate(EER) which is trade-off value in a plot of precision versus recall for all threshold values(PR curve) as a threshold and decided if these candidate pairs are corresponding pairs or not. To the result of applying the proposed method in a digital topographic map and a base map of address system(KAIS), we confirmed that some many-to-many areal features were mis-detected in visual evaluation and precision, recall and F-Measure was highly 0.951, 0.906, 0.928, respectively in statistical evaluation. These means that accuracy of the automated matching between different spatial data-sets by the proposed method is highly. However, we should do a research on an inclusion function and a detail matching criterion to exactly quantify many-to-many areal features in future.

Gathering Common-word and Document Reclassification to improve Accuracy of Document Clustering (문서 군집화의 정확률 향상을 위한 범용어 수집과 문서 재분류 알고리즘)

  • Shin, Joon-Choul;Ock, Cheol-Young;Lee, Eung-Bong
    • The KIPS Transactions:PartB
    • /
    • v.19B no.1
    • /
    • pp.53-62
    • /
    • 2012
  • Clustering technology is used to deal efficiently with many searched documents in information retrieval system. But the accuracy of the clustering is satisfied to the requirement of only some domains. This paper proposes two methods to increase accuracy of the clustering. We define a common-word, that is frequently used but has low weight during clustering. We propose the method that automatically gathers the common-word and calculates its weight from the searched documents. From the experiments, the clustering error rates using the common-word is reduced to 34% compared with clustering using a stop-word. After generating first clusters using average link clustering from the searched documents, we propose the algorithm that reevaluates the similarity between document and clusters and reclassifies the document into more similar clusters. From the experiments using Naver JiSikIn category, the accuracy of reclassified clusters is increased to 1.81% compared with first clusters without reclassification.

An Extended Faceted Classification Scheme and Hybrid Retrieval Model to Support Software Reuse (소프트웨어 재사용을 지원하는 확장된 패싯 분류 방식과 혼합형 검색 모델)

  • Gang, Mun-Seol;Kim, Byeong-Gi
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.1
    • /
    • pp.23-37
    • /
    • 1994
  • In this paper, we design and implement the prototype system, and propose the Extended Faceted Classification. Scheme and the Hybrid Retrieval Method that support classifying the software components, storing in library, and efficient retrieval according to user's request. In order to designs the classification scheme, we identify several necessary items by analyzing basic classes of software components that are to be classified. Then, we classify the items by their characteristics, decide the facets, and compose the component descriptors. According to their basic characteristics, we store software components in the library by clustering their application domains and are assign weights to the facets and its items to describe the component characteristics. In order to retrieve the software components, we use the retrieval-by-query model, and the weights and similarity for easy retrieval of similar software components. As the result of applying proposed classification scheme and retrieval model, we can easily identify similar components and the process of classification become simple. Also, the construction of queries becomes simple, the control of the size and order of the components to be retrieved possible, and the retrieval effectiveness is improved.

  • PDF

An E-Mail Question Answering System using Question Generation Model (질의생성 모델을 이용한 전자우편 질의응답 시스템)

  • Zhang, Jeong-Sun;Kim, Sang-Bum;Seo, Hee-Chul;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 2002.10e
    • /
    • pp.176-183
    • /
    • 2002
  • 전자우편과 같이 일정한 질의 형식을 가지고 있는 긴 자연어 질의에 대해서 사용자 질의 단어에 가중치를 부과하는 방법과 질의에 대한 정답을 기존의 질의응답 집합에서 유사한 질의를 검색하여 그 정답을 사용자에게 제공하는 전자우편 질의응답 시스템을 제안한다. 사용자의 긴 자연어 질의가 주어지면 질의의 범주와 문장의 중요도 정보를 이용하여 질의에서 사용된 단어가 주제어로 쓰였을 확률을 계산하고, 계산된 확률에 기반하여 중요도를 할당하는 질의생성 모델을 제안한다. 또한 사용자 질의와 기존에 문의되어진 전자우편 질의의 유사도를 단어의 빈도를 고려한 어휘유사도, 한글 시소러스(Thesaurus)를 이용한 의미유사도와 본 논문에서 제안한 질의생성 모델을 이용한 주제 유사도를 이용하여 계산한다. 실험을 위하여 실세계에서 사용 중인 질의응답 집합을 이용하여 실험을 하였으며 각 유사도 계산 방법의 기여도를 비교 평가하고 제안한 질의생성모델이 성능향상에 미치는 영향을 평가하였다.

  • PDF