• Title/Summary/Keyword: Vector Store

Search Result 46, Processing Time 0.024 seconds

QA Pair Passage RAG-based LLM Korean chatbot service (QA Pair Passage RAG 기반 LLM 한국어 챗봇 서비스)

  • Joongmin Shin;Jaewwook Lee;Kyungmin Kim;Taemin Lee;Sungmin Ahn;JeongBae Park;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.683-689
    • /
    • 2023
  • 자연어 처리 분야는 최근에 큰 발전을 보였으며, 특히 초대규모 언어 모델의 등장은 이 분야에 큰 영향을 미쳤다. GPT와 같은 모델은 다양한 NLP 작업에서 높은 성능을 보이고 있으며, 특히 챗봇 분야에서 중요하게 다루어지고 있다. 하지만, 이러한 모델에도 여러 한계와 문제점이 있으며, 그 중 하나는 모델이 기대하지 않은 결과를 생성하는 것이다. 이를 해결하기 위한 다양한 방법 중, Retrieval-Augmented Generation(RAG) 방법이 주목받았다. 이 논문에서는 지식베이스와의 통합을 통한 도메인 특화형 질의응답 시스템의 효율성 개선 방안과 벡터 데이터 베이스의 수정을 통한 챗봇 답변 수정 및 업데이트 방안을 제안한다. 본 논문의 주요 기여는 다음과 같다: 1) QA Pair Passage RAG을 활용한 새로운 RAG 시스템 제안 및 성능 향상 분석 2) 기존의 LLM 및 RAG 시스템의 성능 측정 및 한계점 제시 3) RDBMS 기반의 벡터 검색 및 업데이트를 활용한 챗봇 제어 방법론 제안

  • PDF

Development of An Automatic Classification System for Game Reviews Based on Word Embedding and Vector Similarity (단어 임베딩 및 벡터 유사도 기반 게임 리뷰 자동 분류 시스템 개발)

  • Yang, Yu-Jeong;Lee, Bo-Hyun;Kim, Jin-Sil;Lee, Ki Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.2
    • /
    • pp.1-14
    • /
    • 2019
  • Because of the characteristics of game software, it is important to quickly identify and reflect users' needs into game software after its launch. However, most sites such as the Google Play Store, where users can download games and post reviews, provide only very limited and ambiguous classification categories for game reviews. Therefore, in this paper, we develop an automatic classification system for game reviews that categorizes reviews into categories that are clearer and more useful for game providers. The developed system converts words in reviews into vectors using word2vec, which is a representative word embedding model, and classifies reviews into the most relevant categories by measuring the similarity between those vectors and each category. Especially, in order to choose the best similarity measure that directly affects the classification performance of the system, we have compared the performance of three representative similarity measures, the Euclidean similarity, cosine similarity, and the extended Jaccard similarity, in a real environment. Furthermore, to allow a review to be classified into multiple categories, we use a threshold-based multi-category classification method. Through experiments on real reviews collected from Google Play Store, we have confirmed that the system achieved up to 95% accuracy.

Effective Picture Search in Lifelog Management Systems using Bluetooth Devices (라이프로그 관리 시스템에서 블루투스 장치를 이용한 효과적인 사진 검색 방법)

  • Chung, Eun-Ho;Lee, Ki-Yong;Kim, Myoung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.383-391
    • /
    • 2010
  • A Lifelog management system provides users with services to store, manage, and search their life logs. This paper proposes a fully-automatic collecting method of real world social contacts and lifelog search engine using collected social contact information as keyword. Wireless short-distance network devices in mobile phones are used to detect social contacts of their users. Human-Bluetooth relationship matrix is built based on the frequency of a human-being and a Bluetooth device being observed at the same time. Results show that with 20% of social contact information out of full social contact information of the observation times used for calculation, 90% of human-Bluetooth relationship can be correctly acquired. A lifelog search-engine that takes human names as keyword is suggested which compares two vectors, a row of Human-Bluetooth matrix and a vector of Bluetooth list scanned while a lifelog was created, using vector information retrieval model. This search engine returns more lifelog than existing text-matching search engine and ranks the result unlike existing search-engine.

Low-Complexity Lattice Reduction Aided MIMO Detectors Using Look-Up Table (Look-Up Table 기반의 복잡도가 낮은 Lattice Reduction MIMO 검출기)

  • Lee, Chung-Won;Lee, Ho-Kyoung;Heo, Seo-Weon
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.46 no.5
    • /
    • pp.88-94
    • /
    • 2009
  • We propose a scheme which reduce the computational complexity of the lattice reduction (LR) aided detector in MIMO system. The performance of the ML detection algorithm is good but the computational complexity grows exponentially with the number of antenna elements and constellation points. LR aided detector shows the same diversity with the ML scheme with relatively less complexity. But the LR scheme still requires many computations since it involves several iterations of size reduction and column vector exchange. We notice that the LR process depends not on the received signal but only on the channel matrix so we can apply LR process offline and store the results in Look-Up Table (LUT). In this paper we propose an algorithm to generate the LUT which require less memory requirement and we evaluate the performance and complexity of the proposed system. We show that the proposed system requires less computational complexity with similar detection performance compared with the conventional LR aided detector.

A novel approach to the form-finding of membrane structures using dynamic relaxation method

  • Labbafi, S. Fatemeh;Sarafrazi, S. Reza;Gholami, Hossein;Kang, Thomas H.K.
    • Advances in Computational Design
    • /
    • v.2 no.3
    • /
    • pp.123-141
    • /
    • 2017
  • Solving a system of linear or non-linear equations is required to analyze any kind of structures. There are many ways to solve a system of equations, and they can be classified as implicit and explicit techniques. The explicit methods eliminate round-off errors and use less memory. The dynamic relaxation method (DR) is one of the powerful and simple explicit processes. The important point is that the DR does not require to store the global stiffness matrix, for which it just uses the residual loads vector. In this paper, a new approach to the DR method is expressed. In this approach, the damping, mass and time steps are similar to those of the traditional method of dynamic relaxation. The difference of this proposed method is focused on the method of calculating the damping. The proposed method is expressed such that the time step is constant, damping is equal to zero except in steps with maximum energy and the concentrated damping can be applied to minimize the energy of system in this step. In this condition, the calculation of damping in all steps is not required. Then the volume of computation is reduced. The DR method for form-finding of membrane structures is employed in this paper. The form-finding of the three plans related to the membrane structures with different loading is considered to investigate the efficiency of the proposed method. The numerical results show that the convergence rate based on the proposed method increases in all cases than other methods.

Efficient Capturing of Spatial Data in Geographic Database System (지리 데이타베이스 시스템에서의 효율적인 공간 데이타 수집)

  • Kim, Jong-Hun;Kim, Jae-Hong;Bae, Hae-Yeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.3
    • /
    • pp.279-289
    • /
    • 1994
  • A Geographic Database System is a database system which supports map-formed output and allows users to store, retrieve, manage and analyze spatial and aspatial data. Because of large data amount, takes too much time to input spatial data into a Geographic Database System and too much storage. Therefore, an efficient spatial data collecting system is highly required for a Geographic Database System to reduce the input processing time and to use the storage efficiently. In this paper, we analyze conventional vectorizing methods and suggest a different approach. Our approach vectorizes specific geographic data when the users input its aspatial data, instead of vectorizing all the map data. And also, we propose optimized vector data format using tag bit to use the storage that collected data efficiently.

  • PDF

Random Partial Haar Wavelet Transformation for Single Instruction Multiple Threads (단일 명령 다중 스레드 병렬 플랫폼을 위한 무작위 부분적 Haar 웨이블릿 변환)

  • Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.16 no.5
    • /
    • pp.805-813
    • /
    • 2015
  • Many researchers expect the compressive sensing and sparse recovery problem can overcome the limitation of conventional digital techniques. However, these new approaches require to solve the l1 norm optimization problems when it comes to signal reconstruction. In the signal reconstruction process, the transform computation by multiplication of a random matrix and a vector consumes considerable computing power. To address this issue, parallel processing is applied to the optimization problems. In particular, due to huge size of original signal, it is hard to store the random matrix directly in memory, which makes one need to design a procedural approach in handling the random matrix. This paper presents a new parallel algorithm to calculate random partial Haar wavelet transform based on Single Instruction Multiple Threads (SIMT) platform.

A Fast Method for Face Detection Based on PCA and SVM (PCA와 SVM에 기반하는 빠른 얼굴탐지 방법)

  • Xia, Chun-Lei;Shin, Hyeon-Gab;Park, Myeong-Chul;Ha, Seok-Wun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.6
    • /
    • pp.1129-1135
    • /
    • 2007
  • Human face detection technique plays an important role in computer vision area. It has lots of applications such as face recognition, video surveillance, human computer interface, face image database management, and querying image databases. In this paper, a fast face detection approach using Principal Component Analysis (PCA) and Support Vector Machines (SVM) is proposed based on the previous study on face detection technique. In the proposed detection system, firstly it filter the face potential area using statistical feature which is generated by analyzing the local histogram distribution the detection process is speeded up by eliminating most of the non-face area in this step. In the next step, PCA feature vectors are generated, and then detect whether there are faces present in the test image using SVM classifier. Finally, store the detection results and output the results on the test image. The test images in this paper are from CMU face database. The face and non-face samples are selected from the MIT data set. The experimental results indicate the proposed method has good performance for face detection.

Simulation of Evacuation Route Scenarios Through Multicriteria Analysis for Rescue Activities

  • Castillo Osorio, Ever Enrique;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.5
    • /
    • pp.303-313
    • /
    • 2019
  • After a disaster happens in urban areas, many people need support for a quick evacuation. This work aims to develop a method for the calculation of the most feasible evacuation route inside buildings. In the methodology we simplify the geometry of the structural and non structural elements from the BIM (Building Information Modeling) to store them in a spatial database which follows standards to support vector data. Then, we apply the multicriteria analysis with the allocation of prioritization values and weight factors validated through the AHP (Analytic Hierarchy Process), in order to obtain the Importance Index S(n) of the elements. The criteria consider security conditions and distribution of the building's facilities. The S(n) is included as additional heuristic data for the calculation of the evacuation route through an algorithm developed as a variant of the $A^*$ pathfinding, The experimental results in the simulation of evacuation scenarios for vulnerable people in healthy physical conditions and for the elderly group, shown that the conditions about the wide of routes, restricted areas, vulnerable elements, floor roughness and location of facilities in the building applied in the multicriteria analysis has a high influence on the processing of the developed variant of $A^*$ algorithm. The criteria modify the evacuation route, because they considers as the most feasible route, the safest instead of the shortest, for the simulation of evacuation scenarios for people in healthy physical conditions. Likewise, they consider the route with the location of facilities for the movement of the elderly like the most feasible in the simulation of evacuation route for the transit of the elderly group. These results are important for the assessment of the decision makers to select between the shortest or safest route like the feasible for search and rescue activities.

Real-time Color Recognition Based on Graphic Hardware Acceleration (그래픽 하드웨어 가속을 이용한 실시간 색상 인식)

  • Kim, Ku-Jin;Yoon, Ji-Young;Choi, Yoo-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.1-12
    • /
    • 2008
  • In this paper, we present a real-time algorithm for recognizing the vehicle color from the indoor and outdoor vehicle images based on GPU (Graphics Processing Unit) acceleration. In the preprocessing step, we construct feature victors from the sample vehicle images with different colors. Then, we combine the feature vectors for each color and store them as a reference texture that would be used in the GPU. Given an input vehicle image, the CPU constructs its feature Hector, and then the GPU compares it with the sample feature vectors in the reference texture. The similarities between the input feature vector and the sample feature vectors for each color are measured, and then the result is transferred to the CPU to recognize the vehicle color. The output colors are categorized into seven colors that include three achromatic colors: black, silver, and white and four chromatic colors: red, yellow, blue, and green. We construct feature vectors by using the histograms which consist of hue-saturation pairs and hue-intensity pairs. The weight factor is given to the saturation values. Our algorithm shows 94.67% of successful color recognition rate, by using a large number of sample images captured in various environments, by generating feature vectors that distinguish different colors, and by utilizing an appropriate likelihood function. We also accelerate the speed of color recognition by utilizing the parallel computation functionality in the GPU. In the experiments, we constructed a reference texture from 7,168 sample images, where 1,024 images were used for each color. The average time for generating a feature vector is 0.509ms for the $150{\times}113$ resolution image. After the feature vector is constructed, the execution time for GPU-based color recognition is 2.316ms in average, and this is 5.47 times faster than the case when the algorithm is executed in the CPU. Our experiments were limited to the vehicle images only, but our algorithm can be extended to the input images of the general objects.