Search | Korea Science

Feature Analysis for Detecting Mobile Application Review Generated by AI-Based Language Model

Lee, Seung-Cheol;Jang, Yonghun;Park, Chang-Hyeon;Seo, Yeong-Seok
- Journal of Information Processing Systems
- /
- v.18 no.5
- /
- pp.650-664
- /
- 2022
Mobile applications can be easily downloaded and installed via markets. However, malware and malicious applications containing unwanted advertisements exist in these application markets. Therefore, smartphone users install applications with reference to the application review to avoid such malicious applications. An application review typically comprises contents for evaluation; however, a false review with a specific purpose can be included. Such false reviews are known as fake reviews, and they can be generated using artificial intelligence (AI)-based text-generating models. Recently, AI-based text-generating models have been developed rapidly and demonstrate high-quality generated texts. Herein, we analyze the features of fake reviews generated from Generative Pre-Training-2 (GPT-2), an AI-based text-generating model and create a model to detect those fake reviews. First, we collect a real human-written application review from Kaggle. Subsequently, we identify features of the fake review using natural language processing and statistical analysis. Next, we generate fake review detection models using five types of machine-learning models trained using identified features. In terms of the performances of the fake review detection models, we achieved average F1-scores of 0.738, 0.723, and 0.730 for the fake review, real review, and overall classifications, respectively.
https://doi.org/10.3745/JIPS.02.0182 인용 PDF KSCI

An Efficient Block Index Scheme with Segmentation for Spatio-Textual Similarity Join

Xiang, Yiming;Zhuang, Yi;Jiang, Nan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.7
- /
- pp.3578-3593
- /
- 2017
Given two collections of objects that carry both spatial and textual information in the form of tags, a $\text\underline{S}patio$-$\text\underline{T}extual$-based object $\text\underline{S}imilarity$ $\text\underline{JOIN}$ (ST-SJOIN) retrieves the pairs of objects that are textually similar and spatially close. In this paper, we have proposed a block index-based approach called BIST-JOIN to facilitate the efficient ST-SJOIN processing. In this approach, a dual-feature distance plane (DFDP) is first partitioned into some blocks based on four segmentation schemes, and the ST-SJOIN is then transformed into searching the object pairs falling in some affected blocks in the DFDP. Extensive experiments on real and synthetic datasets demonstrate that our proposed join method outperforms the state-of-the-art solutions.
https://doi.org/10.3837/tiis.2017.07.015 인용 PDF KSCI

Recent Trends in Deep Learning-Based Optical Character Recognition (딥러닝 기반 광학 문자 인식 기술 동향)

Min, G.;Lee, A.;Kim, K.S.;Kim, J.E.;Kang, H.S.;Lee, G.H.
- Electronics and Telecommunications Trends
- /
- v.37 no.5
- /
- pp.22-32
- /
- 2022
Optical character recognition is a primary technology required in different fields, including digitizing archival documents, industrial automation, automatic driving, video analytics, medicine, and financial institution, among others. It was created in 1928 using pattern matching, but with the advent of artificial intelligence, it has since evolved into a high-performance character recognition technology. Recently, methods for detecting curved text and characters existing in a complicated background are being studied. Additionally, deep learning models are being developed in a way to recognize texts in various orientations and resolutions, perspective distortion, illumination reflection and partially occluded text, complex font characters, and special characters and artistic text among others. This report reviews the recent deep learning-based text detection and recognition methods and their various applications.
https://doi.org/10.22648/ETRI.2022.J.370503 인용 PDF

Alzheimer's disease recognition from spontaneous speech using large language models

Jeong-Uk Bang;Seung-Hoon Han;Byung-Ok Kang
- ETRI Journal
- /
- v.46 no.1
- /
- pp.96-105
- /
- 2024
We propose a method to automatically predict Alzheimer's disease from speech data using the ChatGPT large language model. Alzheimer's disease patients often exhibit distinctive characteristics when describing images, such as difficulties in recalling words, grammar errors, repetitive language, and incoherent narratives. For prediction, we initially employ a speech recognition system to transcribe participants' speech into text. We then gather opinions by inputting the transcribed text into ChatGPT as well as a prompt designed to solicit fluency evaluations. Subsequently, we extract embeddings from the speech, text, and opinions by the pretrained models. Finally, we use a classifier consisting of transformer blocks and linear layers to identify participants with this type of dementia. Experiments are conducted using the extensively used ADReSSo dataset. The results yield a maximum accuracy of 87.3% when speech, text, and opinions are used in conjunction. This finding suggests the potential of leveraging evaluation feedback from language models to address challenges in Alzheimer's disease recognition.
https://doi.org/10.4218/etrij.2023-0356 인용 PDF

Hardware Implementation for MLP Based Text Detection (MLP 기반의 문자 추출을 위한 하드웨어 구현)

Kyoung, Dong-Wuk;Jung, Kee-Chul
- 한국HCI학회:학술대회논문집
- /
- 2006.02a
- /
- pp.766-771
- /
- 2006
현재 많은 신경망의 하드웨어 구현은 부동 소수점 연산에 비해서 적은 면적과 빠른 수행시간을 가지는 고정소수점 연산을 많이 사용하지만, 소프트웨어에서는 일반적으로 높은 정확도를 가지는 부동소수점 연산을 사용한다. 신경망의 하드웨어 구현에서 많이 사용하는 고정소수점 연산은 부동소수점 연산에 비해서 빠른 처리속도와 적은 면적으로써 쉽게 하드웨어 구현에 용이하지만, 부동소수점 연산에 비해서 낮은 정확도와 기존의 부동소수점 연산을 사용하는 소프트웨어 신경망을 쉽게 적용할 수 없는 단점을 가진다. 본 논문에서는 부동소수점 연산을 사용하여 문자 추출 MLP의 데이터 변환 없이 적용할 수 있는 전체 파이프라이닝 설계 구조를 제안한다. 제안된 설계방법은 신경망의 전체 구조를 입력층과 은닉층을 링크 병렬화 방법과 은닉층과 출력층을 뉴런 병렬화 방법을 개선하여 쉽게 파이프라이닝 구조로 설계함으로써 신경망 처리는 은닉층 뉴런수와 동일한 주기로 처리되며, 기존의 문자추출 소프트웨어 신경망을 제안된 하드웨어 설계방법으로 구현하였을 때 11배의 빠른 성능을 나타낸다.
PDF

Rapid and Brief Communication GPU implementation of neural networks

Oh, Kyoung-Su;Jung, Kee-Chul
- 한국HCI학회:학술대회논문집
- /
- 2007.02c
- /
- pp.322-325
- /
- 2007
Graphics processing unit (GPU) is used for a faster artificial neural network. It is used to implement the matrix multiplication of a neural network to enhance the time performance of a text detection system. Preliminary results produced a 20-fold performance enhancement using an ATI RADEON 9700 PRO board. The parallelism of a GPU is fully utilized by accumulating a lot of input feature vectors and weight vectors, then converting the many inner-product operations into one matrix operation. Further research areas include benchmarking the performance with various hardware and GPU-aware learning algorithms. (c) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.patcog.2004.01.013 인용 PDF

A Study on the Acquisition of Identification Information from Warship Image with Deep Learning (딥러닝을 적용한 영상기반 군함 식별정보 획득에 관한 연구)

Kang, Jiyoung;Kim, Wooju
- Journal of the Korea Institute of Military Science and Technology
- /
- v.25 no.1
- /
- pp.55-64
- /
- 2022
Identifying warships contacted at sea is important to prepare for threats. It is necessary to obtain a basis to identify warships. In this study, we propose a 2-step model that acquires the warship's type and hullnumber with identification information from the warship images. The model classifies the warship's type and detects its hullnumber area by applying object detection, then recognizes hullnumber through text recognition algorithms. Proposed model achieved high performance by using state-of-the-art deep learning algorithms.
https://doi.org/10.9766/KIMST.2022.25.1.055 인용 PDF KSCI

An Efficient Text Detection Model using Bidirectional Feature Fusion (양방향 특징 결합을 이용한 효율적 문자 탐지 모델)

Lim, Seong-Taek;Choi, Hoeryeon;Lee, Hong-Chul
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2021.07a
- /
- pp.67-68
- /
- 2021
기존 객체탐지는 경계 상자 회귀방식을 적용하였지만, 문자는 왜곡과 변형이 심한 특성을 가진 객체로 U-net 구조의 이미지 분할 방식을 사용하는 경우가 많다. 따라서 최근 문자 탐지는 통계적 모델에 비해 높은 정확도를 보이는 심층 신경망 기반의 모델 연구가 많이 진행되고 있다. 본 연구에서는 이미지 분할을 통한 양방향 특징 결합 기법을 사용한 문자 탐지 모델을 제안한다. 이미지 분할 방식은 메모리의 효율이 떨어지기 때문에 이를 극복하고자 특징 추출 단계에서 경량화된 네트워크를 적용하였다. 또한, 객체 탐지에서 큰 성과를 보인 양방향 특징 결합 모듈을 U-net 구조에 추가하여 추출된 특징이 효과적으로 결합 되는 결과를 얻었다. 제안하는 모델의 문자 탐지 성능은 합성 문자 데이터셋을 이용한 실험을 통해 기존의 U-net 구조의 이미지 분할 방식보다 향상되었음을 확인하였다.
PDF

A Novel Character Segmentation Method for Text Images Captured by Cameras

Lue, Hsin-Te;Wen, Ming-Gang;Cheng, Hsu-Yung;Fan, Kuo-Chin;Lin, Chih-Wei;Yu, Chih-Chang
- ETRI Journal
- /
- v.32 no.5
- /
- pp.729-739
- /
- 2010
Due to the rapid development of mobile devices equipped with cameras, instant translation of any text seen in any context is possible. Mobile devices can serve as a translation tool by recognizing the texts presented in the captured scenes. Images captured by cameras will embed more external or unwanted effects which need not to be considered in traditional optical character recognition (OCR). In this paper, we segment a text image captured by mobile devices into individual single characters to facilitate OCR kernel processing. Before proceeding with character segmentation, text detection and text line construction need to be performed in advance. A novel character segmentation method which integrates touched character filters is employed on text images captured by cameras. In addition, periphery features are extracted from the segmented images of touched characters and fed as inputs to support vector machines to calculate the confident values. In our experiment, the accuracy rate of the proposed character segmentation system is 94.90%, which demonstrates the effectiveness of the proposed method.
https://doi.org/10.4218/etrij.10.1510.0086 인용 PDF KSCI

Automatic Title Detection by Spatial Feature and Projection Profile for Document Images (공간 정보와 투영 프로파일을 이용한 문서 영상에서의 타이틀 영역 추출)

Park, Hyo-Jin;Kim, Bo-Ram;Kim, Wook-Hyun
- Journal of the Institute of Convergence Signal Processing
- /
- v.11 no.3
- /
- pp.209-214
- /
- 2010
This paper proposes an algorithm of segmentation and title detection for document image. The automated title detection method that we have developed is composed of two phases, segmentation and title area detection. In the first phase, we extract and segment the document image. To perform this operation, the binary map is segmented by combination of morphological operation and CCA(connected component algorithm). The first phase provides segmented regions that would be detected as title area for the second stage. Candidate title areas are detected using geometric information, then we can extract the title region that is performed by removing non-title regions. After classification step that removes non-text regions, projection is performed to detect a title region. From the fact that usually the largest font is used for the title in the document, horizontal projection is performed within text areas. In this paper, we proposed a method of segmentation and title detection for various forms of document images using geometric features and projection profile analysis. The proposed system is expected to have various applications, such as document title recognition, multimedia data searching, real-time image processing and so on.
PDF KSCI

Search Result 402, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)