• Title/Summary/Keyword: Language Models

Search Result 883, Processing Time 0.022 seconds

Feature Analysis for Detecting Mobile Application Review Generated by AI-Based Language Model

  • Lee, Seung-Cheol;Jang, Yonghun;Park, Chang-Hyeon;Seo, Yeong-Seok
    • Journal of Information Processing Systems
    • /
    • v.18 no.5
    • /
    • pp.650-664
    • /
    • 2022
  • Mobile applications can be easily downloaded and installed via markets. However, malware and malicious applications containing unwanted advertisements exist in these application markets. Therefore, smartphone users install applications with reference to the application review to avoid such malicious applications. An application review typically comprises contents for evaluation; however, a false review with a specific purpose can be included. Such false reviews are known as fake reviews, and they can be generated using artificial intelligence (AI)-based text-generating models. Recently, AI-based text-generating models have been developed rapidly and demonstrate high-quality generated texts. Herein, we analyze the features of fake reviews generated from Generative Pre-Training-2 (GPT-2), an AI-based text-generating model and create a model to detect those fake reviews. First, we collect a real human-written application review from Kaggle. Subsequently, we identify features of the fake review using natural language processing and statistical analysis. Next, we generate fake review detection models using five types of machine-learning models trained using identified features. In terms of the performances of the fake review detection models, we achieved average F1-scores of 0.738, 0.723, and 0.730 for the fake review, real review, and overall classifications, respectively.

Unified Modeling Language based Analysis of Security Attacks in Wireless Sensor Networks: A Survey

  • Hong, Sung-Hyuck;Lim, Sun-Ho;Song, Jae-Ki
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.4
    • /
    • pp.805-821
    • /
    • 2011
  • Wireless Sensor Networks (WSNs) are rapidly emerging because of their potential applications available in military and civilian environments. Due to unattended and hostile deployment environments, shared wireless links, and inherent resource constraints, providing high level security services is challenging in WSNs. In this paper, we revisit various security attack models and analyze them by using a well-known standard notation, Unified Modeling Language (UML). We provide a set of UML collaboration diagram and sequence diagrams of attack models witnessed in different network layers: physical, data/link, network, and transport. The proposed UML-based analysis not only can facilitate understanding of attack strategies, but can also provide a deep insight into designing/developing countermeasures in WSNs.

A Process Programming Language and Its Runtime Support System for the SEED Process-centered Software Engineering Environment (SEED 프로세스 중심 소프트웨어 개발 환경을 위한 프로세스 프로그래밍 언어 및 수행지원 시스템)

  • Kim, Yeong-Gon;Choe, Hyeok-Jae;Lee, Myeong-Jun;Im, Chae-Deok;Han, U-Yong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.5 no.6
    • /
    • pp.727-737
    • /
    • 1999
  • 프로세스 중심 소프트웨어 개발 환경(PSEE : Process-centered Software Engineering Environment)은 소프트웨어 개발자를 위한 여러가지 정보의 제공과 타스크의 수행, 소프트웨어 개발 도구의 수행 및 제어, 필수적인 규칙이나 업무의 수행등과 같은 다양한 행위를 제공하는 프로세스 모형의 수행을 통하여 소프트웨어 개발 행위를 지원한다. SEED(Software Engineering Environment for Development)는 효율적인 소프트웨어 개발과 프로세스 모형의 수행을 제어하기 위해 ETRI에서 개발된 PSEE이다.본 논문에서는 SEED에서 프로세스 모형을 설계하기 위해 사용되는 SimFlex 프로세스 프로그래밍 언어와, 수행지원시스템인 SEED Engine의 구현에 대하여 기술한다. SimFlex는 간단한 언어 구조를 가진 프로세스 프로그래밍 언어이며, 적절한 적합화를 통하여 다른 PSEE에서 사용될 수 있다. SimFlex 컴파일러는 SimFlex에 의해 기술된 프로세스 모형을 분석하고, 모형의 오류를 검사하며, SEED Engine에 의해 참조되는 중간 프로세스 모형을 생성한다. 중간 프로세스 모형을 사용하여 SEED Engine은 외부 모니터링 도구와 연관하여 사용자를 위한 유용한 정보뿐만 아니라 SimFlex에 의해 기술된 프로세스 모형의 자동적인 수행을 제공한다. SimFlex 언어와 수행지원 시스템의 지원을 통하여 소프트웨어 프로세스를 모형화하는데 드는 비용과 시간을 줄일 수 있으며, 편리하게 프로젝트를 관리하여 양질의 소프트웨어 생산물을 도출할 수 있다. Abstract Process-centered Software Engineering Environments(PSEEs) support software development activities through the enaction of process models, providing a variety of activities such as supply of various information for software developers, automation of routine tasks, invocation and control of software development tools, and enforcement of mandatory rules and practices. The SEED(Software Engineering Environment for Development) system is a PSEE which was developed for effective software process development and controlling the enactment of process models by ETRI.In this paper, we describe the implementation of the SimFlex process programming language used to design process models in SEED, and its runtime support system called by SEED Engine. SimFlex is a software process programming language to describe process models with simple language constructs, and it could be embedded into other PSEEs through appropriate customization. The SimFlex compiler analyzes process models described by SimFlex, check errors in the models, and produce intermediate process models referenced by the SEED Engine. Using the intermediate process models, the SEED Engine provides automatic enactment of the process models described by SimFlex as well as useful information for agents linked to the external monitoring tool. With the help of the SimFlex language and its runtime support system, we can reduce cost and time in modeling software processes and perform convenient project management, producing well-qualified software products.

Verification of educational goal of reading area in Korean SAT through natural language processing techniques (대학수학능력시험 독서 영역의 교육 목표를 위한 자연어처리 기법을 통한 검증)

  • Lee, Soomin;Kim, Gyeongmin;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.1
    • /
    • pp.81-88
    • /
    • 2022
  • The major educational goal of reading part, which occupies important portion in Korean language in Korean SAT, is to evaluated whether a given text can be fully understood. Therefore given questions in the exam must be able to solely solvable by given text. In this paper we developed a datatset based on Korean SAT's reading part in order to evaluate whether a deep learning language model can classify if the given question is true or false, which is a binary classification task in NLP. In result, by applying language model solely according to the passages in the dataset, we were able to acquire better performance than 59.2% in F1 score for human performance in most of language models, that KoELECTRA scored 62.49% in our experiment. Also we proved that structural limit of language models can be eased by adjusting data preprocess.

Constraint Description language and Automatic Code Generator for Single-Machine Job Sequencing Problems (단일기계 일정계획을 위한 제약조건 표현언어 및 코드 자동생성기)

  • Lee, You-K.;Baek, Seon-D.;Bae, Sung-M.;Jun, Chi-H.;Chang, Soo-Y.;Choi, In-J.
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.22 no.2
    • /
    • pp.209-229
    • /
    • 1996
  • Scheduling problems which determine the sequence of jobs are one of the Important issues to many industries. This paper deals with a single-machine job sequencing problem which has complex constraints and an objective function. To solve the problem, an expressive constraint description language and an automatic code generator are developed for our scheduling system. The user just needs to describe the scheduling problem using the constraint description language that allows to express both quantitative and qualitative constraints as well as an objective function in real world semantics. Then, a complete scheduling program based on constraint satisfaction technique is automatically generated through the code generator. Advantage of this approach is that models of the scheduling problems are easily developed and maintained because models ore formulated by using the language which reflects real world semantics.

  • PDF

Question Answering that leverage the inherent knowledge of large language models (거대 언어 모델의 내재된 지식을 활용한 질의 응답 방법)

  • Myoseop Sim;Kyungkoo Min;Minjun Park;Jooyoung Choi;Haemin Jung;Stanley Jungkyu Choi
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.31-35
    • /
    • 2023
  • 최근에는 질의응답(Question Answering, QA) 분야에서 거대 언어 모델(Large Language Models, LLMs)의 파라미터에 내재된 지식을 활용하는 방식이 활발히 연구되고 있다. Open Domain QA(ODQA) 분야에서는 기존에 정보 검색기(retriever)-독해기(reader) 파이프라인이 주로 사용되었으나, 최근에는 거대 언어 모델이 독해 뿐만 아니라 정보 검색기의 역할까지 대신하고 있다. 본 논문에서는 거대 언어 모델의 내재된 지식을 사용해서 질의 응답에 활용하는 방법을 제안한다. 질문에 대해 답변을 하기 전에 질문과 관련된 구절을 생성하고, 이를 바탕으로 질문에 대한 답변을 생성하는 방식이다. 이 방법은 Closed-Book QA 분야에서 기존 프롬프팅 방법 대비 우수한 성능을 보여주며, 이를 통해 대형 언어 모델에 내재된 지식을 활용하여 질의 응답 능력을 향상시킬 수 있음을 입증한다.

  • PDF

A Study on Keyword Spotting System Using Pseudo N-gram Language Model (의사 N-gram 언어모델을 이용한 핵심어 검출 시스템에 관한 연구)

  • 이여송;김주곤;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.242-247
    • /
    • 2004
  • Conventional keyword spotting systems use the connected word recognition network consisted by keyword models and filler models in keyword spotting. This is why the system can not construct the language models of word appearance effectively for detecting keywords in large vocabulary continuous speech recognition system with large text data. In this paper to solve this problem, we propose a keyword spotting system using pseudo N-gram language model for detecting key-words and investigate the performance of the system upon the changes of the frequencies of appearances of both keywords and filler models. As the results, when the Unigram probability of keywords and filler models were set to 0.2, 0.8, the experimental results showed that CA (Correctly Accept for In-Vocabulary) and CR (Correctly Reject for Out-Of-Vocabulary) were 91.1% and 91.7% respectively, which means that our proposed system can get 14% of improved average CA-CR performance than conventional methods in ERR (Error Reduction Rate).

Hierarchical Hidden Markov Model for Finger Language Recognition (지화 인식을 위한 계층적 은닉 마코프 모델)

  • Kwon, Jae-Hong;Kim, Tae-Yong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.9
    • /
    • pp.77-85
    • /
    • 2015
  • The finger language is the part of the sign language, which is a language system that expresses vowels and consonants with hand gestures. Korean finger language has 31 gestures and each of them needs a lot of learning models for accurate recognition. If there exist mass learning models, it spends a lot of time to search. So a real-time awareness system concentrates on how to reduce search spaces. For solving these problems, this paper suggest a hierarchy HMM structure that reduces the exploration space effectively without decreasing recognition rate. The Korean finger language is divided into 3 categories according to the direction of a wrist, and a model can be searched within these categories. Pre-classification can discern a similar finger Korean language. And it makes a search space to be managed effectively. Therefore the proposed method can be applied on the real-time recognition system. Experimental results demonstrate that the proposed method can reduce the time about three times than general HMM recognition method.