• Title/Summary/Keyword: 정규표현식

Search Result 85, Processing Time 0.025 seconds

Korean Sentence Symbol Preprocess System for the Improvement of Speech Synthesis Quality (음성 합성 시스템의 품질 향상을 위한 한국어 문장 기호 전처리 시스템)

  • Lee, Ho-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.149-156
    • /
    • 2015
  • In this paper, we propose a Korean sentence symbol preprocessor for a SSML (speech synthesis markup language) supported speech synthesis system in order to improve the quality of the synthesized result. After the analysis of Korean Wikipedia documents, we propose 8 categories for the meaning of sentence symbols and 11 regular expression for the classification of each category. After the development of a Korean sentence symbol preprocess system we archived 56% of precision and 71.45% of recall ratio for 63,000 sentences.

A Study on Identifying Personal Information on Conversational Text Data (대화형 텍스트 데이터 내 개인정보 식별에 대한 연구)

  • Cha, Do Hyun;Kown, Bo Keun;Youn, Hee Chang;Lee, Gu Hyup;Joo, Jong Wha J.
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.11-13
    • /
    • 2022
  • 데이터 3 법을 필두로, 기업은 개인정보가 포함된 데이터를 활용하기 위해 비식별 처리가 필요하게 되었다. 기존 방식은, 비정형 텍스트 데이터에서 정규표현식을 통한 개인정보 식별은 데이터의 다양성에 의해 한계가 명확하며, 기존의 Named Entity Recognition(NER) 태스크로 해결하기에는 언어의 중의적 표현과 2 인 대화에서 나타나는 개인정보가 누구의 것인지 판단하지 못한다는 한계가 존재한다. 따라서 우리는 기존의 한계점을 극복하고 개선하기 위해 BERT 언어 모델에 화자 정보를 학습시키고, 하나의 어절에 2 개의 tag 를 labeling 하는 방법을 제안하여 정확한 개인정보 식별을 시도하였다.

The Mean and Variance of the MUSIC Null-Spectrum (MUSIC Null-Spectrum의 평균과 분산)

  • 최진호;윤진선;김형명;송익호;박성일
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.2
    • /
    • pp.114-120
    • /
    • 1992
  • In this paper we derived the asymptotic distribution of the MUSIC null-spectrum, form which an exact expression of the asymptotic variance of the MUSIC null-spectrum can be obtained. From this result in addition an explicit expression of the normalized standard deviation has been derived and it is shown that the normalized standard deviation depends only on the number of sensors and the number of signals.

  • PDF

Development of an E-Wallet Application for Credit Card Payment for Android (카드 결제 내역을 관리하는 안드로이드 앱의 개발)

  • Ryu, Yeon-Joong;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.01a
    • /
    • pp.287-290
    • /
    • 2014
  • 보편화된 카드 사용은 현금 사용과 달리 지출의 파악이 어려워 과소비를 조장한다. 불필요한 지출을 줄여 합리적인 소비활동을 위해서 본 논문에서는 카드 결제 시 수신 받는 SMS를 통합적으로 관리하여 결제 내역을 분석할 수 있도록 해주는 카드 결제내역 관리 앱을 제안한다. 터치환경에 적합한 Metro UI를 메인화면에 사용하여 메인에서 모든 메뉴로의 접근을 가능하게 하였고, 사용 내역을 수동으로 입력하지 않아도 SMS를 자동으로 분석하여 쉽고 빠르게 이용 가능하다.

  • PDF

The Design & Implementation of Korean Hypertext Automatic Translator (한글 하이퍼텍스트 자동변환시스팀의 설계 및 구현)

  • Ahn, B.I.;Kim, Jay;Kim, Y.W.
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.91-98
    • /
    • 1993
  • 하이퍼텍스트는 문서검색 전산화의 새로운 대안을 제시하고 있으나 저작에 많은 시간과 노력이 요구되는 단점이 있다. 본 연구에서는 기존의 한글문서를 하이퍼텍스트 문서로 자동 변환하는 변환시스팀을 설계, 구현하였다. 문서는 사용자가 제공한 부제목형식의 정규표현식(regular expression)으로부터 논리적 구조가 분석되며 문서분할, 형태소분석, 대표카드결정 및 링크생성의 과정을 거쳐 하이퍼텍스트 문서로 변환된다. 시험운용 결과 본 시스팀은 대량의 한글문서를 적은 노력으로 실용성있는 하이퍼텍스트 문서로 자동 변환할 수 있음을 입증하였다.

  • PDF

A study on the efficient method of constrained iterative regular expression pattern matching (제약 반복적인 정규표현식 패턴 매칭의 효율적인 방법에 관한 연구)

  • Seo, Byung-Suk
    • Design & Manufacturing
    • /
    • v.16 no.3
    • /
    • pp.34-38
    • /
    • 2022
  • Regular expression pattern matching is widely used in applications such as computer virus vaccine, NIDS and DNA sequencing analysis. Hardware-based pattern matching is used when high-performance processing is required due to time constraints. ReCPU, SMPU, and REMP, which are processor-based regular expression matching processors, have been proposed to solve the problem of the hardware-based method that requires resynthesis whenever a pattern is updated. However, these processor-based regular expression matching processors inefficiently handle repetitive operations of regular expressions. In this paper, we propose a new instruction set to improve the inefficient repetitive operations of ReCPU and SMPU. We propose REMPi, a regular expression matching processor that enables efficient iterative operations based on the REMP instruction set. REMPi improves the inefficient method of processing a particularly short sub-pattern as a repeat operation OR, and enables processing with a single instruction. In addition, by using a down counter and a counter stack, nested iterative operations are also efficiently processed. REMPi was described with Verilog and synthesized on Intel Stratix IV FPGA.

Efficient Regular Expression Matching Using FPGA (FPGA를 이용한 효율적 정규표현매칭)

  • Lee, Jang-Haeng;Lee, Seong-Won;Park, Neung-Soo
    • The KIPS Transactions:PartC
    • /
    • v.16C no.5
    • /
    • pp.583-588
    • /
    • 2009
  • Network intrusion detection system (NIDS) monitors all incoming packets in the network and detects packets that are malicious to internal system. The NIDS should also have ability to update detection rules because new attack patterns are unpredictable. Incorporating FPGAs into the NIDS is one of the best solutions that can provide both high performance and high flexibility comparing with other approaches such as software solutions. In this paper we propose and design a novel approach, prefix sharing parallel pattern matcher, that can not only minimize additional resources but also maximize the processing performance. Experimental results showed that the throughput for 16-bit input is twice larger than for 8-bit input but the used LEs/Char in FPGA increases only 1.07 times.

Multivariate empirical distribution functions and descriptive methods (다변량 경험분포함수와 시각적인 표현방법)

  • Hong, Chong Sun;Park, Jun;Park, Yong Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.87-98
    • /
    • 2017
  • The multivaiate empirical distribution function (MEDF) is defined in this work. The MEDF's expectation and variance are derived and we have shown the MEDF converges to its real distribution function. Based on random samples from bivariate standard normal distribution with various correlation coefficients, we also obtain MEDFs and propose two kinds of graphical methods to visualize MEDFs on two dimensional plane. One is represented with at most n stairs with similar arguments as the step function, and the other is described with at most n curves which look like bivariate quantile vector. Even though these two descriptive methods could be expressed with three dimensional space, two dimensional representation is obtained with ease and it is enough to explain characteristics of bivariate distribution functions. Hence, it is possible to visualize trivariate empirical distribution functions with three dimensional quantile vectors. With bivariate and four variate illustrative examples, the proposed MEDFs descriptive plots are obtained and explored.

Browser fuzzing and analysis using known vulnerability (파이썬 모듈과 정규표현식을 활용한 웹 취약점 탐색 자동화 봇)

  • Kim, Nam-gue;Kim, Ki Hwan;Lee, Hoon-Jae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.749-751
    • /
    • 2016
  • Internet technology is universal, news from the Web browser, shopping, search, etc., various activities have been carried out. Its size becomes large, increasing the scale of information security incidents, as damage to this increases the safety for the use of the Internet is emphasized. IE browser is ASLR, such as Isolated Heap, but has been continually patch a number of vulnerabilities, such as various protection measures, this vulnerability, have come up constantly. And, therefore, in order to prevent security incidents, it is necessary to be removed to find before that is used to exploit this vulnerability. Therefore, in this paper, we introduce the purge is a technique that is used in the discovery of the vulnerability, we describe the automation technology related thereto. And utilizing the known vulnerabilities, and try to show any of the typical procedures for the analysis of the vulnerability.

  • PDF

Analytic Error Caused by the Inconsistency of the Approximation Order between the Non Local Boundary Condition and the Parabolic Governing Equation (포물선 지배 방정식과 비국소적 경계조건의 근사 차수 불일치에 의한 해석적 오차)

  • Lee Keun-Hwa;Seong Woo-Jae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.5
    • /
    • pp.229-238
    • /
    • 2006
  • This paper shows the analytic error caused by the inconsistency of the approximation order between the non local boundary condition (NLBC) and the parabolic governing equation. To obtain the analytic error, we first transform the NLBC to the half space domain using plane wave analysis. Then, the analytic error is derived on the boundary between the true numerical domain and the half space domain equivalent to the NLBC. The derived analytic error is physically expressed as the artificial reflection. We examine the characteristic of the analytic error for the grazing angle, the approximation order of the PE or the NLBC. Our main contribution is to present the analytic method of error estimation and the application limit for the high order parabolic equation and the NLBC.