Search | Korea Science

A Method for Automatic Detection of Character Encoding of Multi Language Document File (다중 언어로 작성된 문서 파일에 적용된 문자 인코딩 자동 인식 기법)

Seo, Min Ji;Kim, Myung Ho
- KIISE Transactions on Computing Practices
- /
- v.22 no.4
- /
- pp.170-177
- /
- 2016
Character encoding is a method for changing a document to a binary document file using the code table for storage in a computer. When people decode a binary document file in a computer to be read, they must know the code table applied to the file at the encoding stage in order to get the original document. Identifying the code table used for encoding the file is thus an essential part of decoding. In this paper, we propose a method for detecting the character code of the given binary document file automatically. The method uses many techniques to increase the detection rate, such as a character code range detection, escape character detection, character code characteristic detection, and commonly used word detection. The commonly used word detection method uses multiple word database, which means this method can achieve a much higher detection rate for multi-language files as compared with other methods. If the proportion of language is 20% less than in the document, the conventional method has about 50% encoding recognition. In the case of the proposed method, regardless of the proportion of language, there is up to 96% encoding recognition.
https://doi.org/10.5626/KTCP.2016.22.4.170 인용 PDF KSCI

Analysis of Cross-Correlation of Extended Non-Linear Binary Sequences (확장된 비선형 이진수열의 상호상관관계 분석)

Choi, Un-Sook;Cho, Sung-Jin;Kwon, Sook-Hi
- The Journal of the Korea institute of electronic communication sciences
- /
- v.7 no.2
- /
- pp.263-269
- /
- 2012
Code-Division Multiple-Access(CDMA) allows several users simultaneous access to a common channel by assigning a distinct pseudonoise sequence called spectrum code to each user. Each user in a CDMA system uses a assigned spectrum code to modulate their signal. Choosing the codes used to modulate the signal is very important in the performance of CDMA systems. The best performance will occur when there is good separation between the signal of a desired user and the signals of other users. The receiver synchronizes the code to recover the data. The use of an independent code allows multiple users to access the same frequency band at the same time. In this paper we propose a generalized model of non-linear binary sequence using trace function and analyze cross-correlation of these sequences. These sequences with low correlation, large linear span and large family size, in a direct-sequence spread spectrum communication system, help to minimize multiple access interference, increase security degree of system and enlarge user number.
https://doi.org/10.13067/JKIECS.2012.7.2.263 인용 PDF KSCI

OLE File Analysis and Malware Detection using Machine Learning

Choi, Hyeong Kyu;Kang, Ah Reum
- Journal of the Korea Society of Computer and Information
- /
- v.27 no.5
- /
- pp.149-156
- /
- 2022
Recently, there have been many reports of document-type malicious code injecting malicious code into Microsoft Office files. Document-type malicious code is often hidden by encoding the malicious code in the document. Therefore, document-type malware can easily bypass anti-virus programs. We found that malicious code was inserted into the Visual Basic for Applications (VBA) macro, a function supported by Microsoft Office. Malicious codes such as shellcodes that run external programs and URL-related codes that download files from external URLs were identified. We selected 354 keywords repeatedly appearing in malicious Microsoft Office files and defined the number of times each keyword appears in the body of the document as a feature. We performed machine learning with SVM, naïve Bayes, logistic regression, and random forest algorithms. As a result, each algorithm showed accuracies of 0.994, 0.659, 0.995, and 0.998, respectively.
https://doi.org/10.9708/jksci.2022.27.05.149 인용 PDF KSCI HTML

Recognition of a New Car License Plate Using HSI Information, Fuzzy Binarization and ART2 Algorithm (HSI 정보와 퍼지 이진화 및 ART2 알고리즘을 이용한 신차량 번호판의 인식)

Kim, Kwang-Baek;Woo, Young-Woon;Park, Choong-Shik
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.11 no.5
- /
- pp.1004-1012
- /
- 2007
In this paper, we proposed a new car license plate recognition method using an unsupervised ART2 algorithm with HSI color model. The proposed method consists of two main modules; extracting plate area from a vehicle image and recognizing the characters in the plate after that. To extract plate area, hue(H) component of HSI color model is used, and the sub-area containing characters is acquired using modified fuzzy binarization method. Each character is further divided by a 4-directional edge tracking algorithm. To recognize the separated characters, noise-robust ART2 algorithm is employed. When the proposed algorithm is applied to recognize license plate characters, the extraction rate is better than that of existing RGB model and the overall recognition rate is about 97.4%.
https://doi.org/10.6109/jkiice.2007.11.5.1004 인용 PDF KSCI

An Analysis Technique for Encrypted Unknown Malicious Scripts (알려지지 않은 악성 암호화 스크립트에 대한 분석 기법)

Lee, Seong-Uck;Hong, Man-Pyo
- Journal of KIISE:Information Networking
- /
- v.29 no.5
- /
- pp.473-481
- /
- 2002
Decryption of encrypted malicious scripts is essential in order to analyze the scripts and to determine whether they are malicious. An effective decryption technique is one that is designed to consider the characteristics of the script languages rather than the specific encryption patterns. However, currently X-raying and emulation are not the proper techniques for the script because they were designed to decrypt binary malicious codes. In addition to that, heuristic techniques are unable to decrypt unknown script codes that use unknown encryption techniques. In this paper, we propose a new technique that will be able to decrypt malicious scripts based on analytical approach. we describe its implementation.
PDF KSCI

A Meta-data Generation Technique for Efficient and Secure Code Reuse Attack Detection with a Consideration on Two Types of Instruction Set (안전하고 효율적인 Code Reuse Attack 탐지를 위한 ARM 프로세서의 두 가지 명령어 세트를 고려한 Meta-data 생성 기술)

Heo, Ingeo;Han, Sangjun;Lee, Jinyong;Paek, Yunheung
- Proceedings of the Korea Information Processing Society Conference
- /
- 2014.11a
- /
- pp.443-446
- /
- 2014
Code reuse attack (CRA)는 기존의 코드 내에서 필요한 코드 조각들 (gadgets)을 모아 indirect branch 명령어들로 잇는 방식으로 공격자가 원하는 악성 프로그램을 구성할 수 있는 강력한 공격 방법이다. 공격자는 자신의 코드를 대상 시스템에 심는 대신 기존의 코드를 이용하기 때문에, 대부분의 범용 운영체제 (OS)가 강제하는 W^X protection 을 무력화할 수 있다. 이러한 CRA 에 대응하기 위하여 다수의 연구들에서 branch 의 trace 를 분석하여 CRA 고유의 특성을 찾아내는 Signature 기반 탐지 기술을 제안하였다. 본 논문에서는 ARM 프로세서 상에서의 CRA 를 대응하기 위한 Signature 기반 탐지 기술을 효율적으로 도울 수 있는 binary 분석 및 meta-data 생성 기술을 제안한다. 특히, 본 논문은 우리의 이전 논문에서 고려 되지 못했던 ARM 의 두 가지 명령어 세트의 특성을 고려하여, 공격자가 어느 명령어 세트를 이용하여 CRA 를 시도하더라도 막아낼 수 있도록 meta-data 를 두 가지 mode 에 대해서 생성하였다. 실험 결과, meta-data 는 본래 바이너리 코드 대비 20.8% 정도의 크기 증가를 일으키는 것으로 나타났다.
https://doi.org/10.3745/PKIPS.y2014m11a.443 인용 PDF

A Fast Cell Search Algorithm using Code Position Modulation within code block in Asynchronous W-CDMA System (비동기 W-CDMA 시스템을 위한 코드블럭 내의 코드위치변조를 이용한 고속 셀 탐색 알고리즘)

최정현;김낙명
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.25 no.5A
- /
- pp.611-617
- /
- 2000
Asynchronous mode W-CDMA system is kmown to be quite appropriate to the next generation mobile communication system, especially in a non-homogenious cellular architecture. In this case, however, each base station needs to use different spreading code for identification, so it is a demeanding task for a mobile terminal to find the best cell site and get an accurate code synchronization at the beginning of a communication. Since slow acquisition of a base station could mean the failure of initiation, a fast algorithm to accelerate the cell search process is essential. In this paper, a new cell search algorithm based on the binary code position modulation within the code block is proposed. Different cell sites are identified by different hopping code sequences, andeach position modulation is performed by the hopping code. The proposed algorithm is proved to make the cell search time in most places in a cell much shorter than the previous algorithms, and to make the receiver implementation simpler.
PDF

Implementation of PDF417 2-dimensional Barcode Decoder (PDF417 이차원 바코드 디코딩 알고리즘의 구현)

정정구;한희일
- Proceedings of the IEEK Conference
- /
- 2001.09a
- /
- pp.289-292
- /
- 2001
종래에 사용되어 왔던 1차원 바코드가 정보를 포함하고 있는 데이터베이스에 접근하는 데이터 키 역할을 주로 해온 것에 비해, 2차원 바코드는 다량의 데이터를 포함할 수 있고 고밀도의 데이터 표현이 가능하여, 호스트 컴퓨터의 데이터 베이스에 온라인 연결할 필요없이 확인하고자 하는 사람이나 대상물에 대한 정보를 얻을 수 있다. 본 논문에서는 가장 널리 사용되는 2차원 바코드 체계인 PDF417 을 중심으로 디지털 카메라를 통하여 입력한 영상을 이진화하여 시작 심볼 또는 정지 심볼을 검색함으로써 2차원 바코드 영역을 추출한 다음, 추출된 영역으로부터 바코드의 행과 열의 수, 오류수정 정도 등의 헤더정보를 검출하고 이를 바탕으로 코드워드를 추출하는 알고리즘을 제안한다. 얻어진 코드워드는 데이터를 효율적으로 저장하기위해 정보가 숫자인지, ASCII코드인지 혹은 바이트 정보인지에 따라 다른 방식으로 인코딩 되어 있는데, 그에 따른 디코딩 알고리즘을 제안한다.
PDF

Service Modulization of the Code Visualization (코드 가시화의 서비스 모듈화)

Lee, Jin-Hyub;Yi, Keunsang;Seo, Chae-Yun;Kim, R. YoungChul
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.04a
- /
- pp.629-632
- /
- 2017
국내 대기업들은 충분한 SW테스팅으로 SW의 품질과 안정성을 점검하고 있다. 반면, 중소기업들은 부족한 인력과 비싼 상용 테스팅 도구 등으로 테스팅 환경이 어려운 실정이다. 이로 인한 테스트 부족 속에서 SW제품을 출시한다. 이 논문에서는 이런 문제의 해결방안 중 하나로 개발자가 코드 내부의 복잡도를 측정하여 잠재적인 오류를 줄이는데 초점을 둔다. 이를 위해 공개 소스프트웨어 기반의 도구 개선 제안 및 가시화 구현을 하였다. 즉, 벤처/중소 기업의 개발자들에게 각각 품질 요소들의 가시화 서비스가 가능하다. 이는 코드 내부의 결합력/응집력/복잡도/재사용 등의 가시적 모듈화로 SW품질 개선이 가능하다.
https://doi.org/10.3745/PKIPS.y2017m04a.629 인용 PDF

The Design for a Method of Detecting Polymorphic Script Virus Using Static Analysis (정적 분석을 이용한 다형성 스크립트 바이러스의 탐지기법 설계)

이형준;김철민;이성욱;홍만표
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04a
- /
- pp.407-409
- /
- 2003
매크로 바이러스를 비롯한 악성 스크립트 바이러스는 이진 코드와는 달리 텍스트 형식으로 코드가 저장되기 때문에 많은 수의 변종이 가능하고 다형성을 지닌 형태로의 제작이 쉬워 새로운 형태의 출현이 빈번하다［1］. 이에 따라 시그니처 기반의 감지 기법을 탈피한 다양한 기법들이 제안되고 있으나 세밀한 수준의 분석으로 인한 시간 지연과 높은 긍정 오류의 문제로 현실적으로 적용되지 못하는 실정이다. 이를 개선하여 비교적 짧은 시간에 정적 분석을 끝내고 코드 삽입 기법을 병행하여 긍정 오류 문제를 해결한 기법이 제안 되었다［2］. 그러나 이 기법에서 사용하는 정적 분석은 다형성 스크립트 바이러스에 대하여 고려하고 있지 않다. 본 논문에서는 제안된 정적 분석 기법을 확장 하여 다형성 스크립트 바이러스를 탐지할 수 있는 기법을 제시 한다.
PDF

Search Result 255, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)