• Title/Summary/Keyword: Source Code Analysis

Search Result 437, Processing Time 0.025 seconds

Source Coding Rule of Characters to Minimize HDB-3 Scrambling in Line Coder for UTF-8 code (UTF-8 부호의 HDB-3스크램블링 최소화를 위한 문자의 원천부호화 규칙)

  • Hong, Wan-Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.9
    • /
    • pp.1019-1026
    • /
    • 2015
  • This paper studied the source coding rule of the characters to minimize the HDB-3 scrambling for UTF-8 code. An existing source coding rule of the characters to minimize the HDB-3 scrambling in the line coder is for the source codes which are directly entered into the line coder without any transformation. Therefore the existing source coding rule can't apply the UTF-8 code which is directly came into an input of line coder. The reason is that the scrambling code in the source codes are not same as UTF-8 codes. So, if they want to analysis the scrambling occurrence situation in UTF-8 codes and make an unscrambling UTF-8 code, they should make a UTF-8 code table for the source codes, find out the scrambling occurrence codes and then encode the unscrambling source code. The source coding rule for UTF-8 code showing this paper can omit such a complicated procedure to encode an unscrambling source code.

Applying Practice Patterns to Improve Source Code Quality of Embedded Software (임베디드 소프트웨어의 소스 코드 품질 향상을 위한 Practice Patterns의 적용)

  • Hong Jang-Eui
    • The KIPS Transactions:PartA
    • /
    • v.12A no.7 s.97
    • /
    • pp.589-596
    • /
    • 2005
  • Source code quality is very Important that software embedded into product is difficult to change. In order to improve source code quality, it should be considered the quality of analysis and design models as well as the quality of source code. In this paper, we suggest 'Practice Pattern' as one of practical techniques to improve embedded software source code quality. Practice pattern is a procedural pattern to guide modeling and coding activities in software development phases. We believe that applying our pattern provides the improvement of optimum performance, modularization, and portability for embedded software source code.

Detecting Inconsistent Code Identifiers (코드 비 일관적 식별자 검출 기법)

  • Lee, Sungnam;Kim, Suntae;Park, Sooyoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.5
    • /
    • pp.319-328
    • /
    • 2013
  • Software maintainers try to comprehend software source code by intensively using source code identifiers. Thus, use of inconsistent identifiers throughout entire source code causes to increase cost of software maintenance. Although participants can adopt peer reviews to handle this problem, it might be impossible to go through entire source code if the volume of code is huge. This paper introduces an approach to automatically detecting inconsistent identifiers of Java source code. This approach consists of tokenizing and POS tagging all identifiers in the source code, classifying syntactic and semantic similar terms, and finally detecting inconsistent identifiers by applying proposed rules. In addition, we have developed tool support, named CodeAmigo, to support the proposed approach. We applied it to two popular Java based open source projects in order to show feasibility of the approach by computing precision.

A Survey of Automatic Code Generation from Natural Language

  • Shin, Jiho;Nam, Jaechang
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.537-555
    • /
    • 2021
  • Many researchers have carried out studies related to programming languages since the beginning of computer science. Besides programming with traditional programming languages (i.e., procedural, object-oriented, functional programming language, etc.), a new paradigm of programming is being carried out. It is programming with natural language. By programming with natural language, we expect that it will free our expressiveness in contrast to programming languages which have strong constraints in syntax. This paper surveys the approaches that generate source code automatically from a natural language description. We also categorize the approaches by their forms of input and output. Finally, we analyze the current trend of approaches and suggest the future direction of this research domain to improve automatic code generation with natural language. From the analysis, we state that researchers should work on customizing language models in the domain of source code and explore better representations of source code such as embedding techniques and pre-trained models which have been proved to work well on natural language processing tasks.

Quench analysis and protection circuit design of a superconducting magnet system for RISP 28GHz ECR ion source

  • Song, S.;Ko, T.K.;Choi, S.;Ahn, M.C.
    • Progress in Superconductivity and Cryogenics
    • /
    • v.18 no.2
    • /
    • pp.37-41
    • /
    • 2016
  • This paper presents the developed quench analysis code and protection circuit design for a superconducting magnet system of 28GHz electron cyclotron resonance (ECR) ion source. The superconducting magnet is composed of a hexapole magnet and four solenoid magnets located outside of the hexapole one. All magnets are wound with NbTi composite wire and impregnated by epoxy. By using the developed characteristic analysis code, the normal zone resistance, decaying current and temperature rising can be estimated during quench. Also, the stored magnetic energy is successfully consumed from the series resistor of the designed protection circuit. The analytical results are compared with the experimental results to verify the developed quench analysis code and protection circuit.

A Plagiarism Detection Technique for Java Program Using Bytecode Analysis (바이트코드 분석을 이용한 자바 프로그램 표절검사기법)

  • Ji, Jeong-Hoon;Woo, Gyun;Cho, Hwan-Gue
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.7
    • /
    • pp.442-451
    • /
    • 2008
  • Most plagiarism detection systems evaluate the similarity of source codes and detect plagiarized program pairs. If we use the source codes in plagiarism detection, the source code security can be a significant problem. Plagiarism detection based on target code can be used for protecting the security of source codes. In this paper, we propose a new plagiarism detection technique for Java programs using bytecodes without referring their source codes. The plagiarism detection procedure using bytecode consists of two major steps. First, we generate the token sequences from the Java class file by analyzing the code area of methods. Then, we evaluate the similarity between token sequences using the adaptive local alignment. According to the experimental results, we can find the distributions of similarities of the source codes and that of bytecodes are very similar. Also, the correlation between the similarities of source code pairs and those of bytecode pairs is high enough for typical test data. The plagiarism detection system using bytecode can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.

Measuring Method of Worst-case Execution Time by Analyzing Relation between Source Code and Executable Code (소스코드와 실행코드의 상관관계 분석을 통한 최악실행시간 측정 방법)

  • Seo, Yongjin;Kim, Hyeon Soo
    • Journal of Internet Computing and Services
    • /
    • v.17 no.4
    • /
    • pp.51-60
    • /
    • 2016
  • Embedded software has requirements such as real-time and environment independency. The real-time requirement is affected from worst-case execution time of loaded tasks. Therefore, to guarantee real-time requirement, we need to determine a program's worst-case execution time using static analysis approach. However, the existing methods for worst-case execution time analysis do not consider the environment independency. Thus, in this paper, in order to provide environment independency, we propose a method for measuring task's execution time from the source codes. The proposed method measures the execution time through the control flow graph created from the source codes instead of the executable codes. However, the control flow graph created from the source code does not have information about execution time. Therefore, in order to provide this information, the proposed method identifies the relationships between statements in the source code and instructions in the executable code. By parameterizing those parts that are dependent on processors based on the relationships, it is possible to enhance the flexibility of the tool that measures the worst-case execution time.

The Identification Framework for source code author using Authorship Analysis and CNN (작성자 분석과 CNN을 적용한 소스 코드 작성자 식별 프레임워크)

  • Shin, Gun-Yoon;Kim, Dong-Wook;Hong, Sung-sam;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.33-41
    • /
    • 2018
  • Recently, Internet technology has developed, various programs are being created and therefore various codes are being made through many authors. On this aspect, some author deceive a program or code written by other particular author as they make it themselves and use other writers' code indiscriminately, or not indicating the exact code which has been used. Due to this makes it more and more difficult to protect the code. In this paper, we propose author identification framework using Authorship Analysis theory and Natural Language Processing(NLP) based on Convolutional Neural Network(CNN). We apply Authorship Analysis theory to extract features for author identification in the source code, and combine them with the features being used text mining to perform author identification using machine learning. In addition, applying CNN based natural language processing method to source code for code author classification. Therefore, we propose a framework for the identification of authors using the Authorship Analysis theory and the CNN. In order to identify the author, we need special features for identifying the authors only, and the NLP method based on the CNN is able to apply language with a special system such as source code and identify the author. identification accuracy based on Authorship Analysis theory is 95.1% and identification accuracy applied to CNN is 98%.

A Study on the Improvement of Source Code Static Analysis Using Machine Learning (기계학습을 이용한 소스코드 정적 분석 개선에 관한 연구)

  • Park, Yang-Hwan;Choi, Jin-Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1131-1139
    • /
    • 2020
  • The static analysis of the source code is to find the remaining security weaknesses for a wide range of source codes. The static analysis tool is used to check the result, and the static analysis expert performs spying and false detection analysis on the result. In this process, the amount of analysis is large and the rate of false positives is high, so a lot of time and effort is required, and a method of efficient analysis is required. In addition, it is rare for experts to analyze only the source code of the line where the defect occurred when performing positive/false detection analysis. Depending on the type of defect, the surrounding source code is analyzed together and the final analysis result is delivered. In order to solve the difficulty of experts discriminating positive and false positives using these static analysis tools, this paper proposes a method of determining whether or not the security weakness found by the static analysis tools is a spy detection through artificial intelligence rather than an expert. In addition, the optimal size was confirmed through an experiment to see how the size of the training data (source code around the defects) used for such machine learning affects the performance. This result is expected to help the static analysis expert's job of classifying positive and false positives after static analysis.

Study on the comparison result of Machine code Program (실행코드 비교 감정에서 주변장치 분석의 유효성)

  • Kim, Do-Hyeun;Lee, Kyu-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.1
    • /
    • pp.37-44
    • /
    • 2020
  • The similarity of the software is extracted by the verification of comparing with the source code. The source code is the intellectual copyright of the developer written in the programming language. And the source code written in text format contains the contents of the developer's expertise and ideas. The verification for judging the illegal use of software copyright is performed by comparing the structure and contents of files with the source code of the original and the illegal copy. However, there is hard to do the one-to-one comparison in practice. Cause the suspected source code do not submitted Intentionally or unconsciously. It is now increasing practically. In this case, the comparative evaluation with execution code should be performed, and indirect methods such as reverse assembling method, reverse engineering technique, and sequence analysis of function execution are applied. In this paper, we analyzed the effectiveness of indirect comparison results by practical evaluation . It also proposes a method to utilize to the system and executable code files as a verification results.