Browse > Article
http://dx.doi.org/10.3745/KTSDE.2013.2.5.319

Detecting Inconsistent Code Identifiers  

Lee, Sungnam (방위사업청)
Kim, Suntae (강원대학교 컴퓨터공학과)
Park, Sooyoung (서강대학교 컴퓨터공학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.2, no.5, 2013 , pp. 319-328 More about this Journal
Abstract
Software maintainers try to comprehend software source code by intensively using source code identifiers. Thus, use of inconsistent identifiers throughout entire source code causes to increase cost of software maintenance. Although participants can adopt peer reviews to handle this problem, it might be impossible to go through entire source code if the volume of code is huge. This paper introduces an approach to automatically detecting inconsistent identifiers of Java source code. This approach consists of tokenizing and POS tagging all identifiers in the source code, classifying syntactic and semantic similar terms, and finally detecting inconsistent identifiers by applying proposed rules. In addition, we have developed tool support, named CodeAmigo, to support the proposed approach. We applied it to two popular Java based open source projects in order to show feasibility of the approach by computing precision.
Keywords
Inconsistent Identifiers; Source Code Analysis; Natural Language Processing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 G. Antoniol, G. Canfora, G. Casazza, A.D. Lucia and E. Merlo, "Recovering Traceability links between code and documentation.", IEEE Transactions on Software Engineering, Vol.28, No.10, pp.970-983, October, 2012.
2 B. Caprile and P. Tonella, "Nomen Est Omen: Analyzing the Language of Function Identifiers", In Proceedings of Sixth Working Conference on Reverse Engineering, Altanta, Georgia, pp.112-122, 1999.
3 The Stanford Parser Home page, 2012, [Internet] http://nlp.stanford.edu/software/lex-parser.shtml
4 JAWS(Java API for WordNet Searching) Homepage, 2012, [Internet] http://lyle.smu.edu/-tspell/jaws/index.html
5 JWI(The MIT Java WordNet Interface) Homepage, 2012, [Internet] http://projects.csail.mit.edu/jwi/
6 HyperSQL Homepage, 2012, [Internet] http://www.hsqldb.org/
7 Apache Lucene Homegage, 2012. [Internet] http://lucene.apache.org/core/
8 Apache Ant Homepage, 2012, [Internet] http://ant.apache.org/
9 W.B. Frakes and R. Baeza-Yates, "Information Retrival : Data Structures and Algorithms." Englewood Cliffs, J.J.: Prentice-Hall, 1992.
10 J. Bloch, "Effective Java 2nd Edition", Addison-Wesley, 2008.
11 B. Caprile and P. Tonella. "Restructuring program identifier names". In Proceedings of 16th International Conference on Software Maintenance(ICSM 2000), San Jose, California USA, pp.97-107, Oct., 2000.
12 E. Host and B. Ostvold, "The Programmer's Lexicon, Volumn I: The Verbs", In Proceedings of Seventh IEEE International Working Conference on Source Code Analysis and Manipulation(SCAM2007), Paris France, pp.193-202, 2007.
13 N. Madani, L. Guerroju, M.D. Penta, Y. Gueheneuc and G. Antoniol, "Recognizing Words from Source Code Identifiers using Speech Recognition Techniques", In Proceedings of 14th European Conference on Software Maintenance and Reengineering(CSMR), Madrid, Spain, pp.68-77, 2010.
14 F. Deibenbock and M. Pizka, "Concise and Consistent Naming", In Proceedings of International Workshop on Program Comprehension 2005(IWPC 2005), St. Louis, MO, USA, pp.261-282, 2005.
15 D. Lawrie, H. Field and D. Binkley, "Syntactic Identifier Conciseness and Consistency", In Proceedings of Sixth IEEE International Workshop on Source Code Analysis and Manipulation(SCAM2006), Philadelphia, Pennsylvania, USA, pp.139-148, Sept., 2006.
16 S.F. Abebe, S. Haiduc, P. Tonella and A. Marcus, "Lexicon Bad Smells in Software", In Proceedings of 16th Working Conference on Reverse Engineering, Antwerp Belgium, pp.95-99, Oct., 2008.
17 S.L. Abebe and P. Tonella, "Natural Language Parsing of Program Element Names for Concept Extraction", In Proceedings of 18th International Conference on Program Comprehension (ICPC 2010), Braga, Minho, Portugal, pp.156-159, July, 2010.
18 D. Klein and C.D. Manning, "Accurate Unlexicalized Parsing", In Proceedings of the 41st Meeting of the Association for Computational Linguistics, Sapporo, Japan, pp.423-430, 2003.
19 J. Falleri, M. Lafourcade, C. Nebut, V. Prince and M. Dao, "Automatic Extraction of a WordNet-like Identifier Network from Software", In Proceedings of 18th International Conference on Program Comprehension (ICPC 2010), Braga, Minho, Portugal, pp.4-13, July, 2010.
20 WordNet: A lexical database for English, Home page (2012), [Internet] http://wordnet.princeton.edu/
21 V.I Levenshtein, "Binary codes capable of correcting deletions, insertions and reversals", Soviet Physics Doklady, Vol.10, No.8, pp.707-710, 1966.
22 M. Fowler, "Refactoring: Improving the Design of Existing Code". Addison-Wesley, 1999.
23 "Code Conventions for the Java Programming Language: Why Have Code Conventions", Sunmicro Systems (1999), [Internet]http://www.oracle.com/technetwork/java/index-135089.html