Search | Korea Science

Designing a Repository Independent Model for Mining and Analyzing Heterogeneous Bug Tracking Systems (다형의 버그 추적 시스템 마이닝 및 분석을 위한 저장소 독립 모델 설계)

Lee, Jae-Kwon;Jung, Woo-Sung
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.9
- /
- pp.103-115
- /
- 2014
In this paper, we propose UniBAS(Unified Bug Analysis System) to provide a unified repository model by integrating the extracted data from the heterogeneous bug tracking systems. The UniBAS reduces the cost and complexity of the MSR(Mining Software Repositories) research process and enables the researchers to focus on their logics rather than the tedious and repeated works such as extracting repositories, processing data and building analysis models. Additionally, the system not only extracts the data but also automatically generates database tables, views and stored procedures which are required for the researchers to perform query-based analysis easily. It can also generate various types of exported files for utilizing external analysis tools or managing research data. A case study of detecting duplicate bug reports from the Firfox project of the Mozilla site has been performed based on the UniBAS in order to evaluate the usefulness of the system. The results of the experiments with various algorithms of natural language processing and flexible querying to the automatically extracted data also showed the effectiveness of the proposed system.
https://doi.org/10.9708/jksci.2014.19.9.103 인용 PDF KSCI

A Study of GitHub Documentation Repositories: What Makes GitHub Documentation Repository Popular? (깃허브 문서 저장소들에 대한 연구: 무엇이 깃허브 문서 저장소를 유명하게 하는가?)

Jung Il Kim
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.8
- /
- pp.374-381
- /
- 2024
Documentation repositories on GitHub are used to share information that is helpful in performing various tasks. Popular documentation repositories have an advantage in attracting contributors who can help manage and extend documentation repository. Therefore, it is important to understand the characteristic of documentation repositories helpful to obtain popularity for developing strategies attracting attention of users. This paper presents a study on GitHub documentation repositories. To conduct the study, we collected 566 documentation repositories from GitHub and manually categorized their topic into 30 topics. Based on the stargazer score of the collected documentation repositories, we divided the collected documentation repositories into popular and unpopular documentation repository groups and investigated the topics in the popular documentation group. Then we statistically examined the differences in README characteristics of the popular and unpopular documentation repository groups. As a result, we found that the studied documentation repositories have 23 popular topics. We also found that the popular and unpopular documentation repository groups have differences in 5 README characteristics. The result of our study indicates that what documentation repository become popular in GitHub.
https://doi.org/10.3745/TKIPS.2024.13.8.374 인용 PDF

Design and Implementation of a Data Extraction Tool for Analyzing Software Changes

Lee, Yong-Hyeon;Kim, Kisub;Lee, Jaekwon;Jung, Woosung
- Journal of the Korea Society of Computer and Information
- /
- v.21 no.8
- /
- pp.65-75
- /
- 2016
In this paper, we present a novel approach to help MSR researchers obtain necessary data with a tool, termed General Purpose Extractor for Source code (GPES). GPES has a single function extracts high-quality data, e.g., the version history, abstract syntax tree (AST), changed code diff, and software quality metrics. Moreover, features such as an AST of other languages or new software metrics can be extended easily given that GPES has a flexible data model and a component-based design. We conducted several case studies to evaluate the usefulness and effectiveness of our tool. Case studies show that researchers can reduce the overall cost of data analysis by transforming the data into the required formats.
https://doi.org/10.9708/jksci.2016.21.8.065 인용 PDF KSCI

Towards Effective Analysis and Tracking of Mozilla and Eclipse Defects using Machine Learning Models based on Bugs Data

Hassan, Zohaib;Iqbal, Naeem;Zaman, Abnash
- Soft Computing and Machine Intelligence
- /
- v.1 no.1
- /
- pp.1-10
- /
- 2021
Analysis and Tracking of bug reports is a challenging field in software repositories mining. It is one of the fundamental ways to explores a large amount of data acquired from defect tracking systems to discover patterns and valuable knowledge about the process of bug triaging. Furthermore, bug data is publically accessible and available of the following systems, such as Bugzilla and JIRA. Moreover, with robust machine learning (ML) techniques, it is quite possible to process and analyze a massive amount of data for extracting underlying patterns, knowledge, and insights. Therefore, it is an interesting area to propose innovative and robust solutions to analyze and track bug reports originating from different open source projects, including Mozilla and Eclipse. This research study presents an ML-based classification model to analyze and track bug defects for enhancing software engineering management (SEM) processes. In this work, Artificial Neural Network (ANN) and Naive Bayesian (NB) classifiers are implemented using open-source bug datasets, such as Mozilla and Eclipse. Furthermore, different evaluation measures are employed to analyze and evaluate the experimental results. Moreover, a comparative analysis is given to compare the experimental results of ANN with NB. The experimental results indicate that the ANN achieved high accuracy compared to the NB. The proposed research study will enhance SEM processes and contribute to the body of knowledge of the data mining field.
PDF KSCI

Towards cross-platform interoperability for machine-assisted text annotation

de Castilho, Richard Eckart;Ide, Nancy;Kim, Jin-Dong;Klie, Jan-Christoph;Suderman, Keith
- Genomics & Informatics
- /
- v.17 no.2
- /
- pp.19.1-19.10
- /
- 2019
In this paper, we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements of annotation models and processes that are particularly problematic for, or amenable to, enabling seamless communication across different platforms. The study is conducted in the context of a specific annotation methodology, namely machine-assisted interactive annotation (also known as human-in-the-loop annotation). This methodology requires the ability to freely combine resources from different document repositories, access a wide array of NLP tools that automatically annotate corpora for various linguistic phenomena, and use a sophisticated annotation editor that enables interactive manual annotation coupled with on-the-fly machine learning. We consider three independently developed platforms, each of which utilizes a different model for representing annotations over text, and each of which performs a different role in the process.
https://doi.org/10.5808/GI.2019.17.2.e19 인용 PDF KSCI

Facilitating Web Service Taxonomy Generation : An Artificial Neural Network based Framework, A Prototype Systems, and Evaluation (인공신경망 기반 웹서비스 분류체계 생성 프레임워크의 실증적 평가)

Hwang, You-Sub
- Journal of Intelligence and Information Systems
- /
- v.16 no.2
- /
- pp.33-54
- /
- 2010
The World Wide Web is transitioning from being a mere collection of documents that contain useful information toward providing a collection of services that perform useful tasks. The emerging Web service technology has been envisioned as the next technological wave and is expected to play an important role in this recent transformation of the Web. By providing interoperable interface standards for application-to-application communication, Web services can be combined with component based software development to promote application interaction both within and across enterprises. To make Web services for service-oriented computing operational, it is important that Web service repositories not only be well-structured but also provide efficient tools for developers to find reusable Web service components that meet their needs. As the potential of Web services for service-oriented computing is being widely recognized, the demand for effective Web service discovery mechanisms is concomitantly growing. A number of public Web service repositories have been proposed, but the Web service taxonomy generation has not been satisfactorily addressed. Unfortunately, most existing Web service taxonomies are either too rudimentary to be useful or too hard to be maintained. In this paper, we propose a Web service taxonomy generation framework that combines an artificial neural network based clustering techniques with descriptive label generating and leverages the semantics of the XML-based service specification in WSDL documents. We believe that this is one of the first attempts at applying data mining techniques in the Web service discovery domain. We have developed a prototype system based on the proposed framework using an unsupervised artificial neural network and empirically evaluated the proposed approach and tool using real Web service descriptions drawn from operational Web service repositories. We report on some preliminary results demonstrating the efficacy of the proposed approach.
PDF KSCI

Evaluation of Web Service Similarity Assessment Methods (웹서비스 유사성 평가 방법들의 실험적 평가)

Hwang, You-Sub
- Journal of Intelligence and Information Systems
- /
- v.15 no.4
- /
- pp.1-22
- /
- 2009
The World Wide Web is transitioning from being a mere collection of documents that contain useful information toward providing a collection of services that perform useful tasks. The emerging Web service technology has been envisioned as the next technological wave and is expected to play an important role in this recent transformation of the Web. By providing interoperable interface standards for application-to-application communication, Web services can be combined with component based software development to promote application interaction and integration both within and across enterprises. To make Web services for service-oriented computing operational, it is important that Web service repositories not only be well-structured but also provide efficient tools for developers to find reusable Web service components that meet their needs. As the potential of Web services for service-oriented computing is being widely recognized, the demand for effective Web service discovery mechanisms is concomitantly growing. A number of techniques for Web service discovery have been proposed, but the discovery challenge has not been satisfactorily addressed. Unfortunately, most existing solutions are either too rudimentary to be useful or too domain dependent to be generalizable. In this paper, we propose a Web service organizing framework that combines clustering techniques with string matching and leverages the semantics of the XML-based service specification in WSDL documents. We believe that this is one of the first attempts at applying data mining techniques in the Web service discovery domain. Our proposed approach has several appealing features : (1) It minimizes the requirement of prior knowledge from both service consumers and publishers; (2) It avoids exploiting domain dependent ontologies; and (3) It is able to visualize the semantic relationships among Web services. We have developed a prototype system based on the proposed framework using an unsupervised artificial neural network and empirically evaluated the proposed approach and tool using real Web service descriptions drawn from operational Web service registries. We report on some preliminary results demonstrating the efficacy of the proposed approach.
PDF

Search Result 7, Processing Time 0.019 seconds

Designing a Repository Independent Model for Mining and Analyzing Heterogeneous Bug Tracking Systems (다형의 버그 추적 시스템 마이닝 및 분석을 위한 저장소 독립 모델 설계)

A Study of GitHub Documentation Repositories: What Makes GitHub Documentation Repository Popular? (깃허브 문서 저장소들에 대한 연구: 무엇이 깃허브 문서 저장소를 유명하게 하는가?)

Design and Implementation of a Data Extraction Tool for Analyzing Software Changes

Towards Effective Analysis and Tracking of Mozilla and Eclipse Defects using Machine Learning Models based on Bugs Data

Towards cross-platform interoperability for machine-assisted text annotation

Facilitating Web Service Taxonomy Generation : An Artificial Neural Network based Framework, A Prototype Systems, and Evaluation (인공신경망 기반 웹서비스 분류체계 생성 프레임워크의 실증적 평가)

Evaluation of Web Service Similarity Assessment Methods (웹서비스 유사성 평가 방법들의 실험적 평가)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)