• Title/Summary/Keyword: Language Processing

Search Result 2,707, Processing Time 0.031 seconds

A Technique to Recommend Appropriate Developers for Reported Bugs Based on Term Similarity and Bug Resolution History (개발자 별 버그 해결 유형을 고려한 자동적 개발자 추천 접근법)

  • Park, Seong Hun;Kim, Jung Il;Lee, Eun Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.12
    • /
    • pp.511-522
    • /
    • 2014
  • During the development of the software, a variety of bugs are reported. Several bug tracking systems, such as, Bugzilla, MantisBT, Trac, JIRA, are used to deal with reported bug information in many open source development projects. Bug reports in bug tracking system would be triaged to manage bugs and determine developer who is responsible for resolving the bug report. As the size of the software is increasingly growing and bug reports tend to be duplicated, bug triage becomes more and more complex and difficult. In this paper, we present an approach to assign bug reports to appropriate developers, which is a main part of bug triage task. At first, words which have been included the resolved bug reports are classified according to each developer. Second, words in newly bug reports are selected. After first and second steps, vectors whose items are the selected words are generated. At the third step, TF-IDF(Term frequency - Inverse document frequency) of the each selected words are computed, which is the weight value of each vector item. Finally, the developers are recommended based on the similarity between the developer's word vector and the vector of new bug report. We conducted an experiment on Eclipse JDT and CDT project to show the applicability of the proposed approach. We also compared the proposed approach with an existing study which is based on machine learning. The experimental results show that the proposed approach is superior to existing method.

Design for Database Retrieval System using Virtual Database in Intranet (인트라넷에서 가상데이터베이스를이용한 데이터베이스 검색 시스템의 설계)

  • Lee, Dong-Wook;Park, Young-Bae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.6
    • /
    • pp.1404-1417
    • /
    • 1998
  • Currently, there exists two different methods for database retrieval in the internet. First is to use the search engine and the second is to use the plug-in or ActiveX technology, If a search engine, which makes use of indices built from keywords of simple text data in order to do a search, is used when accessing a database, first it is not possible to access more than one database at a time, second it is also not possible to support various conditional retrievals as in using query language, and third the set of data received might include many unwanted data, in other words, precision rate might be relatively low. Plug in or Active technology make use of Web browset to execute chents' query in order to do a database retrieval. Problems associated with this is that it is not possible to activate more than one DBMS simultaneously even if they are of the same data model. sefond it is not possible to execute a user query other than the ones thai arc previou sly defined by the client program In this paper, to resolve those aforementioned problems we design and implement database retrieval system using a virtual database, which makes it possible to provide direct query jntertacc through the conventional Web browser. We assume that the virtual database is designed and aggregated from more than one relational database using the same data model.

  • PDF

Unstructured Data Analysis using Equipment Check Ledger: A Case Study in Telecom Domain (장비점검 일지의 비정형 데이터분석을 통한 고장 대응 효율화 사례 연구)

  • Ju, Yeonjin;Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.127-135
    • /
    • 2020
  • As the importance of the use and analysis of big data is emerging, there is a growing interest in natural language processing techniques for unstructured data such as news articles and comments. Particularly, as the collection of big data becomes possible, data mining techniques capable of pre-processing and analyzing data are emerging. In this case study with a telecom company, we propose a methodology how to formalize unstructured data using text mining. The domain is determined as equipment failure and the data is about 2.2 million equipment check ledger data. Data on equipment failures by 800,000 per year is accumulated in the equipment check ledger. The equipment check ledger coexist with both formal and unstructured data. Although formal data can be easily used for analysis, unstructured data is difficult to be used immediately for analysis. However, in unstructured data, there is a high possibility that important information. Because it can be contained that is not written in a formal. Therefore, in this study, we study to develop digital transformation method for unstructured data in equipment check ledger.

Design and Implementation of National Language Ability Test System using Korean Style Internet-Based Test added Middle-Server (미들서버방식 한국형 IBT를 이용한 국가언어능력평가 시스템의 설계 및 구현)

  • Chang, Young-Hyun;Park, Dea-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.9
    • /
    • pp.185-192
    • /
    • 2011
  • The purpose of this paper is to propose the design and implementation of a korean style internet-based test system on the basis of efficiency and stability for middle server. The current assessment system has some unstable elements with regard to transmission procedure, cost, system load and stability. This paper proposes a series of activities for the performance improvement of korean style internet-based test system which finally produced various excellent results in the administration of expense control, human resources, and special operational affairs. The proposed system's technological factors using middle server have been tested through a basic simulation pilot system. Actual development procedure starts from the analysis required by improving the shortcomings of existing internet-based test systems. A efficiency comparison with existing system and newly developed system was made in the area of number of operators, abnormal processing, system maintenances. Korean style internet-based test system using middle server has shown great efficiency increased to the maximum of 2 times about the effectiveness of processing for various parts. The korean style internet-based test system using middle server have been given good evaluations with regard to the convenience of their use and the management system for operators and supervisors.

Explanation of mushroom academic terminology (버섯 학술 용어 해설)

  • Lee, Jae-Sung;Sung, Jae-Mo;Kim, Yang-Sub;Chai, Jung-Ki;Yoo, Young-Bok;Yu, Seung-Hun;Cha, Jae-Soon;Lee, Hyun-Sook;Lee, Jae-Dong;Lee, Jong-Soo;Bak, Won-Cheol;Koo, Chang-Duck;Seok, Soon-Ja;Kim, Young-Gab;Cha, Byeong-Jin;Chang, Hyun-Yoo
    • Journal of Mushroom
    • /
    • v.4 no.4
    • /
    • pp.144-213
    • /
    • 2006
  • The mushroom production reached to 1000 billion won in monetary value in Korea. We, however, do not have systematic terminology dictionary published yet. Recently new varieties of medicinal mushrooms in addition to culinary mushrooms are being introduced steadily through out the world. This makes the necessity of coordinated and consistent arrangement of terms involved in culture, cultivation and physiological aspects of mushrooms. Various components in relation to the medicinal and physiological functionality also poses ambiguity in terminology along with the terms used in breeding and genetic researches. Moreover, some of the scientific terms are being used erroneously. In order to help mushroom cultivators, students, and mushroom business personnel in understanding the terms on mushroom science and technology we intended to collect and organize all the terms related to mushroom morphology and cultivation, poison and medicinal functionality, processing and utilization, and so on. Thirteen professionals from each field participated in this project. The fields included here are : 1) Genetics and breeding of mushrooms, 2) Cultivation and physiology of mushrooms, 3) Taxonomy and ecology of mushrooms, 4) Processing and functional components, 5) Blight and insects of mushrooms.

  • PDF

A Korean Community-based Question Answering System Using Multiple Machine Learning Methods (다중 기계학습 방법을 이용한 한국어 커뮤니티 기반 질의-응답 시스템)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1085-1093
    • /
    • 2016
  • Community-based Question Answering system is a system which provides answers for each question from the documents uploaded on web communities. In order to enhance the capacity of question analysis, former methods have developed specific rules suitable for a target region or have applied machine learning to partial processes. However, these methods incur an excessive cost for expanding fields or lead to cases in which system is overfitted for a specific field. This paper proposes a multiple machine learning method which automates the overall process by adapting appropriate machine learning in each procedure for efficient processing of community-based Question Answering system. This system can be divided into question analysis part and answer selection part. The question analysis part consists of the question focus extractor, which analyzes the focused phrases in questions and uses conditional random fields, and the question type classifier, which classifies topics of questions and uses support vector machine. In the answer selection part, the we trains weights that are used by the similarity estimation models through an artificial neural network. Also these are a number of cases in which the results of morphological analysis are not reliable for the data uploaded on web communities. Therefore, we suggest a method that minimizes the impact of morphological analysis by using character features in the stage of question analysis. The proposed system outperforms the former system by showing a Mean Average Precision criteria of 0.765 and R-Precision criteria of 0.872.

Work Hours and Cognitive Function: The Multi-Ethnic Study of Atherosclerosis

  • Charles, Luenda E.;Fekedulegn, Desta;Burchfiel, Cecil M.;Fujishiro, Kaori;Hazzouri, Adina Zeki Al;Fitzpatrick, Annette L.;Rapp, Stephen R.
    • Safety and Health at Work
    • /
    • v.11 no.2
    • /
    • pp.178-186
    • /
    • 2020
  • Background: Cognitive impairment is a public health burden. Our objective was to investigate associations between work hours and cognitive function. Methods: Multi-Ethnic Study of Atherosclerosis (MESA) participants (n = 2,497; 50.7% men; age range 44-84 years) reported hours per week worked in all jobs in Exams 1 (2000-2002), 2 (2002-2004), 3 (2004-2005), and 5 (2010-2011). Cognitive function was assessed (Exam 5) using the Cognitive Abilities Screening Instrument (version 2), a measure of global cognitive functioning; the Digit Symbol Coding, a measure of processing speed; and the Digit Span test, a measure of attention and working memory. We used a prospective approach and linear regression to assess associations for every 10 hours of work. Results: Among all participants, associations of hours worked with cognitive function of any type were not statistically significant. In occupation-stratified analyses (interaction p = 0.051), longer work hours were associated with poorer global cognitive function among Sales/Office and blue-collar workers, after adjustment for age, sex, physical activity, body mass index, race/ethnicity, educational level, annual income, history of heart attack, diabetes, apolipoprotein E-epsilon 4 allele (ApoE4) status, birth-place, number of years in the United States, language spoken at MESA Exam 1, and work hours at Exam 5 (β = -0.55, 95% CI = -0.99, -0.09) and (β = -0.80, -1.51, -0.09), respectively. In occupation-stratified analyses (interaction p = 0.040), we also observed an inverse association with processing speed among blue-collar workers (adjusted β = -0.80, -1.52, -0.07). Sex, race/ethnicity, and ApoE4 did not significantly modify associations between work hours and cognitive function. Conclusion: Weak inverse associations were observed between work hours and cognitive function among Sales/Office and blue-collar workers.

An Intra Prediction Hardware Design for High Performance HEVC Encoder (고성능 HEVC 부호기를 위한 화면내 예측 하드웨어 설계)

  • Park, Seung-yong;Guard, Kanda;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.875-878
    • /
    • 2015
  • In this paper, we propose an intra prediction hardware architecture with less processing time, computations and reduced hardware area for a high performance HEVC encoder. The proposed intra prediction hardware architecture uses common operation units to reduce computational complexity and uses $4{\times}4$ block unit to reduce hardware area. In order to reduce operation time, common operation unit uses one operation unit to generate predicted pixels and filtered pixels in all prediction modes. Intra prediction hardware architecture introduces the $4{\times}4$ PU design processing to reduce the hardware area and uses intemal registers to support $32{\times}32$ PU processmg. The proposed hardware architecture uses ten common operation units which can reduce execution cycles of intra prediction. The proposed Intra prediction hardware architecture is designed using Verilog HDL(Hardware Description Language), and has a total of 41.5k gates in TSMC $0.13{\mu}m$ CMOS standard cell library. At 150MHz, it can support 4K UHD video encoding at 30fps in real time, and operates at a maximum of 200MHz.

  • PDF

Debelppment of C++ Compiler and Programming Environment (C++컴파일러 및 프로그래밍 환경 개발)

  • Jang, Cheon-Hyeon;O, Se-Man
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.3
    • /
    • pp.831-845
    • /
    • 1997
  • In this paper,we proposed and developed a compiler and interactive programming enviroments for C++ wich is mostly worth of nitice among the object -oriented languages.To develope the compiler for C++ we took front=end/back-end model using EM virtual machine.In develpoing Front-End,we formailized C++ gram-mar with the context semsitive tokens which must be manipulated by dexical scanner and designed a AST class li-brary which is the hierarchy of AST node class and well defined interface among them,In develpoing Bacik-End,we proposed model for three major components :code oprtimizer,code generator and run-time enviroments.We emphasized the retargatable back-end which can be systrmatically reconfigured to genrate code for a variety of distinct target computers.We also developed terr pattern matching algorithm and implemented target code gen-erator which produce SPARC code.We also proposed the theroy and model for construction interative pro-gramming enviroments. To represent language features we adopt AST as internal reprsentation and propose uncremental analysis algorithm and viseal digrams.We also studied unparsing scheme, visual diagram,graphical user interface to generate interactive environments automatically Results of our resarch will be very useful for developing a complier and programming environments, and also can be used in compilers for parallel and distributed enviroments.

  • PDF

A Formal Modeling of Managed Object Behaviour with Dynamic Temporal Properties (동적 시간지원 특성을 지원하는 망관리 객체의 정형적 모델링)

  • Choi, Eun-Bok;Lee, Hyung-Hyo;Noh, Bong-Nam
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.1
    • /
    • pp.166-180
    • /
    • 2000
  • Recommendations of ITU-T and ISO stipulate the managerial abstraction of static and dynamic characteristics of network elements, management functions as well as management communication protocol. The current recommendations provide the formal mechanism for the structural parts of managed objects such as managed object class and attributes. But the current description method does not provide the formal mechanism for the behavioral characteristics of managed objects in clear manner but in natural language form, the complete specification of managed objects is not fully described. Also, the behaviour of managed objects is affected by their temporal and active properties. While the temporal properties representing periodic or repetitive internals are to describe managed objects behaviour in rather strict way, it will be more powerful if more dynamic temporal properties determined by external conditions are added to managed objects. In this paper, we added dynamic features to scheduling managed objects, and described, in GDMO, scheduling managed objects that support dynamic features. We also described behaviour of managed objects in newly defined BDL that has dynamic temporal properties. This paper showed that dynamic temporal managed objects provide a systematic and formal method in agent management function model.

  • PDF