• 제목/요약/키워드: object-oriented data model

검색결과 247건 처리시간 0.03초

Prediction & Assessment of Change Prone Classes Using Statistical & Machine Learning Techniques

  • Malhotra, Ruchika;Jangra, Ravi
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.778-804
    • /
    • 2017
  • Software today has become an inseparable part of our life. In order to achieve the ever demanding needs of customers, it has to rapidly evolve and include a number of changes. In this paper, our aim is to study the relationship of object oriented metrics with change proneness attribute of a class. Prediction models based on this study can help us in identifying change prone classes of a software. We can then focus our efforts on these change prone classes during testing to yield a better quality software. Previously, researchers have used statistical methods for predicting change prone classes. But machine learning methods are rarely used for identification of change prone classes. In our study, we evaluate and compare the performances of ten machine learning methods with the statistical method. This evaluation is based on two open source software systems developed in Java language. We also validated the developed prediction models using other software data set in the same domain (3D modelling). The performance of the predicted models was evaluated using receiver operating characteristic analysis. The results indicate that the machine learning methods are at par with the statistical method for prediction of change prone classes. Another analysis showed that the models constructed for a software can also be used to predict change prone nature of classes of another software in the same domain. This study would help developers in performing effective regression testing at low cost and effort. It will also help the developers to design an effective model that results in less change prone classes, hence better maintenance.

U-마켓에서의 사용자 정보보호를 위한 매장 추천방법 (A Store Recommendation Procedure in Ubiquitous Market for User Privacy)

  • 김재경;채경희;구자철
    • Asia pacific journal of information systems
    • /
    • 제18권3호
    • /
    • pp.123-145
    • /
    • 2008
  • Recently, as the information communication technology develops, the discussion regarding the ubiquitous environment is occurring in diverse perspectives. Ubiquitous environment is an environment that could transfer data through networks regardless of the physical space, virtual space, time or location. In order to realize the ubiquitous environment, the Pervasive Sensing technology that enables the recognition of users' data without the border between physical and virtual space is required. In addition, the latest and diversified technologies such as Context-Awareness technology are necessary to construct the context around the user by sharing the data accessed through the Pervasive Sensing technology and linkage technology that is to prevent information loss through the wired, wireless networking and database. Especially, Pervasive Sensing technology is taken as an essential technology that enables user oriented services by recognizing the needs of the users even before the users inquire. There are lots of characteristics of ubiquitous environment through the technologies mentioned above such as ubiquity, abundance of data, mutuality, high information density, individualization and customization. Among them, information density directs the accessible amount and quality of the information and it is stored in bulk with ensured quality through Pervasive Sensing technology. Using this, in the companies, the personalized contents(or information) providing became possible for a target customer. Most of all, there are an increasing number of researches with respect to recommender systems that provide what customers need even when the customers do not explicitly ask something for their needs. Recommender systems are well renowned for its affirmative effect that enlarges the selling opportunities and reduces the searching cost of customers since it finds and provides information according to the customers' traits and preference in advance, in a commerce environment. Recommender systems have proved its usability through several methodologies and experiments conducted upon many different fields from the mid-1990s. Most of the researches related with the recommender systems until now take the products or information of internet or mobile context as its object, but there is not enough research concerned with recommending adequate store to customers in a ubiquitous environment. It is possible to track customers' behaviors in a ubiquitous environment, the same way it is implemented in an online market space even when customers are purchasing in an offline marketplace. Unlike existing internet space, in ubiquitous environment, the interest toward the stores is increasing that provides information according to the traffic line of the customers. In other words, the same product can be purchased in several different stores and the preferred store can be different from the customers by personal preference such as traffic line between stores, location, atmosphere, quality, and price. Krulwich(1997) has developed Lifestyle Finder which recommends a product and a store by using the demographical information and purchasing information generated in the internet commerce. Also, Fano(1998) has created a Shopper's Eye which is an information proving system. The information regarding the closest store from the customers' present location is shown when the customer has sent a to-buy list, Sadeh(2003) developed MyCampus that recommends appropriate information and a store in accordance with the schedule saved in a customers' mobile. Moreover, Keegan and O'Hare(2004) came up with EasiShop that provides the suitable tore information including price, after service, and accessibility after analyzing the to-buy list and the current location of customers. However, Krulwich(1997) does not indicate the characteristics of physical space based on the online commerce context and Keegan and O'Hare(2004) only provides information about store related to a product, while Fano(1998) does not fully consider the relationship between the preference toward the stores and the store itself. The most recent research by Sedah(2003), experimented on campus by suggesting recommender systems that reflect situation and preference information besides the characteristics of the physical space. Yet, there is a potential problem since the researches are based on location and preference information of customers which is connected to the invasion of privacy. The primary beginning point of controversy is an invasion of privacy and individual information in a ubiquitous environment according to researches conducted by Al-Muhtadi(2002), Beresford and Stajano(2003), and Ren(2006). Additionally, individuals want to be left anonymous to protect their own personal information, mentioned in Srivastava(2000). Therefore, in this paper, we suggest a methodology to recommend stores in U-market on the basis of ubiquitous environment not using personal information in order to protect individual information and privacy. The main idea behind our suggested methodology is based on Feature Matrices model (FM model, Shahabi and Banaei-Kashani, 2003) that uses clusters of customers' similar transaction data, which is similar to the Collaborative Filtering. However unlike Collaborative Filtering, this methodology overcomes the problems of personal information and privacy since it is not aware of the customer, exactly who they are, The methodology is compared with single trait model(vector model) such as visitor logs, while looking at the actual improvements of the recommendation when the context information is used. It is not easy to find real U-market data, so we experimented with factual data from a real department store with context information. The recommendation procedure of U-market proposed in this paper is divided into four major phases. First phase is collecting and preprocessing data for analysis of shopping patterns of customers. The traits of shopping patterns are expressed as feature matrices of N dimension. On second phase, the similar shopping patterns are grouped into clusters and the representative pattern of each cluster is derived. The distance between shopping patterns is calculated by Projected Pure Euclidean Distance (Shahabi and Banaei-Kashani, 2003). Third phase finds a representative pattern that is similar to a target customer, and at the same time, the shopping information of the customer is traced and saved dynamically. Fourth, the next store is recommended based on the physical distance between stores of representative patterns and the present location of target customer. In this research, we have evaluated the accuracy of recommendation method based on a factual data derived from a department store. There are technological difficulties of tracking on a real-time basis so we extracted purchasing related information and we added on context information on each transaction. As a result, recommendation based on FM model that applies purchasing and context information is more stable and accurate compared to that of vector model. Additionally, we could find more precise recommendation result as more shopping information is accumulated. Realistically, because of the limitation of ubiquitous environment realization, we were not able to reflect on all different kinds of context but more explicit analysis is expected to be attainable in the future after practical system is embodied.

토픽 모델링을 이용한 트위터 이슈 트래킹 시스템 (Twitter Issue Tracking System by Topic Modeling Techniques)

  • 배정환;한남기;송민
    • 지능정보연구
    • /
    • 제20권2호
    • /
    • pp.109-122
    • /
    • 2014
  • 현재 우리는 소셜 네트워크 서비스(Social Network Service, 이하 SNS) 상에서 수많은 데이터를 만들어 내고 있다. 특히, 모바일 기기와 SNS의 결합은 과거와는 비교할 수 없는 대량의 데이터를 생성하면서 사회적으로도 큰 영향을 미치고 있다. 이렇게 방대한 SNS 데이터 안에서 사람들이 많이 이야기하는 이슈를 찾아낼 수 있다면 이 정보는 사회 전반에 걸쳐 새로운 가치 창출을 위한 중요한 원천으로 활용될 수 있다. 본 연구는 이러한 SNS 빅데이터 분석에 대한 요구에 부응하기 위해, 트위터 데이터를 활용하여 트위터 상에서 어떤 이슈가 있었는지 추출하고 이를 웹 상에서 시각화 하는 트위터이슈 트래킹 시스템 TITS(Twitter Issue Tracking System)를 설계하고 구축 하였다. TITS는 1) 일별 순위에 따른 토픽 키워드 집합 제공 2) 토픽의 한달 간 일별 시계열 그래프 시각화 3) 토픽으로서의 중요도를 점수와 빈도수에 따라 Treemap으로 제공 4) 키워드 검색을 통한 키워드의 한달 간 일별 시계열 그래프 시각화의 기능을 갖는다. 본 연구는 SNS 상에서 실시간으로 발생하는 빅데이터를 Open Source인 Hadoop과 MongoDB를 활용하여 분석하였고, 이는 빅데이터의 실시간 처리가 점점 중요해지고 있는 현재 매우 주요한 방법론을 제시한다. 둘째, 문헌정보학 분야뿐만 아니라 다양한 연구 영역에서 사용하고 있는 토픽 모델링 기법을 실제 트위터 데이터에 적용하여 스토리텔링과 시계열 분석 측면에서 유용성을 확인할 수 있었다. 셋째, 연구 실험을 바탕으로 시각화와 웹 시스템 구축을 통해 실제 사용 가능한 시스템으로 구현하였다. 이를 통해 소셜미디어에서 생성되는 사회적 트렌드를 마이닝하여 데이터 분석을 통한 의미 있는 정보를 제공하는 실제적인 방법을 제시할 수 있었다는 점에서 주요한 의의를 갖는다. 본 연구는 JSON(JavaScript Object Notation) 파일 포맷의 1억 5천만개 가량의 2013년 3월 한국어 트위터 데이터를 실험 대상으로 한다.

BIM기반의 건축설계경기 평가 및 절차에 관한 연구 -국내외 BIM기반 건축설계경기 사례를 기반으로- (A Study for BIM based Evaluation and Process for Architectural Design Competition -Case Study of Domestic and International BIM-based Competition)

  • 박승화;홍창희
    • 한국산학기술학회논문지
    • /
    • 제18권2호
    • /
    • pp.23-30
    • /
    • 2017
  • 건설산업 분야에서 빌딩정보모델링(BIM)은 단지 설계를 보다 쉽게 지원하고, 3차원 모델을 구현하기 위한 도구만은 아니다. 왜냐하면, 설계, 시공, 유지관리 및 철거에 이르는 건설 생애주기 동안의 다양한 데이터를 입력하고 관리가 가능하도록 구조화 되어 있기 때문이다. 이렇게 구조화된 모델을 가지고 다양한 분석 및 해석이 가능해지기 때문에 BIM의 역할은 점점 더 증대되어 가고 있다. 노르웨이 베스트바넨의 국립예술박물관 현상설계 경기를 시작으로 국내외적으로 BIM을 설계 경기 및 평가에 적용하려는 노력이 다각적으로 발생하고 있다. 이때, 기존 방식의 설계경기와는 다르게 BIM기반으로 전사적인 시뮬레이션을 통한 자동적이며, 신뢰성이 높고 효율적인 평가가 가능하게 된다. 이를 위하여 각 BIM 도구를 대상으로 하는 모델링 가이드가 필요하며 이를 검증할 수 있는 평가시스템의 요구가 생기게 되었다. 본 논문은 건설산업 분야에 새로운 패러다임을 이끌고 있는 BIM 기술을 이용한 새로운 방식의 설계평가 및 절차의 방향에 대해, 전력거래소 본사사옥 등의 이전 설계경기 사례를 비교분석하고 이에 대한 개선방안을 설명함으로써 향후 발주될 BIM기반 건축설계경기 평가의 방향을 제시하고자 한다.

시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법 (A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach)

  • 노상규;박현정;박진수
    • Asia pacific journal of information systems
    • /
    • 제17권4호
    • /
    • pp.31-59
    • /
    • 2007
  • We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.

철근콘크리트 공사의 작업 생산성 분석을 위한 3차원 객체 활용 정보관리 시스템 구축방안 (A Framework on 3D Object-Based Construction Information Management System for Work Productivity Analysis for Reinforced Concrete Work)

  • 김준;차희성
    • 한국건설관리학회논문집
    • /
    • 제19권2호
    • /
    • pp.15-24
    • /
    • 2018
  • 국내 건설산업에서 생산성 정보의 필요성과 그 활용에 대한 중요성 인식에도 불구하고, 현장 단위의 공사계획 시 효과적인 피드백을 통한 생산성 정보관리가 미흡한 실정이다. 공사계획의 신뢰성을 향상하기 위해서는 생산성 정보의 효과적인 활용이 요구되지만, 현장에서는 공사계획을 관리자의 경험 및 직관에 의존하고 있다. 생산성 정보가 효과적으로 관리되지 못하는 이유는 생산성 정보의 관리를 위한 추가 인력투입이 어렵다는 점, 기존의 생산성 정보가 새로운 프로젝트에 적용되기 어렵다는 점 등 때문인 것으로 파악되었다. 이러한 문제의 해결방안을 제시하기 위해서 선행연구조사 및 면담조사를 하였고 그 결과 새로운 시스템이 필요하다는 사실을 확인하였다. 새로운 시스템의 요구사항으로는 최소화된 업무, 한정된 정보관리범위, 정보의 분류, 정보의 피드백, 생산성 저하요인의 고려 등이 있다. 본 연구에서는 상기 내용을 바탕으로 생산성 저하요인 및 생산성 정보의 관리가 가능한 시스템의 프레임워크를 제안하였다. 이 시스템은 사용자 접근성이 좋은 SketchUp 소프트웨어를 활용하여 프로그램의 활용에 따른 추가인력의 투입이나 업무량 증가를 최소화할 수 있도록 하였다. 제안한 시스템은 준비단계, 입력단계, 처리단계, 출력단계의 총 네 단계의 과정을 통해서 정보를 입력하여 처리하고 출력하도록 하였다. 입력한 시공정보는 건축공사 표준시방서에 기재된 내용을 참고로 구성한 Task Breakdown Structure (TBS)와 Material Breakdown Structure (MBS)를 통해서 분류되어 생산성 정보로 변환되도록 하고, 변환된 정보를 그래픽으로 화면에 출력하도록 하여 사용자는 이를 활용해 해당 현장에서의 생산성 정보를 활용할 수 있도록 하였다. 본 연구에서 제안한 생산성 정보관리 시스템을 K 사옥 공사현장에 대입하여 현장적용 가능성 및 정보 활용성 측면에서 검증하였고, 사용성 및 적용 가능성에서 매우 긍정적인 결과와 정보 활용에 따른 이득이 있을 것으로 예상되었다. 본 시스템을 활용할 경우 생산성 정보를 활용한 공사계획이 가능할 것이고 추후 정보가 지속해서 누적될 경우, 본 연구의 기대효과는 더욱 높아질 것으로 사료된다.

온라인 서비스 품질이 고객만족 및 충성의도에 미치는 영향 -항공권 예약.발권 웹사이트를 중심으로- (The Effects of Online Service Quality on Consumer Satisfaction and Loyalty Intention -About Booking and Issuing Air Tickets on Website-)

  • 박종기;고도은;이승창
    • 한국유통학회지:유통연구
    • /
    • 제15권3호
    • /
    • pp.71-110
    • /
    • 2010
  • 본 연구에서는 항공권 예약 발권 웹사이트의 서비스 품질을 측정 뿐만 아니라 서비스 회복도 측정하고자 하였다. 또한 서비스 품질과 서비스 회복이 고객만족 및 충성의도에 미치는 영향관계를 실증하고자 하였다. 온라인 서비스 품질과 온라인 서비스 회복의 측정을 위해 Parasuraman, Zeithaml, & Malhotra(2005)가 개발한 E-S-QUAL과 E-RecS-QUAL을 사용했으며, 했다. E-S-QUAL은 온라인 서비스 품질을 측정하는 도구로써, 효율성, 시스템 이용가능성, 이행성, 프라이버시의 4개 차원 22개 항목으로 구성된다. E-RecS-QUAL은 온라인 서비스 회복을 측정하는 도구로써, 반응, 보상, 접촉의 3개 차원 11개 항목으로 구성된다. 실증분석을 위한 설문조사는 항공사나 여행사의 웹사이트를 통해 국내 외 항공권을 구입해 본 경험이 있는 소비자를 대상으로 실시하였는데, 총 400부가 회수되었고, 이 중 342부를 최종분석에 사용하였다. 실증분석을 위해 AMOS 7.0과 SPSS 15.0을 사용하였다. 먼저, SPSS 15.0을 사용하여, 요인점수를 이용한 회귀분석으로 가설검증을 한 결과, <가설 I-1, 2, 3, 4, II-1, 2, 3, III-1, IV-1>이 전부 채택되었다. 온라인 서비스 품질과 온라인 서비스 회복의 각 차원은 모두 전반적인 서비스 품질에 유의한 영향을 보였고, 전반적인 서비스 품질은 고객만족에 유의한 영향을 미쳤다. 마지막으로 고객만족 역시 충성의도에 유의한 영향을 미치는 것으로 확인되었다. 한편 AMOS 7.0을 사용하여 모형 분석을 하였는데, 모형의 적합도는 가설검증을 하기에 합당한 수치가 나왔다. 이를 토대로 가설검증을 한 결과, <가설 I-1, 3, II-1, 3, III-1, IV-1>은 채택되었고, <가설 I-2, 4, II-2>는 기각되었다. 이 결과는 Parasuraman et al.(2005)이 주장한 것처럼 E-S-QUAL을 나타내는 데는 요인점수를 이용한 회귀분석이 더 적합하다는 것을 보여주는 것이라고 판단된다. 이를 토대로 본 연구의 시사점을 정리하였다.

  • PDF