토픽 모델링을 이용한 트위터 이슈 트래킹 시스템 (Twitter Issue Tracking System by Topic Modeling Techniques)
-
- 지능정보연구
- /
- 제20권2호
- /
- pp.109-122
- /
- 2014
현재 우리는 소셜 네트워크 서비스(Social Network Service, 이하 SNS) 상에서 수많은 데이터를 만들어 내고 있다. 특히, 모바일 기기와 SNS의 결합은 과거와는 비교할 수 없는 대량의 데이터를 생성하면서 사회적으로도 큰 영향을 미치고 있다. 이렇게 방대한 SNS 데이터 안에서 사람들이 많이 이야기하는 이슈를 찾아낼 수 있다면 이 정보는 사회 전반에 걸쳐 새로운 가치 창출을 위한 중요한 원천으로 활용될 수 있다. 본 연구는 이러한 SNS 빅데이터 분석에 대한 요구에 부응하기 위해, 트위터 데이터를 활용하여 트위터 상에서 어떤 이슈가 있었는지 추출하고 이를 웹 상에서 시각화 하는 트위터이슈 트래킹 시스템 TITS(Twitter Issue Tracking System)를 설계하고 구축 하였다. TITS는 1) 일별 순위에 따른 토픽 키워드 집합 제공 2) 토픽의 한달 간 일별 시계열 그래프 시각화 3) 토픽으로서의 중요도를 점수와 빈도수에 따라 Treemap으로 제공 4) 키워드 검색을 통한 키워드의 한달 간 일별 시계열 그래프 시각화의 기능을 갖는다. 본 연구는 SNS 상에서 실시간으로 발생하는 빅데이터를 Open Source인 Hadoop과 MongoDB를 활용하여 분석하였고, 이는 빅데이터의 실시간 처리가 점점 중요해지고 있는 현재 매우 주요한 방법론을 제시한다. 둘째, 문헌정보학 분야뿐만 아니라 다양한 연구 영역에서 사용하고 있는 토픽 모델링 기법을 실제 트위터 데이터에 적용하여 스토리텔링과 시계열 분석 측면에서 유용성을 확인할 수 있었다. 셋째, 연구 실험을 바탕으로 시각화와 웹 시스템 구축을 통해 실제 사용 가능한 시스템으로 구현하였다. 이를 통해 소셜미디어에서 생성되는 사회적 트렌드를 마이닝하여 데이터 분석을 통한 의미 있는 정보를 제공하는 실제적인 방법을 제시할 수 있었다는 점에서 주요한 의의를 갖는다. 본 연구는 JSON(JavaScript Object Notation) 파일 포맷의 1억 5천만개 가량의 2013년 3월 한국어 트위터 데이터를 실험 대상으로 한다.
Personalized services directly and indirectly acquire personal data, in part, to provide customers with higher-value services that are specifically context-relevant (such as place and time). Information technologies continue to mature and develop, providing greatly improved performance. Sensory networks and intelligent software can now obtain context data, and that is the cornerstone for providing personalized, context-specific services. Yet, the danger of overflowing personal information is increasing because the data retrieved by the sensors usually contains privacy information. Various technical characteristics of context-aware applications have more troubling implications for information privacy. In parallel with increasing use of context for service personalization, information privacy concerns have also increased such as an unrestricted availability of context information. Those privacy concerns are consistently regarded as a critical issue facing context-aware personalized service success. The entire field of information privacy is growing as an important area of research, with many new definitions and terminologies, because of a need for a better understanding of information privacy concepts. Especially, it requires that the factors of information privacy should be revised according to the characteristics of new technologies. However, previous information privacy factors of context-aware applications have at least two shortcomings. First, there has been little overview of the technology characteristics of context-aware computing. Existing studies have only focused on a small subset of the technical characteristics of context-aware computing. Therefore, there has not been a mutually exclusive set of factors that uniquely and completely describe information privacy on context-aware applications. Second, user survey has been widely used to identify factors of information privacy in most studies despite the limitation of users' knowledge and experiences about context-aware computing technology. To date, since context-aware services have not been widely deployed on a commercial scale yet, only very few people have prior experiences with context-aware personalized services. It is difficult to build users' knowledge about context-aware technology even by increasing their understanding in various ways: scenarios, pictures, flash animation, etc. Nevertheless, conducting a survey, assuming that the participants have sufficient experience or understanding about the technologies shown in the survey, may not be absolutely valid. Moreover, some surveys are based solely on simplifying and hence unrealistic assumptions (e.g., they only consider location information as a context data). A better understanding of information privacy concern in context-aware personalized services is highly needed. Hence, the purpose of this paper is to identify a generic set of factors for elemental information privacy concern in context-aware personalized services and to develop a rank-order list of information privacy concern factors. We consider overall technology characteristics to establish a mutually exclusive set of factors. A Delphi survey, a rigorous data collection method, was deployed to obtain a reliable opinion from the experts and to produce a rank-order list. It, therefore, lends itself well to obtaining a set of universal factors of information privacy concern and its priority. An international panel of researchers and practitioners who have the expertise in privacy and context-aware system fields were involved in our research. Delphi rounds formatting will faithfully follow the procedure for the Delphi study proposed by Okoli and Pawlowski. This will involve three general rounds: (1) brainstorming for important factors; (2) narrowing down the original list to the most important ones; and (3) ranking the list of important factors. For this round only, experts were treated as individuals, not panels. Adapted from Okoli and Pawlowski, we outlined the process of administrating the study. We performed three rounds. In the first and second rounds of the Delphi questionnaire, we gathered a set of exclusive factors for information privacy concern in context-aware personalized services. The respondents were asked to provide at least five main factors for the most appropriate understanding of the information privacy concern in the first round. To do so, some of the main factors found in the literature were presented to the participants. The second round of the questionnaire discussed the main factor provided in the first round, fleshed out with relevant sub-factors. Respondents were then requested to evaluate each sub factor's suitability against the corresponding main factors to determine the final sub-factors from the candidate factors. The sub-factors were found from the literature survey. Final factors selected by over 50% of experts. In the third round, a list of factors with corresponding questions was provided, and the respondents were requested to assess the importance of each main factor and its corresponding sub factors. Finally, we calculated the mean rank of each item to make a final result. While analyzing the data, we focused on group consensus rather than individual insistence. To do so, a concordance analysis, which measures the consistency of the experts' responses over successive rounds of the Delphi, was adopted during the survey process. As a result, experts reported that context data collection and high identifiable level of identical data are the most important factor in the main factors and sub factors, respectively. Additional important sub-factors included diverse types of context data collected, tracking and recording functionalities, and embedded and disappeared sensor devices. The average score of each factor is very useful for future context-aware personalized service development in the view of the information privacy. The final factors have the following differences comparing to those proposed in other studies. First, the concern factors differ from existing studies, which are based on privacy issues that may occur during the lifecycle of acquired user information. However, our study helped to clarify these sometimes vague issues by determining which privacy concern issues are viable based on specific technical characteristics in context-aware personalized services. Since a context-aware service differs in its technical characteristics compared to other services, we selected specific characteristics that had a higher potential to increase user's privacy concerns. Secondly, this study considered privacy issues in terms of service delivery and display that were almost overlooked in existing studies by introducing IPOS as the factor division. Lastly, in each factor, it correlated the level of importance with professionals' opinions as to what extent users have privacy concerns. The reason that it did not select the traditional method questionnaire at that time is that context-aware personalized service considered the absolute lack in understanding and experience of users with new technology. For understanding users' privacy concerns, professionals in the Delphi questionnaire process selected context data collection, tracking and recording, and sensory network as the most important factors among technological characteristics of context-aware personalized services. In the creation of a context-aware personalized services, this study demonstrates the importance and relevance of determining an optimal methodology, and which technologies and in what sequence are needed, to acquire what types of users' context information. Most studies focus on which services and systems should be provided and developed by utilizing context information on the supposition, along with the development of context-aware technology. However, the results in this study show that, in terms of users' privacy, it is necessary to pay greater attention to the activities that acquire context information. To inspect the results in the evaluation of sub factor, additional studies would be necessary for approaches on reducing users' privacy concerns toward technological characteristics such as highly identifiable level of identical data, diverse types of context data collected, tracking and recording functionality, embedded and disappearing sensor devices. The factor ranked the next highest level of importance after input is a context-aware service delivery that is related to output. The results show that delivery and display showing services to users in a context-aware personalized services toward the anywhere-anytime-any device concept have been regarded as even more important than in previous computing environment. Considering the concern factors to develop context aware personalized services will help to increase service success rate and hopefully user acceptance for those services. Our future work will be to adopt these factors for qualifying context aware service development projects such as u-city development projects in terms of service quality and hence user acceptance.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70