• Title/Summary/Keyword: Utility frequency

Search Result 353, Processing Time 0.024 seconds

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

The Utility of Immunological Markers and Pulmonary Function Test in the Early Diagnosis of Pulmonary Involvement in the Patients with Rheumatoid Arthritis (류마티스 관절염환자에서 폐침범의 진단에 있어서 면역학적 지표와 폐기능 검사의 유용성)

  • Lee, Dong-Suk;Lee, Chang-Beam;Koh, Hee-Kwan;Moon, Doo-Seop;Lee, Jae-Young;Lee, Kyung-Sang;Yang, Suck-Chul;Yoon, Ho-Joo;Bae, Sang-Cheol;Shin, Dong-Ho;Kim, Seong-Yoon;Park, Sung-Soo;Lee, Jung-Hee
    • Tuberculosis and Respiratory Diseases
    • /
    • v.42 no.6
    • /
    • pp.878-887
    • /
    • 1995
  • Background: It is reported that frequency of pulmonary involvement in the patients with rheumatoid arthritis reaches 10 to 50% and pulmonary involvement is a principal cause of death. As immunology and radiology has developed, interest for the early diagnosis of pulmonary involvement is increasing. Method: Among the patients at Hanyang University Hospital from March, 1990 to July, 1995, we compared the 29 patients having pulmonary involvement with the 18 patients controls in clinical and chest high resolution computed tomography(HRCT) findings by immunological markers and findings of pulmonary function test. We sought useful markers for early diagnosis of pulmonary involvement with noninvasive investigations. Results: The ratio of males to females was 14 : 15 in the group of pulmonary involvement, and all of the 18 patients were females in the control group. Smoking history was 31%(9/29) in the former group and none in the latter group. Rheumatoid factor(RF) was positive in 95.5%(28/29) of the pulmonary involvement group and in 100%(18/18) of the control group(p=0.12). Antiperinuclear factor(APF) showed a significant difference: 86.9%(20/23, average value: 2.0) was positive in the pulmonary involvement group and 50%(8/16, average value: 1.1) in the control group(p=0.04). Antinuclear antibody(ANA) was positive in 60.7%(17/28) of the pulmonary involvement group and in 72.2%(13/18) of the control group(p=0.33). Cryoglobulin also showed a significant difference: 72%(18/25) in the pulmonary involvement group was positive and 56.2%(9/16) in the control group was positive(p=0.02). Bony erosion was positive in 61.5%(16/26) of the pulmonary involvement group and in 77.7%(14/18) of the control group(p=0.8). On the pulmonary function test, the average value of alveolar volume corrected diffusion capacity and residual volume in the pulmonary involvement group and in the control group were 1.07mmol/rnin/KPa(predicted value: 64.2%), 1.32L(predicted value: 70%) and 1.44mmol/min/KPa, 3.75L(predicted value: 86.6%), respectively(p=0.003, p=0.004), showing a significant difference. Conclusion: APF or cryoglobulin on the serological test, the measurement of residual volume and alveolar volume corrected diffusion capacity may be used as effective markers in the diagnosis of pulmonary involvement of the patients with rheumatoid arthritis.

  • PDF

Anthropometric Measurement, Dietary Behaviors, Health-related Behaviors and Nutrient Intake According to Lifestyles of College Students (대학생의 라이프스타일 유형에 따른 신체계측, 식행동, 건강관련 생활습관 및 영양소 섭취상태에 관한 연구)

  • Cheong, Sun-Hee;Na, Young-Joo;Lee, Eun-Hee;Chang, Kyung-Ja
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.36 no.12
    • /
    • pp.1560-1570
    • /
    • 2007
  • The purpose of this study was to investigate the differences according to lifestyle in anthropometric measurement, dietary attitude, health-related behaviors and nutrient intake among the college students. The subjects were 994 nation-wide college students (male: 385, female: 609) and divided into 7 clusters (PEAO: passive economy/appearance-oriented type, NCPR: non-consumption/pursuit of relationship type, PTA: pursuit of traditional actuality type, PAT: pursuit of active health type, UO: utility-oriented type, POF: pursuit of open fashion type, PFR: pursuit of family relations type). A cross-sectional survey was conducted using a self administered questionnaire, and the data were collected via Internet or by mail. The nutrient intake data collected from food record were analyzed by the Computer Aided Nutritional Analysis Program. Data were analyzed by a SPSS 12.0 program. Average age of male and female college students were 23.7 years and 21.6 years, respectively. Most of the college students had poor eating habits. In particular, about 60% of the PEAO group has irregularity in meal time. The students in PAH and POF groups showed significantly higher consumption frequency of fruits, meat products and foods cooked with oil compared to the other groups. As for exercise, drinking and smoking, there were significant differences between PAH and the other groups. Asked for the reason for body weight control, 16.2% of NCPR group answered "for health", but 24.8% of PEAO group and 26.3% of POF group answered "for appearance". Calorie, vitamin A, vitamin $B_2$, calcium and iron intakes of all the groups were lower than the Korean DRIs. Female students in PTA group showed significantly lower vitamin $B_1$ and niacin intakes compared to the PFR group. Therefore, these results provide nation-wide information on health-related behaviors and nutrient intake according to lifestyles among Korean college students.