• Title/Summary/Keyword: Single-model

Search Result 7,192, Processing Time 0.032 seconds

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

Prevalence of Enteyobius vermiculuris infection and preventive effects of masts treatment among children in rural and urban areas, and children in orphanages (농촌, 도시 및 집단생활 아동의 요충 감염과 집단 구충에 의한 예방 효과)

  • Kim, Jong-Su;Lee, Hae-Yong;An, Yeong-Gyeom
    • Parasites, Hosts and Diseases
    • /
    • v.29 no.3
    • /
    • pp.235-244
    • /
    • 1991
  • An epidemiological study and mass treatments of Enterobius vermicularis infection among children near Wonju area of Kangwon province were carried out. The children were divided into 4 groups according to their residing localities; children in the mountainous area, rural area, urban area and in orphanage. They were examined by adhesive cellotape anal swab technique, and egg positive rates were obtained. The rates of egg reduction and re-infection rates after repeated mass treatments were also observed. The results obtained were as follows: 1. The overall egg Positive rate of E. vermicularis in the first screening was 19.9% (251 out of 1, 262 examinees; 19.7% in males and 20.1% in females). The positive rates were 13.0% in the mountainous area, 11 9% in the rural area, 15.1% in the urban (medium-sized) area and 61.9% in orphanages. 2. The highest positive rates were observed in the kindergarten children, and 1st and 2nd grade children of primary schools (26.2~32.2%), and the lowest rate (13.6%) in 6-year grade children of primary schools. 3. Cumulative detection rates from 3 repeated anal swabs at 4~5 days interval were higher (70.8%) than those from single anal swabs (50.0~59.2%). 4. Out of the examinees who showed the highest cumulative positive rate (70.8%), about 39.2% were consecutively positive in 3 anal swabs. Among different groups of children, the higher the total egg detection rates (87.5%), the higher the consecutive positive rates (71.9%) . 5. A total of 2, 609 (male : female=1 : 12.4) worms were collected from 17 egg-positive cases treated with anthelinintics. The mean number of worms per child was 153 (range: 4-824) . 6. The egg-positive cases in several studied groups (180 children) were treated with anthelmintics 6 times at 3-week intervals. In this case, the overall positive rate was decreased from 54.8% to 2.2% at 15 weeks after the treatments, but no complete negative conversion was experienced. However, in a group of children (154 children) including egg Positive and negative cases who were both treated with anthelmintics at 3-week interval, a complete egg-negative conversion was observed in the 9th week after treatments. 7. The egg-detection rate in the brothers or sisters of egg Positive children was 70.0% (28 out of 40 examined), and the egg-positive rate according to the family unit was 69.7%. In summarizing the above results, it is concluded that Enterobius vermicularis infection is still highly prevalent among children in Korea, and that repeated mass treatments of more than 3 times will be effective for control of this infection.

  • PDF