• Title/Summary/Keyword: open information extraction

Search Result 105, Processing Time 0.024 seconds

Design of a MapReduce-Based Mobility Pattern Mining System for Next Place Prediction (다음 장소 예측을 위한 맵리듀스 기반의 이동 패턴 마이닝 시스템 설계)

  • Kim, Jongwhan;Lee, Seokjun;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.8
    • /
    • pp.321-328
    • /
    • 2014
  • In this paper, we present a MapReduce-based mobility pattern mining system which can predict efficiently the next place of mobile users. It learns the mobility pattern model of each user, represented by Hidden Markov Models(HMM), from a large-scale trajectory dataset, and then predicts the next place for the user to visit by applying the learned models to the current trajectory. Our system consists of two parts: the back-end part, in which the mobility pattern models are learned for individual users, and the front-end part, where the next place for a certain user to visit is predicted based on the mobility pattern models. While the back-end part comprises of three distinct MapReduce modules for POI extraction, trajectory transformation, and mobility pattern model learning, the front-end part has two different modules for candidate route generation and next place prediction. Map and reduce functions of each module in our system were designed to utilize the underlying Hadoop infrastructure enough to maximize the parallel processing. We performed experiments to evaluate the performance of the proposed system by using a large-scale open benchmark dataset, GeoLife, and then could make sure of high performance of our system as results of the experiments.

Prevalence and Factors Associated with Oral Pre-Malignant Lesions in Northeast Thailand

  • Juntanong, Narongrit;Siewchaisakul, Pallop;Bradshaw, Peter;Vatanasapt, Patravoot;Chen, Sam Li-Sheng;Yen, Amy Ming-Fang;Chen, Tony Hsiu-Hsi;Promthet, Supannee
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.8
    • /
    • pp.4175-4179
    • /
    • 2016
  • Background: Oral cavity cancer (OCC) is one of the most common cancers worldwide. No studies have reported on the prevalence and epidemiologic risk factors of oral premalignant lesions (OPLs) in Thailand. The purpose of this study was to investigate the prevalence of OPLs and associated factors in Roi Et Province, Thailand. Materials and Methods: To investigate the prevalence of OPLs, a cross-sectional descriptive study was conducted in which 2,300 subjects over 40 years of age were recruited and screened for the prevalence of OPLs. To identify factors associated with OPLs, a matched case-control study was used in which the subjects were 102 cases with OPL and 102 matched controls without OPLs. The studies were conducted in Roi Et Province during the period 1 February, 2014, to 30 April, 2014, and the data were collected by the use of a structured interview questionnaire and by extraction of information from medical records. Data analyses involved the use of descriptive statistics, McNemar's test, and conditional logistic regression. Results: The overall prevalence of OPLs was 3.8%, and no-one was diagnosed with more than one type of OPL. The factors found to be associated with a statistically significant higher risk of an OPL were betel nut chewing, smoking, and alcohol consumption. The associations with these factors were strong, especially for betel nut chewing and smoking. Conclusions: The habits of betel nut chewing, smoking, and alcohol use are confirmed as factors associated with OPLs in a population of Roi Et Province, Thailand. Campaigns to reduce such risk healthy behaviour are needed, but whether any actual decrease will prevent the eventual transformation of an OPL into an OCC remains an open question.

A novel method for determining dose distribution on panoramic reconstruction computed tomography images from radiotherapy computed tomography

  • Hiroyuki Okamoto;Madoka Sakuramachi;Wakako Yatsuoka;Takao Ueno;Kouji Katsura;Naoya Murakami;Satoshi Nakamura;Kotaro Iijima;Takahito Chiba;Hiroki Nakayama;Yasunori Shuto;Yuki Takano;Yuta Kobayashi;Hironori Kishida;Yuka Urago;Masato Nishitani;Shuka Nishina;Koushin Arai;Hiroshi Igaki
    • Imaging Science in Dentistry
    • /
    • v.54 no.2
    • /
    • pp.129-137
    • /
    • 2024
  • Purpose: Patients with head and neck cancer (HNC) who undergo dental procedures during radiotherapy (RT) face an increased risk of developing osteoradionecrosis (ORN). Accordingly, new tools must be developed to extract critical information regarding the dose delivered to the teeth and mandible. This article proposes a novel approach for visualizing 3-dimensional planned dose distributions on panoramic reconstruction computed tomography (pCT) images. Materials and Methods: Four patients with HNC who underwent volumetric modulated arc therapy were included. One patient experienced ORN and required the extraction of teeth after RT. In the study approach, the dental arch curve (DAC) was defined using an open-source platform. Subsequently, pCT images and dose distributions were generated based on the new coordinate system. All teeth and mandibles were delineated on both the original CT and pCT images. To evaluate the consistency of dose metrics, the Mann-Whitney U test and Student t-test were employed. Results: A total of 61 teeth and 4 mandibles were evaluated. The correlation coefficient between the 2 methods was 0.999, and no statistically significant difference was observed (P>0.05). This method facilitated a straightforward and intuitive understanding of the delivered dose. In 1 patient, ORN corresponded to the region of the root and the gum receiving a high dosage (approximately 70 Gy). Conclusion: The proposed method particularly benefits dentists involved in the management of patients with HNC. It enables the visualization of a 3-dimensional dose distribution in the teeth and mandible on pCT, enhancing the understanding of the dose delivered during RT.

An Epidemiologic study on the Orthodontic Patients Who Visited Department of Orthodontics, Chosun University Dental Hospital Last 10 Years(1990${\sim}$1999) (최근 10년간 조선대학교 부속치과병원 교정과에 내원한 부정교합 환자에 관한 역학적 연구(1990${\sim}$1999))

  • Yoon, Young-Jooh;Kim, Kwang-Won;Hwang, Mee-Sun
    • The korean journal of orthodontics
    • /
    • v.31 no.2 s.85
    • /
    • pp.283-300
    • /
    • 2001
  • With socioeconomic development and change of esthetic recognition, the demand for orthodontic treatment and number of orthodontic patients has been increasing so rapidly. And frequency of malocclusion was changed. So this study was done in an attempt to provide an epidemiologic study so that we can accomodate their orthodontic needs adequately and to obtain the reliable quantitative information regarding the characteristics of orthodontic patients. Distribution and trends were examined in 3,070 malocclusion patients who had been examined and diagnosed at Department of Orthodontics, Dental Hospital, Chosun University over 10 year-period from 1990 to 1999. The results were as follows : 1. The number of patients per year was increasing trend and higher visiting rate in female(56.5%) than in male(43.5%). 2. Age distribution had shown 7${\sim}$ 12 year-old group being the largest(37.9%) and each percentage of 13${\sim}$18, 19${\sim}$24, above-19, 0${\sim}$6 year${\sim}$old group was 32.0%, 19.6%, 7.1%, 3.4%. 3. Hellman dental age IVa which is completion of the permanent dentition showed the highest percentage in male and female. 4. Geographic distribution showed a majority of patients in Kwang Ju(71.0%). Group within the distance 10km from Chosun Dental Hospital was 56.3% and group within 20km was 14.7%. 5. Anterior cross bite showed the highest percentage in chief complaints and percentage of Mn. prognathism and protrusion of Mx. teeth was 12.6%, 12.2%. 6. Distribution in the types of malocclusion according to the Angle's classification had shown; 38.9% for Class I, 20.7% for Class II division 1, 2.0% for Class III division 2, 38.4% for Class III. 7. In the dental vertical dysplasia according to the Angle's classification, deep bite was the most frequent in Class II div.1 and div. 2(24.3%, 56.7%) and open bite in Class III(21.4%). 8. In the skeletal sagittal dysplasia, 39.3% of skeletal Class II was due to the undergrowth of the mandible and 46.3% of skeletal Class III was due to the overgrowth of the mandible. 9. Distribution in orthodontic treatment acceding to the extraction and nonextraction had shown 66.9% for nonextraction case, 33.1% for extraction case, and four first bicuspids have been extracted in the highest percentage(38.6%). 10. Patients who had orthognathic surgery comprised 7.9%, with an increasing trend.

  • PDF

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.