• Title/Summary/Keyword: Unstructured data

Search Result 717, Processing Time 0.032 seconds

A Study on the General Public's Perceptions of Dental Fear Using Unstructured Big Data

  • Han-A Cho;Bo-Young Park
    • Journal of dental hygiene science
    • /
    • v.23 no.4
    • /
    • pp.255-263
    • /
    • 2023
  • Background: This study used text mining techniques to determine public perceptions of dental fear, extracted keywords related to dental fear, identified the connection between the keywords, and categorized and visualized perceptions related to dental fear. Methods: Keywords in texts posted on Internet portal sites (NAVER and Google) between 1 January, 2000, and 31 December, 2022, were collected. The four stages of analysis were used to explore the keywords: frequency analysis, term frequency-inverse document frequency (TF-IDF), centrality analysis and co-occurrence analysis, and convergent correlations. Results: In the top ten keywords based on frequency analysis, the most frequently used keyword was 'treatment,' followed by 'fear,' 'dental implant,' 'conscious sedation,' 'pain,' 'dental fear,' 'comfort,' 'taking medication,' 'experience,' and 'tooth.' In the TF-IDF analysis, the top three keywords were dental implant, conscious sedation, and dental fear. The co-occurrence analysis was used to explore keywords that appear together and showed that 'fear and treatment' and 'treatment and pain' appeared the most frequently. Conclusion: Texts collected via unstructured big data were analyzed to identify general perceptions related to dental fear, and this study is valuable as a source data for understanding public perceptions of dental fear by grouping associated keywords. The results of this study will be helpful to understand dental fear and used as factors affecting oral health in the future.

Analysis of Unstructured Data on Detecting of New Drug Indication of Atorvastatin (아토바스타틴의 새로운 약물 적응증 탐색을 위한 비정형 데이터 분석)

  • Jeong, Hwee-Soo;Kang, Gil-Won;Choi, Woong;Park, Jong-Hyock;Shin, Kwang-Soo;Suh, Young-Sung
    • Journal of health informatics and statistics
    • /
    • v.43 no.4
    • /
    • pp.329-335
    • /
    • 2018
  • Objectives: In recent years, there has been an increased need for a way to extract desired information from multiple medical literatures at once. This study was conducted to confirm the usefulness of unstructured data analysis using previously published medical literatures to search for new indications. Methods: The new indications were searched through text mining, network analysis, and topic modeling analysis using 5,057 articles of atorvastatin, a treatment for hyperlipidemia, from 1990 to 2017. Results: The extracted keywords was 273. In the frequency of text mining and network analysis, the existing indications of atorvastatin were extracted in top level. The novel indications by Term Frequency-Inverse Document Frequency (TF-IDF) were atrial fibrillation, heart failure, breast cancer, rheumatoid arthritis, combined hyperlipidemia, arrhythmias, multiple sclerosis, non-alcoholic fatty liver disease, contrast-induced acute kidney injury and prostate cancer. Conclusions: Unstructured data analysis for discovering new indications from massive medical literature is expected to be used in drug repositioning industries.

Survey-based unstructured data analysis to predict flipped learning performance (플립드러닝 성과를 예측하기 위한 설문조사 기반의 비정형 데이터 분석)

  • Chayoung Kim;Yoon Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.6
    • /
    • pp.519-524
    • /
    • 2023
  • The study summarizes the experience of operation in the application of flipped learning to various IT-related liberal arts subjects, and proposes a specific application method. So far, most of the studies have analyzed various strategies and learner responses to flipped learning. Currently, it is the time when teachers, who are the main operators of the flipped learning class, need to study how to provide immediate feedback and application while running the relevant courses. Studies related to this are gradually coming out. In general, most of the studies on sharing reference materials through the results after applying various strategies such as developing the structure of class operation by instructors themselves, combining them with discussion classes, or developing various contents. This study proposes a method to analyze how various strategies can be applied in the subject and obtain results simultaneously with class operation by analyzing unstructured data, which is a survey that can receive immediate feedback.

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

A Comparative Analysis of Cognitive Change about Big Data Using Social Media Data Analysis (소셜 미디어 데이터 분석을 활용한 빅데이터에 대한 인식 변화 비교 분석)

  • Yun, Youdong;Jo, Jaechoon;Hur, Yuna;Lim, Heuiseok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.7
    • /
    • pp.371-378
    • /
    • 2017
  • Recently, with the spread of smart device and the introduction of web services, the data is rapidly increasing online, and it is utilized in various fields. In particular, the emergence of social media in the big data field has led to a rapid increase in the amount of unstructured data. In order to extract meaningful information from such unstructured data, interest in big data technology has increased in various fields. Big data is becoming a key resource in many areas. Big data's prospects for the future are positive, but concerns about data breaches and privacy are constantly being addressed. On this subject of big data, where positive and negative views coexist, the research of analyzing people's opinions currently lack. In this study, we compared the changes in peoples perception on big data based on unstructured data collected from the social media using a text mining. As a results, yearly keywords for domestic big data, declining positive opinions, and increasing negative opinions were observed. Based on these results, we could predict the flow of domestic big data.

Korean Consumers' Political Consumption of Japanese Fashion Products (국내 소비자의 일본 패션제품에 대한 정치적 소비 연구)

  • Choi, Yeong-Hyeon;Lee, Kyu-Hye
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.44 no.2
    • /
    • pp.295-309
    • /
    • 2020
  • In 2019, Japan announced trade regulations against Korean products; consequently, the sales of Japanese products in Korea dropped due to a Korean consumers' boycott. This study measured the Korean consumers' political consumption behavior toward Japanese fashion products. Unstructured text data from online media sources and consumer posted sources such as blog and SNS were collected. Text mining techniques and semantic network analysis were used to process unstructured data. This study used text mining techniques and semantic network analysis to process data. The results identified boycotting Japanese fashion products and buycotting alternative products and Korean brands due to consumers' political consumption. Two brand cases were investigated in detail. Online text data before and after the political action were compared and significant changes in consumption as well as emotional expressions were identified. Product related industry sectors were identified in terms of the political consumption of fashion: liquor, automobile and tourism industry sectors were closely linked to the fashion sector in terms of boycotting. More "boycott" and "buycott" fashion brands (reflected in consumer attitudes and feelings) were detected in consumer driven texts than in media driven sources.

Using a Cellular Automaton to Extract Medical Information from Clinical Reports

  • Barigou, Fatiha;Atmani, Baghdad;Beldjilali, Bouziane
    • Journal of Information Processing Systems
    • /
    • v.8 no.1
    • /
    • pp.67-84
    • /
    • 2012
  • An important amount of clinical data concerning the medical history of a patient is in the form of clinical reports that are written by doctors. They describe patients, their pathologies, their personal and medical histories, findings made during interviews or during procedures, and so forth. They represent a source of precious information that can be used in several applications such as research information to diagnose new patients, epidemiological studies, decision support, statistical analysis, and data mining. But this information is difficult to access, as it is often in unstructured text form. To make access to patient data easy, our research aims to develop a system for extracting information from unstructured text. In a previous work, a rule-based approach is applied to a clinical reports corpus of infectious diseases to extract structured data in the form of named entities and properties. In this paper, we propose the use of a Boolean inference engine, which is based on a cellular automaton, to do extraction. Our motivation to adopt this Boolean modeling approach is twofold: first optimize storage, and second reduce the response time of the entities extraction.

AUTOMATIC GENERATION OF UNSTRUCTURED SURFACE GRID SYSTEM USING CAD SURFACE DATA (CAD 형상 데이터를 이용한 비정렬 표면 격자계의 자동 생성 기법)

  • Lee, B.J.;Kim, B.S.
    • Journal of computational fluids engineering
    • /
    • v.12 no.4
    • /
    • pp.68-73
    • /
    • 2007
  • Computational Fluid Dynamics (CFD) approach is now playing an important role in the engineering process in these days. Generating proper grid system in time for the region of interest is prerequisite for the efficient numerical calculation of flow physics using CFD approach. Grid generation is, however, usually considered as a major obstacle for a routine and successful application of numerical approaches in the engineering process. CFD approach based on the unstructured grid system is gaining popularity due to its simplicity and efficiency for generating grid system compared to the structured grid approaches, especially for complex geometries. In this paper an automated triangular surface grid generation using CAD(Computer Aided Design) surface data is proposed. According to the present method, the CAD surface data imported in the STL(Stereo-lithography) format is processed to identify feature edges defining the topology and geometry of the surface shape first. When the feature edges are identified, node points along the edges are distributed. The initial fronts which connect those feature edge nodes are constructed and then they are advanced along the CAD surface data inward until the surface is fully covered by triangular surface grid cells using Advancing Front Method. It is found that this approach can be implemented in an automated way successfully saving man-hours and reducing human-errors in generating triangular surface grid system.

Review of Operational Multi-Scale Environment Model with Grid Adaptivity

  • Kang, Sung-Dae
    • Environmental Sciences Bulletin of The Korean Environmental Sciences Society
    • /
    • v.10 no.S_1
    • /
    • pp.23-28
    • /
    • 2001
  • A new numerical weather prediction and dispersion model, the Operational Multi-scale Environment model with Grid Adaptivity(OMEGA) including an embedded Atmospheric Dispersion Model(ADM), is introduced as a next generation atmospheric simulation system for real-time hazard predictions, such as severe weather or the transport of hazardous release. OMEGA is based on an unstructured grid that can facilitate a continuously varying horizontal grid resolution ranging from 100 km down to 1 km and a vertical resolution from 20 -30 meters in the boundary layer to 1 km in the free atmosphere. OMEGA is also naturally scale spanning and time. In particular, the unstructured grid cells in the horizontal dimension can increase the local resolution to better capture the topography or important physical features of the atmospheric circulation and cloud dynamics. This means the OMEGA can readily adapt its grid to a stationary surface, terrain features, or dynamic features in an evolving weather pattern. While adaptive numerical techniques have yet to be extensively applied in atmospheric models, the OMEGA model is the first to exploit the adaptive nature of an unstructured gridding technique for atmospheric simulation and real-time hazard prediction. The purpose of this paper is to provide a detailed description of the OMEGA model, the OMEGA system, and a detailed comparison of OMEGA forecast results with observed data.

  • PDF

Numerical Simulation of Rotor-Fuselage Aerodynamic Interaction Using an Unstructured Overset Mesh Technique

  • Lee, Bum-Seok;Jung, Mun-Seung;Kwon, Oh-Joon;Kang, Hee-Jung
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.11 no.1
    • /
    • pp.1-9
    • /
    • 2010
  • Numerical simulation of unsteady flows around helicopters was conducted to investigate the aerodynamic interaction of main rotor and other components such as fuselage and tail rotor. For this purpose, a three-dimensional inviscid flow solver has been developed based on unstructured meshes. An overset mesh technique was used to describe the relative motion between the main rotor, and other components. As the application of the present method, calculations were made for the rotor-fuselage aerodynamic interaction of the ROBIN (ROtor Body INteraction) configuration and for a complete UH-60 helicopter configuration consisted of main rotor, fuselage, and tail rotor. Comparison of the computational results was made with measured time-averaged and instantaneous fuselage surface pressure distributions for the ROBIN configuration and thrust distribution and available experimental data for the UH-60 configuration. It is demonstrated that the present method is efficient and robust for the simulation of complete rotorcraft configurations.