• Title/Summary/Keyword: bigdata analysis

Search Result 345, Processing Time 0.025 seconds

Analysis of the propensity of medical expenses for auto insurance patients by type of medical institution (의료기관 종류별 자동차보험 환자의 진료비 성향 분석)

  • Ha, Au-Hyun
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.2
    • /
    • pp.184-191
    • /
    • 2022
  • This study aims to provide basic information necessary to find an efficient management plan for patients using auto insurance. The analysis was conducted on the five-year auto insurance medical expenses review data registered in the health care bigdata Hub from 2016 to 2020. As a result of the analysis, the number one composition ratio of auto insurance inpatient treatment expenses was treatment and surgery fees for Certified tertiary hospitals, hospitalization fees for general hospitals, hospitals and clinics, and treatment and surgery fees for oriental medical institutions and dental hospitals. outpatient treatment expenses was doctor's fee for medical institution, treatment and surgery fees for oriental medical institutions and dental hospitals. The ratio of medication, anesthesia, and special equipment significantly affected the cost of inpatient. And the ratio of physical therapy significantly affected the cost of outpatient.

Selecting Optimal Locations for Bicycle Lanes to Prevent Accidents in Seoul (서울특별시 자전거 안전사고 예방을 위한 자전거 도로 최적 입지 선정: 자전거 전용도로 및 전용차로를 중심으로)

  • Ji-eun Kim;Sumin Nam;ZoonKy Lee
    • The Journal of Bigdata
    • /
    • v.8 no.2
    • /
    • pp.45-54
    • /
    • 2023
  • Seoul's public bicycle system, 'Ttareungyi,' introduced in 2015, has achieved an annual ridership of 40 million in 2022. Similarly, electric scooters, a type of personal mobility device, surpassed one million riders in 2020 due to various sharing platforms. However, the major roadways for these new transportation, bicycle lanes, are notably insufficient compared to other forms of transport. Hence, this study proposes an optimal location selection method for bicycle lanes in Seoul to prevent accidents and enhance bicycle ride safety. The location selection process prioritizes road safety concerning bicycle accident risk. Using regression models, high-risk areas for bicycle accidents are identified. Cluster analysis categorizes these areas into six clusters, each suggesting suitable types of bicycle lanes based on cluster-specific characteristics. We hope that this study will contribute to the improvement of Seoul's transportation environment, including the expansion of dedicated bicycle lanes and lanes for personal mobility devices.

Design and Implementation of Efficient Storage and Retrieval Technology of Traffic Big Data (교통 빅데이터의 효율적 저장 및 검색 기술의 설계와 구현)

  • Kim, Ki-su;Yi, Jae-Jin;Kim, Hong-Hoi;Jang, Yo-lim;Hahm, Yu-Kun
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.207-220
    • /
    • 2019
  • Recent developments in information and communication technology has enabled the deployment of sensor based data to provide real-time services. In Korea, The Korea Transportation Safety Authority is collecting driving information of all commercial vehicles through a fitted digital tachograph (DTG). This information gathered using DTG can be utilized in various ways in the field of transportation. Notably in autonomous driving, the real-time analysis of this information can be used to prevent or respond to dangerous driving behavior. However, there is a limit to processing a large amount of data at a level suitable for real-time services using a traditional database system. In particular, due to a such technical problem, the processing of large quantity of traffic big data for real-time commercial vehicle operation information analysis has never been attempted in Korea. In order to solve this problem, this study optimized the new database server system and confirmed that a real-time service is possible. It is expected that the constructed database system will be used to secure base data needed to establish digital twin and autonomous driving environments.

  • PDF

Abusive Detection Using Bidirectional Long Short-Term Memory Networks (양방향 장단기 메모리 신경망을 이용한 욕설 검출)

  • Na, In-Seop;Lee, Sin-Woo;Lee, Jae-Hak;Koh, Jin-Gwang
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.35-45
    • /
    • 2019
  • Recently, the damage with social cost of malicious comments is increasing. In addition to the news of talent committing suicide through the effects of malicious comments. The damage to malicious comments including abusive language and slang is increasing and spreading in various type and forms throughout society. In this paper, we propose a technique for detecting abusive language using a bi-directional long short-term memory neural network model. We collected comments on the web through the web crawler and processed the stopwords on unused words such as English Alphabet or special characters. For the stopwords processed comments, the bidirectional long short-term memory neural network model considering the front word and back word of sentences was used to determine and detect abusive language. In order to use the bi-directional long short-term memory neural network, the detected comments were subjected to morphological analysis and vectorization, and each word was labeled with abusive language. Experimental results showed a performance of 88.79% for a total of 9,288 comments screened and collected.

  • PDF

Grid-based Index Generation and k-nearest-neighbor Join Query-processing Algorithm using MapReduce (맵리듀스를 이용한 그리드 기반 인덱스 생성 및 k-NN 조인 질의 처리 알고리즘)

  • Jang, Miyoung;Chang, Jae Woo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1303-1313
    • /
    • 2015
  • MapReduce provides high levels of system scalability and fault tolerance for large-size data processing. A MapReduce-based k-nearest-neighbor(k-NN) join algorithm seeks to produce the k nearest-neighbors of each point of a dataset from another dataset. The algorithm has been considered important in bigdata analysis. However, the existing k-NN join query-processing algorithm suffers from a high index-construction cost that makes it unsuitable for the processing of bigdata. To solve the corresponding problems, we propose a new grid-based, k-NN join query-processing algorithm. Our algorithm retrieves only the neighboring data from a query cell and sends them to each MapReduce task, making it possible to improve the overhead data transmission and computation. Our performance analysis shows that our algorithm outperforms the existing scheme by up to seven-fold in terms of the query-processing time, while also achieving high extent of query-result accuracy.

The Study of Facebook Marketing Application Method: Facebook 'Likes' Feature and Predicting Demographic Information (페이스북 마케팅 활용 방안에 대한 연구: 페이스북 '좋아요' 기능과 인구통계학적 정보 추출)

  • Yu, Seong Jong;Ahn, Seun;Lee, Zoonky
    • The Journal of Bigdata
    • /
    • v.1 no.1
    • /
    • pp.61-66
    • /
    • 2016
  • With big data analysis, companies use the customized marketing strategy based on customer's information. However, because of the concerns about privacy issue and identity theft, people start erasing their personal information or changing the privacy settings on social network site. Facebook, the most used social networking site, has the feature called 'Likes' which can be used as a tool to predict user's demographic profiles, such as sex and age range. To make accurate analysis model for the study, 'Likes' data has been processed by using Gaussian RBF and nFactors for dimensionality reduction. With random Forest and 5-fold cross-validation, the result shows that sex has 75% and age has 97.85% accuracy rate. From this study, we expect to provide an useful guideline for companies and marketers who are suffering to collect customers' data.

  • PDF

Comparison of Online Shopping Mall BEST 100 using Exploratory Data Analysis (탐색적 자료 분석(EDA) 기법을 활용한 국내 11개 대표 온라인 쇼핑몰 BEST 100 비교)

  • Kang, Jicheon;Kang, Juyoung
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.1-12
    • /
    • 2018
  • Since the beginning of the first online shopping mall, BEST 100 is being provided as the core of all shopping mall websites. BEST 100 is greatly important because consumers can identify popular products at a glance. However, there are only studies using sales outcome indicators, and prior studies using BEST 100 are insignificant. Therefore, this study selected 11 online shopping malls and compared their main characteristics. As a research method, exploratory data analysis technique (EDA) was used by crawling the BEST 100 components of each shopping mall website, such as product name, price, and free shipping check. As a result, the total average price of 11 shopping malls was 72,891.41 won. Sales texts were classified into 8 categories by text mining. The most common category was the fashion part, but it is significant that the setting of the category analyzed the marketing text, not the product attribute. This study has implications for understanding the current online market flow and suggesting future directions by using EDA.

A Study on Subscriber's Preference Factors through Korea, United States and Japan Webtoon Data Analysis : With Naver Webtoon (한, 미, 일 웹툰 분석을 통한 구독자 선호 요인 탐색 : 네이버 웹툰을 중심으로)

  • Do, Sang-Beum;Kang, Juyoung
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.21-32
    • /
    • 2018
  • Currently, Webtoon Industry is promising as high potential market from it's high growth trend. The best advantage webtoon propose is that webtoon can provide appropriate service to customers with various needs. For this feature, webtoon industry is expanding throughout the world. This situation may give a great chance for authors and webtoon service corporation to export webtoon contents. Also, this situation could be an opportunity for webtoon to become a new "Korean Wave" contents. For successful advance to market, a close analysis for customers of exporting countries. In this research, we collected the data from Naver Webtoon and analyzed the features of webtoons and webtoon subscribers according to countries. With this research, it would be possible to find out specific methods and variables which affect the preference of webtoon subscribers.

Emotion Analysis Using a Bidirectional LSTM for Word Sense Disambiguation (양방향 LSTM을 적용한 단어의미 중의성 해소 감정분석)

  • Ki, Ho-Yeon;Shin, Kyung-shik
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.197-208
    • /
    • 2020
  • Lexical ambiguity means that a word can be interpreted as two or more meanings, such as homonym and polysemy, and there are many cases of word sense ambiguation in words expressing emotions. In terms of projecting human psychology, these words convey specific and rich contexts, resulting in lexical ambiguity. In this study, we propose an emotional classification model that disambiguate word sense using bidirectional LSTM. It is based on the assumption that if the information of the surrounding context is fully reflected, the problem of lexical ambiguity can be solved and the emotions that the sentence wants to express can be expressed as one. Bidirectional LSTM is an algorithm that is frequently used in the field of natural language processing research requiring contextual information and is also intended to be used in this study to learn context. GloVe embedding is used as the embedding layer of this research model, and the performance of this model was verified compared to the model applied with LSTM and RNN algorithms. Such a framework could contribute to various fields, including marketing, which could connect the emotions of SNS users to their desire for consumption.

Comparative research on urban image assets of Iksan by analysing bigdata (빅데이터 분석을 통한 익산의 도시 이미지 자산 비교 연구)

  • Yang, Ji-Yu
    • Journal of Digital Contents Society
    • /
    • v.19 no.2
    • /
    • pp.385-392
    • /
    • 2018
  • Iksan is one of medium city in Jellabukdo, South Korea. It has a favorable natural environment for the specialization potential of natural industries and development projects. In addition, it has various historical and cultural resources including Mireuksajji, and KTX Honam line which has been opened for a representative feature as transport city. However, it faces week connection with neighboring cities and large scale of development in neighboring areas, especially in Jeonju and Gunsan. In this paper, we try to classify the urban image assets of Iksan as 'Iksan Station' and 'ktx' on keywords and analyze the possibility of being a center of transportation and logistics through big data analysis extracted from SNS and website. In comparison with Gwangju Songjeong, KTX Honam line station, which has been developed with similar regional characteristics, it is aimed to establish the basis of improvement and establishment of urban image of Iksan city in the future.