• Title/Summary/Keyword: bigdata analysis

Search Result 345, Processing Time 0.019 seconds

Suggestions on how to convert official documents to Machine Readable (공문서의 기계가독형(Machine Readable) 전환 방법 제언)

  • Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.67
    • /
    • pp.99-138
    • /
    • 2021
  • In the era of big data, analyzing not only structured data but also unstructured data is emerging as an important task. Official documents produced by government agencies are also subject to big data analysis as large text-based unstructured data. From the perspective of internal work efficiency, knowledge management, records management, etc, it is necessary to analyze big data of public documents to derive useful implications. However, since many of the public documents currently held by public institutions are not in open format, a pre-processing process of extracting text from a bitstream is required for big data analysis. In addition, since contextual metadata is not sufficiently stored in the document file, separate efforts to secure metadata are required for high-quality analysis. In conclusion, the current official documents have a low level of machine readability, so big data analysis becomes expensive.

COVID-19 Discourse and Social Welfare Intervention through Online News Big Data: Focusing on the Elderly Living Alone (온라인 뉴스 빅데이터를 통한 코로나 19 담론과 사회복지 개입방안: 독거노인을 중심으로)

  • Yeo, Jiyoung
    • 한국노년학
    • /
    • v.41 no.3
    • /
    • pp.353-371
    • /
    • 2021
  • The purpose of this study is to provide clues to social welfare policy making by revealing discourse on social intervention and response based on big data on elderly living alone in the COVID-19 situation. Keyword analysis, network analysis, and topic analysis were utilized to explore the ways in which news media have portrayed challenges facing older individuals and the ways in which the central and local government as well as private organization have responded to them. Results are as follows. First, networks(degree, closeness, betweenness) were formed around region, delivery, society, support, and vulnerability, suggesting an increased demand for economic assistance and social support as well as stronger service delivery systems. Second, key topics derived included "establishing public delivery systems", "establishing local networks", "Managing care gap", "Establishing a private economic support system", and "Establishing service organization system". Based on the research results, discourse on the organic role of government, communities and the private sector has been presented, suggesting policy and practical implications by proposing a discussion on how to intervene for elderly living alone in disaster situations such as COVID-19.

A study on the effect of SME IT resource on performance (중소기업의 IT자원이 업무성과에 미치는 영향에 관한 연구)

  • Jin, Jeongsuk;Park, Jooseok;Park, Jaehong
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.141-158
    • /
    • 2019
  • Based on RBV(Resource Based View), IT of SMEs classified into IT resource and capabilities. And We confirmed that capabilities and resources affected each performance. In other words, based on the questionnaire of SMEs and IT professionals, divides capability from the overall IT resource that are possessed by SMEs. Among the four attributes (value, rare, non-substitutability, imperfect imitability) presented by Barney (1991), this study targeted at value and imperfect imitability and investigated how SMEs recognize IT resource and capability. Furthermore, this study tests how IT resource and capability influence corporate performance. The result of this study finds that resources that are needed on "Knowledge-based" are classified into IT capability, otherwise classified into IT resource. Analysis shows that server, DB(database), system administrators, programmers, CIO, BA were capabilities, Desktop PC, PC software, software for salary and accounting management, e-commerce, Homepage, and network inside th enterprise were resources. Secondly, this study reveals that both IT resource and IT capability affected company performance (employee satisfaction, CEO satisfaction). IT is certainly having an impact on corporate performance. In conclusion, resource can be either IT resource or IT capability based on they way of utilization. And both IT resource and IT capability have an influence on corporate performance (employee job satisfaction, CEO satisfaction). Therefore, when considering IT investment, a company can purchase necessary IT resource and actively utilize it to be IT capability, which can have an influence on corporate performance in return.

  • PDF

Designing the Optimal Urban Distribution Network using GIS : Case of Milk Industry in Ulaanbaatar Mongolia (GIS를 이용한 최적 도심 유통 네트워크 설계 : 몽골 울란바타르 내 우유 산업 사례)

  • Enkhtuya, Daariimaa;Shin, KwangSup
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.159-173
    • /
    • 2019
  • Last-Mile delivery optimization plays a key role in the urban supply chain operation, which is the most expensive and time-consuming and most complicated part of the whole delivery process. The urban consolidation center (UCC) is regarded as a significant asset for supporting customer demand in the last-mile delivery service. It is the key benefit of UCC to improve the load balance of vehicles and to reduce the total traveling distance by finding the better route with the well-organized multi-leg vehicle journey in the urban area. This paper presents the model using multiple scenario analysis integrated with mathematical optimization techniques using Geographic Information System (GIS). The model aims to find the best solution for the distribution network consisted of DC and UCC, which is applied to the case of Ulaanbaatar Mongolia. The proposed methodology integrates two sub-models, location-allocation model and vehicle routing problem. The multiple scenarios devised by selecting locations of UCC are compared considering the general performance and delivery patterns together. It has been adopted to make better decisions the quantitative metrics such as the economic value of capital cost, operating cost, and balance of using available resources. The result of this research may help the manager or public authorities who should design the distribution network for the last mile delivery service optimization using UCC within the urban area.

  • PDF

Social Factors Affecting Internet Searches on Cyber Bullying in Korea and America Using Social Big Data and Google Search Trends (소셜 빅데이터와 Google 검색트렌드를 활용한 한국과 미국의 사이버불링 검색에 영향을 미치는 요인 분석)

  • Song, Tae-Min;Song, Juyoung;Cheon, Mi-Kyung
    • The Journal of Bigdata
    • /
    • v.1 no.1
    • /
    • pp.67-75
    • /
    • 2016
  • The study analyzed big data extracted from Google and social media to identify factors related to searches on cyber bullying in Korea and America. Korea's cyber bullying analysis was conducted social big data collected from online news sites, blogs, $caf{\acute{e}}s$, social network services and message for between January 1, 2011 and March 31, 2013. Google search trends for the search words of stress, exercise, drinking, and cyber bullying were obtained for January 1, 2004 and December 22, 2013. The main results of this study were as follows: first, the significant factors stress were cyber bullying that Korea more than America. Secondly, a positive relationship was found between stress and drinking, exercise and cyber bullying both Korea and America. Thirdly, significant differences were found all path both Korea and America. The study shows that both adults and teenagers are influenced in Korea. We need to develop online application that if cyber bullying behavior was predicted can intervene in real time because these actual cyber bullying-related exposure to psychological and behavioral characteristic.

  • PDF

Comparative Analysis for Clustering Based Optimal Vehicle Routes Planning (클러스터링 기반의 최적 차량 운행 계획 수립을 위한 비교연구)

  • Kim, Jae-Won;Shin, KwangSup
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.155-180
    • /
    • 2020
  • It takes the most important role the problem of assigining vehicles and desigining optimal routes for each vehicle in order to enhance the logistics service level. While solving the problem, various cost factors such as number of vehicles, the capacity of vehicles, total travelling distance, should be considered at the same time. Although most of logistics service providers introduced the Transportation Management System (TMS), the system has the limitation which can not consider the practical constraints. In order to make the solution of TMS applicable, it is required experts revised the solution of TMS based on their own experience and intuition. In this research, different from previous research which have focused on minimizing the total cost, it has been proposed the methodology which can enhance the efficiency and fairness of asset utilization, simultaneously. First of all, it has been adopted the Cluster-First Route-Second (CFRS) approach. Based on the location of customers, we have grouped customers as clusters by using four different clustering algorithm such as K-Means, K-Medoids, DBSCAN, Model-based clustering and a procedural approach, Fisher & Jaikumar algorithm. After getting the result of clustering, it has been developed the optiamal vehicle routes within clusters. Based on the result of numerical experiments, it can be said that the propsed approach based on CFRS may guarantee the better performance in terms of total travelling time and distance. At the same time, the variance of travelling distance and number of visiting customers among vehicles, it can be concluded that the proposed approach can guarantee the better performance of assigning tasks in terms of fairness.

Classification Modeling for Predicting Medical Subjects using Patients' Subjective Symptom Text (환자의 주관적 증상 텍스트에 대한 진료과목 분류 모델 구축)

  • Lee, Seohee;Kang, Juyoung
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.51-62
    • /
    • 2021
  • In the field of medical artificial intelligence, there have been a lot of researches on disease prediction and classification algorithms that can help doctors judge, but relatively less interested in artificial intelligence that can help medical consumers acquire and judge information. The fact that more than 150,000 questions have been asked about which hospital to go over the past year in NAVER portal will be a testament to the need to provide medical information suitable for medical consumers. Therefore, in this study, we wanted to establish a classification model that classifies 8 medical subjects for symptom text directly described by patients which was collected from NAVER portal to help consumers choose appropriate medical subjects for their symptoms. In order to ensure the validity of the data involving patients' subject matter, we conducted similarity measurements between objective symptom text (typical symptoms by medical subjects organized by the Seoul Emergency Medical Information Center) and subjective symptoms (NAVER data). Similarity measurements demonstrated that if the two texts were symptoms of the same medical subject, they had relatively higher similarity than symptomatic texts from different medical subjects. Following the above procedure, the classification model was constructed using a ridge regression model for subjective symptom text that obtained validity, resulting in an accuracy of 0.73.

A Study on the Forecasting Trend of Apartment Prices: Focusing on Government Policy, Economy, Supply and Demand Characteristics (아파트 매매가 추이 예측에 관한 연구: 정부 정책, 경제, 수요·공급 속성을 중심으로)

  • Lee, Jung-Mok;Choi, Su An;Yu, Su-Han;Kim, Seonghun;Kim, Tae-Jun;Yu, Jong-Pil
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.91-113
    • /
    • 2021
  • Despite the influence of real estate in the Korean asset market, it is not easy to predict market trends, and among them, apartments are not easy to predict because they are both residential spaces and contain investment properties. Factors affecting apartment prices vary and regional characteristics should also be considered. This study was conducted to compare the factors and characteristics that affect apartment prices in Seoul as a whole, 3 Gangnam districts, Nowon, Dobong, Gangbuk, Geumcheon, Gwanak and Guro districts and to understand the possibility of price prediction based on this. The analysis used machine learning algorithms such as neural networks, CHAID, linear regression, and random forests. The most important factor affecting the average selling price of all apartments in Seoul was the government's policy element, and easing policies such as easing transaction regulations and easing financial regulations were highly influential. In the case of the three Gangnam districts, the policy influence was low, and in the case of Gangnam-gu District, housing supply was the most important factor. On the other hand, 6 mid-lower-level districts saw government policies act as important variables and were commonly influenced by financial regulatory policies.

Building a Big Data-based Car Camping Website and Proposing a Business Models for the Corona19 Untact Trip (코로나19 언택트 여행을 위한 차박 캠핑 웹사이트 구축 및 비즈니스 모델 제안)

  • Kim, Minjeong;Kim, Soohyun;Oh, Jihye;Eom, Jiyoon;Kang, Juyoung
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.179-196
    • /
    • 2021
  • With the spread of untact culture resulting from the Covid-19 pandemic, the size of the car camping market has expanded to minimize contact with others. As a result, SUVs have exceeded sales of sedans, and sales of recreational vehicles (RVs) have increased by 101% compared to the same period last year. Despite the explosive increase in demand for car camping, research on car camping has not matched this increase. Therefore, in this study, we intended to conduct a study focused on car camping users. According to a survey of Naver's famous car camping cafe, it was difficult to find articles, maps, and websites with car camping places. Analysis of car camping websites showed that most only post information about the camping itself, so details of car camping places were not available. Furthermore, according to a survey derived from related prior studies and literature surveys, most users urged solutions to the problem of unauthorized garbage dumping in the car camping locations. In addition, car camping users wanted to receive information on amenities near the car camping places. Therefore, we aimed to establish a car camping website that provides basic information on car camping places and nearby convenience facilities. Moreover, to solve the problem of garbage dumping, we provided a category wherein users can post pictures of clean camping campaigns. We also developed a business model utilizing the certification process of clean camping. The business model is designed with a structure wherein car camping users are rewarded through the clean camping certification process. Compensation for clean camping certification was proposed to be provided through partnerships with domestic automakers, Korea Tourism Organization, and Small Business Market Promotion Agency.

A Study on the Methodology of Early Diagnosis of Dementia Based on AI (Artificial Intelligence) (인공지능(AI) 기반 치매 조기진단 방법론에 관한 연구)

  • Oh, Sung Hoon;Jeon, Young Jun;Kwon, Young Woo;Jeong, Seok Chan
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.37-49
    • /
    • 2021
  • The number of dementia patients in Korea is estimated to be over 800,000, and the severity of dementia is becoming a social problem. However, no treatment or drug has yet been developed to cure dementia worldwide. The number of dementia patients is expected to increase further due to the rapid aging of the population. Currently, early detection of dementia and delaying the course of dementia symptoms is the best alternative. This study presented a methodology for early diagnosis of dementia by measuring and analyzing amyloid plaques. This vital protein can most clearly and early diagnose dementia in the retina through AI-based image analysis. We performed binary classification and multi-classification learning based on CNN on retina data. We also developed a deep learning algorithm that can diagnose dementia early based on pre-processed retinal data. Accuracy and recall of the deep learning model were verified, and as a result of the verification, and derived results that satisfy both recall and accuracy. In the future, we plan to continue the study based on clinical data of actual dementia patients, and the results of this study are expected to solve the dementia problem.