• Title/Summary/Keyword: Public Data Analysis

Search Result 5,667, Processing Time 0.035 seconds

Suggestions on how to convert official documents to Machine Readable (공문서의 기계가독형(Machine Readable) 전환 방법 제언)

  • Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.67
    • /
    • pp.99-138
    • /
    • 2021
  • In the era of big data, analyzing not only structured data but also unstructured data is emerging as an important task. Official documents produced by government agencies are also subject to big data analysis as large text-based unstructured data. From the perspective of internal work efficiency, knowledge management, records management, etc, it is necessary to analyze big data of public documents to derive useful implications. However, since many of the public documents currently held by public institutions are not in open format, a pre-processing process of extracting text from a bitstream is required for big data analysis. In addition, since contextual metadata is not sufficiently stored in the document file, separate efforts to secure metadata are required for high-quality analysis. In conclusion, the current official documents have a low level of machine readability, so big data analysis becomes expensive.

Sentiment analysis of nuclear energy-related articles and their comments on a portal site in Rep. of Korea in 2010-2019

  • Jeong, So Yun;Kim, Jae Wook;Kim, Young Seo;Joo, Han Young;Moon, Joo Hyun
    • Nuclear Engineering and Technology
    • /
    • v.53 no.3
    • /
    • pp.1013-1019
    • /
    • 2021
  • This paper reviewed the temporal changes in the public opinions on nuclear energy in Korea with a big data analysis of nuclear energy-related articles and their comments posted on the portal site NAVER. All articles that included at least one of "nuclear energy," "nuclear power plant (NPP)," "nuclear power phase-out," or "anti-nuclear" in their titles or main text were extracted from those posted on NAVER in January 2010-December 2019. First, we performed annual word frequency analysis to identify what words had appeared most frequently in the articles. For that period, the most frequent words were "NPP," "nuclear energy," and "energy." In addition, "safety" has remained in the upper ranks since the Fukushima NPP accident. Then, we performed sentiment analysis of the pre-processed articles. The sentiment analysis showed that positive-tone articles have been reported more frequently than negativetone over the entire analysis period. Last, we performed sentiment analysis of the comments on the articles to examine the public's intention regarding nuclear issues. The analysis showed that the number of negative comments to articles each month-irrespective of positive or negative tone-was always larger than that of positive comments over the entire analysis period.

A Case Study of Producing Infographics Using Tableau Public (Tableau Public을 이용한 인포그래픽 제작 사례연구)

  • Kim, Dong Hwan
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.21-29
    • /
    • 2015
  • Recently, according to the increasingly populated data, many media and organizations focus on big data, data visualization, information visualization and infographics. Domestically, Chosun.com and Hankyoreh online have improved on the data visualization field and internationally, the Guardian, Wall Street Journal, and New York Times are the leading companies on that area. Until now, many people have recognized infographics as a design-oriented product in Korea. However, one of significant data visualization programs, Tableau Public, can visualize data more efficiently. In this paper, Data Visualization Methods Quadrant for Policy Making is defined, and data analysis and producing infographics are executed. As used data, World Bank open source was adopted and using the number of passenger cars per 1,000 people, two analysis results are extracted. First, in high income group, the more GNI per capita, the lesser Slope is represented and in mid income group, the more GNI per capita positively affects to Slope. Second, in the global finance crisis, the car ownership rate was about 1.7 times than the usual state in the global economy. Through the case study, this paper suggests that the direction of producing infographics should be changed from design-oriented to data-oriented. Moreover, the data-oriented infographics should be propagated as means of scientific research and policy making.

Big-Data Integration in Public Institutions for Supporting Start-up Businesses (창업지원을 위한 공공기관 빅데이터 통합)

  • Shin, Seong-Yoon;Kim, Do-Goan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.6
    • /
    • pp.1341-1346
    • /
    • 2015
  • Nowadays, many small businesses have experienced the failure of business or hardship. In this point, specific and integrated information for startup business should be required to decrease the rate of failure and to increase the rate of success. This study is to suggest the integration of various data which various public institutions have separately. For this purpose, it is to classify the data types in constructing big-data for start-up business and to suggest a way of data integration, analysis method, and web or services of information system for supporting startup businesses.

Suggestions of Big-Data Integration in Public Institutions for Supporting Start-up Businesses (창업지원을 위한 공공기관 빅데이터 통합 제언)

  • Kim, Do-Goan;Jin, Chan-Yong;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.204-206
    • /
    • 2015
  • Nowadays, many small businesses have experienced the failure of business or hardship. In this point, specific and integrated information for startup business should be required to decrease the rate of failure and to increase the rate of success. This study is to suggest the integration and big data of various data which various public institutions have separately. For this purpose, it is to classify the data types in constructing big-data for start-up business and to suggest a model of analysis in information system for supporting startup businesses.

  • PDF

An Analysis on Management Efficiency of The Regional Public Hospitals Using D.E.A (DEA를 이용한 지방의료원 경영효율성 분석)

  • Kim, Young-Jong;Kim, Kwang-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.8
    • /
    • pp.512-520
    • /
    • 2020
  • This study analyzed the relative operational efficiency and impact factors of regional public hospitals to present benchmarking points for enhancing the efficiency of inefficient regional public hospitals. The survey targets collected and utilized the internal resources and management performance data from 34 regional public hospitals in Korea over the past five years, from 2014 to 2018. The final 33 regional public hospitals were surveyed, excluding Jinan Regional Public Hospital, which opened in 2015, the middle of the survey period. The general characteristics and input/output variables were analyzed by frequency analysis and technical statistics analysis, and Data Envelopment Analysis was performed to measure the operational efficiency index and relative comparison. According to the study, there were 11 efficient hospitals (33.3%) and 22 inefficient hospitals (66.7%). Of the 22 inefficient hospitals, 13 (IRS: Increasing Returns to Scale) required scale expansion, and nine (DRS: Decreasing Returns to Scale) required scale reduction or rebalancing. The significance of this study was that an analysis of the relative efficiency and influencing factors presented specific alternatives or directions that could help enhance the efficiency of the growth of regional public hospitals, sustainable management, and expansion of publicness.

An Analysis on Media Trends in Public Agency for Social Service Applying Text Mining (텍스트 마이닝을 적용한 사회서비스원 언론보도기사 분석)

  • Park, Hae-Keung;Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.2
    • /
    • pp.41-48
    • /
    • 2022
  • This study tried to empirically explore which issues related to the social service agency for public(as below SSA), that is, social perceptions were formed, by using mess media related to the SSA. This study is meaningful in that it identifies the overall social perception and trend of SSA through public opinion. In order to extract media trend data, the search used the big data analysis system, Textom, to collect data from the representative portals Naver News and Daum News. The collected texts were 1,299 in 2020 and 1,410 in 2021, for a total of 2,709. As a result of the analysis, first, the most derived words in relation to the frequency of text appearance were 'SSA', 'establishment', and 'operation'. Second, as a result of the N-gram analysis, the pairs of words directly related to the SSA 'SSA and public', 'SSA and opening', 'SSA and launch', and 'SSA and Department Director', 'SSA and Staff', 'SSA and Caregiver' etc. Third, in the results of TF-IDF analysis and word network analysis, similar to the word occurrence frequency and N-gram results, 'establishment', 'operation', 'public', 'launch', 'provided', 'opened', ' 'Holding' and 'Care' were derived. Based on the above analysis results, it was suggested to strengthen the emergency care support group, to commercialize it in detail, and to stabilize jobs.

Evaluation of Workload and Full-Time Equivalents in Kindergarten Dietitians through Job Analysis by Kindergarten Establishment Type (직무분석을 통한 유치원 설립유형별 영양(교)사의 과업량 및 적정인력 추정)

  • Shin, Yulee;Kyung, Minsook;Ham, Sunny
    • Journal of the Korean Dietetic Association
    • /
    • v.28 no.1
    • /
    • pp.1-18
    • /
    • 2022
  • This study was conducted to estimate the appropriate workforce of dietitians by type of kindergarten through the recognition survey and job analysis of the kindergarten. Nutritionists' duties were classified into 6 duties, 28 tasks and 94 task elements. The statistical data analysis was completed using Statistical Package for the Social Sciences (SPSS) (ver. 25.0). The time spent on 6 duties, including 'Nutrition management' (public attached 666.24 hours/year, public independent 843.04 hours/year), 'Foodservice management Practices' (public attached 1,472.52 hours/year, public independent 1,298.11 hours/year), 'Hygiene management of kindergarten foodservice' (public attached 611.78 hours/year, public independent 607.18 hours/year), 'Nutrition-diet education and counseling' (public attached 340.53 hours/year, public independent 253.42 hours/year), 'Managing snack during semesters and lunch/snacks during breaks' (public independent 309.04 hours/year) and 'Professionalism enhancement' (public attached 88.86 hours/year; public independent 65.17 hours/year). Total working hours for dietitians were 3,179.94 hours/year (public attached) and 3,375.97 hours/year (public independent). The time/day ×5 days/week ×52 weeks/year calculation method using derived total working hours/year was applied to derive appropriate full-time equivalents (FTEs). The analysis showed that the public attached kindergarten's FTEs were 1.53. The public independent's FTEs were 1.62, and the total FTEs were 1.55. This is the first study to analyze the workload of kindergarten dietitians and appropriate manpower by kindergarten establishment type. It is expected to be a valuable policy basis for efficient operation measures related to the kindergarten dietitians.

Issue tracking and voting rate prediction for 19th Korean president election candidates (댓글 분석을 통한 19대 한국 대선 후보 이슈 파악 및 득표율 예측)

  • Seo, Dae-Ho;Kim, Ji-Ho;Kim, Chang-Ki
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.199-219
    • /
    • 2018
  • With the everyday use of the Internet and the spread of various smart devices, users have been able to communicate in real time and the existing communication style has changed. Due to the change of the information subject by the Internet, data became more massive and caused the very large information called big data. These Big Data are seen as a new opportunity to understand social issues. In particular, text mining explores patterns using unstructured text data to find meaningful information. Since text data exists in various places such as newspaper, book, and web, the amount of data is very diverse and large, so it is suitable for understanding social reality. In recent years, there has been an increasing number of attempts to analyze texts from web such as SNS and blogs where the public can communicate freely. It is recognized as a useful method to grasp public opinion immediately so it can be used for political, social and cultural issue research. Text mining has received much attention in order to investigate the public's reputation for candidates, and to predict the voting rate instead of the polling. This is because many people question the credibility of the survey. Also, People tend to refuse or reveal their real intention when they are asked to respond to the poll. This study collected comments from the largest Internet portal site in Korea and conducted research on the 19th Korean presidential election in 2017. We collected 226,447 comments from April 29, 2017 to May 7, 2017, which includes the prohibition period of public opinion polls just prior to the presidential election day. We analyzed frequencies, associative emotional words, topic emotions, and candidate voting rates. By frequency analysis, we identified the words that are the most important issues per day. Particularly, according to the result of the presidential debate, it was seen that the candidate who became an issue was located at the top of the frequency analysis. By the analysis of associative emotional words, we were able to identify issues most relevant to each candidate. The topic emotion analysis was used to identify each candidate's topic and to express the emotions of the public on the topics. Finally, we estimated the voting rate by combining the volume of comments and sentiment score. By doing above, we explored the issues for each candidate and predicted the voting rate. The analysis showed that news comments is an effective tool for tracking the issue of presidential candidates and for predicting the voting rate. Particularly, this study showed issues per day and quantitative index for sentiment. Also it predicted voting rate for each candidate and precisely matched the ranking of the top five candidates. Each candidate will be able to objectively grasp public opinion and reflect it to the election strategy. Candidates can use positive issues more actively on election strategies, and try to correct negative issues. Particularly, candidates should be aware that they can get severe damage to their reputation if they face a moral problem. Voters can objectively look at issues and public opinion about each candidate and make more informed decisions when voting. If they refer to the results of this study before voting, they will be able to see the opinions of the public from the Big Data, and vote for a candidate with a more objective perspective. If the candidates have a campaign with reference to Big Data Analysis, the public will be more active on the web, recognizing that their wants are being reflected. The way of expressing their political views can be done in various web places. This can contribute to the act of political participation by the people.

The Effect of Housing Policy on Purchased Public Housing in Seoul (서울시 매입임대주택 거주자 특성 및 정책효과 실증분석)

  • Sung, Jin Uk;Song, Ki Wook
    • Land and Housing Review
    • /
    • v.11 no.1
    • /
    • pp.1-10
    • /
    • 2020
  • The purpose of this study is to identify the characteristics of residents for purchased public housing in Seoul, using empirical panel data. The scope of the study will be targeted at the Seoul, as of 2017. The research method includes literature review, statistical analysis, and spatial analysis using QGIS software program. The data used in the research is the Panel Survey of Public Housing in Seoul(2017). The main results of the research are briefly summarized as follows; Firstly, Living in a housing with an increased area compared to the previous housing. Secondly, they can live for a long time with low rent. The burden on housing costs is 71.8% in the case of the deposit. Thirdly, there is little concern about social stigma. Purchased public housing was found to be good in terms of stigmatization due to low-income clusters. Lastly, the accessibility in the city center was good condition. In particular, commuting time was 34.79 minutes on a one-way basis, saving about 4 minutes compared to other types of public housing.