• Title/Summary/Keyword: public records big data

Search Result 25, Processing Time 0.022 seconds

Big Data Utilization and Policy Suggestions in Public Records Management (공공기록관리분야의 빅데이터 활용 방법과 시사점 제안)

  • Hong, Deokyong
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.21 no.4
    • /
    • pp.1-18
    • /
    • 2021
  • Today, record management has become more important in management as records generated from administrative work and data production have increased significantly, and the development of information and communication technology, the working environment, and the size and various functions of the government have expanded. It is explained as an example in connection with the concept of public records with the characteristics of big data and big data characteristics. Social, Technological, Economical, Environmental and Political (STEEP) analysis was conducted to examine such areas according to the big data generation environment. The appropriateness and necessity of applying big data technology in the field of public record management were identified, and the top priority applicable framework for public record management work was schematized, and business implications were presented. First, a new organization, additional research, and attempts are needed to apply big data analysis technology to public record management procedures and standards and to record management experts. Second, it is necessary to train record management specialists with "big data analysis qualifications" related to integrated thinking so that unstructured and hidden patterns can be found in a large amount of data. Third, after self-learning by combining big data technology and artificial intelligence in the field of public records, the context should be analyzed, and the social phenomena and environment of public institutions should be analyzed and predicted.

Efficient K-Anonymization Implementation with Apache Spark

  • Kim, Tae-Su;Kim, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.17-24
    • /
    • 2018
  • Today, we are living in the era of data and information. With the advent of Internet of Things (IoT), the popularity of social networking sites, and the development of mobile devices, a large amount of data is being produced in diverse areas. The collection of such data generated in various area is called big data. As the importance of big data grows, there has been a growing need to share big data containing information regarding an individual entity. As big data contains sensitive information about individuals, directly releasing it for public use may violate existing privacy requirements. Thus, privacy-preserving data publishing (PPDP) has been actively studied to share big data containing personal information for public use, while preserving the privacy of the individual. K-anonymity, which is the most popular method in the area of PPDP, transforms each record in a table such that at least k records have the same values for the given quasi-identifier attributes, and thus each record is indistinguishable from other records in the same class. As the size of big data continuously getting larger, there is a growing demand for the method which can efficiently anonymize vast amount of dta. Thus, in this paper, we develop an efficient k-anonymity method by using Spark distributed framework. Experimental results show that, through the developed method, significant gains in processing time can be achieved.

A Study on the Accumulation and Use of Corporate Records: Corporate Records Management as a Big Data Platform (기업의 현용기록 축적과 이용 방안 연구: 빅데이터 플랫폼으로서의 기업기록관리)

  • Kim, Sung-woo;Rieh, Hae-young
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.20 no.3
    • /
    • pp.99-118
    • /
    • 2020
  • The creation of value and the enhancement of benefits through records management by enterprises are comparable to those by public institutions. However, Korea has yet to establish guidelines on corporate records management. Global companies are strengthening their competitiveness by reducing trial and error in their work through the accumulation and use of records as information assets, which serve as the output of their work processes. While Korean companies agree on the necessity of corporate records management, most of them are concerned with archival (noncurrent records) management, such as historical compilation and historical data management, rather than records (current record) management. Therefore, through a case study of a K-company with effective records management, this study identifies methods to promote the accumulation, use, and management of corporate records in line with the search of value and benefits. Moreover, the company emphasizes the management of corporate records as a big data platform that accumulates and uses data, which is an important resource in the era of the Fourth Industrial Revolution, and proposes measures for their revitalization.

Application of 4th Industrial Revolution Technology to Records Management (제4차 산업혁명 기술의 기록관리 적용 방안)

  • An, Dae-jin;Yim, Jin-hee
    • The Korean Journal of Archival Studies
    • /
    • no.54
    • /
    • pp.211-248
    • /
    • 2017
  • This study examined ways to improve records management by using the new technology of the Fourth Industrial Revolution. To do this, we selected four technologies that have a huge impact on the production and management of records such as cloud, big data, artificial intelligence, and the Internet of Things. We tested these technologies and summarized their concepts, characteristics, and applications. The characteristics of the changed production records were analyzed by each technology. Because of new technology, the production of records has rapidly increased and the types of records have become diverse. With this, there is also a need for solutions to explain the quality of data and the context of production. To effectively introduce each technology into records management, a roadmap should be designed by classifying which technology should be applied immediately and which should be applied later depending on the maturity of the technology. To cope with changes in the characteristics of production records, a flexible data structure must be produced in a standardized format. Public authorities should also be able to procure Software as a Service (SaaS) products and use digital technology to improve the quality of public services.

Suggestions on how to convert official documents to Machine Readable (공문서의 기계가독형(Machine Readable) 전환 방법 제언)

  • Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.67
    • /
    • pp.99-138
    • /
    • 2021
  • In the era of big data, analyzing not only structured data but also unstructured data is emerging as an important task. Official documents produced by government agencies are also subject to big data analysis as large text-based unstructured data. From the perspective of internal work efficiency, knowledge management, records management, etc, it is necessary to analyze big data of public documents to derive useful implications. However, since many of the public documents currently held by public institutions are not in open format, a pre-processing process of extracting text from a bitstream is required for big data analysis. In addition, since contextual metadata is not sufficiently stored in the document file, separate efforts to secure metadata are required for high-quality analysis. In conclusion, the current official documents have a low level of machine readability, so big data analysis becomes expensive.

Building Linked Big Data for Stroke in Korea: Linkage of Stroke Registry and National Health Insurance Claims Data

  • Kim, Tae Jung;Lee, Ji Sung;Kim, Ji-Woo;Oh, Mi Sun;Mo, Heejung;Lee, Chan-Hyuk;Jeong, Han-Young;Jung, Keun-Hwa;Lim, Jae-Sung;Ko, Sang-Bae;Yu, Kyung-Ho;Lee, Byung-Chul;Yoon, Byung-Woo
    • Journal of Korean Medical Science
    • /
    • v.33 no.53
    • /
    • pp.343.1-343.8
    • /
    • 2018
  • Background: Linkage of public healthcare data is useful in stroke research because patients may visit different sectors of the health system before, during, and after stroke. Therefore, we aimed to establish high-quality big data on stroke in Korea by linking acute stroke registry and national health claim databases. Methods: Acute stroke patients (n = 65,311) with claim data suitable for linkage were included in the Clinical Research Center for Stroke (CRCS) registry during 2006-2014. We linked the CRCS registry with national health claim databases in the Health Insurance Review and Assessment Service (HIRA). Linkage was performed using 6 common variables: birth date, gender, provider identification, receiving year and number, and statement serial number in the benefit claim statement. For matched records, linkage accuracy was evaluated using differences between hospital visiting date in the CRCS registry and the commencement date for health insurance care in HIRA. Results: Of 65,311 CRCS cases, 64,634 were matched to HIRA cases (match rate, 99.0%). The proportion of true matches was 94.4% (n = 61,017) in the matched data. Among true matches (mean age 66.4 years; men 58.4%), the median National Institutes of Health Stroke Scale score was 3 (interquartile range 1-7). When comparing baseline characteristics between true matches and false matches, no substantial difference was observed for any variable. Conclusion: We could establish big data on stroke by linking CRCS registry and HIRA records, using claims data without personal identifiers. We plan to conduct national stroke research and improve stroke care using the linked big database.

A Study on the Improvement Legal System for Next-generation Records Management (차세대 기록관리를 위한 법체계 개선방안 연구)

  • Lee, Jin Ryong;Ju, Hyun Mi;Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.55
    • /
    • pp.275-305
    • /
    • 2018
  • The advent of e-government following the information revolution has affected public records systems. Records management should now be changed into an environment for establishing a national records management system based on the Internet of things (IoT), cloud, big data, and mobile (ICBM), and it is time to make a fresh start toward a next-generation records management system that responds to changes in the environment. Ultimately, it is time for a records management system that ensures a proper way of dealing with new environmental changes. It has been nearly 20 years since the Public Records Management Act was enacted in 1999, and its complete amendment was made in 2006 so that electronic records could be efficiently managed. When recompliance management needs to be rechecked, a full redesign is required to enable the current legal system to respond to the new circumstances in the present day. Therefore, this study is intended to suggest ways to improve the new records management legal system as the environment changes over the next generation and lay the legal groundwork for innovation in the national records management system.

Big Data Analytics of Construction Safety Incidents Using Text Mining (텍스트 마이닝을 활용한 건설안전사고 빅데이터 분석)

  • Jeong Uk Seo;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.3
    • /
    • pp.581-590
    • /
    • 2024
  • This study aims to extract key topics through text mining of incident records (incident history, post-incident measures, preventive measures) from construction safety accident case data available on the public data portal. It also seeks to provide fundamental insights contributing to the establishment of manuals for disaster prevention by identifying correlations between these topics. After pre-processing the input data, we used the LDA-based topic modeling technique to derive the main topics. Consequently, we obtained five topics related to incident history, and four topics each related to post-incident measures and preventive measures. Although no dominant patterns emerged from the topic pattern analysis, the study holds significance as it provides quantitative information on the follow-up actions related to the incident history, thereby suggesting practical implications for the establishment of a preventive decision-making system through the linkage between accident history and subsequent measures for reccurrence prevention.

A Study on the Impact of the Epidemic Disease on the Number of Books Checked Out of the Public Libraries: Based on the Middle East Respiratory Syndrome Coronavirus (유행성 질병이 공공도서관의 대출책수에 미치는 영향: 메르스 사태를 중심으로)

  • Kim, Wan-Jong
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.4
    • /
    • pp.273-287
    • /
    • 2015
  • This study aimed to investigate the impact of the epidemic disease including Middle East Respiratory Syndrome Coronavirus (MERS) on the usage of public libraries. Such disease yields anxiety throughout the nation and discourages social activities in general. 18,711,453 records from 303 public libraries were examined with "big data retrieval & analysis platform for public libraries" located in Sejong National Library. The results are as follows. First, in 2015, when MERS was prevalent, the daily mean of books checked out was 64,645.05, showing decrease of 6,300 per day compared to that of 2014. Second, in 2014, the daily mean of books checked out from July 5th to August 19th was greater than that of from April 4th to May 19th and that of from May 20th to July 4th, implying the impact of summer vacation on the increase in books checked out in public libraries. Third, in 2015, the daily mean of books checked out from July 5th was greater than during MERS outbreak(from May 20th to July 4th), while it did not show statistically significant difference with that of before the outbreak. Fourth, the daily mean of books checked out did not show statistically significant difference between 2014 and 2015 before and during the outbreak, while it showed statistically significant difference between 2014 and 2015 after the epidemic period. The results indicate that MERS and the anxiety it brought nationwide had an impact on the daily mean of books checked out in public libraries after the epidemic period rather than during the outbreak.

Panic Disorder Intelligent Health System based on IoT and Context-aware

  • Huan, Meng;Kang, Yun-Jeong;Lee, Sang-won;Choi, Dong-Oun
    • International journal of advanced smart convergence
    • /
    • v.10 no.2
    • /
    • pp.21-30
    • /
    • 2021
  • With the rapid development of artificial intelligence and big data, a lot of medical data is effectively used, and the diagnosis and analysis of diseases has entered the era of intelligence. With the increasing public health awareness, ordinary citizens have also put forward new demands for panic disorder health services. Specifically, people hope to predict the risk of panic disorder as soon as possible and grasp their own condition without leaving home. Against this backdrop, the smart health industry comes into being. In the Internet age, a lot of panic disorder health data has been accumulated, such as diagnostic records, medical record information and electronic files. At the same time, various health monitoring devices emerge one after another, enabling the collection and storage of personal daily health information at any time. How to use the above data to provide people with convenient panic disorder self-assessment services and reduce the incidence of panic disorder in China has become an urgent problem to be solved. In order to solve this problem, this research applies the context awareness to the automatic diagnosis of human diseases. While helping patients find diseases early and get treatment timely, it can effectively assist doctors in making correct diagnosis of diseases and reduce the probability of misdiagnosis and missed diagnosis.