• 제목/요약/키워드: Public Open Datasets

검색결과 14건 처리시간 0.02초

Development of a Method for Analyzing and Visualizing Concept Hierarchies based on Relational Attributes and its Application on Public Open Datasets

  • Hwang, Suk-Hyung
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권9호
    • /
    • pp.13-25
    • /
    • 2021
  • 인터넷과 정보통신, 인공지능기술을 기반으로 하는 디지털 혁신 시대를 맞이하면서 거대한 규모의 데이터집합이 발생, 수집, 축적되어, 다양한 공공기관에서 온라인에 오픈하여 유용한 공공정보를 제공하고 있다. 데이터를 분석하여 유용한 통찰력과 정보를 얻기 위하여, 데이터집합에 내재되어 있는 객체와 속성 사이의 이진 관계를 기반으로 데이터를 분석, 분류, 군집화 및 시각화하는 형식개념분석기법이 성공적으로 사용되어 왔다. 본 논문에서는 형식개념분석기법을 확장하여, 객체의 속성뿐만 아니라 객체들 사이의 관련 관계를 기반으로 데이터집합을 분류하고 개념화하여 가시화하기 위한 기법과 지원도구를 제안한다. 일부 공공 오픈 데이터집합을 대상으로 본 논문의 제안기법을 적용하여 몇 가지 실험을 수행한 결과, 데이터집합으로부터 개념 계층구조를 생성하고 시각화하여 보다 유용한 지식을 추출함으로써 제안기법의 타당성과 유용성을 실증하였다. 본 논문에서 제안한 분석기법은 효과적인 데이터분석, 분류, 군집화, 시각화, 정보검색 등을 위한 유용한 도구로 사용될 수 있다.

Opening the Nation: Leveraging Open Data to Create New Business and Provide Services

  • ;이홍주
    • 지식경영연구
    • /
    • 제16권4호
    • /
    • pp.157-168
    • /
    • 2015
  • Opening government data has been one of the main goals of nations building their e-government structures. Nonetheless, more than publishing government data for public viewing, the bigger concern right now is promoting the use change to "and proving the usefulness of available public data". In order to do this, governments must be able to, not only publicize data but more so, publish the kind of data usable to infomediaries and developers in order to create new products and services for citizens. This research investigates 30 open data use cases of South Korea as listed in Data.go.kr. This study aims to contribute to a better understanding of open datasets utilization in a technologically-advanced and well-developed nation and hopefully provide some useful insights on how open data is currently being used, how it is opening up new business, and more importantly, how it is contributing to the civic society by providing services to the public.

거주민 공간복지 향상을 위한 공공 개방 민원 데이터 분석 모델 - 강동구 공간복지 분석 사례를 중심으로 - (A Public Open Civil Complaint Data Analysis Model to Improve Spatial Welfare for Residents - A Case Study of Community Welfare Analysis in Gangdong District -)

  • 신동윤
    • 한국BIM학회 논문집
    • /
    • 제13권3호
    • /
    • pp.39-47
    • /
    • 2023
  • This study aims to introduce a model for enhancing community well-being through the utilization of public open data. To objectively assess abstract notions of residential satisfaction, text data from complaints is analyzed. By leveraging accessible public data, costs related to data collection are minimized. Initially, relevant text data containing civic complaints is collected and refined by removing extraneous information. This processed data is then combined with meaningful datasets and subjected to topic modeling, a text mining technique. The insights derived are visualized using Geographic Information System (GIS) and Application Programming Interface (API) data. The efficacy of this analytical model was demonstrated in the Godeok/Gangil area. The proposed methodology allows for comprehensive analysis across time, space, and categories. This flexible approach involves incorporating specific public open data as needed, all within the overarching framework.

Knowledge Graph of Administrative Codes in Korea: The Case for Improving Data Quality and Interlinking of Public Data

  • Haklae Kim
    • Journal of Information Science Theory and Practice
    • /
    • 제11권3호
    • /
    • pp.43-57
    • /
    • 2023
  • Government codes are created and utilized to streamline and standardize government administrative procedures. They are generally employed in government information systems. Because they are included in open datasets of public data, users must be able to understand them. However, information that can be used to comprehend administrative code is lost during the process of releasing data in the government system, making it difficult for data consumers to grasp the code and limiting the connection or convergence of different datasets that use the same code.This study proposes a way to employ the administrative code produced by the Korean government as a standard in a public data environment on a regular basis. Because consumers of public data are barred from accessing government systems, a means of universal access to administrative code is required. An ontology model is used to represent the administrative code's data structure and meaning, and the full administrative code is built as a knowledge graph. The knowledge graph thus created is used to assess the accuracy and connection of administrative codes in public data. The method proposed in this study has the potential to increase the quality of coded information in public data as well as data connectivity.

개념계층구조를 기반으로 하는 다치 삼원 데이터집합의 지식 추출 (Knowledge Mining from Many-valued Triadic Dataset based on Concept Hierarchy)

  • 황석형;정영애;황세웅
    • Journal of Platform Technology
    • /
    • 제12권3호
    • /
    • pp.3-15
    • /
    • 2024
  • 지식 마이닝은 다종다양한 대량의 데이터로부터 데이터 모델링, 정보추출 및 분석, 가시화, 결과 해석 등과 같은 다양한 기법들을 적용하여 데이터로부터 유용하고 가치 있는 지식을 찾아내는 연구 분야로서, 비즈니스, 의료, 과학 연구 등 다양한 영역에서 원시 데이터를 유용한 지식으로 변환하기 위한 중요한 역할을 수행한다. 본 논문에서는 형식개념분석기법을 확장하여 다종다양한 데이터로부터 지식발견과 데이터 마이닝을 수행하기 위한 분석기법을 제안한다. 분석대상 데이터의 다양한 형식과 구조를 표현하기 위한 제반 모델들(다치데이터 테이블, 삼원데이터테이블)과 데이터처리(이진화 및 평탄화) 및 개념계층구조 구축과 연관규칙 추출을 위한 알고리즘들을 정의하고, 공공오픈데이터를 대상으로 본 논문에서 제안한 기법을 적용한 실험을 수행하여 제안 기법의 유용성을 실증하였다.

  • PDF

연구를 위한 건강보험 청구자료 요구 및 이용 요인분석 (Assessment of Needs and Accessibility Towards Health Insurance Claims Data)

  • 이정아;오주환;문상준;임준태;이진석;이진용;김윤
    • 보건행정학회지
    • /
    • 제21권1호
    • /
    • pp.77-92
    • /
    • 2011
  • Objectives : This study examined the health policy researchers' needs and their accessibility towards health insurance claim datasets according to their academic capacity. Methods : An online questionnaire to capture relevant proxy variables for academic needs, accessibility, and research capacity was constructed based on previous studies. The survey was delivered to active health policy researchers through three major scholarly associations in South Korea. Seven-hundred and one scholars responded while the survey as open for 12 days (starting on December 20th, 2010). Descriptive statistics and logistic regression analysis were carried out. Results : Regardless of the definition for operational needs, the prevalent needs of survey respondents were not met with the current provision of claim data. Greater research capacity was shown to be correlated with increased demand for claim data along with a positive correlation between attempts to obtain claim datasets and research capacity. A greater research capacity, however, was not necessarily correlated with better accessibility to the claim data. Conclusions : The substantial unmet need for claim data among the healthcare policy research community calls for establishing proactive institutions which could systematically prepare and make available public datasets and provide call-in services to facilitate proper handling of data.

행정정보 데이터세트 종합관리시스템의 서비스 방안 연구 (A Study on the Service of the Integrated Administrative Information Dataset Management System)

  • 김지혜;윤성호;양동민
    • 한국기록관리학회지
    • /
    • 제22권2호
    • /
    • pp.27-49
    • /
    • 2022
  • 2020년 「공공기록물에 관한 법률 시행령」 개정에 따라 행정정보 데이터세트 기록관리 방안이 법제화되며, 국가기록원은 행정정보 데이터세트 기록관리 업무를 지원하기 위해 행정정보 데이터세트 종합관리시스템을 구축할 계획을 밝혔다. 하지만 데이터세트와 관리 기준표의 특성을 고려한 구체적인 서비스 방안은 부재한 작금이다. 이에 본 논문은 국내·외 공공데이터 포털 및 기록관 웹사이트 14곳을 대상으로 데이터세트 서비스 현황을 비교 분석하고 시사점을 도출하여 행정정보 데이터세트 종합관리시스템에 적용 가능한 서비스 방안 6가지를 제안했다. 본 연구의 결과가 행정정보 데이터세트 활용 및 서비스 활성화로 이어지기를 기대한다.

FAIR 원칙 기반 메타데이터 평가 프레임워크 (FAIR Principle-Based Metadata Assessment Framework)

  • 박진효;김성희;윤주상
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제11권12호
    • /
    • pp.461-468
    • /
    • 2022
  • 최근 빅데이터 산업의 발전으로 디지털 플랫폼에서 데이터 활용 서비스를 제공하는 사례가 증가하고 있다. 이와 관련해 데이터 관련 분야에서 (메타)데이터 품질, 서비스, 기능 등의 평가에 적용할 수 있는 FAIR 원칙을 데이터 품질 평가에 적용하여 활용하는 연구가 진행되고 있다. 특히, 유럽 오픈 데이터 포털에서는 FAIR 원칙 기반의 평가 모델을 적용하여 이를 기준으로 데이터 성숙도 평가를 시행하고 그 결과를 매년 보고서로 공개하고 있다. 이에 반해 공공데이터 포털에서는 메타데이터를 기반으로 한 데이터 성숙도 평가를 시행하고 있지 않다. 따라서 본 논문에서는 유럽 오픈 데이터 포털에서 데이터 성숙도 평가를 위해 사용되고 있는 FAIR 원칙을 국내 여러 공공데이터 포털 및 데이터 거래를 위해 구축된 빅데이터 플랫폼에 데이터 성숙도 평가를 위한 새로운 모델 제안하고 평가를 시행한다. 제안한 성숙도 평가 모델은 공공데이터 포털 데이터셋 품질을 평가하는 모델이다.

청소년 건강관련 공개자료 접근 및 활용에 관한 고찰 (Access to and Utilization of the Open Source Data-related to Adolescent Health)

  • 이재은;성정혜;이원재;문인옥
    • 한국학교ㆍ지역보건교육학회지
    • /
    • 제11권1호
    • /
    • pp.67-78
    • /
    • 2010
  • Background & Objectives: Current trend is that funding agencies require investigators to share their data with others. However, there is limited guidance how to access and utilize the shared data. We sought to determine what common data sharing practices in U.S.A. are, what data-related to adolescent health are freely available, and how we deal with the large dataset adopting the complex study design. Methods: The study included only research data-related to adolescent health which was collected in USA and unlimitedly accessible through the internet. Only the raw data, not aggregated, was considered for the study. Major keywords for web search were "adolescent", "children", "health", and "school". Results: Current approaches for public health data sharing lacked of common standards and varied largely due to the data's complex nature, large size, local expertise and internal procedures. Some common data sharing practices are unlimited access, formal screened access, restricted access, and informal exclusive access. The Inter-University Consortium for Political and Social Research and the Center for Disease Control and Prevention were the best data depository. "Data on the net" was search engine for the website providing data freely available. Six datasets related to adolescent health freely available were identified. The importance and methods of incorporating complex research design into analysis was discussed. Conclusion: There have been various attempts to standardize process for open access and open data using the information technology concept. However, it may not be easy for researchers to adapt themselves to this high technology. Therefore, guidance provided by this study may help researchers enhance the accessibility to and the utilization of the open source data.

  • PDF

Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies

  • Park, Hyeoun-Ae;Jung, Hyesil;On, Jeongah;Park, Seul Ki;Kang, Hannah
    • Healthcare Informatics Research
    • /
    • 제24권4호
    • /
    • pp.253-262
    • /
    • 2018
  • Objectives: We reviewed digital epidemiological studies to characterize how researchers are using digital data by topic domain, study purpose, data source, and analytic method. Methods: We reviewed research articles published within the last decade that used digital data to answer epidemiological research questions. Data were abstracted from these articles using a data collection tool that we developed. Finally, we summarized the characteristics of the digital epidemiological studies. Results: We identified six main topic domains: infectious diseases (58.7%), non-communicable diseases (29.4%), mental health and substance use (8.3%), general population behavior (4.6%), environmental, dietary, and lifestyle (4.6%), and vital status (0.9%). We identified four categories for the study purpose: description (22.9%), exploration (34.9%), explanation (27.5%), and prediction and control (14.7%). We identified eight categories for the data sources: web search query (52.3%), social media posts (31.2%), web portal posts (11.9%), webpage access logs (7.3%), images (7.3%), mobile phone network data (1.8%), global positioning system data (1.8%), and others (2.8%). Of these, 50.5% used correlation analyses, 41.3% regression analyses, 25.6% machine learning, and 19.3% descriptive analyses. Conclusions: Digital data collected for non-epidemiological purposes are being used to study health phenomena in a variety of topic domains. Digital epidemiology requires access to large datasets and advanced analytics. Ensuring open access is clearly at odds with the desire to have as little personal data as possible in these large datasets to protect privacy. Establishment of data cooperatives with restricted access may be a solution to this dilemma.