• Title/Summary/Keyword: 데이터과학과

Search Result 5,315, Processing Time 0.034 seconds

사이언스 빅 데이터(Science Big Data) 처리 기술 동향

  • Kim, Hui-Jae;Ju, Gyeong-No;Yun, Chan-Hyeon
    • Information and Communications Magazine
    • /
    • v.29 no.11
    • /
    • pp.11-23
    • /
    • 2012
  • 본 고에서는 과학 분야에서의 대용량 데이터 처리를 위한 기술인 사이언스 빅데이터의 처리 기술 동향에 대하여 기술한다. 서론에서 사이언스 빅데이터의 정의 및 필요성을 다루고, 본론에서는 데이터 중심 과학 패러다임의 등장과 그로 인한 사이언스 빅데이터 요구사항, 사이언스 빅데이터 소스 수집 및 정제, 저장 및 관리, 처리, 분석 등으로 이루어지는 사이언스 빅데이터 처리 기법에 대하여 기술한다. 또한 현재 다양한 기관에서 연구하고 있는 사이언스 빅데이터 플랫폼, 맵리듀스 등을 이용한 워크플로우 제어 기반의 사이언스 빅데이터 처리 기법을 예시로 소개한다.

Development of Science Technology Information Service using Citation Information Data (인용정보 데이터를 활용한 과학기술 학술정보서비스 개발)

  • Park, Yoo-Na;Bae, Su-Yeong;Lee, Hye-Jin;Lee, Seok-Hyoung;Choi, Hee-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.12
    • /
    • pp.241-249
    • /
    • 2020
  • The citation information of academic resources contains the knowledge flow from previous research, so it is possible to connect fragmented research in relational aspects. The citation information can grasp the overall flow of research, so it can promote convergence research such as developing existing research or deriving related fields. Therefore, in this study, the citation information of academic literature, which was previously provided at the level of simple disclosure, was reconstructed based on the citation relationship. Through this, backward and forward citation analysis were conducted based on time series, and the research flow was analyzed by setting the citation stage. Finally, we developed an academic information service that visualizes the main research contents of backward and forward citation based on time series. This accesses academic resources through the meaning contained in the citation information.

Adaptive Data Replication Strategy using data access history in DataGrid (데이터 접근 기록 정보를 이용한 적응적 데이터 복제 기법 제안)

  • Sung, GiMun;Lee, DongWoo;Choi, Jihyun;Ramakrishna, R.S.
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.937-940
    • /
    • 2004
  • 프로세서 자원, 데이터 저장장치 자원을 제공하면서 가상기관(Virtual Organization)을 구성하는 각 사이트는 사용할 수 있는 네트윅 자원이 한정된 상황에서 애플리케이션 처리량을 극대화하는 최적화된 데이터그리드 시스템을 기대한다. 본 논문에서는 크기가 제한적이며 지리적으로 분산된 데이터 저장공간에서 적응적 데이터 복제 기법을 제안하고 Replica의 지리적 분배를 위한 평가 모델을 제안한다. 이를 위해 논리 시간 데이터 접근 기록 및 통계를 적용하여 복제할 파일들을 구분 하는 이산적 결정 모델을 제안하고 삭제할 Replica 결정에 논리 시간 접근 기록을 적용한다.

  • PDF

Design and Implementation of XMDR based on OGSA-DAI System for Data Integration retrieval (데이터 통합검색을 위한 XMDR기반의 OGSA-DAI 시스템 설계 및 구현)

  • Ma, Jin;Moon, Seok-Jae;Jung, Gye-Dong;Choi, Young-Keun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.173-174
    • /
    • 2009
  • 기업이 관리하고 있는 중요한 정보자원들은 대부분이 여러 레거시 시스템에 분산 저장되어 있다. 그리고 저장되어 있는 정보 자원들 또한 서로 호환되지 않는 다양한 형태로 존재하고 있다. 이러한 문제를 해결하기 위해 분산된 데이터의 통합과 지식 공유를 위한 시스템이 필요하다. 데이터 통합의 목적은 기업의 조직과 주요 업무, 핵심 어플리케이션으로부터 발생하는 데이터 소스들의 표준 규칙과 메타 데이터를 이용하여 중복성을 제거하고, 오직 단일 데이터를 제공하는데 있다. 본 논문에서는 XMDR 기반의 OGSA-DAI를 이용하여 통합 검색 시스템을 설계 및 구현하였고, 분산되어 있는 레거시 시스템간의 데이터 통합검색이 가능한 시스템을 제안한다. 제안한 시스템은 분산된 레거시 데이터베이스간의 협업 환경 구성에 적합하며, 실시간 기업환경에서 빠른 정보 전달과 업무 지원 환경에 적절한 시스템이다.

High-performance and Highly Scalable Big Data Analysis Platform (고성능, 고확장성 빅데이터 분석 플랫폼)

  • Park, Kyongseok;Yu, Chan Hee;Kim, Yuseon;Um, Jung-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.535-536
    • /
    • 2021
  • 빅데이터를 활용한 기계학습 모델을 개발하기 위해서는 빅데이터 처리를 위한 플랫폼과 딥러닝 프레임 워크 등 고급 분석을 수행할 수 있는 도구의 활용이 동시에 요구된다. 그러나 빅데이터 플랫폼과 딥러닝 프레임워크를 자유롭게 활용하기 위해서는 상당한 수준의 기술적 지식과 경험이 필요하다. 또한 빅데이터를 이용한 딥러닝 모델을 개발할 경우 분산처리와 병렬처리에 대한 지식과 추가적인 작업이 요구된다. 본 연구에서는 빅데이터를 활용한 기계학습 모형을 자유롭게 개발 및 공유하고 분산 딥러닝을 위한 시스템적 지원을 통해 분야별로 딥러닝 모형을 개발하는 응용 연구자들이 활용할 수 있는 플랫폼을 제시하였다. 본 연구를 통해 다양한 분야의 연구자들이 자신의 데이터를 이용하여 모형을 개발할 경우 분산처리와 병렬처리를 위한 기술적 제약을 극복하고 보다 빠르고 효율적인 방법으로 모형을 개발하고 현업에 활용할 수 있을 것으로 기대한다.

Development of a Data Science Education Program for High School Students Taking the High School Credit System (고교학점제 수강 고등학생을 위한 데이터과학교육 프로그램 개발)

  • Semin Kim;SungHee Woo
    • Journal of Practical Engineering Education
    • /
    • v.14 no.3
    • /
    • pp.471-477
    • /
    • 2022
  • In this study, an educational program was developed that allows students who take data science courses in the high school credit system to explore related fields after learning data science education. Accordingly, the existing research and requirements for data science education were analyzed, a learning plan was designed, and an educational program was developed in accordance with a step-by-step educational program. In addition, since there is no research on data science education for the high school credit system in existing studies, the research was conducted in the stages of problem definition, data collection, data preprocessing, data analysis, data visualization, and simulation, and referred to studies on data science education that have been conducted in existing schools. Through this study, it is expected that research on data science education in the high school credit system will become more active.

Improvement of Current Legal System for Promoting Scientific Analysis and Utilization of Maritime Data (해사데이터의 과학적 분석 및 활용을 위한 현행 법제도 개선방안)

  • KwangHyun Lim;JongHwa Baek;DeukJae Cho
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.304-305
    • /
    • 2022
  • Recently, as digital communication technology is widely applied to the maritime field, large amounts of maritime data are being accumulated. Accordingly, attempts to create new value by applying data science and Artificial Intelligence(AI) technologies are emerging. Typically, Ministry of Oceans and Fisheries has been providing korean e-Navigation service since 2021 based on LTE-Maritime communication network, as well as R&D for creating value-added service through analyzing huge-sized maritime traffic data is underway. By the way, to do any data-based research, legal system, as a research infra, that researchers can get the data whenever they need is essential. This paper looked at types of data in maritime fields, checked related legal system about scientific analysis and utilization. It is confirmed that there are some legal factors which restrict its scientific analysis and utilization, and suggested ways of improvement to boost R&D using maritime data as a conclusion.

  • PDF

Exploring the Job Competencies of Data Scientists Using Online Job Posting (온라인 채용정보를 이용한 데이터 과학자 요구 역량 탐색)

  • Jin, Xiangdan;Baek, Seung Ik
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.2
    • /
    • pp.1-20
    • /
    • 2022
  • As the global business environment is rapidly changing due to the 4th industrial revolution, new jobs that did not exist before are emerging. Among them, the job that companies are most interested in is 'Data Scientist'. As information and communication technologies take up most of our lives, data on not only online activities but also offline activities are stored in computers every hour to generate big data. Companies put a lot of effort into discovering new opportunities from such big data. The new job that emerged along with the efforts of these companies is data scientist. The demand for data scientist, a promising job that leads the big data era, is constantly increasing, but its supply is not still enough. Although data analysis technologies and tools that anyone can easily use are introduced, companies still have great difficulty in finding proper experts. One of the main reasons that makes the data scientist's shortage problem serious is the lack of understanding of the data scientist's job. Therefore, in this study, we explore the job competencies of a data scientist by qualitatively analyzing the actual job posting information of the company. This study finds that data scientists need not only the technical and system skills required of software engineers and system analysts in the past, but also business-related and interpersonal skills required of business consultants and project managers. The results of this study are expected to provide basic guidelines to people who are interested in the data scientist profession and to companies that want to hire data scientists.

Is Big Data Analysis to Be a Methodological Innovation? : The cases of social science (빅데이터 분석은 사회과학 연구에서 방법론적 혁신인가?)

  • SangKhee Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.655-662
    • /
    • 2023
  • Big data research plays a role of supplementing existing social science research methods. If the survey and experimental methods are somewhat inaccurate because they mainly rely on recall memories, big data are more accurate because they are real-time records. Social science research so far, which mainly conducts sample research for reasons such as time and cost, but big data research analyzes almost total data. However, it is not easy to repeat and reproduce social research because the social atmosphere can change and the subjects of research are not the same. While social science research has a strong triangular structure of 'theory-method-data', big data analysis shows a weak theory, which is a serious problem. Because, without the theory as a scientific explanation logic, even if the research results are obtained, they cannot be properly interpreted or fully utilized. Therefore, in order for big data research to become a methodological innovation, I proposed big thinking along with researchers' efforts to create new theories(black boxes).

Deriving the Determining Factor for the Management of Oceanographic Data (해양관측데이터 관리를 위한 결정요소 도출)

  • Kim, Sun-Tae;Lee, Tae-Young;Kim, Yong
    • Journal of Information Management
    • /
    • v.43 no.3
    • /
    • pp.97-115
    • /
    • 2012
  • This paper derives determining factor for the management of oceanographic data in two ways. 1) The type of oceanographic observation and the raw data which were collected from marine physics, marine chemistry, marine biology, marine geology area were analyzed. 2) The services of the KODC(Korea Oceangraphic Data Center), NFRDI(National Fisheries Research & Development Institute), KHOA(Korea Hydrographic and Oceanographic Administration) were analyzed to derive metadata elements for retrieval. After analyze, the 42 deciding factor were derived in the 9 areas (general, Observer, satellites, observation instruments, observatories, space, information, projects, and observational data, data processing).