• Title/Summary/Keyword: large Dataset

Search Result 553, Processing Time 0.026 seconds

Automatic Text Summarization based on Selective Copy mechanism against for Addressing OOV (미등록 어휘에 대한 선택적 복사를 적용한 문서 자동요약)

  • Lee, Tae-Seok;Seon, Choong-Nyoung;Jung, Youngim;Kang, Seung-Shik
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.58-65
    • /
    • 2019
  • Automatic text summarization is a process of shortening a text document by either extraction or abstraction. The abstraction approach inspired by deep learning methods scaling to a large amount of document is applied in recent work. Abstractive text summarization involves utilizing pre-generated word embedding information. Low-frequent but salient words such as terminologies are seldom included to dictionaries, that are so called, out-of-vocabulary(OOV) problems. OOV deteriorates the performance of Encoder-Decoder model in neural network. In order to address OOV words in abstractive text summarization, we propose a copy mechanism to facilitate copying new words in the target document and generating summary sentences. Different from the previous studies, the proposed approach combines accurate pointing information and selective copy mechanism based on bidirectional RNN and bidirectional LSTM. In addition, neural network gate model to estimate the generation probability and the loss function to optimize the entire abstraction model has been applied. The dataset has been constructed from the collection of abstractions and titles of journal articles. Experimental results demonstrate that both ROUGE-1 (based on word recall) and ROUGE-L (employed longest common subsequence) of the proposed Encoding-Decoding model have been improved to 47.01 and 29.55, respectively.

A Noise-Tolerant Hierarchical Image Classification System based on Autoencoder Models (오토인코더 기반의 잡음에 강인한 계층적 이미지 분류 시스템)

  • Lee, Jong-kwan
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.23-30
    • /
    • 2021
  • This paper proposes a noise-tolerant image classification system using multiple autoencoders. The development of deep learning technology has dramatically improved the performance of image classifiers. However, if the images are contaminated by noise, the performance degrades rapidly. Noise added to the image is inevitably generated in the process of obtaining and transmitting the image. Therefore, in order to use the classifier in a real environment, we have to deal with the noise. On the other hand, the autoencoder is an artificial neural network model that is trained to have similar input and output values. If the input data is similar to the training data, the error between the input data and output data of the autoencoder will be small. However, if the input data is not similar to the training data, the error will be large. The proposed system uses the relationship between the input data and the output data of the autoencoder, and it has two phases to classify the images. In the first phase, the classes with the highest likelihood of classification are selected and subject to the procedure again in the second phase. For the performance analysis of the proposed system, classification accuracy was tested on a Gaussian noise-contaminated MNIST dataset. As a result of the experiment, it was confirmed that the proposed system in the noisy environment has higher accuracy than the CNN-based classification technique.

Hyperparameter Optimization for Image Classification in Convolutional Neural Network (합성곱 신경망에서 이미지 분류를 위한 하이퍼파라미터 최적화)

  • Lee, Jae-Eun;Kim, Young-Bong;Kim, Jong-Nam
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.21 no.3
    • /
    • pp.148-153
    • /
    • 2020
  • In order to obtain high accuracy with an convolutional neural network(CNN), it is necessary to set the optimal hyperparameters. However, the exact value of the hyperparameter that can make high performance is not known, and the optimal hyperparameter value is different based on the type of the dataset, therefore, it is necessary to find it through various experiments. In addition, since the range of hyperparameter values is wide and the number of combinations is large, it is necessary to find the optimal values of the hyperparameters after the experimental design in order to save time and computational costs. In this paper, we suggest an algorithm that use the design of experiments and grid search algorithm to determine the optimal hyperparameters for a classification problem. This algorithm determines the optima values of the hyperparameters that yields high performance using the factorial design of experiments. It is shown that the amount of computational time can be efficiently reduced and the accuracy can be improved by performing a grid search after reducing the search range of each hyperparameter through the experimental design. Moreover, Based on the experimental results, it was shown that the learning rate is the only hyperparameter that has the greatest effect on the performance of the model.

Quality Evaluation of Long-Term Shipboard Salinity Data Obtained by NIFS (국립수산과학원 장기 정선 관측 염분 자료의 정확성 평가)

  • PARK, JONGJIN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.26 no.1
    • /
    • pp.49-61
    • /
    • 2021
  • The repeated shipboard measurements that have been conducted by the National Institute of Fisheries Science (NIFS) for more than a half century, provide the valuable long-term hydrographic data with high spatial-temporal resolution. However, this unprecedent dataset has been rarely used for oceanic climate sciences because of its reliability issue. In this study, temporal variability of salinity error in the NIFS data was quantified by means of extremely small variability of salinity in the deep layer of the south-western East Sea, in order to contribute to studies on long-term variability of the East Sea. The NIFS salinity errors estimated on the isothermal surfaces of 1℃ have a remarkable temporal variation, such as ~0.160 g/kg in the year of 1961~1980, ~0.060 g/kg in 1981~1994,~0.020 g/kg in 1995~2002, and ~0.010 g/kg in 2003~2014 on average, which basically represent bias error. In the recent years, even though the quality of salinity has been improved, there still remain relatively large bias errors in salinity data presumably due to failure of salinity sensor managements, especially in 2011, 2013, and 2014. On the contrary, the salinity in the year of 2012 was very accurate and stable, whose error was estimated as about 0.001 g/kg comparable to the salinity sensor accuracy. Thus, as long as developing proper data quality control procedures and sensor management systems, I expect that the NIFS shipboard hydrographic data could have good enough quality to support various studies on ocean response to climate variabilities. Additionally, a few points to improve the current NIFS shipboard measurements were suggested in the discussion section.

A Study on the Mapping of Fishing Activity using V-Pass Data - Focusing on the Southeast Sea of Korea - (선박패스(V-Pass) 자료를 활용한 어업활동 지도 제작 연구 - 남해동부해역을 중심으로 -)

  • HAN, Jae-Rim;KIM, Tae-Hoon;CHOI, Eun Yeong;CHOI, Hyun-Woo
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.1
    • /
    • pp.112-125
    • /
    • 2021
  • Marine spatial planning(MSP) designates the marine as nine kinds of use zones for the systematic and rational management of marine spaces. One of them is the fishery protection zone, which is necessary for the sustainable production of fishery products, including the protection and fosterage of fishing activities. This study intends to quantitatively identify the fishing activity space, one of the elements necessary for the designation of fisheries protection zones, by mapping of fishery activities using V-Pass data and deriving the fishery activity concentrated zone. To this end, pre-processing of V-Pass data was performed, such as constructing a dataset that combines static and dynamic information, calculating the speed of fishing vessels, extracting fishing activity points, and removing data in non-fishing activity zone. Finally, using the selected V-Pass point data, a fishery activity map was made by kernel density estimation, and the concentrated space of fishery activity was analyzed. In addition, it was confirmed that there is a difference in the spatial distribution of fishing activities according to the type of fishing vessel and the season. The pre-processing technique of large volume V-Pass data and the mapping method of fishing activities performed through this study are expected to contribute to the study of spatial characteristics evaluation of fishing activities in the future.

Current Calculation Simulation Model for Smartgrid-based Energy Distribution System Operation (스마트 그리드 기반 에너지 시스템 운영을 위한 배전계통 조류계산 시뮬레이션 모델 개발)

  • Bae, HeeSun;Shin, Seungjae;Moon, Il-Chul;Bae, Jang Won
    • Journal of the Korea Society for Simulation
    • /
    • v.30 no.1
    • /
    • pp.113-126
    • /
    • 2021
  • The future energy consumption pattern will show a very different pattern from the present due to the increase of distributed power sources such as renewable energy and the birth of the concept of prosumers, etc. Accordingly, it can be predicted that the direction of establishment of an appropriate production and supply plan considering the stability and consumption efficiency of the entire power grid will also be different from now. This paper proposes a simulation model that can test a new operational strategy when faced with a number of possible future environments. Through the proposed model, it is possible to simulate and analyze power consumed and supplied in a future Smart Grid environment, in which a large amount of new concepts including energy storage service (ESS) and distributed energy resources (DER) will be added. In particular, it is possible to model complex systems structurally by using DEVS formalism among the ABM (Agent-Based Model) methodologies that can model decision-making for each agent existing in the grid, and several factors can be easily added to the grid. The simulation model was verified using given dataset in the current situation, and scenario analysis was performed by simply adding an ESS, one of the main elements of the smart grid, to the model.

A Study on the Application of GOCI to Analyzing Phytoplankton Community Distribution in the East Sea (동해에서 식물플랑크톤 군집 분포 분석을 위한 GOCI 활용 연구)

  • Choi, Jong-kuk;Noh, Jae Hoon;Brewin, Robert J.W.;Sun, Xuerong;Lee, Charity M.
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_1
    • /
    • pp.1339-1348
    • /
    • 2020
  • Phytoplankton controls marine ecosystems in terms of nutrients, photosynthetic rate, carbon cycle, etc. and the degree of its influence on the marine environment depends on their physical size. Many studies have been attempted to identify marine phytoplankton size classes using the remote sensing techniques. One of successful approach was the three-component model which estimates the chlorophyll concentrations of three phytoplankton size classes (micro-phytoplankton; >20 ㎛, nano-; 2-20 ㎛ and pico-; <2 ㎛) as a function of total chlorophyll. Here, we examined the applicability of Geostationary Ocean Colour Imager (GOCI) to the mapping of the phytoplankton size class distribution in the East Sea. A fit of the three-component model to a biomarker pigment dataset collected in the study area for some years including a large harmful algal bloom period has been carried out to derive size-fractioned chlorophyll concentration (CHL). The tuned three-component model was applied to the hourly GOCI images to identify the fractions of each phytoplankton size class for the entire CHL. Then, we investigated the distribution of phytoplankton community in terms of the size structure in the East Sea during the harmful Cochlodinium polykrikoides blooms in the summer of 2013.

A Study for Generation of Artificial Lunar Topography Image Dataset Using a Deep Learning Based Style Transfer Technique (딥러닝 기반 스타일 변환 기법을 활용한 인공 달 지형 영상 데이터 생성 방안에 관한 연구)

  • Na, Jong-Ho;Lee, Su-Deuk;Shin, Hyu-Soung
    • Tunnel and Underground Space
    • /
    • v.32 no.2
    • /
    • pp.131-143
    • /
    • 2022
  • The lunar exploration autonomous vehicle operates based on the lunar topography information obtained from real-time image characterization. For highly accurate topography characterization, a large number of training images with various background conditions are required. Since the real lunar topography images are difficult to obtain, it should be helpful to be able to generate mimic lunar image data artificially on the basis of the planetary analogs site images and real lunar images available. In this study, we aim to artificially create lunar topography images by using the location information-based style transfer algorithm known as Wavelet Correct Transform (WCT2). We conducted comparative experiments using lunar analog site images and real lunar topography images taken during China's and America's lunar-exploring projects (i.e., Chang'e and Apollo) to assess the efficacy of our suggested approach. The results show that the proposed techniques can create realistic images, which preserve the topography information of the analog site image while still showing the same condition as an image taken on lunar surface. The proposed algorithm also outperforms a conventional algorithm, Deep Photo Style Transfer (DPST) in terms of temporal and visual aspects. For future work, we intend to use the generated styled image data in combination with real image data for training lunar topography objects to be applied for topographic detection and segmentation. It is expected that this approach can significantly improve the performance of detection and segmentation models on real lunar topography images.

3D Explosion Analyses of Hydrogen Refueling Station Structure Using Portable LiDAR Scanner and AUTODYN (휴대형 라이다 스캐너와 AUTODYN를 이용한 수소 충전소 구조물의 3차원 폭발해석)

  • Baluch, Khaqan;Shin, Chanhwi;Cho, Yongdon;Cho, Sangho
    • Explosives and Blasting
    • /
    • v.40 no.3
    • /
    • pp.19-32
    • /
    • 2022
  • Hydrogen is a fuel having the highest energy compared with other common fuels. This means hydrogen is a clean energy source for the future. However, using hydrogen as a fuel has implication regarding carrier and storage issues, as hydrogen is highly inflammable and unstable gas susceptible to explosion. Explosions resulting from hydrogen-air mixtures have already been encountered and well documented in research experiments. However, there are still large gaps in this research field as the use of numerical tools and field experiments are required to fully understand the safety measures necessary to prevent hydrogen explosions. The purpose of this present study is to develop and simulate 3D numerical modelling of an existing hydrogen gas station in Jeonju by using handheld LiDAR and Ansys AUTODYN, as well as the processing of point cloud scans and use of cloud dataset to develop FEM 3D meshed model for the numerical simulation to predict peak-over pressures. The results show that the Lidar scanning technique combined with the ANSYS AUTODYN can help to determine the safety distance and as well as construct, simulate and predict the peak over-pressures for hydrogen refueling station explosions.

Mobile Underground High-capacity 3D Spatial Information Tiling Transfer Protocol Development (모바일 지하 대용량 3D 공간정보 타일링 전송 프로토콜 개발)

  • Lee, Tae Hyung;Jo, Won Je;Kim, Hyun Woo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.6
    • /
    • pp.491-496
    • /
    • 2021
  • In line with the modern era in which the safety of underground facilities and the use of underground information are increasingly emphasized, the state is pushing for more precise and accurate underground spatial information to be secured and utilized. Therefore, we need to pay more attention to subsurface geospatial data. In the future, the Ministry of Land, Infrastructure and Transport will actively utilize the 15 types of Integrated Underground Geospatial Information Map(6 types of underground facilities, 6 types of underground structures, 3 types of ground) that the Ministry of Land, Infrastructure and Transport is building as three-dimensional underground spatial information, and contribute greatly to improving national safety and convenience in underground construction. expected to do However, when a site manager requests an Integrated Underground Geospatial Information Map with a mobile device, if the large-capacity integrated underground space map is not quickly transmitted over the wireless section and is not serviced, it causes inconvenience to the site manager and delays work. In this paper, the goal of this paper is to enable field managers to quickly receive a tiled Integrated Underground Geospatial Information Map with minimal information exchange. Therefore, the tiling system is configured according to the dataset for high-speed Mobile Integrated Underground Geospatial Information Map transmission. In addition, a transmission system for the Mobile Integrated Underground Geospatial Information Map is established, and a TCP/IP (Transmission Control Protocol/Internet Protocol)-based spatial information tiling transmission protocol dedicated to the on-site Integrated Underground Geospatial Information Map is developed.