• Title/Summary/Keyword: Large-scale Data

Search Result 2,750, Processing Time 0.031 seconds

Classification and discrimination of excel radial charts using the statistical shape analysis (통계적 형상분석을 이용한 엑셀 방사형 차트의 분류와 판별)

  • Seungeon Lee;Jun Hong Kim;Yeonseok Choi;Yong-Seok Choi
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.73-86
    • /
    • 2024
  • A radial chart of Excel is very useful graphical method in delivering information for numerical data. However, it is not easy to discriminate or classify many individuals. In this case, after shaping each individual of a radial chart, we need to apply shape analysis. For a radial chart, since landmarks for shaping are formed as many as the number of variables representing the characteristics of the object, we consider a shape that connects them to a line. If the shape becomes complicated due to the large number of variables, it is difficult to easily grasp even if visualized using a radial chart. Principal component analysis (PCA) is performed on variables to create a visually effective shape. The classification table and classification rate are checked by applying the techniques of traditional discriminant analysis, support vector machine (SVM), and artificial neural network (ANN), before and after principal component analysis. In addition, the difference in discrimination between the two coordinates of generalized procrustes analysis (GPA) coordinates and Bookstein coordinates is compared. Bookstein coordinates are obtained by converting the position, rotation, and scale of the shape around the base landmarks, and show higher rate than GPA coordinates for the classification rate.

Segmentation Foundation Model-based Automated Yard Management Algorithm (의미론적 분할 기반 모델을 이용한 조선소 사외 적치장 객체 자동 관리 기술)

  • Mingyu Jeong;Jeonghyun Noh;Janghyun Kim;Seongheon Ha;Taeseon Kang;Byounghak Lee;Kiryong Kang;Junhyeon Kim;Jinsun Park
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.52-61
    • /
    • 2024
  • In the shipyard, aerial images are acquired at regular intervals using Unmanned Aerial Vehicles (UAVs) for the management of external storage yards. These images are then investigated by humans to manage the status of the storage yards. This method requires a significant amount of time and manpower especially for large areas. In this paper, we propose an automated management technology based on a semantic segmentation foundation model to address these challenges and accurately assess the status of external storage yards. In addition, as there is insufficient publicly available dataset for external storage yards, we collected a small-scale dataset for external storage yards objects and equipment. Using this dataset, we fine-tune an object detector and extract initial object candidates. They are utilized as prompts for the Segment Anything Model(SAM) to obtain precise semantic segmentation results. Furthermore, to facilitate continuous storage yards dataset collection, we propose a training data generation pipeline using SAM. Our proposed method has achieved 4.00%p higher performance compared to those of previous semantic segmentation methods on average. Specifically, our method has achieved 5.08% higher performance than that of SegFormer.

Literature Review of AI Hallucination Research Since the Advent of ChatGPT: Focusing on Papers from arXiv (챗GPT 등장 이후 인공지능 환각 연구의 문헌 검토: 아카이브(arXiv)의 논문을 중심으로)

  • Park, Dae-Min;Lee, Han-Jong
    • Informatization Policy
    • /
    • v.31 no.2
    • /
    • pp.3-38
    • /
    • 2024
  • Hallucination is a significant barrier to the utilization of large-scale language models or multimodal models. In this study, we collected 654 computer science papers with "hallucination" in the abstract from arXiv from December 2022 to January 2024 following the advent of Chat GPT and conducted frequency analysis, knowledge network analysis, and literature review to explore the latest trends in hallucination research. The results showed that research in the fields of "Computation and Language," "Artificial Intelligence," "Computer Vision and Pattern Recognition," and "Machine Learning" were active. We then analyzed the research trends in the four major fields by focusing on the main authors and dividing them into data, hallucination detection, and hallucination mitigation. The main research trends included hallucination mitigation through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), inference enhancement via "chain of thought" (CoT), and growing interest in hallucination mitigation within the domain of multimodal AI. This study provides insights into the latest developments in hallucination research through a technology-oriented literature review. This study is expected to help subsequent research in both engineering and humanities and social sciences fields by understanding the latest trends in hallucination research.

Study on Establishment of Space Operation Plan for Yangpyeong-gun Public Library (양평군 공공도서관 공간 운영 계획 수립에 관한 연구)

  • Inho Chang;Younghee Noh;Woojung Kwak
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.35 no.1
    • /
    • pp.301-324
    • /
    • 2024
  • In this study, we attempted to propose operational directions for each space in order to transform the newly built library in Yangpyeong-gun into a space for cultural enjoyment and creativity development for local residents. To this end, the purpose is to understand the space composition status of Yangpyeong-gun public libraries and establish an operation plan (draft) for the space to be constructed. To this end, we analyzed the names, operation status, and cases of similar spaces in other libraries, and analyzed the spatial characteristics of library cases to establish a space operation plan for the Yangpyeong-gun public library. As a result of the study, it is important to utilize spaces such as children's resource rooms to improve early reading habits for infants and children, contribute to development, and develop various senses, and small theaters should be planned with a focus on large-scale performances. Furniture and space for reading and relaxation should be provided next to Byeokmyeonga & Bookstair, and it should be operated as a communication space where small talk is possible within a certain limit. It is necessary to operate the multipurpose room by activating experiential creative activities and creative performances. It is necessary for the club room to establish an operation plan through regular communication and opinion sharing. The maker space space is a space that supports various creative activities, and the general data room is a place that provides materials on all topics and must be operated by regularly communicating with users and reflecting their opinions. Lastly, I would like to suggest that the family room should be used like a book cafe where children and parents can freely drink tea together in the same space.

Probability Map of Migratory Bird Habitat for Rational Management of Conservation Areas - Focusing on Busan Eco Delta City (EDC) - (보존지역의 합리적 관리를 위한 철새 서식 확률지도 구축 - 부산 Eco Delta City (EDC)를 중심으로 -)

  • Kim, Geun Han;Kong, Seok Jun;Kim, Hee Nyun;Koo, Kyung Ah
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.26 no.6
    • /
    • pp.67-84
    • /
    • 2023
  • In some areas of the Republic of Korea, the designation and management of conservation areas do not adequately reflect regional characteristics and often impose behavioral regulations without considering the local context. One prominent example is the Busan EDC area. As a result, conflicts may arise, including large-scale civil complaints, regarding the conservation and utilization of these areas. Therefore, for the efficient designation and management of protected areas, it is necessary to consider various ecosystem factors, changes in land use, and regional characteristics. In this study, we specifically focused on the Busan EDC area and applied machine learning techniques to analyze the habitat of regional species. Additionally, we employed Explainable Artificial Intelligence techniques to interpret the results of our analysis. To analyze the regional characteristics of the waterfront area in the Busan EDC district and the habitat of migratory birds, we used bird observations as dependent variables, distinguishing between presence and absence. The independent variables were constructed using land cover, elevation, slope, bridges, and river depth data. We utilized the XGBoost (eXtreme Gradient Boosting) model, known for its excellent performance in various fields, to predict the habitat probabilities of 11 bird species. Furthermore, we employed the SHapley Additive exPlanations technique, one of the representative methodologies of XAI, to analyze the relative importance and impact of the variables used in the model. The analysis results showed that in the EDC business district, as one moves closer to the river from the waterfront, the likelihood of bird habitat increases based on the overlapping habitat probabilities of the analyzed bird species. By synthesizing the major variables influencing the habitat of each species, key variables such as rivers, rice fields, fields, pastures, inland wetlands, tidal flats, orchards, cultivated lands, cliffs & rocks, elevation, lakes, and deciduous forests were identified as areas that can serve as habitats, shelters, resting places, and feeding grounds for birds. On the other hand, artificial structures such as bridges, railways, and other public facilities were found to have a negative impact on bird habitat. The development of a management plan for conservation areas based on the objective analysis presented in this study is expected to be extensively utilized in the future. It will provide diverse evidential materials for establishing effective conservation area management strategies.

Development of Estimation Models for Parking Units -Focused on Gwangju Metropolitan City Condominium Apartments- (주차원단위 산정 모형 개발에 관한 연구 -광주광역시 공동 주택 아파트를 대상으로-)

  • Kwon, Sung-Dae;Ko, Dong-Bong;Park, Je-Jin;Ha, Tae-Jun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.34 no.2
    • /
    • pp.549-559
    • /
    • 2014
  • The rapid expansion of cities led to the shortage of housing in urban areas. The government compensated for this shortage through large scale residential developments that increased the housing supply. The supply of condominium apartments remains above 83% of the entire housing supply, and the proportion of apartments are at a steady increase, at about 50%. Due to the increase, illegally parked cars resulting from the shortage of parking spaces within the apartment complex have become increasingly problematic as they block the transit of emergency vehicles, and heighten the tension among neighboring residents in obtaining a parking space. Especially, the future residents are considered to plan the parking based on the estimated demand for parking. However, the parking unit method utilized to estimate the parking demand accounts for the exclusive use of space, which is believed to be far from the parking demands in reality. The reason for this discrepancy is that, as the number of households decrease, and area of exclusive space is expanded, the planned parking increases. On the other hand, when the number of households increase, and the area of exclusive space is reduced, the planned parking decreases, thus methods to recalculate the parking units based on estimated parking demand is an urgent concern. To estimate the parking units based on condominium apartments, this study first examined the existing research literature, and appointed the field of investigation to collect the necessary data. In addition, field study data and surveys collected and analyzed, in order to identify the problems underlying parking units, and problems regarding the current traffic impact assessment parking unit calculation method were deduced. Through identifying the influential factors on parking demand estimates, and performing a factorial analysis based on the collected data, the variables were selected in relation to the parking demand estimates, to develop the parking unit estimate model. Finally, through comparing and verifying the existing traffic impact assessment parking unit estimate against the newly developed model using collected data, a far more realistic parking unite estimate was suggested, reflecting the characteristics of the residents. The parking unit estimate model developed in this study is anticipated to serve as the guidelines for future parking lot legislature, as wel as the basis to provide a more realistic estimate of parking demands based on the resident characteristics of an apartment complex.

Geochemistry of Total Gaseous Mercury in Nan-Ji-Do, Seoul, Korea (난지도 지역의 대기수은 지화학)

  • Kim, Min-Young;Lee, Gang-Woong;Shin, Jae-Young;Kim, Ki-Hyun
    • Journal of the Korean earth science society
    • /
    • v.21 no.5
    • /
    • pp.611-622
    • /
    • 2000
  • To investigate the exchange rates of mercury(Hg) across soil-air boundary, we undertook the measurements of Hg flux using gradient technique from a major waste reclamation site, Nan-Ji-Do. Based on these measurement data, we attempted to provide insights into various aspects of Hg exchange in a strongly polluted soil environment. According to our analysis, the study site turned out to be not only a major emission source area but also a major sink area. When these data were compared on hourly basis over a full day scale, large fluxes of emission and deposition centered on daytime periods relative to nighttime periods. However, when comparison of frequency with which emission or deposition occurs was made, there emerged a very contrasting pattern. While emission was dominant during nighttime periods, deposition was most favored during daytime periods. When similar comparison was made as a function of wind direction, it was noticed that there may be a major Hg source at easterly direction to bring out significant deposition of Hg in the study area. To account for the environmental conditions controlling the vertical direction of Hg exchange, we compared environmental conditions for both the whole data group and those observed from the wind direction of strong deposition events. Results of this analysis indicated that the concentrations of pollutant species varied sensitively enough to reflect the environmental conditions for each direction of exchange. When correlation analysis was applied to our data, results indicated that windspeed and ozone concentrations best reflected changes in the magnitudes of emission/deposition fluxes. The results of factor analysis also indicated the possibility that Hg emission of study area is temperature-driven process, while that of deposition is affected by a mixed effects of various factors including temperature, ozone, and non-methane HCs. If the computed emission rate is extrapolated to the whole study area we estimate that annual emission of Hg from the study area can amount to approximately 6kg.

  • PDF

Design and Implementation of an Execution-Provenance Based Simulation Data Management Framework for Computational Science Engineering Simulation Platform (계산과학공학 플랫폼을 위한 실행-이력 기반의 시뮬레이션 데이터 관리 프레임워크 설계 및 구현)

  • Ma, Jin;Lee, Sik;Cho, Kum-won;Suh, Young-kyoon
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.77-86
    • /
    • 2018
  • For the past few years, KISTI has been servicing an online simulation execution platform, called EDISON, allowing users to conduct simulations on various scientific applications supplied by diverse computational science and engineering disciplines. Typically, these simulations accompany large-scale computation and accordingly produce a huge volume of output data. One critical issue arising when conducting those simulations on an online platform stems from the fact that a number of users simultaneously submit to the platform their simulation requests (or jobs) with the same (or almost unchanging) input parameters or files, resulting in charging a significant burden on the platform. In other words, the same computing jobs lead to duplicate consumption computing and storage resources at an undesirably fast pace. To overcome excessive resource usage by such identical simulation requests, in this paper we introduce a novel framework, called IceSheet, to efficiently manage simulation data based on execution metadata, that is, provenance. The IceSheet framework captures and stores each provenance associated with a conducted simulation. The collected provenance records are utilized for not only inspecting duplicate simulation requests but also performing search on existing simulation results via an open-source search engine, ElasticSearch. In particular, this paper elaborates on the core components in the IceSheet framework to support the search and reuse on the stored simulation results. We implemented as prototype the proposed framework using the engine in conjunction with the online simulation execution platform. Our evaluation of the framework was performed on the real simulation execution-provenance records collected on the platform. Once the prototyped IceSheet framework fully functions with the platform, users can quickly search for past parameter values entered into desired simulation software and receive existing results on the same input parameter values on the software if any. Therefore, we expect that the proposed framework contributes to eliminating duplicate resource consumption and significantly reducing execution time on the same requests as previously-executed simulations.

Sapflux Measurement Database Using Granier's Heat Dissipation Method and Heat Pulse Method (수액류 측정 데이터베이스: 그래니어(Granier) 센서 열손실탐침법(Heat Dissipation Method)과 열파동법(Heat Pulse Method)을 이용한 수액류 측정)

  • Lee, Minsu;Park, Juhan;Cho, Sungsik;Moon, Minkyu;Ryu, Daun;Lee, Hoontaek;Lee, Hojin;Kim, Sookyung;Kim, Taekyung;Byeon, Siyeon;Jeon, Jihyun;Bhusal, Narayan;Kim, Hyun Seok
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.22 no.4
    • /
    • pp.327-339
    • /
    • 2020
  • Transpiration is the movement of water into the atmosphere through leaf stomata of plant, and it accounts for more than half of evapotranspiration from the land surface. The measurements of transpiration could be conducted in various ways including eddy covariance and water balance method etc. However, the transpiration measurements of individual trees are necessary to quantify and compare the water use of each species and individual component within stands. For the measurement of the transpiration by individual tree, the thermometric methods such as heat dissipation and heat pulse methods are widely used. However, it is difficult and labor consuming to maintain the transpiration measurements of individual trees in a wide range area and especially for long-term experiment. Therefore, the sharing of sapflow data through database should be useful to promote the studies on transpiration and water balance for large spatial scale. In this paper, we present sap flow database, which have Granier type sap flux data from 18 Korean pine (Pinus koraiensis) since 2011 and 16 (Quercus aliena) since 2013 in Mt.Taehwa Seoul National University forest and 18 needle fir (Abies holophylla), seven (Quercus serrata), three (Carpinus laxiflora and C. cordata each since 2013 in Gwangneung. In addition, the database includes the sapling transpiration of nine species (Prunus sargentii, Larix kaempferii, Quercus accutisima, Pinus densiflora, Fraxinus rhynchophylla, Chamecypans obtuse, P. koraiensis, Betulla platyphylla, A. holophylla, Pinus thunbergii), which were measured using heat pulse method since 2018. We believe this is the first database to share the sapflux data in Rep. of Korea, and we wish our database to be used by other researchers and contribute a variety of researches in this field.

A Study on Foreign Exchange Rate Prediction Based on KTB, IRS and CCS Rates: Empirical Evidence from the Use of Artificial Intelligence (국고채, 금리 스왑 그리고 통화 스왑 가격에 기반한 외환시장 환율예측 연구: 인공지능 활용의 실증적 증거)

  • Lim, Hyun Wook;Jeong, Seung Hwan;Lee, Hee Soo;Oh, Kyong Joo
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.71-85
    • /
    • 2021
  • The purpose of this study is to find out which artificial intelligence methodology is most suitable for creating a foreign exchange rate prediction model using the indicators of bond market and interest rate market. KTBs and MSBs, which are representative products of the Korea bond market, are sold on a large scale when a risk aversion occurs, and in such cases, the USD/KRW exchange rate often rises. When USD liquidity problems occur in the onshore Korean market, the KRW Cross-Currency Swap price in the interest rate market falls, then it plays as a signal to buy USD/KRW in the foreign exchange market. Considering that the price and movement of products traded in the bond market and interest rate market directly or indirectly affect the foreign exchange market, it may be regarded that there is a close and complementary relationship among the three markets. There have been studies that reveal the relationship and correlation between the bond market, interest rate market, and foreign exchange market, but many exchange rate prediction studies in the past have mainly focused on studies based on macroeconomic indicators such as GDP, current account surplus/deficit, and inflation while active research to predict the exchange rate of the foreign exchange market using artificial intelligence based on the bond market and interest rate market indicators has not been conducted yet. This study uses the bond market and interest rate market indicator, runs artificial neural network suitable for nonlinear data analysis, logistic regression suitable for linear data analysis, and decision tree suitable for nonlinear & linear data analysis, and proves that the artificial neural network is the most suitable methodology for predicting the foreign exchange rates which are nonlinear and times series data. Beyond revealing the simple correlation between the bond market, interest rate market, and foreign exchange market, capturing the trading signals between the three markets to reveal the active correlation and prove the mutual organic movement is not only to provide foreign exchange market traders with a new trading model but also to be expected to contribute to increasing the efficiency and the knowledge management of the entire financial market.