• Title/Summary/Keyword: data preprocessing

Search Result 997, Processing Time 0.026 seconds

Finite Element Method Modeling for Individual Malocclusions: Development and Application of the Basic Algorithm (유한요소법을 이용한 환자별 교정시스템 구축의 기초 알고리즘 개발과 적용)

  • Shin, Jung-Woog;Nahm, Dong-Seok;Kim, Tae-Woo;Lee, Sung Jae
    • The korean journal of orthodontics
    • /
    • v.27 no.5 s.64
    • /
    • pp.815-824
    • /
    • 1997
  • The purpose of this study is to develop the basic algorithm for the finite element method modeling of individual malocclusions. Usually, a great deal of time is spent in preprocessing. To reduce the time required, we developed a standardized procedure for measuring the position of each tooth and a program to automatically preprocess. The following procedures were carried to complete this study. 1. Twenty-eight teeth morphologies were constructed three-dimensionally for the finite element analysis and saved as separate files. 2. Standard brackets were attached so that the FA points coincide with the center of the brackets. 3. The study model of a patient was made. 4. Using the study model, the crown inclination, angulation, and the vertical distance from the tip of a tooth was measured by using specially designed tools. 5. The arch form was determined from a picture of the model with an image processing technique. 6. The measured data were input as a rotational matrix. 7. The program provides an output file containing the necessary information about the three-dimensional position of teeth, which is applicable to several finite element programs commonly used. The program for a basic algorithm was made with Turbo-C and the subsequent outfile was applied to ANSYS. This standardized model measuring procedure and the program reduce the time required, especially for preprocessing and can be applied to other malocclusions easily.

  • PDF

A Comparative Study on the Effective Deep Learning for Fingerprint Recognition with Scar and Wrinkle (상처와 주름이 있는 지문 판별에 효율적인 심층 학습 비교연구)

  • Kim, JunSeob;Rim, BeanBonyka;Sung, Nak-Jun;Hong, Min
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.17-23
    • /
    • 2020
  • Biometric information indicating measurement items related to human characteristics has attracted great attention as security technology with high reliability since there is no fear of theft or loss. Among these biometric information, fingerprints are mainly used in fields such as identity verification and identification. If there is a problem such as a wound, wrinkle, or moisture that is difficult to authenticate to the fingerprint image when identifying the identity, the fingerprint expert can identify the problem with the fingerprint directly through the preprocessing step, and apply the image processing algorithm appropriate to the problem. Solve the problem. In this case, by implementing artificial intelligence software that distinguishes fingerprint images with cuts and wrinkles on the fingerprint, it is easy to check whether there are cuts or wrinkles, and by selecting an appropriate algorithm, the fingerprint image can be easily improved. In this study, we developed a total of 17,080 fingerprint databases by acquiring all finger prints of 1,010 students from the Royal University of Cambodia, 600 Sokoto open data sets, and 98 Korean students. In order to determine if there are any injuries or wrinkles in the built database, criteria were established, and the data were validated by experts. The training and test datasets consisted of Cambodian data and Sokoto data, and the ratio was set to 8: 2. The data of 98 Korean students were set up as a validation data set. Using the constructed data set, five CNN-based architectures such as Classic CNN, AlexNet, VGG-16, Resnet50, and Yolo v3 were implemented. A study was conducted to find the model that performed best on the readings. Among the five architectures, ResNet50 showed the best performance with 81.51%.

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Study on the Limitation of AVO Responses Shown in the Seismic Data from East-sea Gas Reservoir (동해 가스전 탄성파 자료에서 나타나는 AVO 반응의 한계점에 대한 고찰)

  • Shin, Seung-Il;Byun, Joong-Moo;Choi, Hyung-Wook;Kim, Kun-Deuk;Ko, Seung-Won;Seo, Young-Tak;Cha, Young-Ho
    • Geophysics and Geophysical Exploration
    • /
    • v.11 no.3
    • /
    • pp.242-249
    • /
    • 2008
  • Recently, AVO analysis has been widely used in oil exploration with seismic subsurface section as a direct indicator of the existence of the gas. In the case of the deep reservoirs like the gas reservoirs in the East-sea, it is often difficult to observe AVO responses in CMP gathers even though the bright spots are shown in the stacked section. Because the reservoir becomes more consolidated as its depth deepens, P-wave velocity does not decrease significantly when the pore fluid is replaced by the gas. Thus the difference in Poisson's ratio, which is a key factor for AVO response, between the reservoir and the layer above it does not increase significantly. In this study, we analyzed the effects of Poisson's ratio difference on AVO response with a variety of Poisson's ratios for the upper and lower layers. The results show that, as the difference in Poisson's ratio between the upper and lower layers decreases, the change in the reflection amplitude with incidence angle decreases and AVO responses become insignificant. To consider the limitation of AVO responses shown in the gas reservoir in East-sea, the velocity model was made by simulation Gorae V structure with seismic data and well logs. The results of comparing AVO responses observed from the synthetic data with theoretical AVO responses calculated by using material properties show that the amount of the change in reflection amplitude with increasing incident angle is very small when the difference in Poisson's ratio between the upper and lower layers is small. In addition, the characteristics of AVO responses were concealed by noise or amplitude distortion arisen during preprocessing. To overcome such limitations of AVO analysis of the data from deep reservoirs, we need to acquire precisely reflection amplltudes In data acquisition stage and use processing tools which preserve reflection amplitude in data processing stage.

Oceanic Application of Satellite Synthetic Aperture Radar - Focused on Sea Surface Wind Retrieval - (인공위성 합성개구레이더 영상 자료의 해양 활용 - 해상풍 산출을 중심으로 -)

  • Jang, Jae-Cheol;Park, Kyung-Ae
    • Journal of the Korean earth science society
    • /
    • v.40 no.5
    • /
    • pp.447-463
    • /
    • 2019
  • Sea surface wind is a fundamental element for understanding the oceanic phenomena and for analyzing changes of the Earth environment caused by global warming. Global research institutes have developed and operated scatterometers to accurately and continuously observe the sea surface wind, with the accuracy of approximately ${\pm}20^{\circ}$ for wind direction and ${\pm}2m\;s^{-1}$ for wind speed. Given that the spatial resolution of the scatterometer is 12.5-25.0 km, the applicability of the data to the coastal area is limited due to complicated coastal lines and many islands around the Korean Peninsula. In contrast, Synthetic Aperture Radar (SAR), one of microwave sensors, is an all-weather instrument, which enables us to retrieve sea surface wind with high resolution (<1 km) and compensate the sparse resolution of the scatterometer. In this study, we investigated the Geophysical Model Functions (GMF), which are the algorithms for retrieval of sea surface wind speed from the SAR data depending on each band such as C-, L-, or X-band radar. We reviewed in the simulation of the backscattering coefficients for relative wind direction, incidence angle, and wind speed by applying LMOD, CMOD, and XMOD model functions, and analyzed the characteristics of each GMF. We investigated previous studies about the validation of wind speed from the SAR data using these GMFs. The accuracy of sea surface wind from SAR data changed with respect to observation mode, GMF type, reference data for validation, preprocessing method, and the method for calculation of relative wind direction. It is expected that this study contributes to the potential users of SAR images who retrieve wind speeds from SAR data at the coastal region around the Korean Peninsula.

Improved Social Network Analysis Method in SNS (SNS에서의 개선된 소셜 네트워크 분석 방법)

  • Sohn, Jong-Soo;Cho, Soo-Whan;Kwon, Kyung-Lag;Chung, In-Jeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.117-127
    • /
    • 2012
  • Due to the recent expansion of the Web 2.0 -based services, along with the widespread of smartphones, online social network services are being popularized among users. Online social network services are the online community services which enable users to communicate each other, share information and expand human relationships. In the social network services, each relation between users is represented by a graph consisting of nodes and links. As the users of online social network services are increasing rapidly, the SNS are actively utilized in enterprise marketing, analysis of social phenomenon and so on. Social Network Analysis (SNA) is the systematic way to analyze social relationships among the members of the social network using the network theory. In general social network theory consists of nodes and arcs, and it is often depicted in a social network diagram. In a social network diagram, nodes represent individual actors within the network and arcs represent relationships between the nodes. With SNA, we can measure relationships among the people such as degree of intimacy, intensity of connection and classification of the groups. Ever since Social Networking Services (SNS) have drawn increasing attention from millions of users, numerous researches have made to analyze their user relationships and messages. There are typical representative SNA methods: degree centrality, betweenness centrality and closeness centrality. In the degree of centrality analysis, the shortest path between nodes is not considered. However, it is used as a crucial factor in betweenness centrality, closeness centrality and other SNA methods. In previous researches in SNA, the computation time was not too expensive since the size of social network was small. Unfortunately, most SNA methods require significant time to process relevant data, and it makes difficult to apply the ever increasing SNS data in social network studies. For instance, if the number of nodes in online social network is n, the maximum number of link in social network is n(n-1)/2. It means that it is too expensive to analyze the social network, for example, if the number of nodes is 10,000 the number of links is 49,995,000. Therefore, we propose a heuristic-based method for finding the shortest path among users in the SNS user graph. Through the shortest path finding method, we will show how efficient our proposed approach may be by conducting betweenness centrality analysis and closeness centrality analysis, both of which are widely used in social network studies. Moreover, we devised an enhanced method with addition of best-first-search method and preprocessing step for the reduction of computation time and rapid search of the shortest paths in a huge size of online social network. Best-first-search method finds the shortest path heuristically, which generalizes human experiences. As large number of links is shared by only a few nodes in online social networks, most nods have relatively few connections. As a result, a node with multiple connections functions as a hub node. When searching for a particular node, looking for users with numerous links instead of searching all users indiscriminately has a better chance of finding the desired node more quickly. In this paper, we employ the degree of user node vn as heuristic evaluation function in a graph G = (N, E), where N is a set of vertices, and E is a set of links between two different nodes. As the heuristic evaluation function is used, the worst case could happen when the target node is situated in the bottom of skewed tree. In order to remove such a target node, the preprocessing step is conducted. Next, we find the shortest path between two nodes in social network efficiently and then analyze the social network. For the verification of the proposed method, we crawled 160,000 people from online and then constructed social network. Then we compared with previous methods, which are best-first-search and breath-first-search, in time for searching and analyzing. The suggested method takes 240 seconds to search nodes where breath-first-search based method takes 1,781 seconds (7.4 times faster). Moreover, for social network analysis, the suggested method is 6.8 times and 1.8 times faster than betweenness centrality analysis and closeness centrality analysis, respectively. The proposed method in this paper shows the possibility to analyze a large size of social network with the better performance in time. As a result, our method would improve the efficiency of social network analysis, making it particularly useful in studying social trends or phenomena.

Analysis of Research Trends of 'Word of Mouth (WoM)' through Main Path and Word Co-occurrence Network (주경로 분석과 연관어 네트워크 분석을 통한 '구전(WoM)' 관련 연구동향 분석)

  • Shin, Hyunbo;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.179-200
    • /
    • 2019
  • Word-of-mouth (WoM) is defined by consumer activities that share information concerning consumption. WoM activities have long been recognized as important in corporate marketing processes and have received much attention, especially in the marketing field. Recently, according to the development of the Internet, the way in which people exchange information in online news and online communities has been expanded, and WoM is diversified in terms of word of mouth, score, rating, and liking. Social media makes online users easy access to information and online WoM is considered a key source of information. Although various studies on WoM have been preceded by this phenomenon, there is no meta-analysis study that comprehensively analyzes them. This study proposed a method to extract major researches by applying text mining techniques and to grasp the main issues of researches in order to find the trend of WoM research using scholarly big data. To this end, a total of 4389 documents were collected by the keyword 'Word-of-mouth' from 1941 to 2018 in Scopus (www.scopus.com), a citation database, and the data were refined through preprocessing such as English morphological analysis, stopwords removal, and noun extraction. To carry out this study, we adopted main path analysis (MPA) and word co-occurrence network analysis. MPA detects key researches and is used to track the development trajectory of academic field, and presents the research trend from a macro perspective. For this, we constructed a citation network based on the collected data. The node means a document and the link means a citation relation in citation network. We then detected the key-route main path by applying SPC (Search Path Count) weights. As a result, the main path composed of 30 documents extracted from a citation network. The main path was able to confirm the change of the academic area which was developing along with the change of the times reflecting the industrial change such as various industrial groups. The results of MPA revealed that WoM research was distinguished by five periods: (1) establishment of aspects and critical elements of WoM, (2) relationship analysis between WoM variables, (3) beginning of researches of online WoM, (4) relationship analysis between WoM and purchase, and (5) broadening of topics. It was found that changes within the industry was reflected in the results such as online development and social media. Very recent studies showed that the topics and approaches related WoM were being diversified to circumstantial changes. However, the results showed that even though WoM was used in diverse fields, the main stream of the researches of WoM from the start to the end, was related to marketing and figuring out the influential factors that proliferate WoM. By applying word co-occurrence network analysis, the research trend is presented from a microscopic point of view. Word co-occurrence network was constructed to analyze the relationship between keywords and social network analysis (SNA) was utilized. We divided the data into three periods to investigate the periodic changes and trends in discussion of WoM. SNA showed that Period 1 (1941~2008) consisted of clusters regarding relationship, source, and consumers. Period 2 (2009~2013) contained clusters of satisfaction, community, social networks, review, and internet. Clusters of period 3 (2014~2018) involved satisfaction, medium, review, and interview. The periodic changes of clusters showed transition from offline to online WoM. Media of WoM have become an important factor in spreading the words. This study conducted a quantitative meta-analysis based on scholarly big data regarding WoM. The main contribution of this study is that it provides a micro perspective on the research trend of WoM as well as the macro perspective. The limitation of this study is that the citation network constructed in this study is a network based on the direct citation relation of the collected documents for MPA.

Feature Ranking for Detection of Neuro-degeneration and Vascular Dementia in micro-Raman spectra of Platelet (특징 순위 방법을 이용한 혈소판 라만 스펙트럼에서 퇴행성 뇌신경질환과 혈관성 인지증 분류)

  • Park, Aa-Ron;Baek, Sung-June
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.4
    • /
    • pp.21-26
    • /
    • 2011
  • Feature ranking is useful to gain knowledge of data and identify relevant features. In this study, we proposed a use of feature ranking for classification of neuro-degeneration and vascular dementia in micro-Raman spectra of platelet. The entire region of the spectrum is divided into local region including several peaks, followed by Gaussian curve fitting method in the region to be modeled. Local minima select from the subregion and then remove the background based on the position by using interpolation method. After preprocessing steps, significant features were selected by feature ranking method to improve the classification accuracy and the computational complexity of classification system. PCA (principal component analysis) transform the selected features and the overall features that is used classification with the number of principal components. These were classified as MAP (maximum a posteriori) and it compared with classification result using overall features. In all experiments, the computational complexity of the classification system was remarkably reduced and the classification accuracy was partially increased. Particularly, the proposed method increased the classification accuracy in the experiment classifying the Parkinson's disease and normal with the average 1.7 %. From the result, it confirmed that proposed method could be efficiently used in the classification system of the neuro-degenerative disease and vascular dementia of platelet.

Hardware-Based High Performance XML Parsing Technique Using an FPGA (FPGA를 이용한 하드웨어 기반 고성능 XML 파싱 기법)

  • Lee, Kyu-hee;Seo, Byeong-seok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.12
    • /
    • pp.2469-2475
    • /
    • 2015
  • A structured XML has been widely used to present services on various Web-services. The XML is also used for digital documents and digital signatures and for the representation of multimedia files in email systems. The XML document should be firstly parsed to access elements in the XML. The parsing is the most compute-instensive task in the use of XML documents. Most of the previous work has focused on hardware based XML parsers in order to improve parsing performance, while a little work has studied parsing techniques. We present the high performance parsing technique which can be used all of XML parsers and design hardware based XML parser using an FPGA. The proposed parsing technique uses element analyzers instead of the state machine and performs multibyte-based element matching. As a result, our parsing technique can reduce the number of clock cycles per byte(CPB) and does not need to require any preprocessing, such as loading XML data into memory. Compared to other parsers, our parser acheives 1.33~1.82 times improvement in the system performance. Therefore, the proposed parsing technique can process XML documents in real time and is suitable for applying to all of XML parsers.

Measurement of Quality Parameters of Honey by Reflectance Spectra

  • Park, Chang-Hyun;Yang, Won-Jun;Sohn, Jae-Hyung;Kim, Jong-Hoon
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1530-1530
    • /
    • 2001
  • The objectives of this study were to develop models to predict quality parameters of Korean bee-honeys by visible and NIR spectroscopic technique. Two kinds of bee-honey fronl acacia and polyflower sources were tested in this study. The honeys were harvested in the spring of 2000 and stored in the storage facility at 20$^{\circ}C$ during experiments. Total of 394 samples of honey were analyzed. Reflectance spectra, moisture contents, ash, invert sugar, sucrose, F/G (fructose/glucose) ratio, HMF (hydroxymethyl furfural), and C12/C13 ratio of honeys were measured. The average values for the tested honeys were 19.9% of moisture contents, 0.12% of ash, 68.4% of invert sugar, 5.7% of sucrose, 1.27 of F/G(fructose/glucose) ratio, 14.4 mg/kg of HMF, and -19.1 of C12/C13 ratio. A spectrophotometer, equipped with a single-beam scanning monochromator (NIR Systems, Model 6500, USA) and a horizontal setup module, was used to collect reflectance data from honey. The reflectance spectra were measured in wavelength ranges of 400∼2,498 nm. with 2 nm of interval. Thirty-two repetitive scans were averaged, transformed to log(1/Reflectance), and then were stored in a microcomputer file, forming one spectrum per measurement. A sample cell and reflectance plate were made to hold honey samples constantly. Spectra of honey samples were divided into a calibration set and a validation set. The calibration set was used during model development, and the validation set was used to predict quality parameters from unknown spectra. The PLS(Partial Least Square) models were developed to predict the quality parameters of honeys. The first and the second derivatives of raw spectra were also used to develop the models with proper smoothing gap. The MSC (multiplicative scatter correction) and the SNV & Dtr.(standard normal variate and detranding) preprocessing were applied to all spectra to minimize sample-to-sample light scatter differences. The PLS models showed good relationships between predicted and measured quality parameters of honeys in the wavelength range of 1100∼2200 nm. However, the PLS analysis was not good enough to predict HMF of honeys.

  • PDF