• Title/Summary/Keyword: Error patterns

Search Result 693, Processing Time 0.031 seconds

Predicting the Number of Confirmed COVID-19 Cases Using Deep Learning Models with Search Term Frequency Data (검색어 빈도 데이터를 반영한 코로나 19 확진자수 예측 딥러닝 모델)

  • Sungwook Jung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.387-398
    • /
    • 2023
  • The COVID-19 outbreak has significantly impacted human lifestyles and patterns. It was recommended to avoid face-to-face contact and over-crowded indoor places as much as possible as COVID-19 spreads through air, as well as through droplets or aerosols. Therefore, if a person who has contacted a COVID-19 patient or was at the place where the COVID-19 patient occurred is concerned that he/she may have been infected with COVID-19, it can be fully expected that he/she will search for COVID-19 symptoms on Google. In this study, an exploratory data analysis using deep learning models(DNN & LSTM) was conducted to see if we could predict the number of confirmed COVID-19 cases by summoning Google Trends, which played a major role in surveillance and management of influenza, again and combining it with data on the number of confirmed COVID-19 cases. In particular, search term frequency data used in this study are available publicly and do not invade privacy. When the deep neural network model was applied, Seoul (9.6 million) with the largest population in South Korea and Busan (3.4 million) with the second largest population recorded lower error rates when forecasting including search term frequency data. These analysis results demonstrate that search term frequency data plays an important role in cities with a population above a certain size. We also hope that these predictions can be used as evidentiary materials to decide policies, such as the deregulation or implementation of stronger preventive measures.

Radar-based rainfall prediction using generative adversarial network (적대적 생성 신경망을 이용한 레이더 기반 초단시간 강우예측)

  • Yoon, Seongsim;Shin, Hongjoon;Heo, Jae-Yeong
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.8
    • /
    • pp.471-484
    • /
    • 2023
  • Deep learning models based on generative adversarial neural networks are specialized in generating new information based on learned information. The deep generative models (DGMR) model developed by Google DeepMind is an generative adversarial neural network model that generates predictive radar images by learning complex patterns and relationships in large-scale radar image data. In this study, the DGMR model was trained using radar rainfall observation data from the Ministry of Environment, and rainfall prediction was performed using an generative adversarial neural network for a heavy rainfall case in August 2021, and the accuracy was compared with existing prediction techniques. The DGMR generally resembled the observed rainfall in terms of rainfall distribution in the first 60 minutes, but tended to predict a continuous development of rainfall in cases where strong rainfall occurred over the entire area. Statistical evaluation also showed that the DGMR method is an effective rainfall prediction method compared to other methods, with a critical success index of 0.57 to 0.79 and a mean absolute error of 0.57 to 1.36 mm in 1 hour advance prediction. However, the lack of diversity in the generated results sometimes reduces the prediction accuracy, so it is necessary to improve the diversity and to supplement it with rainfall data predicted by a physics-based numerical forecast model to improve the accuracy of the forecast for more than 2 hours in advance.

Analysis of the Relationship between Melon Fruit Growth and Net Quality Using Computer Vision and Fruit Modeling (컴퓨터 비전과 과실 모델링을 이용한 멜론 과실 생장과 네트 품질의 관계 분석)

  • Seungri Yoon;Minju Shin;Jin Hyun Kim;Ji Wong Bang;Ho Jeong Jeong;Tae In Ahn
    • Journal of Bio-Environment Control
    • /
    • v.32 no.4
    • /
    • pp.456-465
    • /
    • 2023
  • Melon fruits exhibit a wide range of morphological variations in fruit shape, sugar content, net quality, diameter and weight, which are largely dependent on the variety. These characteristics significantly affect marketability. For netted varieties, the uniformity and pattern of the net serve as key factors in determining the external quality of the melon and act as indicators of its internal quality. In this study, we evaluated the effect of fruit morphology and growth on netting by analyzing the changes in melon fruit quality under LED light treatment and monitoring fruit growth. Computer vision analysis was used for quantitative evaluation of fruit net quality, and a three-variable logistic model was applied to simulate fruit growth. The results showed that melons grown under LED conditions exhibited more uniform fruit shape and improvements in both net quality and sugar content compared to the control group. The results of the logistic model showed minimal error values and consistent curve slopes across treatments, confirming its ability to accurately predict fruit growth patterns under varying light conditions. This study provides an understanding of the effects of fruit shape and growth on net quality.

Comparative Assessment of the Half-lives of Benfuresate and Oxolinic Acid Estimated from Kinetic Models Under Field Soil Conditions (포장조건에서 Kinetic Models로부터 산출한 Benfuresate 및 Oxolinic Acid의 토양중 반감기 비교평가)

  • Yang, Jae-E.;Park, Dong-Sik;Han, Dae-Sung
    • Korean Journal of Environmental Agriculture
    • /
    • v.14 no.3
    • /
    • pp.302-311
    • /
    • 1995
  • Benfuresate or oxolinic acid, as an experimental pesticide, was applied to the different textural paddy or upland soil respectively under the field condition and the residual concentrations were determined. Six kinetic models were employed to characterize the best-fit kinetic model describing the residual pattern of benfuresate or oxolinic acid and the $t\frac{1}{2}$ estimated from each model was comparatively assessed. All of the six models explained significantly the residual patterns of the pesticides but the empirical models such as PF, EL, and PB were not recommendable for the $t\frac{1}{2}$ estimation. Among theoretical models, the residual patterns were followed in the orders of the second-order(SO)>first-order(FO)>zero-order(ZO) kinetics, judging from the size and significance of coefficient of determination and standard error. However, the multiple FO model, consisting of the fast and slow decomposition steps, was better than the single FO model for the residual pattern and the $r^2$ in this case became similar to that of SO kinetic model. Thus the multiple FO and SO models were represented as the best fit model of the experimental pesticide. The $t\frac{1}{2}$ of benfuresate estimated from the single FO kinetic model in Weolgog and Cheongwon series was 49 and 63 days, respectively, which were 20 and 13% longer than the respective $t\frac{1}{2}$ from the SO kinetic model. The $t\frac{1}{2}$ of oxolinic acid from the FO model in Yonggye and Ihyeon series were 87 and 51% longer than those from the SO kinetic model, respectively. These results demonstrated that the best-fit model representing the residual pattern of a pesticide and the resultant $t\frac{1}{2}$ might be variable with the kinds of pesticides and the environmental conditions. Therefore it is recommended that the half-life of a pesticide be assessed from the best-fit model rather than from the FO kinetic model uniformly.

  • PDF

Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques (소셜 네트워크와 데이터 마이닝 기법을 활용한 학문 분야 중심 및 융합 키워드 추천 서비스)

  • Cho, In-Dong;Kim, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.127-138
    • /
    • 2011
  • The core service of most research portal sites is providing relevant research papers to various researchers that match their research interests. This kind of service may only be effective and easy to use when a user can provide correct and concrete information about a paper such as the title, authors, and keywords. However, unfortunately, most users of this service are not acquainted with concrete bibliographic information. It implies that most users inevitably experience repeated trial and error attempts of keyword-based search. Especially, retrieving a relevant research paper is more difficult when a user is novice in the research domain and does not know appropriate keywords. In this case, a user should perform iterative searches as follows : i) perform an initial search with an arbitrary keyword, ii) acquire related keywords from the retrieved papers, and iii) perform another search again with the acquired keywords. This usage pattern implies that the level of service quality and user satisfaction of a portal site are strongly affected by the level of keyword management and searching mechanism. To overcome this kind of inefficiency, some leading research portal sites adopt the association rule mining-based keyword recommendation service that is similar to the product recommendation of online shopping malls. However, keyword recommendation only based on association analysis has limitation that it can show only a simple and direct relationship between two keywords. In other words, the association analysis itself is unable to present the complex relationships among many keywords in some adjacent research areas. To overcome this limitation, we propose the hybrid approach for establishing association network among keywords used in research papers. The keyword association network can be established by the following phases : i) a set of keywords specified in a certain paper are regarded as co-purchased items, ii) perform association analysis for the keywords and extract frequent patterns of keywords that satisfy predefined thresholds of confidence, support, and lift, and iii) schematize the frequent keyword patterns as a network to show the core keywords of each research area and connecting keywords among two or more research areas. To estimate the practical application of our approach, we performed a simple experiment with 600 keywords. The keywords are extracted from 131 research papers published in five prominent Korean journals in 2009. In the experiment, we used the SAS Enterprise Miner for association analysis and the R software for social network analysis. As the final outcome, we presented a network diagram and a cluster dendrogram for the keyword association network. We summarized the results in Section 4 of this paper. The main contribution of our proposed approach can be found in the following aspects : i) the keyword network can provide an initial roadmap of a research area to researchers who are novice in the domain, ii) a researcher can grasp the distribution of many keywords neighboring to a certain keyword, and iii) researchers can get some idea for converging different research areas by observing connecting keywords in the keyword association network. Further studies should include the following. First, the current version of our approach does not implement a standard meta-dictionary. For practical use, homonyms, synonyms, and multilingual problems should be resolved with a standard meta-dictionary. Additionally, more clear guidelines for clustering research areas and defining core and connecting keywords should be provided. Finally, intensive experiments not only on Korean research papers but also on international papers should be performed in further studies.

PCA­based Waveform Classification of Rabbit Retinal Ganglion Cell Activity (주성분분석을 이용한 토끼 망막 신경절세포의 활동전위 파형 분류)

  • 진계환;조현숙;이태수;구용숙
    • Progress in Medical Physics
    • /
    • v.14 no.4
    • /
    • pp.211-217
    • /
    • 2003
  • The Principal component analysis (PCA) is a well-known data analysis method that is useful in linear feature extraction and data compression. The PCA is a linear transformation that applies an orthogonal rotation to the original data, so as to maximize the retained variance. PCA is a classical technique for obtaining an optimal overall mapping of linearly dependent patterns of correlation between variables (e.g. neurons). PCA provides, in the mean-squared error sense, an optimal linear mapping of the signals which are spread across a group of variables. These signals are concentrated into the first few components, while the noise, i.e. variance which is uncorrelated across variables, is sequestered in the remaining components. PCA has been used extensively to resolve temporal patterns in neurophysiological recordings. Because the retinal signal is stochastic process, PCA can be used to identify the retinal spikes. With excised rabbit eye, retina was isolated. A piece of retina was attached with the ganglion cell side to the surface of the microelectrode array (MEA). The MEA consisted of glass plate with 60 substrate integrated and insulated golden connection lanes terminating in an 8${\times}$8 array (spacing 200 $\mu$m, electrode diameter 30 $\mu$m) in the center of the plate. The MEA 60 system was used for the recording of retinal ganglion cell activity. The action potentials of each channel were sorted by off­line analysis tool. Spikes were detected with a threshold criterion and sorted according to their principal component composition. The first (PC1) and second principal component values (PC2) were calculated using all the waveforms of the each channel and all n time points in the waveform, where several clusters could be separated clearly in two dimension. We verified that PCA-based waveform detection was effective as an initial approach for spike sorting method.

  • PDF

Physical Characterization of Domestic Aggregate (국내 골재의 물리적 특성 분석)

  • Junyoung Ko;Eungyu Park;Junghae Choi;Jong-Tae Kim
    • The Journal of Engineering Geology
    • /
    • v.33 no.1
    • /
    • pp.169-187
    • /
    • 2023
  • Aggregates from 84 cities and counties in Korea were tested for quality to allow analysis of the physical characteristics of aggregates from river, land, and forest environments. River and land aggregates were analyzed for 18 test items, and forest aggregates for 12 test items. They were classified according to watershed and geology, respectively. The observed physical characteristics of the river aggregates by basin were as follows: aggregates from the Geum River basin passed through 2.5, 1.2, 0.6, 0.3, 0.15, and 0.08 mm sieves; clay lumps constituted the Nakdong River basin material; aggregates from the Seomjin River basin passed through 10, 5, and 2.5 mm sieves; those from the Youngsang River basin passed through 1.2, 0.6, 0.3, 0.15, and 0.08 mm sieves; and aggregates from the Han River basin passed through 10, 5, 2.5, 1.2, 0.6, 0.3, and 0.08 mm sieves, Stability; Standard errors were analyzed for the average amount passing through 10, 0.6, and 0.08 mm silver sieves, and performance rate showed different distribution patterns from other physical characteristics. Analysis of variance found that 16 of the 18 items, excluding the absorption rate and the performance rate, had statistically significant differences in their averages by region. Considering land aggregates by basin, those from the Nakdong River basin excluding the Geum River basin had clay lumps, those from the Seomjin River basin had 10 and 5 mm sieve passage, aggregates from the Youngsang River basin had 0.08 mm sieve passage, and those from the Han River basin had 10, 0.6, and 0.08 mm sieve passage. The standard error of the mean of the quantity showed a different distribution pattern from the other physical characteristics. Analysis of variance found a statistically significant difference in the average of all 18 items by region. Analyzing forest aggregates by geology showed distributions of porosity patterns different from those of other physical characteristics in metamorphic rocks (but not igneous rocks), and distributions of wear rate and porosity were different from those of sedimentary rocks. There were statistically significant differences in the average volume mass, water absorption rate, wear rate, and Sc/Rc items by lipid.

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

  • Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.111-131
    • /
    • 2015
  • There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.

Within Field Distribution Pattern and Design of a Sampling Plan for Damaged Onions by the Onion maggot, Hylemya antiqua Meigen(Diptera: Anthomyiidae) (고자리파리에 의한 양파피해(被害)의 포장내(圃場內) 분포양식(分布樣式)과 피해량(被害量) 추정(推定)을 위한 표본추출(標本抽出) 계획(計劃))

  • Park, C.G.;Hyun, J.S.;Cho, D.J.;Lee, K.S.;Hah, J.K.
    • Korean journal of applied entomology
    • /
    • v.24 no.1 s.62
    • /
    • pp.29-33
    • /
    • 1985
  • Every plant in $990m^2$ onion field was inspected for damages by the onion maggot. Maps were constructed every ten days to show which plants were infested and which were not from April 11 to May 21, 1984. The maps were sectioned into squares one of which contains 80 onion plants and the counts of damaged onions in each square were fitted to poisson and negative binomial distribution and tested by chi-square. We argue that the satisfactory fitness of the expected negative binomial $[P(x^2)>0.05]$ provided a useful description of the spatial distribution patterns of the damaged onions. Edge effect was tested by the differences of damage ratio and variance/mean ratio (${\sigma}^2/m$) between edge and center part. The result showed that the damage ratioes and variances of all the periods, ${\sigma}^2/m$ values after May 1 were greater in edge part than in center part. Again, the maps were sectioned into four blocks and the squares (sample units) were sectioned into quadrants. By application of the variance component technique, it was suggested that $2{\sim}8$ sample units for 5% sampling error and $1{\sim}2$ sample units for 10% error should be sampled randomly to estimate the damage ratio when $2{\sim}3$ quadrants were inspected.

  • PDF

A digital Audio Watermarking Algorithm using 2D Barcode (2차원 바코드를 이용한 오디오 워터마킹 알고리즘)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.97-107
    • /
    • 2011
  • Nowadays there are a lot of issues about copyright infringement in the Internet world because the digital content on the network can be copied and delivered easily. Indeed the copied version has same quality with the original one. So, copyright owners and content provider want a powerful solution to protect their content. The popular one of the solutions was DRM (digital rights management) that is based on encryption technology and rights control. However, DRM-free service was launched after Steve Jobs who is CEO of Apple proposed a new music service paradigm without DRM, and the DRM is disappeared at the online music market. Even though the online music service decided to not equip the DRM solution, copyright owners and content providers are still searching a solution to protect their content. A solution to replace the DRM technology is digital audio watermarking technology which can embed copyright information into the music. In this paper, the author proposed a new audio watermarking algorithm with two approaches. First, the watermark information is generated by two dimensional barcode which has error correction code. So, the information can be recovered by itself if the errors fall into the range of the error tolerance. The other one is to use chirp sequence of CDMA (code division multiple access). These make the algorithm robust to the several malicious attacks. There are many 2D barcodes. Especially, QR code which is one of the matrix barcodes can express the information and the expression is freer than that of the other matrix barcodes. QR code has the square patterns with double at the three corners and these indicate the boundary of the symbol. This feature of the QR code is proper to express the watermark information. That is, because the QR code is 2D barcodes, nonlinear code and matrix code, it can be modulated to the spread spectrum and can be used for the watermarking algorithm. The proposed algorithm assigns the different spread spectrum sequences to the individual users respectively. In the case that the assigned code sequences are orthogonal, we can identify the watermark information of the individual user from an audio content. The algorithm used the Walsh code as an orthogonal code. The watermark information is rearranged to the 1D sequence from 2D barcode and modulated by the Walsh code. The modulated watermark information is embedded into the DCT (discrete cosine transform) domain of the original audio content. For the performance evaluation, I used 3 audio samples, "Amazing Grace", "Oh! Carol" and "Take me home country roads", The attacks for the robustness test were MP3 compression, echo attack, and sub woofer boost. The MP3 compression was performed by a tool of Cool Edit Pro 2.0. The specification of MP3 was CBR(Constant Bit Rate) 128kbps, 44,100Hz, and stereo. The echo attack had the echo with initial volume 70%, decay 75%, and delay 100msec. The sub woofer boost attack was a modification attack of low frequency part in the Fourier coefficients. The test results showed the proposed algorithm is robust to the attacks. In the MP3 attack, the strength of the watermark information is not affected, and then the watermark can be detected from all of the sample audios. In the sub woofer boost attack, the watermark was detected when the strength is 0.3. Also, in the case of echo attack, the watermark can be identified if the strength is greater and equal than 0.5.