• Title/Summary/Keyword: kernel operator

Search Result 87, Processing Time 0.022 seconds

An End-to-End Sequence Learning Approach for Text Extraction and Recognition from Scene Image

  • Lalitha, G.;Lavanya, B.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.7
    • /
    • pp.220-228
    • /
    • 2022
  • Image always carry useful information, detecting a text from scene images is imperative. The proposed work's purpose is to recognize scene text image, example boarding image kept on highways. Scene text detection on highways boarding's plays a vital role in road safety measures. At initial stage applying preprocessing techniques to the image is to sharpen and improve the features exist in the image. Likely, morphological operator were applied on images to remove the close gaps exists between objects. Here we proposed a two phase algorithm for extracting and recognizing text from scene images. In phase I text from scenery image is extracted by applying various image preprocessing techniques like blurring, erosion, tophat followed by applying thresholding, morphological gradient and by fixing kernel sizes, then canny edge detector is applied to detect the text contained in the scene images. In phase II text from scenery image recognized using MSER (Maximally Stable Extremal Region) and OCR; Proposed work aimed to detect the text contained in the scenery images from popular dataset repositories SVT, ICDAR 2003, MSRA-TD 500; these images were captured at various illumination and angles. Proposed algorithm produces higher accuracy in minimal execution time compared with state-of-the-art methodologies.

Assessment of Changed Input Modules with SMOKE Model (SMOKE 모델의 입력 모듈 변경에 따른 영향 분석)

  • Kim, Ji-Young;Kim, Jeong-Soo;Hong, Ji-Hyung;Jung, Dong-Il;Ban, Soo-Jin;Lee, Yong-Mi
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.24 no.3
    • /
    • pp.284-299
    • /
    • 2008
  • Emission input modules was developed to produce emission input data and change some profiles for Sparse Matrix Operator Kernel Emissions (SMOKE) using Clean Air Policy Support System (CAPSS)'s activities and previous studies. Specially, this study was focused to improve chemical speciation and temporal allocation profiles of SMOKE. At first, SCC cord mapping was done. 579 SCC cords of CAPSS were matched with EPA's one. Temporal allocation profiles were changed using CAPSS monthly activities. And Chemical speciation profiles were substituted using Kang et al. (2000) and Lee et al. (2005) studies and Kim et al. (2005) study. Simulation in Seoul Metropolitan Area (Seoul, Incheon, Gyeonggi) using MM5, SMOKE and CMAQ modeling system was done for effect analysis of changed input modules of SMOKE. Emission model results adjusted with new input modules were slightly changed as compared to using EPA's default modules. SMOKE outputs shows that aldehyde emissions were decreased 4.78% after changing chemical profiles, increased 0.85% after implementing new temporal profiles. Toluene emissions were decreased 18.56% by changing chemical speciation profiles, increased 0.67% by replacing temporal profiles as well. Simulated results of air quality were also slightly elevated by using new input modules. Continuous accumulation of domestic data and studies to develop input system for air quality modeling would produce more improved results of air quality prediction.

Korea Emissions Inventory Processing Using the US EPA's SMOKE System

  • Kim, Soon-Tae;Moon, Nan-Kyoung;Byun, Dae-Won W.
    • Asian Journal of Atmospheric Environment
    • /
    • v.2 no.1
    • /
    • pp.34-46
    • /
    • 2008
  • Emissions inputs for use in air quality modeling of Korea were generated with the emissions inventory data from the National Institute of Environmental Research (NIER), maintained under the Clean Air Policy Support System (CAPSS) database. Source Classification Codes (SCC) in the Korea emissions inventory were adapted to use with the U.S. EPA's Sparse Matrix Operator Kernel Emissions (SMOKE) by finding the best-matching SMOKE default SCCs for the chemical speciation and temporal allocation. A set of 19 surrogate spatial allocation factors for South Korea were developed utilizing the Multi-scale Integrated Modeling System (MIMS) Spatial Allocator and Korean GIS databases. The mobile and area source emissions data, after temporal allocation, show typical sinusoidal diurnal variations with high peaks during daytime, while point source emissions show weak diurnal variations. The model-ready emissions are speciated for the carbon bond version 4 (CB-4) chemical mechanism. Volatile organic carbon (VOC) emissions from painting related industries in area source category significantly contribute to TOL (Toluene) and XYL (Xylene) emissions. ETH (Ethylene) emissions are largely contributed from point industrial incineration facilities and various mobile sources. On the other hand, a large portion of OLE (Olefin) emissions are speciated from mobile sources in addition to those contributed by the polypropylene industry in point source. It was found that FORM (Formaldehyde) is mostly emitted from petroleum industry and heavy duty diesel vehicles. Chemical speciation of PM2.5 emissions shows that PEC (primary fine elemental carbon) and POA (primary fine organic aerosol) are the most abundant species from diesel and gasoline vehicles. To reduce uncertainties in processing the Korea emission inventory due to the mapping of Korean SCCs to those of U.S., it would be practical to develop and use domestic source profiles for the top 10 SCCs for area and point sources and top 5 SCCs for on-road mobile sources when VOC emissions from the sources are more than 90% of the total.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Impact of Emission Inventory Choices on PM10 Forecast Accuracy and Contributions in the Seoul Metropolitan Area (배출량 목록에 따른 수도권 PM10 예보 정합도 및 국내외 기여도 분석)

  • Bae, Changhan;Kim, Eunhye;Kim, Byeong-Uk;Kim, Hyun Cheol;Woo, Jung-Hun;Moon, Kwang-Joo;Shin, Hye-Jung;Song, In Ho;Kim, Soontae
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.33 no.5
    • /
    • pp.497-514
    • /
    • 2017
  • This study quantitatively analyzes the effects of emission inventory choices on the simulated particulate matter (PM) concentrations and the domestic/foreign contributions in the Seoul Metropolitan Area (SMA) with an air quality forecasting system. The forecasting system is composed of Weather Research and Forecasting (WRF)-Sparse Matrix Operator Kernel Emissions (SMOKE)-Community Multi-Scale Air Quality (CMAQ). Different domestic and foreign emission inventories were selectively adopted to set up four sets of emissions inputs for air quality simulations in this study. All modeling cases showed that model performance statistics satisfied the criteria levels (correlation coefficient >0.7, fractional error <50%) suggested by previous studies. Notwithstanding the apparently good model performance of total PM concentrations by all emission cases, annual average concentrations of simulated total PM concentrations varied up to $20{\mu}g/m^3$ (160%) depending on the combination of emission inventories. In detail, the difference in simulated annual average concentrations of the primary PM coarse (PMC) was up to $25.2{\mu}g/m^3$ (6.5 times) compared with other cases. Furthermore, model performance analyses on PM species showed that the difference in the simulated primary PMC led to gross model overestimation in general, which indicates that the primary PMC emissions need to be improved. The contribution analysis using model direct outputs indicated that the domestic contributions to the annual average PM concentrations in the SMA vary from 44% to 67%. To account for the uncertainty of the simulated concentration, the contribution correction factor method proposed by Bae et al. (2017) was applied, which resulted in converged contributions(from 48% to 57%). We believe this study shows that it is necessary to improve the simulated concentrations of PM components in order to enhance the accuracy of the forecasting model. It is deemed that these improvements will provide more accurate contribution results.

Estimation of Chemical Speciation and Temporal Allocation Factor of VOC and PM2.5 for the Weather-Air Quality Modeling in the Seoul Metropolitan Area (수도권 지역에서 기상-대기질 모델링을 위한 VOC와 PM2.5의 화학종 분류 및 시간분배계수 산정)

  • Moon, Yun Seob
    • Journal of the Korean earth science society
    • /
    • v.36 no.1
    • /
    • pp.36-50
    • /
    • 2015
  • The purpose of this study is to assign emission source profiles of volatile organic compounds (VOCs) and particulate matters (PMs) for chemical speciation, and to correct the temporal allocation factor and the chemical speciation of source profiles according to the source classification code within the sparse matrix operator kernel emission system (SMOKE) in the Seoul metropolitan area. The chemical speciation from the source profiles of VOCs such as gasoline, diesel vapor, coating, dry cleaning and LPG include 12 and 34 species for the carbon bond IV (CBIV) chemical mechanism and the statewide air pollution research center 99 (SAPRC99) chemical mechanism, respectively. Also, the chemical speciation of PM2.5 such as soil, road dust, gasoline and diesel vehicles, industrial source, municipal incinerator, coal fired, power plant, biomass burning and marine was allocated to 5 species of fine PM, organic carbon, elementary carbon, $NO_3{^-}$, and $SO_4{^2-}$. In addition, temporal profiles for point and line sources were obtained by using the stack telemetry system (TMS) and hourly traffic flows in the Seoul metropolitan area for 2007. In particular, the temporal allocation factor for the ozone modeling at point sources was estimated based on $NO_X$ emission inventories of the stack TMS data.

An Estimation of Concentration of Asian Dust (PM10) Using WRF-SMOKE-CMAQ (MADRID) During Springtime in the Korean Peninsula (WRF-SMOKE-CMAQ(MADRID)을 이용한 한반도 봄철 황사(PM10)의 농도 추정)

  • Moon, Yun-Seob;Lim, Yun-Kyu;Lee, Kang-Yeol
    • Journal of the Korean earth science society
    • /
    • v.32 no.3
    • /
    • pp.276-293
    • /
    • 2011
  • In this study a modeling system consisting of Weather Research and Forecasting (WRF), Sparse Matrix Operator Kernel Emissions (SMOKE), the Community Multiscale Air Quality (CMAQ) model, and the CMAQ-Model of Aerosol Dynamics, Reaction, Ionization, and Dissolution (MADRID) model has been applied to estimate enhancements of $PM_{10}$ during Asian dust events in Korea. In particular, 5 experimental formulas were applied to the WRF-SMOKE-CMAQ (MADRID) model to estimate Asian dust emissions from source locations for major Asian dust events in China and Mongolia: the US Environmental Protection Agency (EPA) model, the Goddard Global Ozone Chemistry Aerosol Radiation and Transport (GOCART) model, and the Dust Entrainment and Deposition (DEAD) model, as well as formulas by Park and In (2003), and Wang et al. (2000). According to the weather map, backward trajectory and satellite image analyses, Asian dust is generated by a strong downwind associated with the upper trough from a stagnation wave due to development of the upper jet stream, and transport of Asian dust to Korea shows up behind a surface front related to the cut-off low (known as comma type cloud) in satellite images. In the WRF-SMOKE-CMAQ modeling to estimate the PM10 concentration, Wang et al.'s experimental formula was depicted well in the temporal and spatial distribution of Asian dusts, and the GOCART model was low in mean bias errors and root mean square errors. Also, in the vertical profile analysis of Asian dusts using Wang et al's experimental formula, strong Asian dust with a concentration of more than $800\;{\mu}g/m^3$ for the period of March 31 to April 1, 2007 was transported under the boundary layer (about 1 km high), and weak Asian dust with a concentration of less than $400\;{\mu}g/m^3$ for the period of 16-17 March 2009 was transported above the boundary layer (about 1-3 km high). Furthermore, the difference between the CMAQ model and the CMAQ-MADRID model for the period of March 31 to April 1, 2007, in terms of PM10 concentration, was seen to be large in the East Asia area: the CMAQ-MADRID model showed the concentration to be about $25\;{\mu}g/m^3$ higher than the CMAQ model. In addition, the $PM_{10}$ concentration removed by the cloud liquid phase mechanism within the CMAQ-MADRID model was shown in the maximum $15\;{\mu}g/m^3$ in the Eastern Asia area.