• Title/Summary/Keyword: Processing parameters

Search Result 2,718, Processing Time 0.033 seconds

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

A Dynamic Prefetch Filtering Schemes to Enhance Usefulness Of Cache Memory (캐시 메모리의 유용성을 높이는 동적 선인출 필터링 기법)

  • Chon Young-Suk;Lee Byung-Kwon;Lee Chun-Hee;Kim Suk-Il;Jeon Joong-Nam
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.123-136
    • /
    • 2006
  • The prefetching technique is an effective way to reduce the latency caused memory access. However, excessively aggressive prefetch not only leads to cache pollution so as to cancel out the benefits of prefetch but also increase bus traffic leading to overall performance degradation. In this thesis, a prefetch filtering scheme is proposed which dynamically decides whether to commence prefetching by referring a filtering table to reduce the cache pollution due to unnecessary prefetches In this thesis, First, prefetch hashing table 1bitSC filtering scheme(PHT1bSC) has been shown to analyze the lock problem of the conventional scheme, this scheme such as conventional scheme used to be N:1 mapping, but it has the two state to 1bit value of each entries. A complete block address table filtering scheme(CBAT) has been introduced to be used as a reference for the comparative study. A prefetch block address lookup table scheme(PBALT) has been proposed as the main idea of this paper which exhibits the most exact filtering performance. This scheme has a length of the table the same as the PHT1bSC scheme, the contents of each entry have the fields the same as CBAT scheme recently, never referenced data block address has been 1:1 mapping a entry of the filter table. On commonly used prefetch schemes and general benchmarks and multimedia programs simulates change cache parameters. The PBALT scheme compared with no filtering has shown enhanced the greatest 22%, the cache miss ratio has been decreased by 7.9% by virtue of enhanced filtering accuracy compared with conventional PHT2bSC. The MADT of the proposed PBALT scheme has been decreased by 6.1% compared with conventional schemes to reduce the total execution time.

A Reflectance Normalization Via BRDF Model for the Korean Vegetation using MODIS 250m Data (한반도 식생에 대한 MODIS 250m 자료의 BRDF 효과에 대한 반사도 정규화)

  • Yeom, Jong-Min;Han, Kyung-Soo;Kim, Young-Seup
    • Korean Journal of Remote Sensing
    • /
    • v.21 no.6
    • /
    • pp.445-456
    • /
    • 2005
  • The land surface parameters should be determined with sufficient accuracy, because these play an important role in climate change near the ground. As the surface reflectance presents strong anisotropy, off-nadir viewing results a strong dependency of observations on the Sun - target - sensor geometry. They contribute to the random noise which is produced by surface angular effects. The principal objective of the study is to provide a database of accurate surface reflectance eliminated the angular effects from MODIS 250m reflective channel data over Korea. The MODIS (Moderate Resolution Imaging Spectroradiometer) sensor has provided visible and near infrared channel reflectance at 250m resolution on a daily basis. The successive analytic processing steps were firstly performed on a per-pixel basis to remove cloudy pixels. And for the geometric distortion, the correction process were performed by the nearest neighbor resampling using 2nd-order polynomial obtained from the geolocation information of MODIS Data set. In order to correct the surface anisotropy effects, this paper attempted the semiempirical kernel-driven Bi- directional Reflectance Distribution Function(BRDF) model. The algorithm yields an inversion of the kernel-driven model to the angular components, such as viewing zenith angle, solar zenith angle, viewing azimuth angle, solar azimuth angle from reflectance observed by satellite. First we consider sets of the model observations comprised with a 31-day period to perform the BRDF model. In the next step, Nadir view reflectance normalization is carried out through the modification of the angular components, separated by BRDF model for each spectral band and each pixel. Modeled reflectance values show a good agreement with measured reflectance values and their RMSE(Root Mean Square Error) was totally about 0.01(maximum=0.03). Finally, we provide a normalized surface reflectance database consisted of 36 images for 2001 over Korea.

An Implementation of OTB Extension to Produce TOA and TOC Reflectance of LANDSAT-8 OLI Images and Its Product Verification Using RadCalNet RVUS Data (Landsat-8 OLI 영상정보의 대기 및 지표반사도 산출을 위한 OTB Extension 구현과 RadCalNet RVUS 자료를 이용한 성과검증)

  • Kim, Kwangseob;Lee, Kiwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.449-461
    • /
    • 2021
  • Analysis Ready Data (ARD) for optical satellite images represents a pre-processed product by applying spectral characteristics and viewing parameters for each sensor. The atmospheric correction is one of the fundamental and complicated topics, which helps to produce Top-of-Atmosphere (TOA) and Top-of-Canopy (TOC) reflectance from multi-spectral image sets. Most remote sensing software provides algorithms or processing schemes dedicated to those corrections of the Landsat-8 OLI sensors. Furthermore, Google Earth Engine (GEE), provides direct access to Landsat reflectance products, USGS-based ARD (USGS-ARD), on the cloud environment. We implemented the Orfeo ToolBox (OTB) atmospheric correction extension, an open-source remote sensing software for manipulating and analyzing high-resolution satellite images. This is the first tool because OTB has not provided calibration modules for any Landsat sensors. Using this extension software, we conducted the absolute atmospheric correction on the Landsat-8 OLI images of Railroad Valley, United States (RVUS) to validate their reflectance products using reflectance data sets of RVUS in the RadCalNet portal. The results showed that the reflectance products using the OTB extension for Landsat revealed a difference by less than 5% compared to RadCalNet RVUS data. In addition, we performed a comparative analysis with reflectance products obtained from other open-source tools such as a QGIS semi-automatic classification plugin and SAGA, besides USGS-ARD products. The reflectance products by the OTB extension showed a high consistency to those of USGS-ARD within the acceptable level in the measurement data range of the RadCalNet RVUS, compared to those of the other two open-source tools. In this study, the verification of the atmospheric calibration processor in OTB extension was carried out, and it proved the application possibility for other satellite sensors in the Compact Advanced Satellite (CAS)-500 or new optical satellites.

Investigating Data Preprocessing Algorithms of a Deep Learning Postprocessing Model for the Improvement of Sub-Seasonal to Seasonal Climate Predictions (계절내-계절 기후예측의 딥러닝 기반 후보정을 위한 입력자료 전처리 기법 평가)

  • Uran Chung;Jinyoung Rhee;Miae Kim;Soo-Jin Sohn
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.2
    • /
    • pp.80-98
    • /
    • 2023
  • This study explores the effectiveness of various data preprocessing algorithms for improving subseasonal to seasonal (S2S) climate predictions from six climate forecast models and their Multi-Model Ensemble (MME) using a deep learning-based postprocessing model. A pipeline of data transformation algorithms was constructed to convert raw S2S prediction data into the training data processed with several statistical distribution. A dimensionality reduction algorithm for selecting features through rankings of correlation coefficients between the observed and the input data. The training model in the study was designed with TimeDistributed wrapper applied to all convolutional layers of U-Net: The TimeDistributed wrapper allows a U-Net convolutional layer to be directly applied to 5-dimensional time series data while maintaining the time axis of data, but every input should be at least 3D in U-Net. We found that Robust and Standard transformation algorithms are most suitable for improving S2S predictions. The dimensionality reduction based on feature selections did not significantly improve predictions of daily precipitation for six climate models and even worsened predictions of daily maximum and minimum temperatures. While deep learning-based postprocessing was also improved MME S2S precipitation predictions, it did not have a significant effect on temperature predictions, particularly for the lead time of weeks 1 and 2. Further research is needed to develop an optimal deep learning model for improving S2S temperature predictions by testing various models and parameters.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

Quantitative Differences between X-Ray CT-Based and $^{137}Cs$-Based Attenuation Correction in Philips Gemini PET/CT (GEMINI PET/CT의 X-ray CT, $^{137}Cs$ 기반 511 keV 광자 감쇠계수의 정량적 차이)

  • Kim, Jin-Su;Lee, Jae-Sung;Lee, Dong-Soo;Park, Eun-Kyung;Kim, Jong-Hyo;Kim, Jae-Il;Lee, Hong-Jae;Chung, June-Key;Lee, Myung-Chul
    • The Korean Journal of Nuclear Medicine
    • /
    • v.39 no.3
    • /
    • pp.182-190
    • /
    • 2005
  • Purpose: There are differences between Standard Uptake Value (SUV) of CT attenuation corrected PET and that of $^{137}Cs$. Since various causes lead to difference of SUV, it is important to know what is the cause of these difference. Since only the X-ray CT and $^{137}Cs$ transmission data are used for the attenuation correction, in Philips GEMINI PET/CT scanner, proper transformation of these data into usable attenuation coefficients for 511 keV photon has to be ascertained. The aim of this study was to evaluate the accuracy in the CT measurement and compare the CT and $^{137}Cs$-based attenuation correction in this scanner. Methods: For all the experiments, CT was set to 40 keV (120 kVp) and 50 mAs. To evaluate the accuracy of the CT measurement, CT performance phantom was scanned and Hounsfield units (HU) for those regions were compared to the true values. For the comparison of CT and $^{137}Cs$-based attenuation corrections, transmission scans of the elliptical lung-spine-body phantom and electron density CT phantom composed of various components, such as water, bone, brain and adipose, were performed using CT and $^{137}Cs$. Transformed attenuation coefficients from these data were compared to each other and true 511 keV attenuation coefficient acquired using $^{68}Ge$ and ECAT EXACT 47 scanner. In addition, CT and $^{137}Cs$-derived attenuation coefficients and SUV values for $^{18}F$-FDG measured from the regions with normal and pathological uptake in patients' data were also compared. Results: HU of all the regions in CT performance phantom measured using GEMINI PET/CT were equivalent to the known true values. CT based attenuation coefficients were lower than those of $^{68}Ge$ about 10% in bony region of NEMA ECT phantom. Attenuation coefficients derived from $^{137}Cs$ data was slightly higher than those from CT data also in the images of electron density CT phantom and patients' body with electron density. However, the SUV values in attenuation corrected images using $^{137}Cs$ were lower than images corrected using CT. Percent difference between SUV values was about 15%. Conclusion: Although the HU measured using this scanner was accurate, accuracy in the conversion from CT data into the 511 keV attenuation coefficients was limited in the bony region. Discrepancy in the transformed attenuation coefficients and SUV values between CT and $^{137}Cs$-based data shown in this study suggests that further optimization of various parameters in data acquisition and processing would be necessary for this scanner.

Application and Analysis of Ocean Remote-Sensing Reflectance Quality Assurance Algorithm for GOCI-II (천리안해양위성 2호(GOCI-II) 원격반사도 품질 검증 시스템 적용 및 결과)

  • Sujung Bae;Eunkyung Lee;Jianwei Wei;Kyeong-sang Lee;Minsang Kim;Jong-kuk Choi;Jae Hyun Ahn
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_2
    • /
    • pp.1565-1576
    • /
    • 2023
  • An atmospheric correction algorithm based on the radiative transfer model is required to obtain remote-sensing reflectance (Rrs) from the Geostationary Ocean Color Imager-II (GOCI-II) observed at the top-of-atmosphere. This Rrs derived from the atmospheric correction is utilized to estimate various marine environmental parameters such as chlorophyll-a concentration, total suspended materials concentration, and absorption of dissolved organic matter. Therefore, an atmospheric correction is a fundamental algorithm as it significantly impacts the reliability of all other color products. However, in clear waters, for example, atmospheric path radiance exceeds more than ten times higher than the water-leaving radiance in the blue wavelengths. This implies atmospheric correction is a highly error-sensitive process with a 1% error in estimating atmospheric radiance in the atmospheric correction process can cause more than 10% errors. Therefore, the quality assessment of Rrs after the atmospheric correction is essential for ensuring reliable ocean environment analysis using ocean color satellite data. In this study, a Quality Assurance (QA) algorithm based on in-situ Rrs data, which has been archived into a database using Sea-viewing Wide Field-of-view Sensor (SeaWiFS) Bio-optical Archive and Storage System (SeaBASS), was applied and modified to consider the different spectral characteristics of GOCI-II. This method is officially employed in the National Oceanic and Atmospheric Administration (NOAA)'s ocean color satellite data processing system. It provides quality analysis scores for Rrs ranging from 0 to 1 and classifies the water types into 23 categories. When the QA algorithm is applied to the initial phase of GOCI-II data with less calibration, it shows the highest frequency at a relatively low score of 0.625. However, when the algorithm is applied to the improved GOCI-II atmospheric correction results with updated calibrations, it shows the highest frequency at a higher score of 0.875 compared to the previous results. The water types analysis using the QA algorithm indicated that parts of the East Sea, South Sea, and the Northwest Pacific Ocean are primarily characterized as relatively clear case-I waters, while the coastal areas of the Yellow Sea and the East China Sea are mainly classified as highly turbid case-II waters. We expect that the QA algorithm will support GOCI-II users in terms of not only statistically identifying Rrs resulted with significant errors but also more reliable calibration with quality assured data. The algorithm will be included in the level-2 flag data provided with GOCI-II atmospheric correction.