• Title/Summary/Keyword: Data Normalization

Search Result 481, Processing Time 0.043 seconds

Building Hybrid Stop-Words Technique with Normalization for Pre-Processing Arabic Text

  • Atwan, Jaffar
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.7
    • /
    • pp.65-74
    • /
    • 2022
  • In natural language processing, commonly used words such as prepositions are referred to as stop-words; they have no inherent meaning and are therefore ignored in indexing and retrieval tasks. The removal of stop-words from Arabic text has a significant impact in terms of reducing the size of a cor- pus text, which leads to an improvement in the effectiveness and performance of Arabic-language processing systems. This study investigated the effectiveness of applying a stop-word lists elimination with normalization as a preprocessing step. The idea was to merge statistical method with the linguistic method to attain the best efficacy, and comparing the effects of this two-pronged approach in reducing corpus size for Ara- bic natural language processing systems. Three stop-word lists were considered: an Arabic Text Lookup Stop-list, Frequency- based Stop-list using Zipf's law, and Combined Stop-list. An experiment was conducted using a selected file from the Arabic Newswire data set. In the experiment, the size of the cor- pus was compared after removing the words contained in each list. The results showed that the best reduction in size was achieved by using the Combined Stop-list with normalization, with a word count reduction of 452930 and a compression rate of 30%.

Illumination Normalization Method for Robust Eye Detection in Lighting Changing Environment (조명변화에 강인한 눈 검출을 위한 조명 정규화 방법)

  • Xu, Chengzhe;Islam, Ihtesham Ul;Kim, In-Taek
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.955-956
    • /
    • 2008
  • This paper presents a new method for illumination normalization in eye detection. Based on the retinex image formation model, we employ the discrete wavelet transform to remove the lighting effect in face image data. The final result based on the proposed method shows the better performance in detecting eyes compared with previous work.

  • PDF

On-Line Blind Channel Normalization for Noise-Robust Speech Recognition

  • Jung, Ho-Young
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.1 no.3
    • /
    • pp.143-151
    • /
    • 2012
  • A new data-driven method for the design of a blind modulation frequency filter that suppresses the slow-varying noise components is proposed. The proposed method is based on the temporal local decorrelation of the feature vector sequence, and is done on an utterance-by-utterance basis. Although the conventional modulation frequency filtering approaches the same form regardless of the task and environment conditions, the proposed method can provide an adaptive modulation frequency filter that outperforms conventional methods for each utterance. In addition, the method ultimately performs channel normalization in a feature domain with applications to log-spectral parameters. The performance was evaluated by speaker-independent isolated-word recognition experiments under additive noise environments. The proposed method achieved outstanding improvement for speech recognition in environments with significant noise and was also effective in a range of feature representations.

  • PDF

Adaptive Channel Normalization Based on Infomax Algorithm for Robust Speech Recognition

  • Jung, Ho-Young
    • ETRI Journal
    • /
    • v.29 no.3
    • /
    • pp.300-304
    • /
    • 2007
  • This paper proposes a new data-driven method for high-pass approaches, which suppresses slow-varying noise components. Conventional high-pass approaches are based on the idea of decorrelating the feature vector sequence, and are trying for adaptability to various conditions. The proposed method is based on temporal local decorrelation using the information-maximization theory for each utterance. This is performed on an utterance-by-utterance basis, which provides an adaptive channel normalization filter for each condition. The performance of the proposed method is evaluated by isolated-word recognition experiments with channel distortion. Experimental results show that the proposed method yields outstanding improvement for channel-distorted speech recognition.

  • PDF

Image Classification Method using Independent Component Analysis and Normalization (독립성분해석과 정규화를 이용한 영상분류 방법)

  • Hong, Jun-Sik;Ryu, Jeong-Woong
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.9
    • /
    • pp.629-633
    • /
    • 2001
  • In this paper, we improve noise tolerance in image classification by combining ICA(Independent Component Analysis) with Normalization. When we add noise to the raw image data the degree of noise tolerance becomes N(0, 0.4) for PCA and N(0, 0.53) for ICA. However, when we use the preprocessing approach the degree of noise tolerance after Normalization becomes N(0, 0.75), which shows the improvement of noise tolerance in classification.

  • PDF

Short Term Sensor's Drift Analysis and Compensation Using Internal Normalization (내부 최적화를 이용한 화학 센서의 단기 드리프트 분석 및 보정)

  • Jeon, Jin-Young;Baek, Jong-Hyun;Byun, Hyung-Gi
    • Journal of Sensor Science and Technology
    • /
    • v.24 no.4
    • /
    • pp.270-273
    • /
    • 2015
  • One of the main problems when working the chemical sensor is the lack of repeatability and reproducibility of the sensor response. If the problem is not properly taken into consideration, the stability and reliability of the system using chemical sensors would be decreased. In this paper we analyzed the sensor's drift of short term and proposed a compensation method for reducing the effects of the drift in order to improve the stability and the reliability of the chemical sensor. The sensor drift was analyzed by a trend line graph and CV(coefficient of variation) was used to quantify. And we compensated for the drift by using the internal normalization. As a result it was found that the value of CV was decreased after compensation.

Variation of Water Level on the Upstream Gauging Station by Operation of the Drainage Sluice Gate of Geumgang Estuary Dam (금강하구둑 배수갑문 조작에 의한 상류수역의 수위변동)

  • Park, Seung-Ki
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.47 no.6
    • /
    • pp.15-24
    • /
    • 2005
  • The normalization on the characteristics of water level change at the upstream gauging station was attempted according to the operation of drainage sluice gate of the Geumgang estuary dam. The characteristics were normalized by the analysis of water level change and by the linear-regression of the water level data measured at the inner station of Geumgang estuary dam and upstream gauging station. The results of normalization may be referred to the management of Geumgang estuary lake, the operation of pumping and drainage stations in the shore of the lake. The mean response time of water level change on Ibpo, Ganggyeong and Gyuam water level station were 39,81 and 160 minutes, when sluice gate was opened respectively. The mean velocity of surface wave, the mean displacement of water level change, the mean time of water level change and the mean rate of water level change varied largely depending on the location of gauging station and the characteristics of stream section of the water level gauging station.

Comparative Analysis for Emotion Expression Using Three Methods Based by CNN (CNN기초로 세 가지 방법을 이용한 감정 표정 비교분석)

  • Yang, Chang Hee;Park, Kyu Sub;Kim, Young Seop;Lee, Yong Hwan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.65-70
    • /
    • 2020
  • CNN's technologies that represent emotional detection include primitive CNN algorithms, deployment normalization, and drop-off. We present the methods and data of the three experiments in this paper. The training database and the test database are set up differently. The first experiment is to extract emotions using Batch Normalization, which complemented the shortcomings of distribution. The second experiment is to extract emotions using Dropout, which is used for rapid computation. The third experiment uses CNN using convolution and maxpooling. All three results show a low detection rate, To supplement these problems, We will develop a deep learning algorithm using feature extraction method specialized in image processing field.

Motion Recognition of Workers using Skeleton and LSTM (Skeleton 정보와 LSTM을 이용한 작업자 동작인식)

  • Jeon, Wang Su;Rhee, Sang Yong
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.4
    • /
    • pp.575-582
    • /
    • 2022
  • In the manufacturing environment, research to minimize robot collisions with human beings have been widespread, but in order to interact with robots, it is important to precisely recognize and predict human actions. In this research, after enhancing performance by applying group normalization to the Hourglass model to detect the operator motion, the skeleton was estimated and data were created using this model. And then, three types of operator's movements were recognized using LSTM. As results of the experiment, the accuracy was enhanced by 1% using group normalization, and the recognition accuracy was 99.6%.

Combining Support Vector Machine Recursive Feature Elimination and Intensity-dependent Normalization for Gene Selection in RNAseq (RNAseq 빅데이터에서 유전자 선택을 위한 밀집도-의존 정규화 기반의 서포트-벡터 머신 병합법)

  • Kim, Chayoung
    • Journal of Internet Computing and Services
    • /
    • v.18 no.5
    • /
    • pp.47-53
    • /
    • 2017
  • In past few years, high-throughput sequencing, big-data generation, cloud computing, and computational biology are revolutionary. RNA sequencing is emerging as an attractive alternative to DNA microarrays. And the methods for constructing Gene Regulatory Network (GRN) from RNA-Seq are extremely lacking and urgently required. Because GRN has obtained substantial observation from genomics and bioinformatics, an elementary requirement of the GRN has been to maximize distinguishable genes. Despite of RNA sequencing techniques to generate a big amount of data, there are few computational methods to exploit the huge amount of the big data. Therefore, we have suggested a novel gene selection algorithm combining Support Vector Machines and Intensity-dependent normalization, which uses log differential expression ratio in RNAseq. It is an extended variation of support vector machine recursive feature elimination (SVM-RFE) algorithm. This algorithm accomplishes minimum relevancy with subsets of Big-Data, such as NCBI-GEO. The proposed algorithm was compared to the existing one which uses gene expression profiling DNA microarrays. It finds that the proposed algorithm have provided as convenient and quick method than previous because it uses all functions in R package and have more improvement with regard to the classification accuracy based on gene ontology and time consuming in terms of Big-Data. The comparison was performed based on the number of genes selected in RNAseq Big-Data.