Search | Korea Science

Exploring the Feature Selection Method for Effective Opinion Mining: Emphasis on Particle Swarm Optimization Algorithms

Eo, Kyun Sun;Lee, Kun Chang
- Journal of the Korea Society of Computer and Information
- /
- v.25 no.11
- /
- pp.41-50
- /
- 2020
Sentimental analysis begins with the search for words that determine the sentimentality inherent in data. Managers can understand market sentimentality by analyzing a number of relevant sentiment words which consumers usually tend to use. In this study, we propose exploring performance of feature selection methods embedded with Particle Swarm Optimization Multi Objectives Evolutionary Algorithms. The performance of the feature selection methods was benchmarked with machine learning classifiers such as Decision Tree, Naive Bayesian Network, Support Vector Machine, Random Forest, Bagging, Random Subspace, and Rotation Forest. Our empirical results of opinion mining revealed that the number of features was significantly reduced and the performance was not hurt. In specific, the Support Vector Machine showed the highest accuracy. Random subspace produced the best AUC results.
https://doi.org/10.9708/jksci.2020.25.11.041 인용 PDF KSCI

Exploring the Predictive Variables of Government Statistical Indicators on Retail sales Using Machine Learning: Focusing on Pharmacy (머신러닝을 이용한 정부통계지표가 소매업 매출액에 미치는 예측 변인 탐색: 약국을 중심으로)

Lee, Gwang-Su
- Journal of Internet Computing and Services
- /
- v.23 no.3
- /
- pp.125-135
- /
- 2022
This study aims to explore variables using machine learning and provide analysis techniques suitable for predicting pharmacy sales whether government statistical indicators built to create an industrial ecosystem based on data, network, and artificial intelligence affect pharmacy sales. Therefore, this study explored predictive variables and performance through machine learning techniques such as Random Forest, XGBoost, LightGBM, and CatBoost using analysis data from January 2016 to December 2021 for 28 government statistical indicators and pharmacies in the retail sector. As a result of the analysis, economic sentiment index, economic accompanying index circulation change, and consumer sentiment index, which are economic indicators, were found to be important variables affecting pharmacy sales. As a result of examining the indicators MAE, MSE, and RMSE for regression performance, random forests showed the best performance than XGBoost, LightGBM, and CatBoost. Therefore, this study presented variables and optimal machine learning techniques that affect pharmacy sales based on machine learning results, and proposed several implications and follow-up studies.
https://doi.org/10.7472/jksii.2022.23.3.125 인용 PDF KSCI HTML

Design of fuzzy model using meiosis-genetic algorithm (감수분열 유전알고리즘을 이용한 퍼지 모델의 자동 설계)

Koh, Taek-Beom;Lee, Deog-Kyoo
- Proceedings of the KIEE Conference
- /
- 2000.07d
- /
- pp.2696-2698
- /
- 2000
본 연구에서는 실수형 염색체들로 구성된 개체에 대해 감수분열을 적용하여 개체를 만들고, 이 생식체들의 랜덤한 선택과 교배에 의해 세대가 진화함에 따라 탐색을 수행하는 감수분열 유전알고리즘을 이용하여 퍼지모델의 최적 구조와 파라미터를 탐색하고 Gradient Descent 알고리즘으로 파라미터를 정밀 조정하는 방안을 제안한다. 제안된 방안을 적용하여 Box-Jenkins의 가스로 데이터에 대한 퍼지모델을 구성하고 그 적용 가능성을 보인다.
PDF

An Improved Partitioning Algorithm in Hardware Software Codeisgn (하드웨어 소프트웨어 통합설계에서의 개선된 분할 알고리즘)

Oh, Ju-Young
- Proceedings of the Korea Information Processing Society Conference
- /
- 2001.10a
- /
- pp.689-692
- /
- 2001
본 논문에서는 주어진 제약조건을 만족하며 저비용 고효율의 목적물 합성을 위하여 어느 부분을 하드웨어로 또는 소프트웨어로 구현할 것인지를 결정하는 분할 알고리즘을 제안한다. 논문[6]에서 제시한 시뮬레이티드 어닐링의 후보자 선택은 랜덤한 방식에 의해 노드의 이동이 이루어지기 때문에 중복된 후보자의 선택으로 인하여 시간이 오래 걸리는 단점이 있다. 이러한 단점을 극복하기 위해, 본 논문에서는 비용 함수를 구성하는 변수들 중에서 시스템 실행시간과 구현 비용에 영향을 미칠 수 있는 부분들을 고려해 후보자를 선택하도록 하여 최적해 탐색을 위한 분할 알고리즘의 실행 시간을 단축시켰다. 실험 결과는 대상 노드가 많아질수록 기존의 방법보다 빠른 시간에 최적의 해를 탐색한다.
PDF

The Improved Evolutionary Programming with Direction Vectors (방향성 벡터를 갖는 개선된 진화프로그래밍)

박진현;배준경
- Journal of the Korean Institute of Intelligent Systems
- /
- v.10 no.6
- /
- pp.542-547
- /
- 2000
진화프로그래밍(Evolutionary Programming : EP)은 최적화 문제에 있어서 매우 유용한 기법으로 자연선택의 원리를 모방한 탐색알고리즘이다. EP는 기존의 최적화 알고리즘에 비하여 여러해를 동시에 탐색하는 전역탐색(global search)방법이므로 국부수렴(local convergence)의 가능성이 줄어들고, 최적화 파라메터 영역의 연속성과 미분치의 존재성과 같은 조건이 필요 없는 장점을 갖는다. 이러한 장점에도 불구하고, EP의 탐색영역이 초기조건 및 최적화 파라메터들의 랜덤 생성 그리고 최적화에 필요한 전략적 파라메터들에 의하여 탐색 영역이 결정되고, 수렴성이 느린 단점을 갖는다. 이러한 문제를 해결하기 위하여, 본 연구에서는 빠른 수렴성과 다양성을 갖는 개선된 EP을 제안하고, 제안된 방향성 벡터를 갖는 개선된 EP를 함수 최적화 문제에 적용하여 그 성능의 유용성을 보이고자 한다.
PDF

Effect of Genetic Correlations on the P Values from Randomization Test and Detection of Significant Gene Groups (유전자 연관성이 랜덤검정 P값과 유의 유전자군의 탐색에 미치는 영향)

Yi, Mi-Sung;Song, Hae-Hiang
- The Korean Journal of Applied Statistics
- /
- v.22 no.4
- /
- pp.781-792
- /
- 2009
At an early stage of genomic investigations, a small sample of microarrays is used in gene expression experiments to identify small subsets of candidate genes for a further accurate investigation. Unlike the statistical analysis methods for a large sample of microarrays, an appropriate statistical method for identifying small subsets is a randomization test that provides exact P values. These exact P values from a randomization test for a small sample of microarrays are discrete. The possible existence of differentially expressed genes in the sample of a full set of genes can be tested for the null hypothesis of a uniform distribution. Subsets of smaller P values are of prime interest for a further accurate investigation and identifying these outlier cells from a multinomial distribution of P values is possible by M test of Fuchs et al. (1980). Above all, the genome-wide gene expressions in microarrays are correlated, but the majority of statistical analysis methods in the microarray analysis are based on an independence assumption of genes and ignore the possibly correlated expression levels. We investigated with simulation studies the effect that correlated gene expression levels could have on the randomization test results and M test results, and found that the effects are often not ignorable.
https://doi.org/10.5351/KJAS.2009.22.4.781 인용 PDF KSCI

Research of FOV difference correction between Electro Optic Tracking System and Radar System (전자광학 추적장비와 레이더시스템 간의 표적탐색영역 차이 보상에 관한 연구)

Kwon, Kang-hoon;Kim, Young-gil
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2013.05a
- /
- pp.151-154
- /
- 2013
We typically have a variety of equipment that can detect and track targets, and detect and track target quickly and accurately through the exchange of the information between each piece of equipment. These equipments have similar detection area(FOV), but some are different due to the limits of the resolution of the equipments. In this paper, we studied the method of reducing time to search and detect target, and also did the method of tracking automatically it.
PDF

Review of control parameter of SCE-UA (SCE-UA기법의 제어 매개변수 검토)

Taehun Jung;Sangho Lee;Namjoo Lee
- Proceedings of the Korea Water Resources Association Conference
- /
- 2023.05a
- /
- pp.350-350
- /
- 2023
SCE-UA(Shuffled Complex Evolution-University Arizona)기법은 최적해 탐색 알고리즘으로 개념적 강우유출 모형(conceptual rainfall runoff model)의 보정을 위한 도구로 개발되었다. SCE-UA기법은 메타휴리스틱 방법의 일종으로 최적해를 구하기 위하여 여러번 목적함수 값을 계산해야 한다. 이 때 목적함수 계산 횟수와 해의 수렴과 관련된 제어 매개변수가 존재하며, 사용자가 적절한 값을 입력해주어야 한다. 이 연구에서는 SCE-UA와 관련된 제어 매개변수의 기능에 대해서 검토하였다. 그리고 집합체 수의 변화에 따라서 검사함수인 Ackley function의 전역해를 얼마나 잘 탐색하는지 검토하였다. 검토 결과 랜덤 시드에 따라서 전역해 탐색 결과가 달라졌으며, 집합체의 수가 증가할수록 목적함수 계산 횟수는 증가하는 경향을 나타내었다. 검사함수의 차원(결정 변수의 수)이 증가하면 전역해의 탐색률이 감소하며, 집합체의 수가 많아지면 전역해를 더 잘 찾는 경향이 나타나지만, 목적함수 계산 횟수는 더 많아지게 되는 것을 확인할 수 있었다. 2차원인 경우 집합체의 수가 7개 이상일 때 탐색 성공률은 90% 이상이 되었지만, 10차원인 경우 집합체의 수가 시험 최대값인 20개일 때의 전역해 탐색률은 37%에 그쳤다. 이 연구의 결과는 SCE-UA 기법의 설정 매개변수에 관한 기본 개념을 이해하고, 사용자가 설정 매개변수 선정 시에 활용할 수 있을 것이다.
PDF

Area-Based Q-learning for Multiple Robots Control (다수 로봇 제어를 위한 면적 기반 Q-learning)

Yoon Han-Ul;Jang In-Hoon;Sim Kwee-Bo
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2005.04a
- /
- pp.198-201
- /
- 2005
본 논문에서는 다수개의 로봇을 효율적으로 제어하기 위한 면적기반 Q-learning에 대해 논한다. 각 로봇은 $60^{\circ}$의 각을 이루도록 배치된 6개 센서를 가지고 있고 이를 통해 자신과 주변환경 사이의 거리를 센싱한다. 다음으로, 이 획득된 거리 데이터들로부터 6방향의 면적을 계산하여, 이후의 진행에 있어 보다 넓은 행동 반경을 보장해주는 영역으로 이동한다. 이 이동을 어떤 상태에서 다른 상태로의 전이로 간주, 이동 후 다시 6방향의 면적을 계산하여 이전 상태에서 현재 상태로의 행동에 대한 Q-Value를 업데이트 한다. 본 논문의 실험에서는 5개의 로봇을 이용해 장애물 사이에 숨어있는 물체를 찾아내는 것을 시도하였고, 3개의 서로 다른 제어 방법 - 랜덤 탐색, 면적 기반 탐색, 면적 기반 Q-learning 탐색 - 에 따른 결과를 나타내었다.
PDF

Design of a Real Time, High Speed, Large Scale Data Storage System using the DEVS formalism (DEVS 형식론을 이용한 실시간 고속 대규모 데이터 저장 시스템의 설계)

이찬수;성영락;오하령
- Proceedings of the Korea Society for Simulation Conference
- /
- 1997.04a
- /
- pp.75-80
- /
- 1997
본 연구에서는 대용량의 데이터를 고속으로 입출력할 수 있는 데이터 저장 시스템 이 가져야할 요구사항을 분석하고, 그것을 만족하는 시스템을 설계하였다. 본 논문에서는 우선 고속 대용량, 랜덤 억세스의 조건을 만족시키기 위해 여러 대의 하드 디스크를 병렬로 연결하여 입력되는 데이터들을 나누어 저장하도록 하였다. 그러나 하드 디스크의 성능은 디 스크 아암의 탐색동작에 의해 크게 영향을 받으므로 실시간 요구 조건을 만족시키기 위해선 단순히 디스크의 수를 늘이는 것 외에 디스크 아암의 탐색 동작을 효율적으로 제어할 수 있 는 방법이 필요하다. 그래서 본 논문에서 설계된 시스템에서는 시스템을 MCU(Master Control Unit), DDU(Data Distribution Unit), SCU(Slave Control Unit), DSU(Data Storage Unit)의 4부분으로 나누고, 각 디스크의 디스크 아암 탐색 동작을 독립된 SCU에서 제어하 도록 하였다. 설계된 내용이 주어진 요구사항들을 만족하는 것을 확인하기 위해, 본 논문에 서는 이산사건 시스템을 기술하는 수학적인 언어인 DEVS 형식론을 이용하여 제안된 시스 템을 기술하고 시뮬레이션하였다. 그리고 시뮬레이션되는 과정에서 생산되는 사건들의 궤적 을 분석하였다. 분석결과 제안된 시스템은 앞에서 제시한 여러 요구사항들을 잘 수용함을 보았다.
PDF

Search Result 97, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)