통합 검색 | Korea Science

김민경;황범석
- 응용통계연구
- /
- 제37권5호
- /
- pp.663-673
- /
- 2024
MAB (multi-armed bandits) 문제는 순차적 의사 결정 상황에서 나타나며, 동적인 환경 내에서 가능한 여러 행동 중 보상을 최대화할 수 있는 최적의 행동을 선택하는 데 중점을 둔다. 통계적 학습 이론의 맥락에서 MAB 문제를 해결하는 대표적인 알고리즘 중 하나인 톰슨 샘플링은 근사 기법을 적용하면 복잡한 상황에서도 유연하게 적용될 수 있다고 알려져 있다. 그러나 실제 상용 서비스 데이터를 이용한 연구는 부족한 상황이다. 본 연구에서는 대중적인 추천 시스템 환경 중 하나인 배너 클릭 데이터를 활용하여 여러 조건의 모의실험 환경에서 톰슨 샘플링에 다양한 근사 기법 적용 여부에 따른 성능을 평가하였다. 실험 결과, 랑주뱅 몬테 카를로 근사 기법을 적용한 톰슨 샘플링의 성능이 빅데이터 환경에서 기존 톰슨 샘플링과 유사한 성능을 보임을 확인하였다. 본 연구는 근사 기법을 적용한 톰슨 샘플링이 근사 기법의 고유한 장점을 가지면서도 기존 모형과 유사한 성능을 낼 수 있음을 실증 확인하였다는 점에 그 의의가 있다고 볼 수 있다.
https://doi.org/10.5351/KJAS.2024.37.5.663 인용 PDF

Chung, Byung Chang
- 한국정보통신학회논문지
- /
- 제25권12호
- /
- pp.1960-1963
- /
- 2021
In this paper, we propose a multiplay Thompson sampling algorithm in multipath communication system. Multipath communication system has advantages on communication capacity, robustness, survivability, and so on. It is important to select appropriate network path according to the status of individual path. However, it is hard to obtain the information of path quality simultaneously. To solve this issue, we propose Thompson sampling which is popular in machine learning area. We find some issues when the algorithm is applied directly in the proposal system and suggested some modifications. Through simulation, we verified the proposed algorithm can utilize the entire network paths. In summary, our proposed algorithm can be applied as a path allocation in multipath-based communications system.
https://doi.org/10.6109/jkiice.2021.25.12.1960 인용 PDF KSCI