• Title/Summary/Keyword: DOC

Search Result 778, Processing Time 0.033 seconds

Multiple Fusion-based Deep Cross-domain Recommendation (다중 융합 기반 심층 교차 도메인 추천)

  • Hong, Minsung;Lee, WonJin
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.819-832
    • /
    • 2022
  • Cross-domain recommender system transfers knowledge across different domains to improve the recommendation performance in a target domain that has a relatively sparse model. However, they suffer from the "negative transfer" in which transferred knowledge operates as noise. This paper proposes a novel Multiple Fusion-based Deep Cross-Domain Recommendation named MFDCR. We exploit Doc2Vec, one of the famous word embedding techniques, to fuse data user-wise and transfer knowledge across multi-domains. It alleviates the "negative transfer" problem. Additionally, we introduce a simple multi-layer perception to learn the user-item interactions and predict the possibility of preferring items by users. Extensive experiments with three domain datasets from one of the most famous services Amazon demonstrate that MFDCR outperforms recent single and cross-domain recommendation algorithms. Furthermore, experimental results show that MFDCR can address the problem of "negative transfer" and improve recommendation performance for multiple domains simultaneously. In addition, we show that our approach is efficient in extending toward more domains.

Entity-oriented Sentence Extraction and Relation-Context Co-attention for Document-level Relation Extraction (문서 수준 관계 추출을 위한 개체 중심 문장 추출 및 Relation-Context Co-attention 방법)

  • Park, SeongSik;Kim, HarkSoo
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.9-13
    • /
    • 2020
  • 관계 추출은 주어진 문장이나 문서에 존재하는 개체들 간의 의미적 관계를 찾아내는 작업을 말한다. 최근 문서 수준 관계 추출 말뭉치인 DocRED가 공개되면서 문서 수준 관계 추출에 대한 연구가 활발히 진행되고 있다. 또한 사전 학습된 Masked Language Model(MLM)이 자연어처리 분야 전체에 영향력을 보이면서 관계 추출에서도 MLM을 사용하는 연구가 진행되고 있다. 그러나 문서 수준의 관계 추출은 문서의 단위가 길기 때문에 Self-attention을 기반으로 하는 MLM을 사용하면 모델의 계산량이 증가하는 문제가 있다. 본 논문은 이 점을 보완하기 위해 관계 추출에 필요한 문장을 선별하는 간단한 전처리 방법을 제안한다. 또한 문서의 길이에 상관없이 관계 추출에 필요한 어휘 정보를 자동으로 습득 할 수 있는 Relation-Context Co-attention 방법을 제안한다. 제안 모델은 DocRED 말뭉치에서 Dev F1 62.01%, Test F1 59.90%로 높은 성능을 보였다.

  • PDF

Designing a Recommendation System between Korean Start-ups and Foreign Buyers based on Convolutional Neural Network (CNN 기반의 국내 스타트업 해외-바이어간 추천시스템 설계)

  • Choi, Jungsuk;Moon, Nammee
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.795-796
    • /
    • 2021
  • 본 논문은 국내 스타트업의 상품-서비스에 적합한 해외 바이어를 찾아 맟춤형으로 추천해주는 시스템을 설계하고자 한다. 추천 알고리즘은 CNN 기반의 Word2Vec과 Doc2Vec 알고리즘을 활용하며, 정확도를 높이기 위해 시각정보를 활용한다. 추천 시스템에 사용되는 데이터는 비정형 데이타인 회사 소개 및 상품/서비스 소개 문장 데이터이며, 제품 사진을 시각정보로 이용한다. 유사도가 높은 순으로 추천하기 위해 문장데이타를 키워드 리스트로 변환하고, Word2vec 모델에 이식시켜 키워드 좌표를 만들어 벡터화한다. 그리고, 문장의 중심점간 거리를 계산해 기업간 유사성 및 연관성을 도출한다. 이를 바탕으로 국내 스타트업의 문장데이타 및 시각정보와 유사도가 높은 순으로 해외바이어를 추천한다.

Characterization of Dissolved Organics Based on Their Origins (상수 원수에 따른 용존 유기물의 특성 평가)

  • 허준무;박종안;장봉기;이종화
    • Journal of Environmental Science International
    • /
    • v.8 no.3
    • /
    • pp.337-347
    • /
    • 1999
  • This study was carried out to evaluate the characteristics of dissolved organics based on their origins, which were divided into two categories. The first group consisted of river, lake and secondary sewage treatment effluent, which were chosen as representative of their origins. The second group were artificial samples which were made of AHA(Aldrich humic acids) and WHA(Wako humic acids). Physicochemical characteristics, biological degradability and THMEP(trihalomethane formation potential) of the samples were analysed based on the AMWD(apparent molecular weight distribution). Large portion of dissolved organic carbon(DOC) in the river and lake samples was comprised of LMW(low molecular weight), which that of AHA and WHA was HMW(high molecular weight). The DOC of the lake was evenly distributed in the all range of molecular weight. The river, lake and secondary treated effluent have lower ultraviolet(UV) absorbance at 254nm, and have a higher amount of humic acids. Higher absorbance of humic acids means that aliphatic bond and benzenoid type components that absorb UV light were contained in these kind of humic acids. It was expected that lake sample was the most biodegradable in the different samples investigated, and in order of secondary sewage treatment effluent, river, WHA and AHA based on the result of determination of specific ultraviolet absorbance(SUVA). Biodegradability showed similar result except for AHA, while dissolved organics in the range of LMW decreased during the biodegradability test, and on the contrary those of HMW increased. Production of the SMPs(soluble micobial products) was observed during humicfication of dissolved organics and the SMPs were higher production of the SMPs. THM formation was high in the samples containing HMW and similar tendency was shown in the THMEP(trihalomethane formation potential), except for WHA.

  • PDF