• Title/Summary/Keyword: Matching Score

Search Result 347, Processing Time 0.029 seconds

Building a Korean-English Parallel Corpus by Measuring Sentence Similarities Using Sequential Matching of Language Resources and Topic Modeling (언어 자원과 토픽 모델의 순차 매칭을 이용한 유사 문장 계산 기반의 위키피디아 한국어-영어 병렬 말뭉치 구축)

  • Cheon, JuRyong;Ko, YoungJoong
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.901-909
    • /
    • 2015
  • In this paper, to build a parallel corpus between Korean and English in Wikipedia. We proposed a method to find similar sentences based on language resources and topic modeling. We first applied language resources(Wiki-dictionary, numbers, and online dictionary in Daum) to match word sequentially. We construct the Wiki-dictionary using titles in Wikipedia. In order to take advantages of the Wikipedia, we used translation probability in the Wiki-dictionary for word matching. In addition, we improved the accuracy of sentence similarity measuring method by using word distribution based on topic modeling. In the experiment, a previous study showed 48.4% of F1-score with only language resources based on linear combination and 51.6% with the topic modeling considering entire word distributions additionally. However, our proposed methods with sequential matching added translation probability to language resources and achieved 9.9% (58.3%) better result than the previous study. When using the proposed sequential matching method of language resources and topic modeling after considering important word distributions, the proposed system achieved 7.5%(59.1%) better than the previous study.

Wage Differentials between Regular and Irregular Workers (데이터 매칭을 이용한 비정규직의 임금격차 분석)

  • Kim, Sunae;Kim, Jinyoung
    • Journal of Labour Economics
    • /
    • v.34 no.2
    • /
    • pp.53-77
    • /
    • 2011
  • The last decade has witnessed a surge of research interest in differences between regular and irregular workers in employment forms. Recent studies on estimating wage differentials between the two types of workers in employment forms have typically used the linear regression analysis. Our study utilizes a new methodology to estimate wage differentials between the two types of workers: data matching. Our method can perform better than the ordinary regression analysis because it carefully addresses the selection bias problem. Our results indicate that there is no significant difference in wage between regular and irregular workers.

  • PDF

A BERT-based Transfer Learning Model for Bidirectional HR Matching (양방향 인재매칭을 위한 BERT 기반의 전이학습 모델)

  • Oh, Sojin;Jang, Moonkyoung;Song, Hee Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.28 no.4
    • /
    • pp.33-43
    • /
    • 2021
  • While youth unemployment has recorded the lowest level since the global COVID-19 pandemic, SMEs(small and medium sized enterprises) are still struggling to fill vacancies. It is difficult for SMEs to find good candidates as well as for job seekers to find appropriate job offers due to information mismatch. To overcome information mismatch, this study proposes the fine-turning model for bidirectional HR matching based on a pre-learning language model called BERT(Bidirectional Encoder Representations from Transformers). The proposed model is capable to recommend job openings suitable for the applicant, or applicants appropriate for the job through sufficient pre-learning of terms including technical jargons. The results of the experiment demonstrate the superior performance of our model in terms of precision, recall, and f1-score compared to the existing content-based metric learning model. This study provides insights for developing practical models for job recommendations and offers suggestions for future research.

A study of using quality for Radial Basis Function based score-level fusion in multimodal biometrics (RBF 기반 유사도 단계 융합 다중 생체 인식에서의 품질 활용 방안 연구)

  • Choi, Hyun-Soek;Shin, Mi-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.192-200
    • /
    • 2008
  • Multimodal biometrics is a method for personal authentication and verification using more than two types of biometrics data. RBF based score-level fusion uses pattern recognition algorithm for multimodal biometrics, seeking the optimal decision boundary to classify score feature vectors each of which consists of matching scores obtained from several unimodal biometrics system for each sample. In this case, all matching scores are assumed to have the same reliability. However, in recent research it is reported that the quality of input sample affects the result of biometrics. Currently the matching scores having low reliability caused by low quality of samples are not currently considered for pattern recognition modelling in multimodal biometrics. To solve this problem, in this paper, we proposed the RBF based score-level fusion approach which employs quality information of input biometrics data to adjust decision boundary. As a result the proposed method with Qualify information showed better recognition performance than both the unimodal biometrics and the usual RBF based score-level fusion without using quality information.

The Usefulness of Multiple-Choice Name Matching Test in Aphasic Patients (실어증 환자에서 선다형 이름 맞추기 검사의 유용성)

  • Min, Yong;Ko, Myoung-Hwan;Seo, Jeong-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.137-142
    • /
    • 2012
  • The aim of this study is to investigate the usefulness of the multiple-choice name matching test (MC-NMT) in adults with aphasia by comparing the Korean version of the Boston Naming Test (K-BNT) and subsets of the Korean version of the Western Aphasia Battery (K-WAB). Thirty-nine patients who suffer from aphasia participated in the study. All patients were examined by the K-BNT, MC-NMT and K-WAB. The MC-NMT consisted of the 30 original BNT object stimuli which were presented with four response choices (written words) with similar frequency, including one correct and three incorrect responses. Cards containing the drawings were presented to the patient one at time. An item was passed if the patient chose the correct response within 10 seconds. We subdivided two groups into a total group and a low K-BNT group (at and below 15 points). We evaluated the correlation between the K-BNT, MC-NMT score and production, naming, repetition, comprehension, reading and writing scores in subsets of the K-WAB. There was a highly positive correlation between the K-BNT score and naming score of the K-WAB in total patients. However, the MC-NMT was highly correlated with reading scores in the K-WAB. In low score K-BNT patients, the K-BNT strongly correlated with production, naming and repetition scores of the K-WAB. These findings mean that K-BNT reflects motor language function. However, the MC-NMT was strong correlated comprehension, reading and writing of the K-WAB. This finding reflects sensory language function. We suggest that the combination of K-BNT and newly developed MC-NMT will be useful to evaluate speech functions in aphasic patients.

Comparison of short-term outcomes of open and laparoscopic assisted pancreaticoduodenectomy for periampullary carcinoma: A propensity score-matched analysis

  • Utpal Anand;Rohith Kodali;Kunal Parasar;Basant Narayan Singh;Kislay Kant;Sitaram Yadav;Saad Anwar;Abhishek Arora
    • Annals of Hepato-Biliary-Pancreatic Surgery
    • /
    • v.28 no.2
    • /
    • pp.220-228
    • /
    • 2024
  • Backgrounds/Aims: Postoperative pancreatic fistula is the key worry in the ongoing debate about the safety and effectiveness of total laparoscopic pancreaticoduodenectomy (TLPD). Laparoscopic-assisted pancreaticoduodenectomy (LAPD), a hybrid approach combining laparoscopic resection and anastomosis with a small incision, is an alternative to TLPD. This study compares the short-term outcomes and oncological efficacy of LAPD vs. open pancreaticoduodenectomy (OPD). Methods: A retrospective analysis of data of all patients who underwent LAPD or OPD for periampullary carcinoma at a tertiary care center in Northeast India from July 2019 to August 2023 was done. A total of 30 LAPDs and 30 OPDs were compared after 1:1 propensity score matching. Demographic data, intraoperative and postoperative data (30 days), and pathological data were compared. Results: The study included a total of 93 patients, 30 underwent LAPD and 62 underwent OPD. After propensity score matching, the matched cohort included 30 patients in both groups. The LAPD presented several advantages over the OPD group, including a shorter incision length, reduced postoperative pain, earlier initiation of oral feeding, and shorter hospital stays. LAPD was not found to be inferior to OPD in terms of pancreatic fistula incidence (Grade B, 30.0% vs. 33.3%), achieving R0 resection (100% vs. 93.3%), and the number of lymph nodes harvested (12 vs. 14, p = 0.620). No significant differences in blood loss, short-term complications, pathological outcomes, readmissions, and early (30-day) mortality were observed between the two groups. Conclusions: LAPD has comparable safety, technical feasibility, and short-term oncological efficacy.

Matching Pursuit Sinusoidal Modeling with Damping Factor (Damping 요소를 첨가한 매칭 퍼슈잇 정현파 모델링)

  • Jeong, Gyu-Hyeok;Kim, Jong-Hark;Lim, Joung-Woo;Joo, Gi-Ho;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.105-113
    • /
    • 2007
  • In this paper, we propose the matching pursuit with damping factors, a new sinusoidal model improving the matching pursuit, for the codecs based on sinusoidal model. The proposed model defines damping factors by using a correlativity of parameters between the current and adjacent frame, and estimates sinusoidal parameters more accurately in analysis frame by using the matching pursuit according to damping factor, and synthesizes the final signal. Then it is possible to model efficiently without interpolation schemes. The proposed sinusoidal model shows a better speech quality without an additional delay than the conventional sinusoidal model with interpolation methods. Through the SNR(signal to noise ratio), the MOS(Mean Opinion Score), LR(Itakura-Saito likelihood ratio), and CD(cepstral distance), we compare the performance of our model with that of matching pursuit using interpolation methods.

Optimization Driven MapReduce Framework for Indexing and Retrieval of Big Data

  • Abdalla, Hemn Barzan;Ahmed, Awder Mohammed;Al Sibahee, Mustafa A.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.1886-1908
    • /
    • 2020
  • With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face a burden in handling them. Additionally, the presence of the imbalance data in big data is a massive concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new indexing algorithm for retrieving big data in the MapReduce framework. In mappers, the data clustering is done based on the Sparse Fuzzy-c-means (Sparse FCM) algorithm. The reducer combines the clusters generated by the mapper and again performs data clustering with the Sparse FCM algorithm. The two-level query matching is performed for determining the requested data. The first level query matching is performed for determining the cluster, and the second level query matching is done for accessing the requested data. The ranking of data is performed using the proposed Monarch chaotic whale optimization algorithm (M-CWOA), which is designed by combining Monarch butterfly optimization (MBO) [22] and chaotic whale optimization algorithm (CWOA) [21]. Here, the Parametric Enabled-Similarity Measure (PESM) is adapted for matching the similarities between two datasets. The proposed M-CWOA outperformed other methods with maximal precision of 0.9237, recall of 0.9371, F1-score of 0.9223, respectively.

Appearance-Order-Based Schema Matching

  • Ding, Guohui;Cao, Keyan;Wang, Guoren;Han, Dong
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.94-106
    • /
    • 2014
  • Schema matching is widely used in many applications, such as data integration, ontology merging, data warehouse and dataspaces. In this paper, we propose a novel matching technique that is based on the order of attributes appearing in the schema structure of query results. The appearance order embodies the extent of the importance of an attribute for the user examining the query results. The core idea of our approach is to collect statistics about the appearance order of attributes from the query logs, to find correspondences between attributes in the schemas to be matched. As a first step, we employ a matrix to structure the statistics around the appearance order of attributes. Then, two scoring functions are considered to measure the similarity of the collected statistics. Finally, a traditional algorithm is employed to find the mapping with the highest score. Furthermore, our approach can be seen as a complementary member to the family of the existing matchers, and can also be combined with them to obtain more accurate results. We validate our approach with an experimental study, the results of which demonstrate that our approach is effective, and has good performance.

A New Perspective to Stable Marriage Problem in Profit Maximization of Matrimonial Websites

  • Bhatnagar, Aniket;Gambhir, Varun;Thakur, Manish Kumar
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.961-979
    • /
    • 2018
  • For many years, matching in a bipartite graph has been widely used in various assignment problems, such as stable marriage problem (SMP). As an application of bipartite matching, the problem of stable marriage is defined over equally sized sets of men and women to identify a stable matching in which each person is assigned a partner of opposite gender according to their preferences. The classical SMP proposed by Gale and Shapley uses preference lists for each individual (men and women) which are infeasible in real world applications for a large populace of men and women such as matrimonial websites. In this paper, we have proposed an enhancement to the SMP by computing a weighted score for the users registered at matrimonial websites. The proposed enhancement has been formulated into profit maximization of matrimonial websites in terms of their ability to provide a suitable match for the users. The proposed formulation to maximize the profits of matrimonial websites leads to a combinatorial optimization problem. We have proposed greedy and genetic algorithm based approaches to solve the proposed optimization problem. We have shown that the proposed genetic algorithm based approaches outperform the existing Gale-Shapley algorithm on the dataset crawled from matrimonial websites.