• 제목/요약/키워드: lowest frequency words

검색결과 11건 처리시간 0.018초

저빈도어를 고려한 개념학습 기반 의미 중의성 해소 (Word Sense Disambiguation based on Concept Learning with a focus on the Lowest Frequency Words)

  • 김동성;최재웅
    • 한국언어정보학회지:언어와정보
    • /
    • 제10권1호
    • /
    • pp.21-46
    • /
    • 2006
  • This study proposes a Word Sense Disambiguation (WSD) algorithm, based on concept learning with special emphasis on statistically meaningful lowest frequency words. Previous works on WSD typically make use of frequency of collocation and its probability. Such probability based WSD approaches tend to ignore the lowest frequency words which could be meaningful in the context. In this paper, we show an algorithm to extract and make use of the meaningful lowest frequency words in WSD. Learning method is adopted from the Find-Specific algorithm of Mitchell (1997), according to which the search proceeds from the specific predefined hypothetical spaces to the general ones. In our model, this algorithm is used to find contexts with the most specific classifiers and then moves to the more general ones. We build up small seed data and apply those data to the relatively large test data. Following the algorithm in Yarowsky (1995), the classified test data are exhaustively included in the seed data, thus expanding the seed data. However, this might result in lots of noise in the seed data. Thus we introduce the 'maximum a posterior hypothesis' based on the Bayes' assumption to validate the noise status of the new seed data. We use the Naive Bayes Classifier and prove that the application of Find-Specific algorithm enhances the correctness of WSD.

  • PDF

한글 부호의 최적화 전송을 위한 한국어 낱자 분석 (An Analysis on the Korean Language for Optimum Transmission of Hangul Code)

  • 홍완표
    • 한국전자통신학회논문지
    • /
    • 제10권1호
    • /
    • pp.33-38
    • /
    • 2015
  • 본 논문은 한글부호의 전송을 최적화하는데 요구되는 한글낱자를 연구하였다. 한글낱자는 한국어를 구성하고 있는 한글을 토대로 하여 분석하였다. 한글낱자의 분석은 세가지 유형을 대상으로 하여 수행하였다. 첫번째 대상은 한글맞춤법 통일안의 24개 낱자이었다. 두번째 대상은 표준 두벌식 글자판의 낱자 28개이었다. 세 번째 대상은 한글맞춤법 통일안에 겹낱자를 포함한 총 54개 낱자이었다. 이 세가지 분석대상별로 각 낱자에 대한 사용빈도를 분석하였다. 국립국어원의 한국어자료에 수록된 총 한국어 단어는 총58,437개인데, 단어들은 총 1,540자의 글자로 구성되어 있다. 이 한글들을 분석한 결과, 사용빈도수를 보면, 첫 번째 대상의 경우, 닿소리는 "ㅇ"이 가장 많고 "ㅋ"이 가장 적었으며 홑소리는 "ㅏ"가 가장 많고 "ㅑ"가 가장 적었다. 두 번째 대상의 경우, 닿소리는 첫 번째 대상과 같고 홑소리는 "ㅏ"가 가장 많고 "ㅒ"가 가장 적었다. 세 번째 대상의 경우, 닿소리는 "ㄱ"이 가장 많고 "ㄽ"이 가장 적었으며 홑소리는 "ㅏ"가 가장 많고 "ㅞ"가 가장 적었다.

Non-equal DC link Voltages in a Cascaded H-Bridge with a Selective Harmonic Mitigation-PWM Technique Based on the Fundamental Switching Frequency

  • Moeini, Amirhossein;Iman-Eini, Hossein;Najjar, Mohammad
    • Journal of Power Electronics
    • /
    • 제17권1호
    • /
    • pp.106-114
    • /
    • 2017
  • In this paper, the Selective Harmonic Mitigation-PWM (SHM-PWM) method is used in single-phase and three-phase Cascaded H-Bridge (CHB) inverters in order to fulfill different power quality standards such as EN 50160, CIGRE WG 36-05, IEC 61000-3-6 and IEC 61000-2-12. Non-equal DC link voltages are used to increase the degrees of freedom for the proposed SHM-PWM technique. In addition, it will be shown that the obtained solutions become continuous and without sudden changes. As a result, the look-up tables can be significantly reduced. The proposed three-phase modulation method can mitigate up to the 50th harmonic from the output voltage, while each switch has just one switching in a fundamental period. In other words, the switching frequency of the power switches are limited to 50 Hz, which is the lowest switching frequency that can be achieved in the multilevel converters, when the optimal selective harmonic mitigation method is employed. In single-phase mode, the proposed method can successfully mitigate harmonics up to the 50th, where the switching frequency is 150 Hz. Finally, the validity of the proposed method is verified by simulations and experiments on a 9-level CHB inverter.

한국어 조사의 운율적 특성 - 낭독체 문장을 중심으로- (The prosodic characters of particles in Korean -- focusing on the read speech --)

  • 전은;이숙향
    • 대한음성학회지:말소리
    • /
    • 제37호
    • /
    • pp.73-85
    • /
    • 1999
  • The prosodic characteristics of Korean particles in read speech were examined in this paper based on K-ToBI labeling system in order to see whether they are prosodically weak form like functions words in English. Acoustic measurements and statistical analysis were done focusing on the distribution of particles over a variety of prosodic positions, prosodic positional effects on the phonetic realization of particles, and acoustic strength of particles compared to those of their surrounding syllables. The panicles were distributed rather equally over all 4 prosodic positions with the highest frequency at IP-medial/AP-final position and the lowest at IP-medial/AP-medial position except that topic marker 'Un/nUn' showed preference for IP-final/AP-final position. There was a significant prosodic positional effect on the duration and F0 of the particles. Duration was the longest at IP-final/AP-final position and interestingly, at IP-medial/AP-medial position while F0 was the highest at IP-final/AP-medial Position as expected. The comparison of the acoustic properties of the particles with those of neighbor syllables showed that duration was generally significantly longer and energy also showed larger values, if not significant, in particles suggesting that the particles in Korean are not prosodically weaker like function words in English.

  • PDF

한글의 정보처리 및 통신용 부호 최적화를 위한 한국어 분석 (Analysis of Korean Language to Optimize the Hangul Character Coding for Information Processing and Communication)

  • 홍완표
    • 한국전자통신학회논문지
    • /
    • 제10권3호
    • /
    • pp.375-380
    • /
    • 2015
  • 본 논문은 정보처리 및 전송용으로 사용되는 한글의 부호화를 최적화할 수 있도록 하기 위하여 한국어를 연구하였다. 본 논문은 한국어 구성하고 있는 한글의 구성현황과 그 한글들에 대한 각각의 사용빈도를 분석하였다. 본 논문은 본 연구결과 분석된 한글의 구성현황을 한국 KS 문자 표준과 국제 문자표준인 유니코드로 부호화되어 있는 한글 문자와 비교하였다. 연구를 위해 사용된 한국어는 국립국어원의 "현대국어사용빈도조사결과"를 대상으로 하였다. 이 보고서에 수록된 한국어는 총 58.437개이다. 분석결과 한국어 총58,437국어를 구성하고 있는 한글은 총1,540개였다. 이 총1,540개 한국어 중에서 사용빈도가 가장 높은 글자는 "다"로서 전체 사용빈도의 15%였다. 사용빈도가 가장 낮은 글자는 "휫"으로서 전체사용빈도의 0.00003%였다. 한국어를 구성하고 있는 한글 글자수는 유니코드 한글문자 부호를 구성하고 있는 한글 수 보다 약 7.2배, KS X 1001 한글문자 부호를 구성하고 있는 한글 수보다 약 1.5배 적은 것으로 나타났다.

기네아피그 달팽이관의 N1-N2 오디오그램 (N1-N2 Audiograms of the Guinea Pig Cochlea)

  • 장순석
    • 대한의용생체공학회:의공학회지
    • /
    • 제16권1호
    • /
    • pp.77-84
    • /
    • 1995
  • Nl and N2 gross neural action potentials were measured from the round window of the guinea pig cochlea at the onset of the acoustic stimuli. Nl -N2 audiograms were made by means of regulating stimulant intensities in order to produce constant Nl -N2 potentials as criteria for different input tone pip frequencies. The lowest threshold was measured with an input tone pip 15 dB SPL in intensity and 12 KHz in frequency when the animal was in normal physiological condition. The procedure of experimental measurements is explained in detail. This experimental approach is very useful for the investigation of the Cochlear function. Both nonlinear and active functions of the Cochlea can be monitored by Nl -N2 audiograms. Key words : Guinea Pig, Cochlea, Wl and N2 Gross Neural Action Potential, Nl -N2 audiogram.

  • PDF

A Research on Difference Between Consumer Perception of Slow Fashion and Consumption Behavior of Fast Fashion: Application of Topic Modelling with Big Data

  • YANG, Oh-Suk;WOO, Young-Mok;YANG, Yae-Rim
    • 융합경영연구
    • /
    • 제9권1호
    • /
    • pp.1-14
    • /
    • 2021
  • Purpose: The article deals with the proposition that consumers' fashion consumption behavior will still follow the consumption behavior of fast fashion, despite recognizing the importance of slow fashion. Research design, data and methodology: The research model to verify this proposition is topic modelling with big data including unstructured textual data. we combined 5,506 news articles posted on Naver news search platform during the 2003-2019 period about fast fashion and slow fashion, high-frequency words have been derived, and topics have been found using LDA model. Based on these, we examined consumers' perception and consumption behavior on slow fashion through the analysis of Topic Network. Results: (1) Looking at the status of annual article collection, consumers' interest in slow fashion mainly began in 2005 and showed a steady increase up to 2019. (2) Term Frequency analysis showed that the keywords for slow fashion are the lowest, with consumers' consumption patterns continuing around 'brand.' (3) Each topic's weight in articles showed that 'social value' - which includes slow fashion - ranked sixth among the 9 topics, low linkage with other topics. (4) Lastly, 'brand' and 'fashion trend' were key topics, and the topic 'social value' accounted for a low proportion. Conclusion: Slow fashion was not a considerable factor of consumption behavior. Consumption patterns in fashion sector are still dominated by general consumption patterns centered on brands and fast fashion.

Patent Technology Trends of Oral Health: Application of Text Mining

  • Hee-Kyeong Bak;Yong-Hwan Kim;Han-Na Kim
    • 치위생과학회지
    • /
    • 제24권1호
    • /
    • pp.9-21
    • /
    • 2024
  • Background: The purpose of this study was to utilize text network analysis and topic modeling to identify interconnected relationships among keywords present in patent information related to oral health, and subsequently extract latent topics and visualize them. By examining key keywords and specific subjects, this study sought to comprehend the technological trends in oral health-related innovations. Furthermore, it aims to serve as foundational material, suggesting directions for technological advancement in dentistry and dental hygiene. Methods: The data utilized in this study consisted of information registered over a 20-year period until July 31st, 2023, obtained from the patent information retrieval service, KIPRIS. A total of 6,865 patent titles related to keywords, such as "dentistry," "teeth," and "oral health," were collected through the searches. The research tools included a custom-designed program coded specifically for the research objectives based on Python 3.10. This program was used for keyword frequency analysis, semantic network analysis, and implementation of Latent Dirichlet Allocation for topic modeling. Results: Upon analyzing the centrality of connections among the top 50 frequently occurring words, "method," "tooth," and "manufacturing" displayed the highest centrality, while "active ingredient" had the lowest. Regarding topic modeling outcomes, the "implant" topic constituted the largest share at 22.0%, while topics concerning "devices and materials for oral health" and "toothbrushes and oral care" exhibited the lowest proportions at 5.5% each. Conclusion: Technologies concerning methods and implants are continually being researched in patents related to oral health, while there is comparatively less technological development in devices and materials for oral health. This study is expected to be a valuable resource for uncovering potential themes from a large volume of patent titles and suggesting research directions.

공분산구조분석을 이용한 고속철도와 국내항공의 이동단계별 서비스특성 비교연구 (A Comparative Study on the Service Characteristics for Transferring Process of High-Speed Rail and Domestic Airline Systems by Using Structural Equation Modeling)

  • 김태호;정광섭;박제진
    • 대한토목학회논문집
    • /
    • 제29권2D호
    • /
    • pp.183-190
    • /
    • 2009
  • 고속철도가 향후 지속적인 성과와 안정화를 도모하기 위해서는 고속철도 이용승객의 만족도를 높여 주어 지속적으로 고속철도를 이용하도록 해야 한다. 동시에 고속철도와 경쟁수단이라 할 수 있는 국내항공을 이용하는 고객과의 서비스특성 비교를 통하여 경쟁수단으로부터 우위를 점해야 한다. 구조방정식을 이용한 이용자의 서비스영향 모형개발을 위해 설문조사를 실시하였으며, 신뢰성분석, 상관분석, 요인분석을 통해 측정지표의 타당성 및 연구가설 설정을 위한 기초분석을 수행하였다. 구조방정식모형을 이용하여 경쟁수단간 서비스특성을 분석한 결과, 고속철도(KTX)는 이용자의 내부서비스(대기, 이동)를 개선해야 하는 것으로 나타났으며, 국내항공은 이용자의 외부서비스(접근, 환승)를 개선해야 하는 것으로 나타났다. 다시 말하면, 고속철도(KTX)의 경우 대기시설과 현재 일부구간에 제약을 받고 있는 이동부분에 개선이 필요한 것을 알 수 있으며, 국내 항공의 경우 공항으로의 접근 및 타 교통수단으로의 환승은 불편한 것으로 나타나 국내항공 이용시 접근체계 및 환승에 대 한 서비스 개선이 필요한 것으로 판단된다.

거주지역에 따른 결혼이민자 여성의 자아분화 및 문화적응이 결혼만족도에 미치는 영향 (Influence of Self-Differentiation and Acculturation on Marriage Satisfaction Among Immigrant Women by Residential Area)

  • 이영분;이유경
    • 가정과삶의질연구
    • /
    • 제28권1호
    • /
    • pp.145-157
    • /
    • 2010
  • This is a study that explores the influence of self-differentiation and acculturation among married immigrant women on their feelings of marriage satisfaction by residential area. The aim was to verify the level of self-differentiation and acculturation that married immigrant women secure from multicultural marriage. To achieve this objective, the study widely distributed a questionnaire that targeted women who were participating in education and other services at health and family support centers, multi-cultural family support centers, general social welfare centers, immigrant women shelters, and Korean language classrooms which are located in Seoul, Gyeonggi, Chungcheong, Jeolla, and Gyeongsang. Data analysis involved frequency analysis, descriptive statistics, one-way-Anova, and multiple regression analysis. Based on the results of descriptive statistics, two factors, namely, (1)interpersonal-relation differentiation, a sub-scale of self-differentiation, and (2)marginalization, a sub-scale of acculturation, had the lowest average. In verifying its various hypotheses, the study achieved the following results. Firstly, among demographic characteristics, there was difference of the mean in the marriage period, average monthly income, the frequency of meetings with the married woman´s parents-in-law and her own parents, and the average cost of supporting the woman's parents-in-law and her own parents. Secondly, among demographic characteristics, the variable of influencing marriage satisfaction showed negative influence in the case of women dwelling in farming and fishing villages. This shows that women residing in cities whether small, medium or large have higher marriage satisfaction. Also, as a result of verifying whether self-differentiation has influence on marriage satisfaction, the element of interpersonal-relations differentiation had a negative influence on marriage satisfaction. Thirdly, as for influence of acculturation upon marriage satisfaction, the study showed that just integration, which is a sub-scale of acculturation had positive effect on marriage satisfaction. In other words, the study showed that the interpersonal-relation differentiation among the sub-scales of self-differentiation among married immigrant women had a negative influence, and that integration among the sub-scales in acculturation had a positive influence on marriage satisfaction. Based on these results, in order to increase interpersonal-relation differentiation, as well as marriage satisfaction among immigrant women, the study suggests the integration of the women's families with the nuclear and extended families in the communities where the women reside.