• 제목/요약/키워드: Lexical Statistics

검색결과 17건 처리시간 0.023초

어휘 풍부성 평가에 대한 계량언어학적 연구 (프랑스어 텍스트를 중심으로) (A Quantitative Linguistic Study for the Appreciation of the Lexical Richness)

  • 배희숙
    • 음성과학
    • /
    • 제7권3호
    • /
    • pp.139-149
    • /
    • 2000
  • Studying language by the quantitative linguistic method is not a recent development. Lately however, the interest in the quantitative linguistics has increased according to the demand on communication between human and human or between human and machine. We are required to transfer the system of the natural language onto machine. This requires the study of quantitative linguistics because we are unable to seize the characters of the tiny linguistic units and their structure in an intuitive way. In fact, the quantitative linguistics treats the internal structure of the language by the relation between the linguitic units and their quantitative characters. It is natural then that there is this growing interest in quantitative linguistics. In addition, Korean linguists take interest in the quantitative linguistics, although quantitative linguistics in Korea is not advanced by the level of the statistical analysis. Therefore, this present study shows how statistics can be applied in the field of linguistics through the two texts written in French: Lovers of the Subway and Our life's A. B. C.

  • PDF

아파트 브랜드에 따른 외관 및 실내디자인 이미지 통합에 관한 연구 (A Study on the Intergrated Images in Exterior and Interior Design According to the Apartments Brand)

  • 신미옥;김남효
    • 한국실내디자인학회:학술대회논문집
    • /
    • 한국실내디자인학회 2008년도 춘계학술발표대회 논문집
    • /
    • pp.139-143
    • /
    • 2008
  • The brand power is getting more important in apartment market so that consumers are accustomed to ask first what the apartment brand is when they are considering to buy an apartment. Even so the brand name is the first factor which approaches to the consumer, message and image can be delivered to customers by visual factors. Since visual image can be effective to remind of customers brand image, construction business company should make portfolio to synthesize brand Image actively. This research investigate images of consistency in extenor and interior design according to the apartments brand. Used lexical meaning of the adjective used to discern standard to extract images, selected survey, and evaluated by step 1 to 5 using semantic differential method, SD. The collected cases are analyzed by using statistics software SPSS for windows release 11.0. This research provides conveyance of the vision image which fits to the brand Image and further design direction.

  • PDF

A Study on the Diachronic Evolution of Ancient Chinese Vocabulary Based on a Large-Scale Rough Annotated Corpus

  • Yuan, Yiguo;Li, Bin
    • 아시아태평양코퍼스연구
    • /
    • 제2권2호
    • /
    • pp.31-41
    • /
    • 2021
  • This paper makes a quantitative analysis of the diachronic evolution of ancient Chinese vocabulary by constructing and counting a large-scale rough annotated corpus. The texts from Si Ku Quan Shu (a collection of Chinese ancient books) are automatically segmented to obtain ancient Chinese vocabulary with time information, which is used to the statistics on word frequency, standardized type/token ratio and proportion of monosyllabic words and dissyllabic words. Through data analysis, this study has the following four findings. Firstly, the high-frequency words in ancient Chinese are stable to a certain extent. Secondly, there is no obvious dissyllabic trend in ancient Chinese vocabulary. Moreover, the Northern and Southern Dynasties (420-589 AD) and Yuan Dynasty (1271-1368 AD) are probably the two periods with the most abundant vocabulary in ancient Chinese. Finally, the unique words with high frequency in each dynasty are mainly official titles with real power. These findings break away from qualitative methods used in traditional researches on Chinese language history and instead uses quantitative methods to draw macroscopic conclusions from large-scale corpus.

Analysis of Impact Between Data Analysis Performance and Database

  • Kyoungju Min;Jeongyun Cho;Manho Jung;Hyangbae Lee
    • Journal of information and communication convergence engineering
    • /
    • 제21권3호
    • /
    • pp.244-251
    • /
    • 2023
  • Engineering or humanities data are stored in databases and are often used for search services. While the latest deep-learning technologies, such like BART and BERT, are utilized for data analysis, humanities data still rely on traditional databases. Representative analysis methods include n-gram and lexical statistical extraction. However, when using a database, performance limitation is often imposed on the result calculations. This study presents an experimental process using MariaDB on a PC, which is easily accessible in a laboratory, to analyze the impact of the database on data analysis performance. The findings highlight the fact that the database becomes a bottleneck when analyzing large-scale text data, particularly over hundreds of thousands of records. To address this issue, a method was proposed to provide real-time humanities data analysis web services by leveraging the open source database, with a focus on the Seungjeongwon-Ilgy, one of the largest datasets in the humanities fields.

지지벡터기계(Support Vector Machines)를 이용한 한국어 화행분석 (An analysis of Speech Acts for Korean Using Support Vector Machines)

  • 은종민;이성욱;서정연
    • 정보처리학회논문지B
    • /
    • 제12B권3호
    • /
    • pp.365-368
    • /
    • 2005
  • 본 연구에서는 지지 벡터 기계(Support Vector Machines)를 이용하여 한국어 대화의 화행을 분석하는 방법을 제안한다. 우리는 발화의 어휘 및 품사와 이진 품사 쌍을 문장 자질로 사용하고 이전 발화의 문맥을 문맥 발화로 사용한다. 카이 제곱 통계량을 이용해 적절한 자질을 선택하고 선택된 자질로 지지 벡터 기계를 학습하였다. 학습된 지지 벡터 기계 분류기를 이용하여 각 발화의 화행을 분석하였다. 호텔 예약 영역의 말뭉치에 대해 제안된 시스템을 이용하여 실험한 결과 약 $90.54\%$의 정확률을 얻었다.

워드넷을 이용한 문서내에서 단어 사이의 의미적 유사도 측정 (Semantic Similarity Measures Between Words within a Document using WordNet)

  • 강석훈;박종민
    • 한국산학기술학회논문지
    • /
    • 제16권11호
    • /
    • pp.7718-7728
    • /
    • 2015
  • 단어 사이의 의미적 유사성은 많은 분야에 적용 될 수 있다. 예를 들면 컴퓨터 언어학, 인공지능, 정보처리 분야이다. 본 논문에서 우리는 단어 사이의 의미적 유사성을 측정하는 문서 내의 단어 가중치 적용 방법을 제시한다. 이 방법은 워드넷의 간선의 거리와 깊이를 고려한다. 그리고 문서 내의 정보를 기반으로 단어 사이의 의미적 유사성을 구한다. 문서 내의 정보는 단어의 빈도수와 단어의 의미 빈도수를 사용한다. 문서 내에서 단어 마다 단어 빈도수와 의미 빈도수를 통해 각 단어의 가중치를 구한다. 본 방법은 단어 사이의 거리, 깊이, 그리고 문서 내의 단어 가중치 3가지를 혼합한 유사도 측정 방법이다. 실험을 통하여 기존의 다른 방법과 성능을 비교하였다. 그 결과 기존 방법에 대비하여 성능의 향상을 가져왔다. 이를 통해 문서 내에서 단어의 가중치를 문서 마다 구할 수 있다. 단순한 최단거리 기반의 방법들과 깊이를 고려한 기존의 방법들은, 정보에 대한 특성을 제대로 표현하지 못했거나 다른 정보를 제대로 융합하지 못했다. 본 논문에서는 최단거리와 깊이 그리고 문서 내에서 단어의 정보량까지 고려하였고, 성능의 개선을 보였다.

카페 이미지에서 목재 마감재에 따른 색채배색과 감성 선호도 분석 메커니즘 (The Analysis of Mechanism on Color Scheme and Emotional Affectivity Preferences according to Wood Material Finishing in the Cafe Images)

  • 최진경;김주연
    • 한국생활환경학회지
    • /
    • 제24권5호
    • /
    • pp.654-664
    • /
    • 2017
  • The use of environmentally friendly finishing materials allows us to create a space where we can feel nature and to have stability and peace in the city center. In this paper, we examined the sensitivity of people to the three café spaces where wooden finishing materials are used in the space elements that change according to people's demands for environmentally friendly space due to pollution of living environment. First, we examined the wood and finishing materials and emotional vocabulary through literature review and previous research. Second, the values of L *, a *, b* and sR, sG and sB values were extracted by using a line spectrophotometer (Ci6X). Third, we conducted a 7 - point scale questionnaire based on the extracted 13 pairs of emotional vocabulary. Using SPSS 21, frequency analysis by descriptive statistics, crossover analysis by visiting purpose and intention, and emotional lexical factor analysis were performed. Through the study, the following points were found. First, CB (The Coffee Bean), SB (Starbucks) and HS (Hollys Coffee) showed differences in CB (65%), SB (40%) and HS (37%) in the spatial analysis. Second, CB gave color similar to the color of wall and furniture wood, but HS changed the color or brightness of wood finishing color of furniture. HS or SB showed favorable use of wood color scheme. Third, SB (26.3%) and HS (19.7%) were selected by taste. Fourth, there were differences in the items of CB, 'local-exotic' and SB 'dark-bright' in the factor value. The use of wood finishing materials differed in the atmosphere evaluation depending on the spatial factors and the color of the furniture. However, in this study, there are many factors that are insufficient in the accuracy of the ratio of the applied wood finishing material to the space element and the amount of the survey. If we further study the evaluation of emotional image according to the ratio of wood finishing materials, we think that it is necessary to study now that interest in environmentally friendly is increasing.