• Title/Summary/Keyword: GO language

Search Result 155, Processing Time 0.028 seconds

Design and Implementation of Indexing and Query Languages for an Efficient Retrieval of SGML Documents (SGML 문서의 효율적인 검색을 위한 색인 및 질의 언어의 설계 및 구현)

  • Lee, Bong-Sin;Lee, Gyeong-Ho;Go, Seung-Gyu;Choe, Yun-Cheol
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.11
    • /
    • pp.2911-2921
    • /
    • 1999
  • We present new methods for an efficient retrieval of SGML documents. We define IDDL (index database description language) which is able to describe various information such as meta data, an indexing range, and the creation and manipulation of a database. In addition, we design IDQL (index database query language) that can deal with querying meta data as well as logical structure. Especially, the retrieval system based on IDDL and IDQL has been developed and implemented, and has been experimented on large number of documents. Experimental result shows that the proposed method provides the dynamic creation of an index database and a convenient retrieval environment.

  • PDF

An Approach to Automatically Generating Infobox for Wikipedia in Cross-languages through Translation and Webgraph (번역과 웹그래프를 활용한 언어 간 위키피디아 인포박스 자동생성 기법)

  • Kim, Eun-Kyung;Choi, DongHyun;Go, Eun-Bi;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.9-15
    • /
    • 2011
  • 여러 언어로 작성되는 위키피디아의 경우 언어 간에 등록되어 있는 정보의 양과 내용이 달라 언어 간 정보를 상호 추출하고 서로 통합하는 연구에 대한 관심이 증가하고 있다. 특히, 위키피디아의 요약본으로써 의미가 있는 인포박스는 위키피디아 아티클에 존재하는 구조화된 정보 중 가장 근간이 되는 정보로, 본 논문에서는 위키피디아에 존재하는 인포박스를 1)소스 언어 자원으로부터 획득하여 타겟 언어로 번역하고, 2)번역된 결과물과 웹그래프를 이용하여 타겟 언어 데이터에서 획득하는 정보와 결합하는 과정을 통해 자동으로 인포박스를 생성하는 기법에 대하여 설명한다. 웹그래프는 위키피디아에 존재하는 링크 구조를 통해 서로 다른 두 용어간의 관련도를 측정하여 인포박스에 추가될 내용을 파악하는데 사용한다. 본 논문의 기법은 언어 간 인포박스를 생성하는 측면에서, 영어 인포박스 데이터를 입력으로 하여 한국어 인포박스 데이터를 생성하는 방식으로 진행하였다. 평가를 위하여 기존 한국어에 실제 존재하는 인포박스 데이터와 비교 실험하는 방식을 사용하여 평균적으로 40%의 정확률과 83%의 재현율을 나타내었다. 하지만, 기존 한국어에 존재하는 인포박스 데이터의 내용이 인포박스에 포함될 완전한 데이터를 모두 포함했다고 볼 수 없으므로 본 논문에서 제안하는 수행한 실험의 정확률이 상대적으로 낮게 나온 것으로 분석되었다. 실제 사람이 수작업으로 새롭게 생성된 인포박스 데이터의 적합성을 판별한 경우 평균 76%의 정확률과 91%의 재현율을 나타내었다.

  • PDF

Recognition of the appearance and fashion style of women who experience childbirth (출산을 경험한 여성의 외모에 대한 인식과 패션스타일)

  • Kim, Koh Woon
    • The Research Journal of the Costume Culture
    • /
    • v.29 no.3
    • /
    • pp.453-470
    • /
    • 2021
  • The aims of this study are to explore the experiences of modern Korean women who experience childbirth and to examine the perceptions of body and appearance in everyday life and how fashion provides a means of self-expression. The study utilizes focused ethnography (a qualitative research method) of cultural technology magazines, conducted to observe women's behavior and language, and to explore their life values, such as beliefs, attitudes, and behaviors in fashion style in everyday life. The purpose of this study is to reveal the actual meaning of childbirth, the resulting change in appearance, and patterns of specific style expression. This will enable a better understanding of the experiences of married women with children in Korea using vivid language, by which an in-depth understanding of their lives may be promoted. A survey of 24 women (aged 25~40) who had experienced pregnancy and childbirth were included in the study, categorized as early pregnancy, pre-birth, and post-birth parenting. Subcategories were derived as "unfeasible pregnancy," "unpredictable and unprepared anxiety," "self-awareness of changing bodies," "pressure on healthy bodies," "opportunity to let go of pressure on appearance management," "pressure on hard parenting," and "experience of change in unmanaged areas." Pregnant women and women with children demonstrated tastes and preferences in style suitable for differentiated situations and roles, along with perceptions of appearance.

Unethical Expressions in Messenger Talks for Interactive Artificial Intelligence (대화형 인공지능을 위한 메신저 대화의 비윤리적 표현 연구)

  • Yelin Go;Kilim Nam;Hyunju Song
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.22-25
    • /
    • 2022
  • 본 연구는 대화형 인공지능이 비윤리적 표현을 학습하거나 생성하는 것을 방지하기 위한 기초적 연구로, 메신저 대화에 나타나는 단어 단위, 구 단위 이상의 비윤리적 표현을 수집하고 그 특성을 분석하였다. 비윤리적 표현은 '욕설, 혐오 및 차별 표현, 공격적 표현, 성적 표현'이 해당된다. 메신저 대화에 나타난 비윤리적 표현은 욕설이 가장 많은 비중을 차지했는데, 욕설에서는 비표준형뿐만 아니라 '존-', '미치다' 등과 같이 맥락을 고려하여 판단해야 하는 경우가 있다. 가장 높은 빈도로 나타난 욕설 '존나류, 씨발류, 새끼류'의 타입-토큰 비율(TTR)을 확인한 결과 '새끼류'의 TTR이 가장 높게 나타났다. 다음으로 메신저 대화에서는 공격적 표현이나 성적인 표현에 비해 혐오 및 차별 표현의 비중이 높았는데, '국적/인종'과 '젠더' 관련된 혐오 및 차별 표현이 특히 높게 나타났다. 혐오 및 차별 표현은 단어 단위보다는 구 단위 이상의 표현의 비중이 높았고 문장 단위로 떨어지기 보다는 대화 전체에 걸쳐 나타나는 것을 확인하였다. 따라서 혐오 및 차별 표현을 탐지하기 위해서는 단어 단위보다는 구 단위 이상 표현의 탐지에 대한 필요성이 있음을 학인하였다.

  • PDF

BERT-based Document Summarization model using Copying-Mechanism and Reinforcement Learning (복사 메커니즘과 강화 학습을 적용한 BERT 기반의 문서 요약 모델)

  • Hwang, Hyunsun;Lee, Changki;Go, Woo-Young;Yoon, Han-Jun
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.167-171
    • /
    • 2020
  • 문서 요약은 길이가 긴 원본 문서에서 의미를 유지한 채 짧은 문서나 문장을 얻어내는 작업을 의미한다. 딥러닝을 이용한 자연어처리 기술들이 연구됨에 따라 end-to-end 방식의 자연어 생성 모델인 sequence-to-sequence 모델을 문서 요약 생성에 적용하는 방법들이 연구되었다. 본 논문에서는 여러 자연어처리 분야에서 높은 성능을 보이고 있는 BERT 모델을 이용한 자연어 생성 모델에 복사 메커니즘과 강화 학습을 추가한 문서 요약 모델을 제안한다. 복사 메커니즘은 입력 문장의 단어들을 출력 문장에 복사하는 기술로 학습데이터에서 학습되기 힘든 고유 명사 등의 단어들에 대한 성능을 높이는 방법이다. 강화 학습은 정답 단어의 확률을 높이기 위해 학습하는 지도 학습 방법과는 달리 연속적인 단어 생성으로 얻어진 전체 문장의 보상 점수를 높이는 방향으로 학습하여 생성되는 단어 자체보다는 최종 생성된 문장이 더 중요한 자연어 생성 문제에 효과적일 수 있다. 실험결과 기존의 BERT 생성 모델 보다 복사 메커니즘과 강화 학습을 적용한 모델의 Rouge score가 더 높음을 확인 하였다.

  • PDF

ICLAL: In-Context Learning-Based Audio-Language Multi-Modal Deep Learning Models (ICLAL: 인 컨텍스트 러닝 기반 오디오-언어 멀티 모달 딥러닝 모델)

  • Jun Yeong Park;Jinyoung Yeo;Go-Eun Lee;Chang Hwan Choi;Sang-Il Choi
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.514-517
    • /
    • 2023
  • 본 연구는 인 컨택스트 러닝 (In-Context Learning)을 오디오-언어 작업에 적용하기 위한 멀티모달 (Multi-Modal) 딥러닝 모델을 다룬다. 해당 모델을 통해 학습 단계에서 오디오와 텍스트의 소통 가능한 형태의 표현 (Representation)을 학습하고 여러가지 오디오-텍스트 작업을 수행할 수 있는 멀티모달 딥러닝 모델을 개발하는 것이 본 연구의 목적이다. 모델은 오디오 인코더와 언어 인코더가 연결된 구조를 가지고 있으며, 언어 모델은 6.7B, 30B 의 파라미터 수를 가진 자동회귀 (Autoregressive) 대형 언어 모델 (Large Language Model)을 사용한다 오디오 인코더는 자기지도학습 (Self-Supervised Learning)을 기반으로 사전학습 된 오디오 특징 추출 모델이다. 언어모델이 상대적으로 대용량이기 언어모델의 파라미터를 고정하고 오디오 인코더의 파라미터만 업데이트하는 프로즌 (Frozen) 방법으로 학습한다. 학습을 위한 과제는 음성인식 (Automatic Speech Recognition)과 요약 (Abstractive Summarization) 이다. 학습을 마친 후 질의응답 (Question Answering) 작업으로 테스트를 진행했다. 그 결과, 정답 문장을 생성하기 위해서는 추가적인 학습이 필요한 것으로 보였으나, 음성인식으로 사전학습 한 모델의 경우 정답과 유사한 키워드를 사용하는 문법적으로 올바른 문장을 생성함을 확인했다.

Bakhtinian Reading of the Su-Hyeon Kim's Lines 2 Focused on Carnivalistic Component of Bakhtinian Dialogism Theory (김수현 대사의 바흐찐적 독해 2 바흐찐 대화주의 이론의 카니발적 요소를 중심으로)

  • Yoo, Jin-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.10
    • /
    • pp.631-643
    • /
    • 2018
  • This study is the first ever attempted series paper of one particular TV drama writer, Su-Hyeon Kim, with diversified and systematic scale. Along with the just before study, proving Su-Hyeon Kim's line's conceptual properties as the Bahktin's dialogism, its core of polyphony, this study is to analyze the concrete method of creation through Bahktin's Canivalistic creation component of 'Grotesque Realism' and 'Unofficial Square Language'. With her work of (SBS, 1992), this study verifies its characteristic creation of lines to be the 'Grotesque Realism' components of overturn, destruction, and creation along with 'Unofficial Square Language' components of the masses, material, physical language, and discourse of self-liberation. In this way, this study is on one hand, to overcome the application problem of Bahktin's theory as a separated one, but also to enhance the quality of the study of the writer and the lines. As a result, she has acquired her own distinction of ambivalence by the way of demotion, overturn with masses' material, physical, unofficial square language along with a stronger execution of acquiring self denial, positivity. liberation, and life. Therefore the writer is highly rated for her fierce cognition of language and her artistic spirit. Also, this paper proposes the following study to appraise the impact of the writer's differentiation and artistic spirit on the creation of Korean TV drama, contents industry, and mass media.

Case Study on the Writing of the Papers of Journal of the Korean Association for Science Education (한국과학교육학회지 논문의 글쓰기 사례 연구)

  • Han, JaeYoung
    • Journal of The Korean Association For Science Education
    • /
    • v.35 no.4
    • /
    • pp.649-663
    • /
    • 2015
  • This study investigated the current state of writing in research papers of science education with focus on the translationese and basic Korean grammar, and found a way of improving the Korean language. The science education research have characteristics of both social science and natural science, and of having more quantitative than qualitative research, which could influence the writing of the research paper. The translationese means the conventional expression originated from foreign language other than Korean. The basic Korean grammar includes 'agreement,' 'spelling, word spacing, punctuation mark,' 'causative suffix,' 'use of English or loanword,' and the translationese is divided in 'English,' 'Japanese,' and 'English and Japanese.' The sentences in nine research papers in the 'Journal of the Korean Association for Science Education' were analyzed, and the problematic sentences were discussed and provided with alternatives. The cases with high frequency include '-jeok,' 'use of English,' 'expression of the plural,' 'passive voice of the verb with -hada,' '-go inneun,' '-eul tonghayeo,' '-e daehayeo,' 'gajida,' 'genitive case marker -eui,' 'passive voice with subject of thing,' and 'causative suffix, -sikida.' Based on the results, the characteristics of writing of science education research papers were described as 'writing of quantitative research,' 'objective writing of academic research,' and 'writing of research of foreign origin.' In order to improve the writing of research paper of science education, the science education researcher should pay attention to basic Korean grammar and the translationese, and be familiar with the concrete examples of problematic cases. The results of this study could be used in the education of writing and grammar of Korean language.

A Study on the Keyboard of Jawi Script (Arabic-Malay Script) (아랍식-말레이문자(Jawi Script) 키보드(Keyboard)에 관한 연구)

  • KANG, Kyoung Seok
    • SUVANNABHUMI
    • /
    • v.3 no.1
    • /
    • pp.47-66
    • /
    • 2011
  • Malay society is rooted on the Islamic concept. That Islam influenced every corner of that Malay society which had ever been an edge of the civilizations of the Indus and Ganges. Once the letters of that Hindu religion namely Sanscrit was adopted to this Malay society for the purpose of getting the Malay language, that is, Bahasa Melayu down to the practical literation but in vain. The Sanscrit was too complicated for Malay society to imitate and put it into practice in everyday life because it was totally different type of letters which has many of the similar allographs for a sound. In the end Malay society gave it up and just used the Malay language without using any letters for herself. After a few centuries Islam entered this Malay society with taking Arabic letters. It was not merely influencing Malay cultures, but to the religious life according to wide spread of that Islam. Finally Arabic letters was to the very means that Malay language was written by. It means that Arabic letters had been used for Arabic language in former times, but it became a similar form of letters for a new language which was named as Malay language. This Arabic letters for Arabic language has no problems whereas Arabic letters for Malay language has some of it. Naturally speaking, arabic letters was not designed for any other language but just for Arabic language itself. On account of this, there occurred a few problems in writing Malay consonants, just like p, ng, g, c, ny and v. These 6 letters could never be written down in Arabic letters. Those 6 ones were never known before in trying to pronounce by Arab people. Therefore, Malay society had only to modify a few new forms of letters for these 6 letters which had frequently been found in their own Malay sounds. As a result, pa was derived from fa, nga was derived from ain, ga was derived from kaf, ca was derived from jim, nya was derived from tha or ba, and va was derived from wau itself. Where must these 6 newly modified letters be put on this Arabic keyboard? This is the very core of this working paper. As a matter of course, these 6 letters were put on the place where 6 Arabic signs which were scarecely written in Malay language. Those 6 are found when they are used only in the 'shift-key-using-letters.' These newly designed 6 letters were put instead of the original places of fatha, kasra, damma, sukun, tanween and so on. The main differences between the 2 set of 6 letters are this: 6 in Arabic orginal keyboard are only signs for Arabic letters, on the other hand 6 Malay's are real letters. In others words, 6 newly modified Malay letters were substituted for unused 6 Arabic signs in Malay keyboard. This type of newly designed Malay Jawi Script keyboard is still used in Malaysia, Brunei and some other Malay countries. But this sort of keyboard also needs to go forward to find out another way of keyboard system which is in accordance with the alphabetically ordered keyboard system. It means that alif is going to be typed for A key, and zai shall be typed when Z key is pressed. This keyboard system is called 'Malay Jawi-English Rumi matching keyboard system', even though this system should probably be inconvenient for Malay Jawi experts who are good at Arabic 'alif-ba-ta'order.

  • PDF

The Challenges Native English-Speaking Teachers Face in Korean Secondary Schools

  • Nam, Hyun-Ha
    • English Language & Literature Teaching
    • /
    • v.17 no.2
    • /
    • pp.59-77
    • /
    • 2011
  • In recent years, as many native English speakers are working in Asia to as English teachers, team teaching with local teachers has been commonly implemented within the Korean EFL classroom. Using qualitative case studies, this paper aims to explore native English-speaking teachers' (NESTs) perceptions of team teaching and their challenges at different secondary Korean schools. The study documents the challenges faced by three foreign teachers embedded in intercultural teaching teams. The data shows that common challenges include vague role distribution among teachers, problems presented by mixed levels of students, large classes, and students' low valuation during foreign teacher's classes, which go ungraded. The study calls for serious governmental efforts to change these fundamental problems and closely examine local factors that strongly affect team teaching practices before initiating a system of importing foreign teachers without proper preparation.

  • PDF