• 제목/요약/키워드: Language generation

검색결과 735건 처리시간 0.033초

재일 동포의 한국어에 대한 태도와 학습 동기 강도가 한국어 능력에 미치는 영향 (The Influence of Attitudes toward Korean Language and Motivational Intensity on Korean Proficiency of Korean Residents in Japan)

  • 김희상;김효은
    • 한국어교육
    • /
    • 제28권1호
    • /
    • pp.49-78
    • /
    • 2017
  • This study aims to analyze the effect of attitudes of Korean residents in Japan towards learning the Korean language and their motivational intensity on their Korean proficiency. Data for this study came from a survey on language use of Korean residents in Japan which was conducted in 2016, and questionnaire items referred to language attitude, language use and the degree of understanding language; language use; language learning and Korean ethnic identity. The main results are as follows. First, there were significant differences in Korean language proficiency depending on age, education levels and generation. Second, the control for socio-demographic characteristics, the influence of attitudes towards Korean language on Korean proficiency was statistically significant. However, Korean proficiency was not significantly influenced by motivational intensity. Lastly, moderated effects of immigrant generation in the relation between Korean language attitudes and Korean proficiency were significant. Therefore, the effect of Korean language attitudes on Korean proficiency was more influential on second and third generation Korean-Japanese learners than first generation Korean-Japanese learners. Based on these results, this study suggests that in order to promote Korean language education for Korean residents in Japan, it is required to build positive attitudes toward Korean language, and to consider immigrant generation as a major factor.

Subword Neural Language Generation with Unlikelihood Training

  • Iqbal, Salahuddin Muhammad;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제12권2호
    • /
    • pp.45-50
    • /
    • 2020
  • A Language model with neural networks commonly trained with likelihood loss. Such that the model can learn the sequence of human text. State-of-the-art results achieved in various language generation tasks, e.g., text summarization, dialogue response generation, and text generation, by utilizing the language model's next token output probabilities. Monotonous and boring outputs are a well-known problem of this model, yet only a few solutions proposed to address this problem. Several decoding techniques proposed to suppress repetitive tokens. Unlikelihood training approached this problem by penalizing candidate tokens probabilities if the tokens already seen in previous steps. While the method successfully showed a less repetitive generated token, the method has a large memory consumption because of the training need a big vocabulary size. We effectively reduced memory footprint by encoding words as sequences of subword units. Finally, we report competitive results with token level unlikelihood training in several automatic evaluations compared to the previous work.

A Frame-based Approach to Text Generation

  • Le, Huong Thanh
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.192-201
    • /
    • 2007
  • This paper is a study on constructing a natural language interface to database, concentrating on generating textual answers. TGEN, a system that generates textual answer from query result tables is presented. The TGEN architecture guarantees its portability across domains. A combination of a frame-based approach and natural language generation techniques in the TGEN provides text fluency and text flexibility. The implementation result shows that this approach is feasible while a deep NLG approach is still far to be reached.

  • PDF

정상성인의 뇌기능적 자기공명영상에서 명사, 동사, 형용사 그리고 부사 만들기 과제들에 대한 언어영역편재화의 재현성에 관한 연구 (Reproducibility of Hemispheric Language Dominance by Noun, Verb, Adjective and Adverb Generation Paradigms in Functional Magnetic Resonance Imaging of Normal Volunteers)

  • In Chan Song;Kee Hyun Chang;Chun Kee Chung;Sang Hyun Lee;Moon Hee Han
    • Investigative Magnetic Resonance Imaging
    • /
    • 제5권1호
    • /
    • pp.24-32
    • /
    • 2001
  • Purpose : We investigated the reproducibility of language lateralization by 4 different word generation paradigms or the rest contents in each paradigm using functional magnetic resonance imaging in normal volunteers Materials and Methods Nine normal volunteers with left-handedness (mean age: 25 yrs) were examined on a 1.57 MR unit using a single-shot gradient echo epibold sequence. Four different word generation paradigms of noun, verb, adjective and adverb were used in each normal volunteer for investigating language system. In each paradigm, two different rest contents consisted of only seeing the " +" symbol or reading the meaningless letters. Each task consisted of 96 phases including 3 activations and 6 rests of 2 different contents. Two activation maps in one task were obtained under two different rest contents using the correlation method. We evaluated the detection rates of Broca and Wernicke areas and the differences of language lateralization among four different word generation paradigms, or between the rest contents. Results : The detection rates of Broca and Wernicke areas were over 67 % in 4 different language paradigms and there was no significant difference of them among language paradigms, or between two different rest contents. Language dominances, in all 4 different language paradigms, were shown to be consistent in 66 %, but were contrary with language paradigms in some subjects. The rest contents made no significant effect on dominant language dominance determination, but the success rates of the dominant language dominances determined from 4 language paradigms were higher in reading the meaningless letter (100%, n=9) than in only seeing "+" on screen at the rest task (78%, n=7).

  • PDF

A Survey of Automatic Code Generation from Natural Language

  • Shin, Jiho;Nam, Jaechang
    • Journal of Information Processing Systems
    • /
    • 제17권3호
    • /
    • pp.537-555
    • /
    • 2021
  • Many researchers have carried out studies related to programming languages since the beginning of computer science. Besides programming with traditional programming languages (i.e., procedural, object-oriented, functional programming language, etc.), a new paradigm of programming is being carried out. It is programming with natural language. By programming with natural language, we expect that it will free our expressiveness in contrast to programming languages which have strong constraints in syntax. This paper surveys the approaches that generate source code automatically from a natural language description. We also categorize the approaches by their forms of input and output. Finally, we analyze the current trend of approaches and suggest the future direction of this research domain to improve automatic code generation with natural language. From the analysis, we state that researchers should work on customizing language models in the domain of source code and explore better representations of source code such as embedding techniques and pre-trained models which have been proved to work well on natural language processing tasks.

마이크로명령어 기술언어를 사용한 마이크로코드 자동생성 시스템 (An Automatic Microcode Generation System Using a Microinstruction Description Language)

  • 이상정;조영훈;임인칠
    • 전자공학회논문지B
    • /
    • 제28B권7호
    • /
    • pp.540-547
    • /
    • 1991
  • This paper proposes a machine in dependent automatic microcode generation system using a microtnstruction description language, MDL. The MDL, which has similar structure to C language, is a high-level microarchitecture description language. It defines the hardwaer elements and the operand selection of microoperartions. The proposed system generates microcode automatically by describing the structure information of a target microarchitectuer and accepting thebehavioral information of microoperations which are generated ad a intermediate language from HLML-C. This proposed system is implemented with C language and YACC on a SUN workstation (4.3 BSD).

  • PDF

Enhanced Regular Expression as a DGL for Generation of Synthetic Big Data

  • Kai, Cheng;Keisuke, Abe
    • Journal of Information Processing Systems
    • /
    • 제19권1호
    • /
    • pp.1-16
    • /
    • 2023
  • Synthetic data generation is generally used in performance evaluation and function tests in data-intensive applications, as well as in various areas of data analytics, such as privacy-preserving data publishing (PPDP) and statistical disclosure limit/control. A significant amount of research has been conducted on tools and languages for data generation. However, existing tools and languages have been developed for specific purposes and are unsuitable for other domains. In this article, we propose a regular expression-based data generation language (DGL) for flexible big data generation. To achieve a general-purpose and powerful DGL, we enhanced the standard regular expressions to support the data domain, type/format inference, sequence and random generation, probability distributions, and resource reference. To efficiently implement the proposed language, we propose caching techniques for both the intermediate and database queries. We evaluated the proposed improvement experimentally.

언어함수를 이용한 영문 생성기의 구현에 관한 연구 (A study on Implementation of English Sentence Generator using Lexical Functions)

  • 정희연;김희연;이웅재
    • 인터넷정보학회논문지
    • /
    • 제1권2호
    • /
    • pp.49-59
    • /
    • 2000
  • 컴퓨터의 발달과 인터넷 사용자의 증대로 자연어 처리의 연구에 관한 관심이 증대되고 있다. 그러나 대부분의 연구가 자연어 분석 및 이해에 집중되고 있어 자연어 생성에 관한 연구는 주목을 받지 못해 왔으며 자연어 생성을 자연어 분석의 역 과정으로 간단하게 생각하는 경향마저도 있다. 하지만 Web상에서의 다국어간 번역 자연어 인터페이스 자연어 검색 시스템 등 자연어처리에 관한 필요성이 증가함에 따라 자연어 생성의 필요성도 자연히 증가하고 있는 실정이며 좀 더 체계적인 자연어 생성 시스템 개발을 위해서는 자연어 생성에 관한 보다 구체적인 알고리즘에 관한 연구가 필요하다. 본 논문에서는 영문 생성에 있어서 보다 자연스러운 문장을 생성하기 위한 알고리즘을 제안하며 특히 Igor Mel'uk (Mel'uk & Zholkovsky, 1988)의 어휘 함수(LFs)를 이용한 어휘 결합을 통하여 절 길이의 설명문을 생성하는 영문 생성기의 구현에 대하여 논한다.

  • PDF

Framework for evaluating code generation ability of large language models

  • Sangyeop Yeo;Yu-Seung Ma;Sang Cheol Kim;Hyungkook Jun;Taeho Kim
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.106-117
    • /
    • 2024
  • Large language models (LLMs) have revolutionized various applications in natural language processing and exhibited proficiency in generating programming code. We propose a framework for evaluating the code generation ability of LLMs and introduce a new metric, pass-ratio@n, which captures the granularity of accuracy according to the pass rate of test cases. The framework is intended to be fully automatic to handle the repetitive work involved in generating prompts, conducting inferences, and executing the generated codes. A preliminary evaluation focusing on the prompt detail, problem publication date, and difficulty level demonstrates the successful integration of our framework with the LeetCode coding platform and highlights the applicability of the pass-ratio@n metric.

문장생성에 의한 통신보조시스템의 설계 및 구현 (Design and Implementation of a Augmentative and Alternative Communication System Using Sentence Generation)

  • 우요섭;민홍기;황인정
    • 한국멀티미디어학회논문지
    • /
    • 제8권9호
    • /
    • pp.1248-1257
    • /
    • 2005
  • 본 논문은 통신보조시스템을 위한 문장생성의 구현과 설계에 관한 것이다. 통신보조시스템은 언어장애인을 위한 보조 시스템으로서 시간과 키의 수를 줄여 문장을 생성하는데 그 목적이 있다. 본 논문에서는 기존의 문장생성의 장단점을 보완하여 문장생성을 하였다. 문장생성을 위하여 동사와 조사에 따라 명사가 한정되는 한글 구조를 이용하였다. 본 논문의 특징은 도메인 개념을 이용하여 명사와 동사를 연결하였다. 문장생성을 위해 한글의 특성으로 구축한 어휘정보를 이용하였다. 또한 현재 문장생성에 관한 여러 방법을 비교하였다. 문장생성은 문장특징 추출에 의한 어휘정보에 바탕을 둔다.

  • PDF