• Title/Summary/Keyword: 문장유사성분석

Search Result 69, Processing Time 0.027 seconds

Analysis of dieting practices in 2016 using big data (빅데이터를 통한 2016년의 다이어트 실태 분석)

  • Jung, Eun-Jin;Chang, Un-Jae;Jo, Kyungae
    • Korean Journal of Food Science and Technology
    • /
    • v.51 no.2
    • /
    • pp.176-181
    • /
    • 2019
  • The aim of this study was to analyze dieting practices and tendencies in 2016 using big data. The keywords related to diet were collected from the portal site Naver and analyzed through simple frequency, N-gram, keyword network, and analysis of seasonality. The results showed that exercise had the highest frequency in simple frequency analysis. However, diet menu appeared most frequently in N-gram analysis. In addition, analysis of seasonality showed that the interest of subjects in diet increased steadily from February to July and peaked in October 2016. The monthly frequency of the keyword highfat diet was highest in October, because that showed the 'Low Carbohydrate High Fat' TV program. Although diet showed a certain pattern on a yearly basis, the emergence of new trendy diets in mass media also affects the pattern of diet. Therefore, it is considered that continuous monitoring and analysis of diet is needed rather than periodic monitoring.

The Critical Discussion about Lacanian Structural Definition of Sexual Difference. (남녀성차에 대한 라캉의 구조적 정의와 그 문제)

  • Moun, Jean-sou
    • Journal of Korean Philosophical Society
    • /
    • v.129
    • /
    • pp.53-82
    • /
    • 2014
  • This paper analyzes the concept of Lacanian subject and the structural definition of sexual difference between man and woman, and criticizes some problems of those definitions. It seems to me, to do so, that it is important to know precisely the core terms of psychoanalysis quoted by Lacan. We should analyze the basic meanings and the relation of the Imaginary, Symbolic and the Real, of ideal ego and ego ideal, of phallus and signifier, of desire and the other, of consciousness and unconsciousness, of alienation and separation, etc. I'm going to discuss the relation between the Imaginary and the ideal ego in chapter 2, and then, deal with the relation between the Symbolic and the ego ideal in chapter 3. I'll explain both similarity and difference between the ideal ego and ego ideal through those discussions. In chapter 4, I'm planning to explain the relation among the other, desire and the subject of unconsciousness. In chapter 5, I'll analyze the meaning of phallus and signifier. I'll criticize the Lacanian structural definition of sexual difference on the basis of the work made in former chapters. These discussions will lead to my final conclusion that the concept of Lacanian subject and the structural definition of sexual difference are only dependent on reductionism regarding everything as symbolic, which has in itself a lot of contradiction. In order that All discussions about sexual difference have at least a objective meaning, they have to rely on anatomical differences between man and woman.

A Suggestion of the Direction of Construction Disaster Document Management through Text Data Classification Model based on Deep Learning (딥러닝 기반 분류 모델의 성능 분석을 통한 건설 재해사례 텍스트 데이터의 효율적 관리방향 제안)

  • Kim, Hayoung;Jang, YeEun;Kang, HyunBin;Son, JeongWook;Yi, June-Seong
    • Korean Journal of Construction Engineering and Management
    • /
    • v.22 no.5
    • /
    • pp.73-85
    • /
    • 2021
  • This study proposes an efficient management direction for Korean construction accident cases through a deep learning-based text data classification model. A deep learning model was developed, which categorizes five categories of construction accidents: fall, electric shock, flying object, collapse, and narrowness, which are representative accident types of KOSHA. After initial model tests, the classification accuracy of fall disasters was relatively high, while other types were classified as fall disasters. Through these results, it was analyzed that 1) specific accident-causing behavior, 2) similar sentence structure, and 3) complex accidents corresponding to multiple types affect the results. Two accuracy improvement experiments were then conducted: 1) reclassification, 2) elimination. As a result, the classification performance improved with 185.7% when eliminating complex accidents. Through this, the multicollinearity of complex accidents, including the contents of multiple accident types, was resolved. In conclusion, this study suggests the necessity to independently manage complex accidents while preparing a system to describe the situation of future accidents in detail.

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

A Comparative Study on International Baccalaureate Diploma Programme(IBDP) Textbooks and Korean Textbooks by the 2015 Revised Curriculum -Focus on function from a mathematical modeling perspective- (우리나라 교과서와 International Baccalaureate Diploma Programme(IBDP) 교과서 비교·분석 -수학적 모델링의 관점에서 함수 영역을 중심으로-)

  • Park, Woo Hong;Choi-Koh, Sang Sook
    • Journal of the Korean School Mathematics Society
    • /
    • v.25 no.2
    • /
    • pp.125-148
    • /
    • 2022
  • This study aimed to compare and analyze the number and characteristics of modeling problems in chapters related to function contents in International Baccalaureate Diploma Program (IBDP) mathematics textbooks and Korean high school mathematics textbooks. This study implies how the textbooks contributed to the improvement of students' modeling competency. In this study, three textbooks from IBDP and all nine textbooks from the Korean 2015 revised curriculum were selected. All the problems in textbooks were classified into real-world problems and non-real-world problems. Problems classified as real-world problems were once again divided into word problems and modeling problems according to the need to set up mathematical models. Modeling problems were further categorized into standard applications and good modeling problems depending on whether all the necessary information was included in the problem-solving process. Among the 12 textbooks, the textbook with the most modeling problems was the IBDP textbook, 'Math: Applications and Interpretation', which accounted for 50.41% of modeling problems to the total number of problems. This textbook provided learners with significantly higher modeling opportunities than other IBDP and Korean textbooks, which had 2% and 9% modeling problem ratios. In all 12 textbooks, all problems classified as modeling problems appeared as standard applications, and there were no proper modeling problems. Among the six sub-competencies of mathematical modeling, 'mathematical analysis' and 'interpretation and evaluation of results' sub-competencies appeared the most with very similar number of modeling problems, followed by the 'mathematization'. It is expected that the results of this study will help compare the number and ratio of modeling problems in each textbook and provide a better understanding of which modeling sub-competencies appear to what extent in the modeling problems.

A School-tailored High School Integrated Science Q&A Chatbot with Sentence-BERT: Development and One-Year Usage Analysis (인공지능 문장 분류 모델 Sentence-BERT 기반 학교 맞춤형 고등학교 통합과학 질문-답변 챗봇 -개발 및 1년간 사용 분석-)

  • Gyeongmo Min;Junehee Yoo
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.3
    • /
    • pp.231-248
    • /
    • 2024
  • This study developed a chatbot for first-year high school students, employing open-source software and the Korean Sentence-BERT model for AI-powered document classification. The chatbot utilizes the Sentence-BERT model to find the six most similar Q&A pairs to a student's query and presents them in a carousel format. The initial dataset, built from online resources, was refined and expanded based on student feedback and usability throughout over the operational period. By the end of the 2023 academic year, the chatbot integrated a total of 30,819 datasets and recorded 3,457 student interactions. Analysis revealed students' inclination to use the chatbot when prompted by teachers during classes and primarily during self-study sessions after school, with an average of 2.1 to 2.2 inquiries per session, mostly via mobile phones. Text mining identified student input terms encompassing not only science-related queries but also aspects of school life such as assessment scope. Topic modeling using BERTopic, based on Sentence-BERT, categorized 88% of student questions into 35 topics, shedding light on common student interests. A year-end survey confirmed the efficacy of the carousel format and the chatbot's role in addressing curiosities beyond integrated science learning objectives. This study underscores the importance of developing chatbots tailored for student use in public education and highlights their educational potential through long-term usage analysis.

Validity and Reliability of a Korean Version of Nurse Clinical Reasoning Competence Scale (한국어판 간호사 임상적 추론 역량 척도의 타당도와 신뢰도)

  • Joung, Jaewon;Han, Jeong Won
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.4
    • /
    • pp.304-310
    • /
    • 2017
  • This study is a methodological research study that tests the validity and reliability of the NCRC (Nurse Clinical Reasoning Competence scale), an instrument developed by Liou and his colleagues as the basic data for enhancing the clinical reasoning competence of nurses, by translating it into Korean and checking the similarity of the sentence structure and meaning (between the two versions?). This study verified its validity and reliability by examining 166 nurses working in four tertiary hospitals located in Seoul and Busan. An analysis of the content validity by experts showed that all of the items have a content validity higher than CVI 0.8. From the exploratory and confirmatory factor analysis, it was found that the instrument includes a total of 15 items consisting of one factor. In addition, the correlation with the Korean version of the Nurse Clinical Reasoning Competence scale is confirmed to test the concurrent validity, by using a measurement tool of nurses' critical thinking dispositions and clinical decision-making abilities (correlation coefficient =.55-.64(p<.001) and Cronbach's ${\alpha}=.93$). Thus, the Korean version of the NCRC may be a useful instrument for evaluating the clinical reasoning competence of Korean nurses and providing the basic data for assessing their clinical reasoning competence and developing their promotion strategies.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

A Study on the Morphological Structure of Sasul-Sijo (사설시조의 형태구조 연구)

  • Won, Yong-Moon
    • Sijohaknonchong
    • /
    • v.23
    • /
    • pp.161-188
    • /
    • 2005
  • The purpose of this study was to delve into the morphological types of Sijo in an effort to determine the morphological structure of Sasul-sijo, and it's also attempted to present standard about how to discriminate Pyong-si, Eos-sijo and Sasul-sijo from one another from a morphological standpoint. It's suggested that Si with tee Jangs, six verses and 12 stanzas or more, with three Jangs, seven verses and 14 stanzas or more, and with three Jangs, eight verses and 16 stanzas or more should respectively be called Pyong-sijo, Eos-sijo and Sasul-sijo. After what Sijo was and what's not were discussed, how to distinguish Eos-sijo from Sasul-sijo was described, and finally, the structure of Sasul-sijo was presented. As for Sijo and non-Sijo, the types of works that consisted of tee Jangs, like Sijo, yet didn't suit its framework and Yuljo and were written in Chinese characters were regarded as non-Sijo. Concerning discrimination between Eos-si and Sasul-sijo, the type of Sijo that included one more or higher number of verse(s) and two more or higher number of stanzas in one of three Jangs was defined as Eos-sijo, and the type of Sijo that involved two more or higher number of verses and four more or higher number of stanzas in one of three Jangs was called Sasul-sijo. In other words, Eos-sijo contained one more verse in one of tee Jangs, and Sasul-sijo included one more Jang in one tee Jangs. The sort of Sijo that contained one more Jang in one of three Jangs could be viewed as Sasul-sijo. Regarding the structure of Sasul-si, there should be three Jangs, eight verses and 16 stanzas in one piece of Sasul-sijo. Any type of Sijo that contained two more or higher number of verses and four more or higher number of stanzas could be called Sasul-sijo. Such an addition of verse and stanza could done in various ways. The examples were (1) adding stanzas the first Jang, 2) adding stanzas to the second Jang, (3) adding stanzas to the final Jang, (4) adding stanzas to both the first and Second Jangs, (5) adding stanzas to th the second and final Jangs, and (6) adding stanzas to all the first, second and third Jangs at the same time. Besides, there was an extremely broad gap between the numbers of verse and stanza in Sasul-sijo, which ranged from a low of eight stanzas to a high of 87 ones in one of three Jangs.

  • PDF