• Title/Summary/Keyword: language model

Search Result 2,791, Processing Time 0.024 seconds

Research Trends of Ergonomics in Occupational Safety and Health through MEDLINE Search: Focus on Abstract Word Modeling using Word Embedding (MEDLINE 검색을 통한 산업안전보건 분야에서의 인간공학 연구동향 : 워드임베딩을 활용한 초록 단어 모델링을 중심으로)

  • Kim, Jun Hee;Hwang, Ui Jae;Ahn, Sun Hee;Gwak, Gyeong Tae;Jung, Sung Hoon
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.5
    • /
    • pp.61-70
    • /
    • 2021
  • This study aimed to analyze the research trends of the abstract data of ergonomic studies registered in MEDLINE, a medical bibliographic database, using word embedding. Medical-related ergonomic studies mainly focus on work-related musculoskeletal disorders, and there are no studies on the analysis of words as data using natural language processing techniques, such as word embedding. In this study, the abstract data of ergonomic studies were extracted with a program written with selenium and BeutifulSoup modules using python. The word embedding of the abstract data was performed using the word2vec model, after which the data found in the abstract were vectorized. The vectorized data were visualized in two dimensions using t-Distributed Stochastic Neighbor Embedding (t-SNE). The word "ergonomics" and ten of the most frequently used words in the abstract were selected as keywords. The results revealed that the most frequently used words in the abstract of ergonomics studies include "use", "work", and "task". In addition, the t-SNE technique revealed that words, such as "workplace", "design", and "engineering," exhibited the highest relevance to ergonomics. The keywords observed in the abstract of ergonomic studies using t-SNE were classified into four groups. Ergonomics studies registered with MEDLINE have investigated the risk factors associated with workers performing an operation or task using tools, and in this study, ergonomics studies were identified by the relationship between keywords using word embedding. The results of this study will provide useful and diverse insights on future research direction on ergonomic studies.

Modified multi-sense skip-gram using weighted context and x-means (가중 문맥벡터와 X-means 방법을 이용한 변형 다의어스킵그램)

  • Jeong, Hyunwoo;Lee, Eun Ryung
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.389-399
    • /
    • 2021
  • In recent years, word embedding has been a popular field of natural language processing research and a skip-gram has become one successful word embedding method. It assigns a word embedding vector to each word using contexts, which provides an effective way to analyze text data. However, due to the limitation of vector space model, primary word embedding methods assume that every word only have a single meaning. As one faces multi-sense words, that is, words with more than one meaning, in reality, Neelakantan (2014) proposed a multi-sense skip-gram (MSSG) to find embedding vectors corresponding to the each senses of a multi-sense word using a clustering method. In this paper, we propose a modified method of the MSSG to improve statistical accuracy. Moreover, we propose a data-adaptive choice of the number of clusters, that is, the number of meanings for a multi-sense word. Some numerical evidence is given by conducting real data-based simulations.

End-to-end speech recognition models using limited training data (제한된 학습 데이터를 사용하는 End-to-End 음성 인식 모델)

  • Kim, June-Woo;Jung, Ho-Young
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.63-71
    • /
    • 2020
  • Speech recognition is one of the areas actively commercialized using deep learning and machine learning techniques. However, the majority of speech recognition systems on the market are developed on data with limited diversity of speakers and tend to perform well on typical adult speakers only. This is because most of the speech recognition models are generally learned using a speech database obtained from adult males and females. This tends to cause problems in recognizing the speech of the elderly, children and people with dialects well. To solve these problems, it may be necessary to retain big database or to collect a data for applying a speaker adaptation. However, this paper proposes that a new end-to-end speech recognition method consists of an acoustic augmented recurrent encoder and a transformer decoder with linguistic prediction. The proposed method can bring about the reliable performance of acoustic and language models in limited data conditions. The proposed method was evaluated to recognize Korean elderly and children speech with limited amount of training data and showed the better performance compared of a conventional method.

A Study on the Methods of Communication Education based on 'Empathy'; for Example <(500) Days of Summer> ('공감'을 기반으로 한 의사소통교육 방법 모색 ; 영화 <500일의 섬머>를 예로)

  • Kim, Kyung Ae
    • Journal of Digital Convergence
    • /
    • v.19 no.3
    • /
    • pp.279-285
    • /
    • 2021
  • This paper criticized that online classes during the Covid-19 period were centered on knowledge and information education, and sought ways to improve empathy as a way to improve students' sociality. The teaching-learning process was designed around the movie <(500) Days of Summer> which has the theme and story of parting and growth. On this paper the stage of empathy was divided into three stages, recognize-into, feeling-into, emotional-transaction stage. In particular, considering the process of transitioning from emotional empathy to behavioral empathy as the key to communication education, the class was designed in five stages, with an expression stage between the feeling-into stage and the emotional-transaction stage. This course is possible when learners sympathize with the work itself and reflect on their own narrative, so literary therapeutic was used, and students's response statements were collected to prove that this process is meaningful for improving empathy. In this article, the class was designed for the movie <(500) Days of Summer>, but this teaching-learning model can be applied to other contemporary film texts.

A study on the difficulty adjustment of programming language multiple-choice problems using machine learning (머신러닝을 활용한 프로그래밍언어 객관식 문제의 난이도 조정에 대한 연구)

  • Kim, EunJung
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.2
    • /
    • pp.11-24
    • /
    • 2022
  • For the questions asked for LMS-based online evaluation the professor directly set exam questions, or use the automatic question-taking method according to the level of difficulty using the question bank divided by category. Among them, it is important to manage the difficulty of questions in an objective and efficient way, above all, in the automatic question-taking method according to difficulty. Because the questions presented to the evaluators may be different. In this paper, we propose an difficulty re-adjustment algorithm that considers not only the correct rate of a problem but also the time taken to solve the problem. For this, a logistic regression classification algorithm was used of machine learning, and a reference threshold was set based on the predicted probability value of the learning model and used to readjust the difficulty of each item. As a result, it was confirmed that there were many changes in the difficulty of each item that depended only on the existing correct rate. Also, as a result of performing group evaluation using the adjustment difficulty problem, it was confirmed that the average score improved in most groups compared to the difficulty problem based on the percentage of correct answers.

Performance Improvement of Context-Sensitive Spelling Error Correction Techniques using Knowledge Graph Embedding of Korean WordNet (alias. KorLex) (한국어 어휘 의미망(alias. KorLex)의 지식 그래프 임베딩을 이용한 문맥의존 철자오류 교정 기법의 성능 향상)

  • Lee, Jung-Hun;Cho, Sanghyun;Kwon, Hyuk-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.3
    • /
    • pp.493-501
    • /
    • 2022
  • This paper is a study on context-sensitive spelling error correction and uses the Korean WordNet (KorLex)[1] that defines the relationship between words as a graph to improve the performance of the correction[2] based on the vector information of the word embedded in the correction technique. The Korean WordNet replaced WordNet[3] developed at Princeton University in the United States and was additionally constructed for Korean. In order to learn a semantic network in graph form or to use it for learned vector information, it is necessary to transform it into a vector form by embedding learning. For transformation, we list the nodes (limited number) in a line format like a sentence in a graph in the form of a network before the training input. One of the learning techniques that use this strategy is Deepwalk[4]. DeepWalk is used to learn graphs between words in the Korean WordNet. The graph embedding information is used in concatenation with the word vector information of the learned language model for correction, and the final correction word is determined by the cosine distance value between the vectors. In this paper, In order to test whether the information of graph embedding affects the improvement of the performance of context- sensitive spelling error correction, a confused word pair was constructed and tested from the perspective of Word Sense Disambiguation(WSD). In the experimental results, the average correction performance of all confused word pairs was improved by 2.24% compared to the baseline correction performance.

Analysis of Factors Affecting Academic Ability of Preschool-age Children

  • Moon, Kyung-Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.205-213
    • /
    • 2022
  • This study is to analyze the relationship among potential variables of self-development, social development, learning readiness, and academic ability using data from the Panel Study on Korean Children, which was surveyed in 2014, and to find factors affecting the academic ability of preschool children will be. The subjects of this study were 6-year-old children of 1113 households among 2150 households in the 7th Panel Study on Korean Children(2014) data, excluding non-responders and system-missing 1037 households. As a result of analyzing the path effect of the research model, it was found that, between self-development and academic skills, self-development had a direct effect on academic skills and also had a significant indirect effect through social development and learning readiness as a medium. In addition, it was found that learning readiness had the greatest influence among self-development, social development, and learning readiness on academic skills. As a result, the academic skills of preschool-age children should be treated with great importance in order to develop them into talents with creativity and problem-solving ability.

Error Analysis of Recent Conversational Agent-based Commercialization Education Platform (최신 대화형 에이전트 기반 상용화 교육 플랫폼 오류 분석)

  • Lee, Seungjun;Park, Chanjun;Seo, Jaehyung;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.3
    • /
    • pp.11-22
    • /
    • 2022
  • Recently, research and development using various Artificial Intelligence (AI) technologies are being conducted in the field of education. Among the AI in Education (AIEd), conversational agents are not limited by time and space, and can learn more effectively by combining them with various AI technologies such as voice recognition and translation. This paper conducted a trend analysis on platforms that have a large number of users and used conversational agents for English learning among commercialized application. Currently commercialized educational platforms using conversational agent through trend analysis has several limitations and problems. To analyze specific problems and limitations, a comparative experiment was conducted with the latest pre-trained large-capacity dialogue model. Sensibleness and Specificity Average (SSA) human evaluation was conducted to evaluate conversational human-likeness. Based on the experiment, this paper propose the need for trained with large-capacity parameters dialogue models, educational data, and information retrieval functions for effective English conversation learning.

Teleworking Survey in Saudi Arabia: Reliability and Validity of Arabic Version of the Questionnaire

  • Heba Yaagoub, AlNujaidi;Mehwish, Hussain;Sama'a H., AlMubarak;Asma Saud, AlFayez;Demah Mansour, AlSalman;Atheer Khalid, AlSaif;Mona M., Al-Juwair
    • Journal of Preventive Medicine and Public Health
    • /
    • v.55 no.6
    • /
    • pp.578-585
    • /
    • 2022
  • Objectives: This study aimed to adapt the survey questionnaire designed by Moens et al. (2021) and determine the validity and reliability of the Arabic version of the survey in a sample of the Saudi population experiencing teleworking. Methods: The questionnaire includes 2 sections. The first consists of 13 items measuring the impact of extended telework during the coronavirus disease 2019 (COVID-19) crisis. The second section includes 6 items measuring the impact of the COVID-19 crisis on selfview of telework and digital meetings. The survey instrument was translated based on the guidelines for the cultural adaptation of self-administrated measures. Results: The reliability of the questionnaire responses was measured by Cronbach's alpha. The construct validity was checked through exploratory factor analysis followed by confirmatory factor analysis (CFA) to further assess the factor structure. CFA revealed that the model had excellent fit (root mean square error of approximation, 0.00; comparative fit index, 1.0; Tucker-Lewis index, 1; standardized root mean squared residual, 0.0). Conclusions: The Arabic version of the teleworking questionnaire had high reliability and good validity in assessing experiences and perceptions toward teleworking. While the validated survey examined perceptions and experiences during COVID-19, its use can be extended to capture experiences and perceptions during different crises.

Electronic Data Interchange Framework for Financial Management System

  • Aldowesh, Nora;Alfaleh, Aljawharah;Alhejazi, Manal;Baghdadi, Heyam;Atta-ur-Rahman, Atta-ur-Rahman
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.275-287
    • /
    • 2022
  • As a result, for the increasing expansion by the university faculties in the field of postgraduate studies, The Deanship of Graduate Studies at the university has been established in 1430 AH/2009 CE to specifically address the needs of the current and prospective graduate population to supervise postgraduate studies programs in coordination with the concerned faculties. This comes as a result for the university being certain of the importance of providing postgraduate studies opportunities that follow the bachelor's degree to qualify our ambitious youth appropriately. The University offers 72 different Graduate programs, awarding doctoral and master's degrees along with fellowships and diplomas in various disciplines like health, engineering, science, literary, and educational. Currently, the financial model for admission and students' payment is manual and paper based. This paper proposes to provide a user interface for Financial Management in Deanship of Graduate studies The basic purpose of the system was to minimize human interference and reduce mistakes placed by human interference, also to have efficient and a fast performance, and perform Electronic Data Interchange (EDI) for various tasks such as billing and scheduling details.