• Title/Summary/Keyword: target text

Search Result 233, Processing Time 0.029 seconds

A Proposal on Data Modification Detection System using SHA-256 in Digital Forensics (디지털 포렌식을 위한 SHA-256 활용 데이터 수정 감지시스템 제안)

  • Jang, Eun-Jin;Shin, Seung-Jung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.4
    • /
    • pp.9-13
    • /
    • 2021
  • With the development of communication technology, various forms of digital crime are increasing, and the need for digital forensics is increasing. Moreover, if a textual document containing sensitive data is deliberately deleted or modified by a particular person, it could be important data to prove its connection to a particular person and crime through a system that checks for data modification detection. This paper proposes a data modification detection system that can analyze the hash data, file size, file creation date, file modification date, file access date, etc. of SHA-256, one of the encryption techniques, focusing on text files, to compare whether the target text file is modified or not.

English-Korean speech translation corpus (EnKoST-C): Construction procedure and evaluation results

  • Jeong-Uk Bang;Joon-Gyu Maeng;Jun Park;Seung Yun;Sang-Hun Kim
    • ETRI Journal
    • /
    • v.45 no.1
    • /
    • pp.18-27
    • /
    • 2023
  • We present an English-Korean speech translation corpus, named EnKoST-C. End-to-end model training for speech translation tasks often suffers from a lack of parallel data, such as speech data in the source language and equivalent text data in the target language. Most available public speech translation corpora were developed for European languages, and there is currently no public corpus for English-Korean end-to-end speech translation. Thus, we created an EnKoST-C centered on TED Talks. In this process, we enhance the sentence alignment approach using the subtitle time information and bilingual sentence embedding information. As a result, we built a 559-h English-Korean speech translation corpus. The proposed sentence alignment approach showed excellent performance of 0.96 f-measure score. We also show the baseline performance of an English-Korean speech translation model trained with EnKoST-C. The EnKoST-C is freely available on a Korean government open data hub site.

A Comparative Study on the Types and its Importance of Trade Claims between China and the United States: Using Text Mining Techniques (중국과 미국의 무역클레임 유형과 중요도 비교 연구 : 텍스트 마이닝 기법을 활용하여)

  • Cheon Yu;Yun-Seop Hwang
    • Korea Trade Review
    • /
    • v.47 no.3
    • /
    • pp.177-190
    • /
    • 2022
  • This study is designed to identify the differences in the types and importance of trade claims at the national level. For analysis data, abstracts of arbitration and court judgments published on the website of the United Nations Commission on International Trade Law are collected and used. The target countries are China and the United States, with 102 cases from China and 59 cases from the United States. By applying topic modeling techniques to the collection decisions of China and the United States, trade claims are categorized, and the importance of each type is identified using the network centrality index derived through semantic network analysis. The analysis results are as follows. First, the main types of trade claims were the same for both the United States and China: product nonconformity, delivery issues, and payments. However, in China, the order of product nonconformity > delivery issues > payments was important, and in the United States, payments > product nonconformity > delivery issues were found to be important. This study is significant in that it presents a strategic trade claim management plan using a quantitative methodology.

A Multi-level Representation of the Korean Narrative Text Processing and Construction-Integration Theory: Morpho- syntactic and Discourse-Pragmatic Effects of Verb Modality on Topic Continuity (한국어 서사 텍스트 처리의 다중 표상과 구성 통합 이론: 주제어 연속성에 대한 양태 어미의 형태 통사적, 담화 화용적 기능)

  • Cho Sook-Whan;Kim Say-Young
    • Korean Journal of Cognitive Science
    • /
    • v.17 no.2
    • /
    • pp.103-118
    • /
    • 2006
  • The main purpose of this paper is to investigate the effects of discourse topic and morpho-syntactic verbal information on the resolution of null pronouns in the Korean narrative text within the framework of the construction-integration theory (Kintsch, 1988, Singer & Kintsch, 2001, Graesser, Gernsbacher, & Goldman. 2003). For the purpose of this paper, two conditions were designed: an explicit condition with both a consistently maintained discourse topic and the person-specific verb modals on one hand, and a neutral condition with no discourse topic or morpho-syntactic information provided, on the other. We measured the reading tines far the target sentence containing a null pronoun and the question response times for finding an antecedent, and the accuracy rates for finding an antecedent. During the experiments each passage was presented at a tine on a computer-controlled display. Each new sentence was presented on the screen at the moment the participant pressed the button on the computer keyboard. Main findings indicate that processing is facilitated by macro-structure (topicality) in conjunction with micro-structure (morpho-syntax) in pronoun interpretation. It is speculated that global processing alone may not be able to determine which potential antecedent is to be focused unless aided by lexical information. It is argued that the results largely support the resonance-based model, but not the minimalist hypothesis.

  • PDF

Literature-Based Instruction for Fashion Design Class: Using Alice's Adventures in Wonderland by Lewis Carroll (영문학을 활용한 의상디자인 전공을 위한 영어교육: Alice's Adventures in Wonderland by Lewis Carroll 을 활용한 학습 모형)

  • Kim, Minjung
    • Fashion & Textile Research Journal
    • /
    • v.20 no.3
    • /
    • pp.287-292
    • /
    • 2018
  • The present study proposes a model for literature-based instruction within the context of a fashion design curriculum at a Korean university. The fashion design market continues to grow. The fashion design market now requires more cultural-bound strategies and efficient communication skills. The literature provides authentic resources and is highly relevant to the development of students' culture awareness as well as language awareness. Alice's Adventures in Wonderland written by Lewis Carroll contains various cultural contexts of the Victorian Era. The text provides explicit knowledge of the era depicted in both illustrations and satire languages. This study instructs students to analyze and interpret texts and illustrations so that they can engage critically and analytically in reading text to increase culture awareness and language awareness. The integration of literature and fashion design can provide students an opportunity to explore language choice and acquire refined knowledge of the target culture. Along with the text, illustrations in the literature provoke student's imaginative and creative thinking skills. Students can think and discuss many issues that deal with Victorian values and reinforce creative thinking skills. In the final stage, students can design fashion inspired by Victorian values and present their own designs using the acquired languages. This eventually leads students to adapt to a new notion for the fashion market and become competent communicators in the fashion world.

Study on the Improvement of Extraction Performance for Domain Knowledge based Wrapper Generation (도메인 지식 기반 랩퍼 생성의 추출 성능 향상에 관한 연구)

  • Jeong Chang-Hoo;Choi Yun-Soo;Seo Jeong-Hyeon;Yoon Hwa-Mook
    • Journal of Internet Computing and Services
    • /
    • v.7 no.4
    • /
    • pp.67-77
    • /
    • 2006
  • Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target web page. Our experimental work shows that the proposed methods help increasing precision of the resulting wrapper, particularly extracting the title information, the most important entity on a web page. The proposed methods can be useful in making a more efficient and correct information extraction system for various sources of information without user intervention.

  • PDF

Convergence Study of Relation between Job Stress and Self-efficacy of Nurses (간호사의 직무 스트레스와 자기효능감 관련 연구에 대한 융합적 고찰)

  • Moon, Heakyung;Jung, Miran;Noh, Wonjung
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.3
    • /
    • pp.146-151
    • /
    • 2019
  • This study performed to identify the relationship between job stress and self-efficacy based on the related research review and text network analysis. For the literature review, we performed the search process at three domestic and one foreign database using key words, 'nurse', 'stress', 'self-efficacy'. A total of 18 papers were selected as the target literature. Nine of these studies reported a statistically significant negative correlation between nurses' job stress and self-efficacy. It was difficult to compare between studies' results because of the optional usage of the questionnaires. In addition, a text network analysis was conducted by extracting keywords from the 18 papers. The keyword with the highest frequency of appearance was job stress, and the main words with high frequency of emergence were self-efficacy, hospital, and correlation. To clarify the relationship between the keywords, it is proposed to perform a survey on the influence factors through the development of Korean version measurement.

A DOM-Based Fuzzing Method for Analyzing Seogwang Document Processing System in North Korea (북한 서광문서처리체계 분석을 위한 Document Object Model(DOM) 기반 퍼징 기법)

  • Park, Chanju;Kang, Dongsu
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.5
    • /
    • pp.119-126
    • /
    • 2019
  • Typical software developed and used by North Korea is Red Star and internal application software. However, most of the existing research on the North Korean software is the software installation method and general execution screen analysis. One of the ways to identify software vulnerabilities is file fuzzing, which is a typical method for identifying security vulnerabilities. In this paper, we use file fuzzing to analyze the security vulnerability of the software used in North Korea's Seogwang Document Processing System. At this time, we propose the analysis of open document text (ODT) file produced by Seogwang Document Processing System, extraction of node based on Document Object Mode (DOM) to determine test target, and generation of mutation file through insertion and substitution, this increases the number of crash detections at the same testing time.

Study on Participants' Perceptions of Sharing Economy Policies: A Text Ming Approach to Online Community Posts (공유경제 참여자의 비즈니스 등록정책에 대한 인식과 심적기재: 온라인 발화에 대한 텍스트마이닝)

  • Park, Soo Kyung
    • Journal of Digital Convergence
    • /
    • v.20 no.2
    • /
    • pp.47-56
    • /
    • 2022
  • With the advent of online platforms, individuals have been able to trade small resources, such as a room, in the market. However, as there is no clear regulation on these economic activities, various side effects have emerged. Accordingly, the government reestablished related policies to resolve the unintended consequences of these economic activities. However, the policy has not been implemented yet, and many participants do not comply with the policy. Therefore, this study intends to examine their perceptions in detail. For this purpose, a text mining technique was applied. Posts and comments from major online communities were collected. By applying the topic modeling technique, 5 topics were derived. Compliance with the government's policy is a voluntary decision. Therefore, it is necessary to carry out an in-depth understanding of the policy target. Therefore, based on this study, it is expected that in the future, methods to induce them to conform to policy can be discussed in detail.

Developing the Automated Sentiment Learning Algorithm to Build the Korean Sentiment Lexicon for Finance (재무분야 감성사전 구축을 위한 자동화된 감성학습 알고리즘 개발)

  • Su-Ji Cho;Ki-Kwang Lee;Cheol-Won Yang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.32-41
    • /
    • 2023
  • Recently, many studies are being conducted to extract emotion from text and verify its information power in the field of finance, along with the recent development of big data analysis technology. A number of prior studies use pre-defined sentiment dictionaries or machine learning methods to extract sentiment from the financial documents. However, both methods have the disadvantage of being labor-intensive and subjective because it requires a manual sentiment learning process. In this study, we developed a financial sentiment dictionary that automatically extracts sentiment from the body text of analyst reports by using modified Bayes rule and verified the performance of the model through a binary classification model which predicts actual stock price movements. As a result of the prediction, it was found that the proposed financial dictionary from this research has about 4% better predictive power for actual stock price movements than the representative Loughran and McDonald's (2011) financial dictionary. The sentiment extraction method proposed in this study enables efficient and objective judgment because it automatically learns the sentiment of words using both the change in target price and the cumulative abnormal returns. In addition, the dictionary can be easily updated by re-calculating conditional probabilities. The results of this study are expected to be readily expandable and applicable not only to analyst reports, but also to financial field texts such as performance reports, IR reports, press articles, and social media.