• Title/Summary/Keyword: MS-Word

Search Result 74, Processing Time 0.026 seconds

Efficient Hangul Word Processor (HWP) Malware Detection Using Semi-Supervised Learning with Augmented Data Utility Valuation (효율적인 HWP 악성코드 탐지를 위한 데이터 유용성 검증 및 확보 기반 준지도학습 기법)

  • JinHyuk Son;Gihyuk Ko;Ho-Mook Cho;Young-Kuk Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.71-82
    • /
    • 2024
  • With the advancement of information and communication technology (ICT), the use of electronic document types such as PDF, MS Office, and HWP files has increased. Such trend has led the cyber attackers increasingly try to spread malicious documents through e-mails and messengers. To counter such attacks, AI-based methodologies have been actively employed in order to detect malicious document files. The main challenge in detecting malicious HWP(Hangul Word Processor) files is the lack of quality dataset due to its usage is limited in Korea, compared to PDF and MS-Office files that are highly being utilized worldwide. To address this limitation, data augmentation have been proposed to diversify training data by transforming existing dataset, but as the usefulness of the augmented data is not evaluated, augmented data could end up harming model's performance. In this paper, we propose an effective semi-supervised learning technique in detecting malicious HWP document files, which improves overall AI model performance via quantifying the utility of augmented data and filtering out useless training data.

A Research of Anomaly Detection Method in MS Office Document (MS 오피스 문서 파일 내 비정상 요소 탐지 기법 연구)

  • Cho, Sung Hye;Lee, Sang Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.2
    • /
    • pp.87-94
    • /
    • 2017
  • Microsoft Office is an office suite of applications developed by Microsoft. Recently users with malicious intent customize Office files as a container of the Malware because MS Office is most commonly used word processing program. To attack target system, many of malicious office files using a variety of skills and techniques like macro function, hiding shell code inside unused area, etc. And, people usually use two techniques to detect these kinds of malware. These are Signature-based detection and Sandbox. However, there is some limits to what it can afford because of the increasing complexity of malwares. Therefore, this paper propose methods to detect malicious MS office files in Computer forensics' way. We checked Macros and potential problem area with structural analysis of the MS Office file for this purpose.

The final stop consonant perception in typically developing children aged 4 to 6 years and adults (4-6세 정상발달아동 및 성인의 종성파열음 지각력 비교)

  • Byeon, Kyeongeun;Ha, Seunghee
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.57-65
    • /
    • 2015
  • This study aimed to identify the development pattern of final stop consonant perception using the gating task. Sixty-four subjects participated in the study: 16 children aged 4 years, 16 children aged 5 years, 17 children aged 6 years, and 15 adults. One-syllable words with consonant-vowel-consonant(CVC) structure, mokㄱ-motㄱ and papㄱ-patㄱ were used as stimuli in order to remove the redundancy of acoustic cues in stimulus words, 40ms-length (-40ms) and 60ms-length (-60ms) from the entire duration of the final consonant were deleted. Three conditions (the whole word segment, -40ms, -60ms) were used for this speech perception experiment. 48 tokens (4 stimuli ${\times}3$ conditions ${\times}4$ trials) in total were provided for participants. The results indicated that 5 and 6 year olds showed final consonant perception similar to adults in stimuli, papㄱ-patㄱ and only the 6-year-old children showed perception similar to adults in stimuli, 'mokㄱ-motㄱ. The results suggested that younger typically developing children require more acoustic information to accurately perceive final consonants than older children and adults. Final consonant perception ability may become adult-like around 6 years old. The study provides fundamental data on the development pattern of speech perception in normal developing children, which can be used to compare to those of children with communication disorders.

Design and Implementation of Metadata Management System for Serious Game (기능성 게임을 위한 메타데이터 관리 시스템의 설계 및 구현)

  • Yoon, Sun-Jung;Park, Hee-Sook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.4
    • /
    • pp.893-900
    • /
    • 2010
  • One of the most interests is currently a field related with serious game. Also, the scale of serious game industry is rapid quantitative growth every year. For effective management of metadata of serious game is an important issue so that we propose design of integrated management system for effective management of metadata of serious game. The system provide a service based on internet. Users(general users, developers, experts and managers of metadata) of serious game who can carry out effectively works using proposed system such as new metadata information input, existing metadata information search, generated document files(HTML, XML, EXCEL, MS-WORD) as search results saving into their owns local computer system and so on.

Document Summarization using Topic Phrase Extraction and Query-based Summarization (주제어구 추출과 질의어 기반 요약을 이용한 문서 요약)

  • 한광록;오삼권;임기욱
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.488-497
    • /
    • 2004
  • This paper describes the hybrid document summarization using the indicative summarization and the query-based summarization. The learning models are built from teaming documents in order to extract topic phrases. We use Naive Bayesian, Decision Tree and Supported Vector Machine as the machine learning algorithm. The system extracts topic phrases automatically from new document based on these models and outputs the summary of the document using query-based summarization which considers the extracted topic phrases as queries and calculates the locality-based similarity of each topic phrase. We examine how the topic phrases affect the summarization and how many phrases are proper to summarization. Then, we evaluate the extracted summary by comparing with manual summary, and we also compare our summarization system with summarization mettled from MS-Word.

Design and Implementation of Input and Output System for Unstructured Big Data (비정형 대용량 데이터 입력 및 출력 시스템 설계 및 구현)

  • Kim, Chang-Su;Shim, Kyu-Chul;Kang, Byoung-Jun;Kim, Kyung-Hwan;Jung, Hoe-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.2
    • /
    • pp.387-393
    • /
    • 2014
  • In recent years, the spread of computers is increasing, and efficient processing effort for unstructured Big Data is required. In this paper, we are proposed a system to extract the data typed in a word processor quickly by user creating and XML mapping file after converting XML data that has been entered in the office file(HWP, MS-office). In addition, we proposed a system is able to lookup the necessary data from a database by entered form in advance and convert word processor document to office files by the application program. The unstructured big data will be available to be used.

Perception of English High Vowels by Korean Speakers of English

  • Lee, Ji-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.39-46
    • /
    • 2009
  • This study compares the perception of English high tense and lax vowels (/i, I, u, $\mho$/) by English speakers and Korean speakers of English. The four vowels were produced in /hVd/ context by a native speaker of English, and each word's vowel duration was manipulated to range from 170ms to 290ms in 30ms increments. Two English speakers and six Korean speakers of English were asked to listen to pairs of tense and lax vowel words with manipulated vowel durations and to identify the pair by choosing either heed-hid or hid-heed for front vowels and either who'd-hood or hood-who'd for back vowels. The results show that English speakers distinguished tense vowels from lax vowels with 100% accuracy regardless of the different durations, compared to 62% accuracy for Korean speakers of English. Most errors occurred for lengthened lax vowels and shortened tense vowels. The results of this study demonstrate that Korean speakers mainly rely on vowel duration as a cue to discriminate the tense and lax vowels. The theoretical and pedagogical implications of this finding are discussed.

  • PDF

ShareIt: An Application Sharing System using Window Capturing and Multicast under Heterogeneous Window Systems

  • Jung, Jin-H.;Park, Hyun, J.;Yang, Hyun-S.
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06b
    • /
    • pp.99-104
    • /
    • 1998
  • Application sharing is the ability to use existing applications, such as Excel or MS-Word, during a group session without modification. In this paper, we present the design and implementation of an application sharing system, called ShareIt, which enable users to share arbitrary MS-Windows applications under the Win 3.1/95/NT and X window system, and evaluation of the system performance. To share an application, the image of the application window is captured and transmitted to other sites. With the use of the window capturing method, ShareIt allows any MS-Windows application to be shared regardless of not only the window systems but also the version-up of window systems.

  • PDF

Document Summarization Using Mutual Recommendation with LSA and Sense Analysis (LSA를 이용한 문장 상호 추천과 문장 성향 분석을 통한 문서 요약)

  • Lee, Dong-Wook;Baek, Seo-Hyeon;Park, Min-Ji;Park, Jin-Hee;Jung, Hye-Wuk;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.5
    • /
    • pp.656-662
    • /
    • 2012
  • In this paper, we describe a new summarizing method based on a graph-based and a sense-based analysis. In the graph-based analysis, we convert sentences in a document into word vectors and calculate the similarity between each sentence using LSA. We reflect this similarity of sentences and the rarity scores of words in sentences to define weights of edges in the graph. Meanwhile, in the sense-based analysis, in order to determine the sense of words, subjectivity or objectivity, we built a database which is extended from the golden standards using Wordnet. We calculate the subjectivity of sentences from the sense of words, and select more subjective sentences. Lastly, we combine the results of these two methods. We evaluate the performance of the proposed method using classification games, which are usually used to measure the performances of summarization methods. We compare our method with the MS-Word auto-summarization, and verify the effectiveness of ours.

Does Cloned Template Text Compromise the Information Integrity of a Paper, and is it a New Form of Text Plagiarism?

  • Jaime A. Teixeira da Silva
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.13 no.2
    • /
    • pp.23-35
    • /
    • 2023
  • Word templates exist for select journals, and their primary objective is to facilitate submissions to those journals, thereby optimizing editors' and publishers' time and resources by ensuring that the desired style (e.g., of sections, references, etc.) is followed. However, if multiple unrelated authors use the exact same template, a risk exists that some text might be erroneously cloned if template-based papers are not carefully screened by authors, journal editors or proof copyeditors. Elsevier Procedia® was used as an example. Select cloned text, presumably derived from MS Word templates used for submissions to Elsevier Procedia® journals, was assessed using Science Direct. Typically, in academic publishing, identical text is screened using text similarity software during the submission process, and if detected, may be flagged as plagiarism. After searching for "heading should be left justified, bold, with the first letter capitalized", 44 Elsevier Procedia® papers were found to be positive for vestigial template text. The integrity of the information in these papers has been compromised, so these errors should be corrected with an erratum, or in the case of extensive errors and vast tracts (e.g., pages long) of template text, papers should be retracted and republished.