• Title/Summary/Keyword: Text data

Search Result 2,956, Processing Time 0.029 seconds

Simple Image Stenography Technology for Large Scale Text (대용량 텍스트를 위한 손실 없는 영상 은닉기술)

  • Rhee, Keun-Moo
    • Annual Conference of KIPS
    • /
    • 2008.05a
    • /
    • pp.1104-1107
    • /
    • 2008
  • These people where generally the image or the document nik technique silver document image, against the digital data of audio back all type the research is advanced being used with objective and the use which are various, is a d. Needs a low-end leveling instrument security text from the research which it sees and with substitution quantity the silver nik being simple it will be able to deliver the technique which is simple it embodied. It combined the text image first and the nose which is in the collar image of 24 bit depth which will reach ting it did and it rehabilitatedded and a higher officer technique and the result it used that the loss ratio of the text image to analyze is slight it was ascertained.

Analysis of Prevention Methods by Type of Construction Disaster Using Text Mining Techniques (텍스트마이닝을 활용한 건설현장 재해 유형별 예방 대책 분석)

  • Gyu Pil Jo;Myungdo Lee;Yoon-seok Shin;Baek-Joong Kim
    • Journal of the Society of Disaster Information
    • /
    • v.20 no.1
    • /
    • pp.13-19
    • /
    • 2024
  • Purpose: This study provides prevention methods by type of construction disaster using text mining techniques. Method: Based on the database that analyzed the cases of critical disasters in the domestic construction sector, preventive measures and causes are analyzed by text mining techniques, and the contents of the analysis are visually shown. Result: This visual data represents the measures for preventing critical disasters of each process according to the importance. Conclusion: It is believed that the results will be helpful in identifying factors to be considered in preparing preventive measures for serious accidents in construction.

Comparison of Cognitive Loads between Koreans and Foreigners in the Reading Process

  • Im, Jung Nam;Min, Seung Nam;Cho, Sung Moon
    • Journal of the Ergonomics Society of Korea
    • /
    • v.35 no.4
    • /
    • pp.293-305
    • /
    • 2016
  • Objective: This study aims to measure cognitive load levels by analyzing the EEG of Koreans and foreigners, when they read a Korean text with care selected by level from the grammar and vocabulary aspects, and compare the cognitive load levels through quantitative values. The study results can be utilized as basic data for more scientific approach, when Korean texts or books are developed, and an evaluation method is built, when the foreigners encounter them for learning or an assignment. Background: Based on 2014, the number of the foreign students studying in Korea was 84,801, and they increase annually. Most of them are from Asian region, and they come to Korea to enter a university or a graduate school in Korea. Because those foreign students aim to learn within Universities in Korea, they receive Korean education from their preparation for study in Korea. To enter a university in Korea, they must acquire grade 4 or higher level in the Test of Proficiency in Korean (TOPIK), or they need to complete a certain educational program at each university's affiliated language institution. In such a program, the learners of the Korean language receive Korean education based on texts, except speaking domain, and the comprehension of texts can determine their academic achievements in studying after they enter their desired schools (Jeon, 2004). However, many foreigners, who finish a language course for the short-term, and need to start university study, cannot properly catch up with university classes requiring expertise with the vocabulary and grammar levels learned during the language course. Therefore, reading education, centered on a strategy to understand university textbooks regarded as top level reading texts to the foreigners, is necessary (Kim and Shin, 2015). This study carried out an experiment from a perspective that quantitative data on the readers of the main player of reading education and teaching materials need to be secured to back up the need for reading education for university study learners, and scientifically approach educational design. Namely, this study grasped the difficulty level of reading through the measurement of cognitive loads indicated in the reading activity of each text by dividing the difficulty of a teaching material (book) into eight levels, and the main player of reading into Koreans and foreigners. Method: To identify cognitive loads indicated upon reading Korean texts with care by Koreans and foreigners, this study recruited 16 participants (eight Koreans and eight foreigners). The foreigners were limited to the language course students studying the intermediate level Korean course at university-affiliated language institutions within Seoul Metropolitan Area. To identify cognitive load, as they read a text by level selected from the Korean books (difficulty: eight levels) published by King Sejong Institute (Sejonghakdang.org), the EEG sensor was attached to the frontal love (Fz) and occipital lobe (Oz). After the experiment, this study carried out a questionnaire survey to measure subjective evaluation, and identified the comprehension and difficulty on grammar and words. To find out the effects on schema that may affect text comprehension, this study controlled the Korean texts, and measured EEG and subjective satisfaction. Results: To identify brain's cognitive load, beta band was extracted. As a result, interactions (Fz: p =0.48; Oz: p =0.00) were revealed according to Koreans and foreigners, and difficulty of the text. The cognitive loads of Koreans, the readers whose mother tongue is Korean, were lower in reading Korean texts than those of the foreigners, and the foreigners' cognitive loads became higher gradually according to the difficulty of the texts. From the text four, which is intermediate level in difficulty, remarkable differences started to appear in comparison of the Koreans and foreigners in the beginner's level text. In the subjective evaluation, interactions were revealed according to the Koreans and foreigners and text difficulty (p =0.00), and satisfaction was lower, as the difficulty of the text became higher. Conclusion: When there was background knowledge in reading, namely schema was formed, the comprehension and satisfaction of the texts were higher, although higher levels of vocabulary and grammar were included in the texts than those of the readers. In the case of a text in which the difficulty of grammar was felt high in the subjective evaluation, foreigners' cognitive loads were also high, which shows the result of the loads' going up higher in proportion to the increase of difficulty. This means that the grammar factor functions as a stress factor to the foreigners' reading comprehension. Application: This study quantitatively evaluated the cognitive loads of Koreans and foreigners through EEG, based on readers and the text difficulty, when they read Korean texts. The results of this study can be used for making Korean teaching materials or Korean education content and topic selection for foreigners. If research scope is expanded to reading process using an eye-tracker, the reading education program and evaluation method for foreigners can be developed on the basis of quantitative values.

Text Mining Techniques for Adaptable Learning (적응적인 학습을 위한 텍스트 마이닝 기술)

  • Kim, Cheon-Shik;Jung, Myung-Hee;Hong, You-Sik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.31-39
    • /
    • 2008
  • Until now, there are many technologies to improve studying ability using e-learning system. In most of e-learning system, learners are studying through the lecture materials and studying problems. The studying ability and intention, however, can be improved through the shared materials and discussion. In this case, learning materials are shared by the learners' discussion and shared materials through the board Internet and MSN. Such data was not classified by learners; it was not easy for the learners to search related valuable information. Therefore, it was not helping to learning. The technologies of most text mining extract summary data from the collection of document or classify into similar document from the complex document. In this paper, we implemented e-learning system for learners to improve learning abilities and especially, applied text mining technology to classify learning material for helping learners.

Glycosyl glycerides from the stems of 'Baekma' cultivar of Chrysanthemum morifolium (국화 '백마'(Chrysanthemum morifolium) 줄기로부터 glycosyl glyceride 의 분리 및 동정)

  • Oh, Hyun-Ji;Kim, Hyoung-Geun;Pak, Ha-Seung;Baek, Yun-Su;Kwon, Oh-Keun;Shin, Hak-Ki;Baek, Nam-In
    • Journal of Applied Biological Chemistry
    • /
    • v.61 no.2
    • /
    • pp.131-134
    • /
    • 2018
  • The stem of Chrysanthemum morifolium, 'Baekma', were repeatedly extracted with 80% aqueous MeOH and the concentrates was partitioned into ethyl acetate (EtOAc), n-butyl alcohol and $H_2O$ fraction. The repeated silica gel and octadecyl silica gel column chromatographies for the EtOAc fractions led to isolation of two glycosyl glycerides. The chemical structures of the compounds were determined as (2S)-1-O-${\beta}-{\text\tiny{D}}$-galactopyranosyl-2,3-dilinoleoylglycerol (1) and (2S)-1-O-${\beta}-{\text\tiny{D}}$-galactopyranosyl-2,3-dipalmitoylglycerol (2) based on spectroscopic data anlyses including nuclear magnetic resonance, mass sperctrometry, and infrared spectrometry and gas chromatography mass spectrometry.

Chatting Pattern Based Game BOT Detection: Do They Talk Like Us?

  • Kang, Ah Reum;Kim, Huy Kang;Woo, Jiyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.11
    • /
    • pp.2866-2879
    • /
    • 2012
  • Among the various security threats in online games, the use of game bots is the most serious problem. Previous studies on game bot detection have proposed many methods to find out discriminable behaviors of bots from humans based on the fact that a bot's playing pattern is different from that of a human. In this paper, we look at the chatting data that reflects gamers' communication patterns and propose a communication pattern analysis framework for online game bot detection. In massive multi-user online role playing games (MMORPGs), game bots use chatting message in a different way from normal users. We derive four features; a network feature, a descriptive feature, a diversity feature and a text feature. To measure the diversity of communication patterns, we propose lightly summarized indices, which are computationally inexpensive and intuitive. For text features, we derive lexical, syntactic and semantic features from chatting contents using text mining techniques. To build the learning model for game bot detection, we test and compare three classification models: the random forest, logistic regression and lazy learning. We apply the proposed framework to AION operated by NCsoft, a leading online game company in Korea. As a result of our experiments, we found that the random forest outperforms the logistic regression and lazy learning. The model that employs the entire feature sets gives the highest performance with a precision value of 0.893 and a recall value of 0.965.

A Study on the Improvement of Retrieval Efficiency Based on the CRFMD (공통기술표현포맷에 기반한 다매체자료의 검색효율 향상에 관한 연구)

  • Park, Il-Jong;Jeong, Ki-Tai
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.3 s.61
    • /
    • pp.5-21
    • /
    • 2006
  • In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and have progressed quickly with the rapid progress in data processing speeds. This study proposes a common representation format for multimedia documents (CRFMD) composed of both images and text to form a single data structure. It also shows that image classification of a given test set is dramatically improved when text features are encoded together with image features. CRFMD might be applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval.

Improvement OCR Algorithm for Efficient Book Catalog RetrievalTechnology (효과적인 도서목록 검색을 위한 개선된 OCR알고리즘에 관한 연구)

  • HeWen, HeWen;Baek, Young-Hyun;Moon, Sung-Ryong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.152-159
    • /
    • 2010
  • Existing character recognition algorithm recognize characters in simple conditional. It has the disadvantage that recognition rates often drop drastically when input document image has low quality, rotated text, various font or size text because of external noise or data loss. In this paper, proposes the optical character recognition algorithm which using bicubic interpolation method for the catalog retrieval when the input image has rotated text, blurred, various font and size. In this paper, applied optical character recognition algorithm consist of detection and recognition part. Detection part applied roberts and hausdorff distance algorithm for correct detection the catalog of book. Recognition part applied bicubic interpolation to interpolate data loss due to low quality, various font and size text. By the next time, applied rotation for the bicubic interpolation result image to slant proofreading. Experimental results show that proposal method can effectively improve recognition rate 6% and search-time 1.077s process result.

An Analysis of Online Black Market: Using Data Mining and Social Network Analysis (온라인 해킹 불법 시장 분석: 데이터 마이닝과 소셜 네트워크 분석 활용)

  • Kim, Minsu;Kim, Hee-Woong
    • The Journal of Information Systems
    • /
    • v.29 no.2
    • /
    • pp.221-242
    • /
    • 2020
  • Purpose This study collects data of the recently activated online black market and analyzes it to present a specific method for preparing for a hacking attack. This study aims to make safe from the cyber attacks, including hacking, from the perspective of individuals and businesses by closely analyzing hacking methods and tools in a situation where they are easily shared. Design/methodology/approach To prepare for the hacking attack through the online black market, this study uses the routine activity theory to identify the opportunity factors of the hacking attack. Based on this, text mining and social network techniques are applied to reveal the most dangerous areas of security. It finds out suitable targets in routine activity theory through text mining techniques and motivated offenders through social network analysis. Lastly, the absence of guardians and the parts required by guardians are extracted using both analysis techniques simultaneously. Findings As a result of text mining, there was a large supply of hacking gift cards, and the demand to attack sites such as Amazon and Netflix was very high. In addition, interest in accounts and combos was in high demand and supply. As a result of social network analysis, users who actively share hacking information and tools can be identified. When these two analyzes were synthesized, it was found that specialized managers are required in the areas of proxy, maker and many managers are required for the buyer network, and skilled managers are required for the seller network.

An Improved Automatic Text Summarization Based on Lexical Chaining Using Semantical Word Relatedness (단어 간 의미적 연관성을 고려한 어휘 체인 기반의 개선된 자동 문서요약 방법)

  • Cha, Jun Seok;Kim, Jeong In;Kim, Jung Min
    • Smart Media Journal
    • /
    • v.6 no.1
    • /
    • pp.22-29
    • /
    • 2017
  • Due to the rapid advancement and distribution of smart devices of late, document data on the Internet is on the sharp increase. The increment of information on the Web including a massive amount of documents makes it increasingly difficult for users to understand corresponding data. In order to efficiently summarize documents in the field of automated summary programs, various researches are under way. This study uses TextRank algorithm to efficiently summarize documents. TextRank algorithm expresses sentences or keywords in the form of a graph and understands the importance of sentences by using its vertices and edges to understand semantic relations between vocabulary and sentence. It extracts high-ranking keywords and based on keywords, it extracts important sentences. To extract important sentences, the algorithm first groups vocabulary. Grouping vocabulary is done using a scale of specific weight. The program sorts out sentences with higher scores on the weight scale, and based on selected sentences, it extracts important sentences to summarize the document. This study proved that this process confirmed an improved performance than summary methods shown in previous researches and that the algorithm can more efficiently summarize documents.