• Title/Summary/Keyword: 텍스트 연구

Search Result 3,492, Processing Time 0.033 seconds

TF-IDF Based Association Rule Analysis System for Medical Data (의료 정보 추출을 위한 TF-IDF 기반의 연관규칙 분석 시스템)

  • Park, Hosik;Lee, Minsu;Hwang, Sungjin;Oh, Sangyoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.145-154
    • /
    • 2016
  • Because of the recent interest in the u-Health and development of IT technology, a need of utilizing a medical information data has been increased. Among previous studies that utilize various data mining algorithms for processing medical information data, there are studies of association rule analysis. In the studies, an association between the symptoms with specified diseases is the target to discover, however, infrequent terms which can be important information for a disease diagnosis are not considered in most cases. In this paper, we proposed a new association rule mining system considering the importance of each term using TF-IDF weight to consider infrequent but important items. In addition, the proposed system can predict candidate diagnoses from medical text records using term similarity analysis based on medical ontology.

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS (대용량 한국어 TTS의 결정트리기반 음성 DB 감축 방안)

  • Lee, Jung-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.7
    • /
    • pp.91-98
    • /
    • 2010
  • Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. Because the improvements in the natualness, personality, speaking style, emotions of synthetic speech need the increase of the size of speech DB, it is necessary to prune the redundant speech segments in a large speech segment DB. In this paper, we propose a new method to construct a segmental speech DB for the Korean TTS system based on a clustering algorithm to downsize the segmental speech DB. For the performance test, the synthetic speech was generated using the Korean TTS system which consists of the language processing module, prosody processing module, segment selection module, speech concatenation module, and segmental speech DB. And MOS test was executed with the a set of synthetic speech generated with 4 different segmental speech DBs. We constructed 4 different segmental speech DB by combining CM1(or CM2) tree clustering method and full DB (or reduced DB). Experimental results show that the proposed method can reduce the size of speech DB by 23% and get high MOS in the perception test. Therefore the proposed method can be applied to make a small sized TTS.

Cloud Messaging Service for Preventing Smishing Attack (스미싱 공격 방지를 위한 클라우드 메시징 서비스)

  • Park, Hyo-Min;Kim, Wan-Seok;Kang, So-Jeong;Shin, Sang Uk
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.285-293
    • /
    • 2017
  • They are rapidly evolving malicious attacks on smart devices, and to timely protect the smart devices from these attacks has become a very important issue. In particular, smishing attack has emerged as one of the most important threats on the smartphone. In this paper, we propose the cloud service that can fundamentally protect the user from the risk of smishing attack. The proposed scheme provides cloud messaging service that can filter text messages including URLs in the user's smart device, view and manage them through a virtual machine provided by a cloud server. The existing techniques for preventing smshing attacks protect only malicious code of a known pattern and there is the possibility of error such as FP(False Positive) or FN(False Negative). However, since the proposed method automatically filters all text messages including URLs, storing, viewing, and managing them in their own storage space on the cloud server, it can completely block the installation of malwares(malicious codes) on the user's smart device through smishing attacks.

Improvement of Endoscopic Image using De-Interlacing Technique (De-Interlace 기법을 이용한 내시경 영상의 화질 개선)

  • 신동익;조민수;허수진
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.5
    • /
    • pp.469-476
    • /
    • 1998
  • In the case of acquisition and displaying medical Images such as ultrasonography and endoscopy on VGA monitor of PC system, image degradation of tear-drop appears through scan conversion. In this study, we compare several methods which can solve this degradation and implement the hardware system that resolves this problem in real-time with PC. It is possible to represent high quality image display and real-time processing and acquisition with specific de-interlacing device and PCI bridge on our hardware system. Image quality is improved remarkably on our hardware system. It is implemented as PC-based system, so acquiring, saving images and describing text comment on those images and PACS networking can be easily implemented.metabolism. All images were spatially normalized to MNI standard PET template and smoothed with 16mm FWHM Gaussian kernel using SPM96. Mean count in cerebral region was normalized. The VOls for 34 cerebral regions were previously defined on the standard template and 17 different counts of mirrored regions to hemispheric midline were extracted from spatially normalized images. A three-layer feed-forward error back-propagation neural network classifier with 7 input nodes and 3 output nodes was used. The network was trained to interpret metabolic patterns and produce identical diagnoses with those of expert viewers. The performance of the neural network was optimized by testing with 5~40 nodes in hidden layer. Randomly selected 40 images from each group were used to train the network and the remainders were used to test the learned network. The optimized neural network gave a maximum agreement rate of 80.3% with expert viewers. It used 20 hidden nodes and was trained for 1508 epochs. Also, neural network gave agreement rates of 75~80% with 10 or 30 nodes in hidden layer. We conclude that artificial neural network performed as well as human experts and could be potentially useful as clinical decision support tool for the localization of epileptogenic zones.

  • PDF

Improved SIM Algorithm for Contents-based Image Retrieval (내용 기반 이미지 검색을 위한 개선된 SIM 방법)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.49-59
    • /
    • 2009
  • Contents-based image retrieval methods are in general more objective and effective than text-based image retrieval algorithms since they use color and texture in search and avoid annotating all images for search. SIM(Self-organizing Image browsing Map) is one of contents-based image retrieval algorithms that uses only browsable mapping results obtained by SOM(Self Organizing Map). However, SOM may have an error in selecting the right BMU in learning phase if there are similar nodes with distorted color information due to the intensity of light or objects' movements in the image. Such images may be mapped into other grouping nodes thus the search rate could be decreased by this effect. In this paper, we propose an improved SIM that uses HSV color model in extracting image features with color quantization. In order to avoid unexpected learning error mentioned above, our SOM consists of two layers. In learning phase, SOM layer 1 has the color feature vectors as input. After learning SOM Layer 1, the connection weights of this layer become the input of SOM Layer 2 and re-learning occurs. With this multi-layered SOM learning, we can avoid mapping errors among similar nodes of different color information. In search, we put the query image vector into SOM layer 2 and select nodes of SOM layer 1 that connects with chosen BMU of SOM layer 2. In experiment, we verified that the proposed SIM was better than the original SIM and avoid mapping error effectively.

  • PDF

A Strategy To Reduce Network Traffic Using Two-layered Cache Servers for Continuous Media Data on the Wide Area Network (이중 캐쉬 서버를 사용한 실시간 데이터의 좡대역 네트워크 대역폭 감소 정책)

  • Park, Yong-Woon;Beak, Kun-Hyo;Chung, Ki-Dong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.10
    • /
    • pp.3262-3271
    • /
    • 2000
  • Continuous media objects, due to large volume and real-time consiraints in their delivery,are likely to consume much network andwidth Generally, proxy servers are used to hold the fiequently requested objects so as to reduce the network traffic to the central server but most of them are designed for text and image dae that they do not go well with continuous media data. So, in this paper, we propose a two-layered network cache management policy for continuous media object delivery on the wide area networks. With the proposed cache management scheme,in cach LAN, there exists one LAN cache and each LAN is further devided into a group of sub-LANs, each of which also has its own sub-LAN eache. Further, each object is also partitioned into two parts the front-end and rear-end partition. they can be loaded in the same cache or separately in different network caches according to their access frequencics. By doing so, cache replacement overhead could be educed as compared to the case of the full size daa allocation and replacement , this eventually reduces the backbone network traffic to the origin server.

  • PDF

A Study on Design of Annotation Database for Visible Human (인체영상 어노테이션 DB 설계에 관한 연구)

  • Ahn, bu-young;Lee, seung-bock;Han, Geon;Lee, sang-ho
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2008.05a
    • /
    • pp.819-822
    • /
    • 2008
  • As the IT and computer network technology is developed very rapidly, the quantity of digital contents is increased and disseminated more widely. The digital contents is generally expressed in 2 or 3 dimensional multimedia format and the visible human image that is taken from human body is very important because of its variety of usefulness. The KISTI(Korea Institute of Science and Technology Information) is now constructing various Korean human informations such as visible Korean, digital Korean, human bone property and human models. These informations are accessable through the internet. However, these human images are not easily understandable for general users because they are specialized in medical image field and there is no detailed explanation data. In this study, we designed the annotation database and searching interface for KISTI's visible Korean database. This annotation database involved the detailed explanation and special note of visible Korean data and it can connect the image and text data of visible Korean with each other. By studying this database and interface design, the KISTI's visible Korean database is more easily accessable and understandable to general users and it can promote the usage of visible Korean data more widely.

  • PDF

Knowledge based Text to Facial Sequence Image System for Interaction of Lecturer and Learner in Cyber Universities (가상대학에서 교수자와 학습자간 상호작용을 위한 지식기반형 문자-얼굴동영상 변환 시스템)

  • Kim, Hyoung-Geun;Park, Chul-Ha
    • The KIPS Transactions:PartB
    • /
    • v.15B no.3
    • /
    • pp.179-188
    • /
    • 2008
  • In this paper, knowledge based text to facial sequence image system for interaction of lecturer and learner in cyber universities is studied. The system is defined by the synthesis of facial sequence image which is synchronized the lip according to the text information based on grammatical characteristic of hangul. For the implementation of the system, the transformation method that the text information is transformed into the phoneme code, the deformation rules of mouse shape which can be changed according to the code of phonemes, and the synthesis method of facial sequence image by using deformation rules of mouse shape are proposed. In the proposed method, all syllables of hangul are represented 10 principal mouse shape and 78 compound mouse shape according to the pronunciation characteristics of the basic consonants and vowels, and the characteristics of the articulation rules, respectively. To synthesize the real time facial sequence image able to realize the PC, the 88 mouth shape stored data base are used without the synthesis of mouse shape in each frame. To verify the validity of the proposed method the various synthesis of facial sequence image transformed from the text information is accomplished, and the system that can be applied the PC is implemented using the proposed method.

A Comparative Analysis of the Changes in Perception of the Fourth Industrial Revolution: Focusing on Analyzing Social Media Data (4차 산업혁명에 대한 인식 변화 비교 분석: 소셜 미디어 데이터 분석을 중심으로)

  • You, Jae Eun;Choi, Jong Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.11
    • /
    • pp.367-376
    • /
    • 2020
  • The fourth industrial revolution will greatly contribute to the entry of objects into an intelligent society through technologies such as big data and an artificial intelligence. Through the revolution, we were able to understand human behavior and awareness, and through the use of an artificial intelligence, we established ourselves as a key tool in various fields such as medicine and science. However, the fourth industrial revolution has a negative side with a positive future. In this study, an analysis was conducted using text mining techniques based on unstructured big data collected through social media. We wanted to look at keywords related to the fourth industrial revolution by year (2016, 2017 and 2018) and understand the meaning of each keyword. In addition, we understood how the keywords related to the Fourth Industrial Revolution changed with the change of the year and wanted to use R to conduct a Keyword Analysis to identify the recognition flow closely related to the Fourth Industrial Revolution through the keyword flow associated with the Fourth Industrial Revolution. Finally, people's perceptions of the fourth industrial revolution were identified by looking at the positive and negative feelings related to the fourth industrial revolution by year. The analysis showed that negative opinions were declining year after year, with more positive outlook and future.

The Image of Ruralism in Korea through a Text Mining for Online News Media analysis (인터넷 뉴스 데이터 텍스트 분석을 통해 본 우리나라 농촌다움에 대한 이미지 연구)

  • Son, Yong-hoon;Kim, Young-jin
    • Journal of Korean Society of Rural Planning
    • /
    • v.25 no.4
    • /
    • pp.13-26
    • /
    • 2019
  • The rural areas in South Korea have changed rapidly in the process of national land development. Rural landscapes have become discoloured, and their attractiveness has decreased as cities have expanded. But the attractiveness or multifunctional values of rural areas has become more important in contemporary society around the world. According to this social demand, the efforts of conserving the rural landscape are of high priority and the recovery of ruralism in the area is required. This study has tried to understand how the public image of ruralism in South Korea has been influenced by the news media. The study retrieved news articles using the web searching portal site from the six keywords, commonly used to refer to ruralism, including 'rural landscape', 'rural community', 'rural tourism', 'rural life', 'rural amenity', and 'rural environment'. News data from the six keywords were also collected respectively from within the year-period of 2004-05, 2007-08, 2012-13, and 2016-17. In the text mining analysis, the nouns with high Degree Centrality were figured out, and the changes by year-period were identified. Then, LDA topic analysis was performed for text datasets of six keywords. As a result, the study found that the news articles gave an informed focus on only a handful of issues such as 'poor rural living condition', 'regional or village improvement projects', 'rural tourism promotion projects', and 'other government support projects'. On the other hand, nouns related to virtues and values in the rural landscape were less shown in news articles. These results have become more apparent in recent years. In the topic analysis, 35 topics were identified. 'village development projects', 'rural tourism', and 'urban-rural exchange projects' were appeared repeatedly in several keywords. Among the topics, there are also topics closely related to ruralism such as 'rural landscape conservation', 'eco-friendly rural areas', 'local amenity resources', 'public interest values of agriculture', and 'rural life and communities'. The study presented an image map showing ruralism in South Korea using a network map between all topics and keywords. At the end of the study, implications for Korean rural area policy and research directions were discussed.