• Title/Summary/Keyword: morphological information preprocessing

Search Result 33, Processing Time 0.023 seconds

Ultrasonographic Analysis of the Size and Shape of the Muscles (근육의 크기와 형태의 초음파적 분석)

  • Kim, Kwang-Baek
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.2
    • /
    • pp.9-15
    • /
    • 2011
  • In this paper, we propose a method to extract the external oblique muscle of abdomen images that is often excluded by previous method due to image distortion. In the preprocessing phase of the proposed method, we emphasize the brightness contrast with Ends-in search stretching algorithm after removing noise from the initial ultrasonic images. Then we apply average binarization in vertical direction to extract candidate fascia areas. After removing other areas than fascia with morphological characteristics, the lost part in the fascia during the process is restored with such characteristic information and location information. Then the skin area is also removed with information from the arc appearing in convex filming and the candidate muscle areas are extracted by overlapping two results two way up-down search algorithm. Another noise removing process is done to determine the muscle area. In case of obtaining obscure result, after restoring the muscle area by smearing method, the thickness of the muscle is measured by min square method. The experiment verifies that the proposed method is sufficiently effective to analyze the size and shape of muscles in abdomen in ultrasonography than previously used methods.

Recognizing Unknown Words and Correcting Spelling errors as Preprocessing for Korean Information Processing System (한국어 정보처리 시스템의 전처리를 위한 미등록어 추정 및 철자 오류의 자동 교정)

  • Park, Bong-Rae;Rim, Hae-Chang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.10
    • /
    • pp.2591-2599
    • /
    • 1998
  • In this paper, we proose a method of recognizing unknown words and correcting spelling errors(including spacing erors) to increase the performance of Korean information processing systems. Unknown words are recognized through comparative analysis of two or more morphologically similar eojeols(spacing units in Korean) including the same unknown word candidates. And spacing errors and spelling errors are corrected by using lexicatlized rules shich are automatically extracted from very large raw corpus. The extractionof the lexicalized rules is based on morphological and contextual similarities between error eojeols and their corection eojeols which are confirmed to be used in the corpus. The experimental result shows that our system can recognize unknown words in an accuracy of 98.9%, and can correct spacing errors and spelling errors in accuracies of 98.1% and 97.1%, respectively.

  • PDF

Premature Contraction Arrhythmia Classification through ECG Pattern Analysis and Template Threshold (ECG 패턴 분석과 템플릿 문턱값을 통한 조기수축 부정맥분류)

  • Cho, Ik-sung;Cho, Young-Chang;Kwon, Hyeog-soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.2
    • /
    • pp.437-444
    • /
    • 2016
  • Most methods for detecting arrhythmia require pp interval, diversity of P wave morphology, but it is difficult to detect the p wave signal because of various noise types. Therefore it is necessary to use noise-free R wave. In this paper, we propose algorithm for premature contraction arrhythmia classification through ECG pattern analysis and template threshold. For this purpose, we detected R wave through the preprocessing method using morphological filter, subtractive operation method. Also, we developed algorithm to classify premature contraction wave pattern using weighted average, premature ventricular contraction(PVC) and atrial premature contraction(APC) through template threshold for R wave amplitude. The performance of R wave detection, PVC classification is evaluated by using 6 record of MIT-BIH arrhythmia database that included over 30 PVC and APC. The achieved scores indicate the average of 99.77% in R wave detection and the rate of 94.91%, 95.76% in PVC and APC classification.

Crack Detection on the Road in Aerial Image using Mask R-CNN (Mask R-CNN을 이용한 항공 영상에서의 도로 균열 검출)

  • Lee, Min Hye;Nam, Kwang Woo;Lee, Chang Woo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.24 no.3
    • /
    • pp.23-29
    • /
    • 2019
  • Conventional crack detection methods have a problem of consuming a lot of labor, time and cost. To solve these problems, an automatic detection system is needed to detect cracks in images obtained by using vehicles or UAVs(unmanned aerial vehicles). In this paper, we have studied road crack detection with unmanned aerial photographs. Aerial images are generated through preprocessing and labeling to generate morphological information data sets of cracks. The generated data set was applied to the mask R-CNN model to obtain a new model in which various crack information was learned. Experimental results show that the cracks in the proposed aerial image were detected with an accuracy of 73.5% and some of them were predicted in a certain type of crack region.

Detection of Forest Areas using Airborne LIDAR Data (항공 라이다데이터를 이용한 산림영역 탐지)

  • Hwang, Se-Ran;Kim, Seong-Joon;Lee, Im-Pyeong
    • Spatial Information Research
    • /
    • v.18 no.3
    • /
    • pp.23-32
    • /
    • 2010
  • LIDAR data are useful for forest applications such as bare-earth DEM generation for forest areas, and estimation of tree height and forest biomass. As a core preprocessing procedure for most forest applications, this study attempts to develop an efficient method to detect forest areas from LIDAR data. First, we suggest three perceptual cues based on multiple return characteristics, height deviation and spatial distribution, being expected as reliable perceptual cues for forest area detection from LIDAR data. We then classify the potential forest areas based on the individual cue and refine them with a bi-morphological process to eliminate falsely detected areas and smoothing the boundaries. The final refined forest areas have been compared with the reference data manually generated with an aerial image. All the methods based on three types of cues show the accuracy of more than 90%. Particularly, the method based on multiple returns is slightly better than other two cues in terms of the simplicity and accuracy. Also, it is shown that the combination of the individual results from each cue can enhance the classification accuracy.

Pattern Classification of Chromosome Images using the Image Reconstruction Method (영상 재구성방법을 이용한 염색체 영상의 패턴 분류)

  • 김충석;남재현;장용훈
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.4
    • /
    • pp.839-844
    • /
    • 2003
  • To improve classification accuracy in this paper, we proposed an algorithm for the chromosome image reconstruction in the image preprocessing part. also we proposed the pattern classification method using the hierarchical multilayer neural network(HMNN) to classify the chromosome karyotype. It reconstructed chromosome images for twenty normal human chromosome by the image reconstruction algorithm. The four morphological and ten density feature parameters were extracted from the 920 reconstructed chromosome images. The each combined feature parameters of ten human chromosome images were used to learn HMNN(Hierarchical Multilayer Neural Network) and the rest of them were used to classify the chromosome images. The experimental results in this paper were composed to optimized HMNN and also obtained about 98.26% to recognition ratio.

Text Region Detection Method in Mobile Phone Video (휴대전화 동영상에서의 문자 영역 검출 방법)

  • Lee, Hoon-Jae;Sull, Sang-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.192-198
    • /
    • 2010
  • With the popularization of the mobile phone with a built-in camera, there are a lot of effort to provide useful information to users by detecting and recognizing the text in the video which is captured by the camera in mobile phone, and there is a need to detect the text regions in such mobile phone video. In this paper, we propose a method to detect the text regions in the mobile phone video. We employ morphological operation as a preprocessing and obtain binarized image using modified k-means clustering. After that, candidate text regions are obtained by applying connected component analysis and general text characteristic analysis. In addition, we increase the precision of the text detection by examining the frequency of the candidate regions. Experimental results show that the proposed method detects the text regions in the mobile phone video with high precision and recall.

What Concerns Does ChatGPT Raise for Us?: An Analysis Centered on CTM (Correlated Topic Modeling) of YouTube Video News Comments (ChatGPT는 우리에게 어떤 우려를 초래하는가?: 유튜브 영상 뉴스 댓글의 CTM(Correlated Topic Modeling) 분석을 중심으로)

  • Song, Minho;Lee, Soobum
    • Informatization Policy
    • /
    • v.31 no.1
    • /
    • pp.3-31
    • /
    • 2024
  • This study aimed to examine public concerns in South Korea considering the country's unique context, triggered by the advent of generative artificial intelligence such as ChatGPT. To achieve this, comments from 102 YouTube video news related to ethical issues were collected using a Python scraper, and morphological analysis and preprocessing were carried out using Textom on 15,735 comments. These comments were then analyzed using a Correlated Topic Model (CTM). The analysis identified six primary topics within the comments: "Legal and Ethical Considerations"; "Intellectual Property and Technology"; "Technological Advancement and the Future of Humanity"; "Potential of AI in Information Processing"; "Emotional Intelligence and Ethical Regulations in AI"; and "Human Imitation."Structuring these topics based on a correlation coefficient value of over 10% revealed 3 main categories: "Legal and Ethical Considerations"; "Issues Related to Data Generation by ChatGPT (Intellectual Property and Technology, Potential of AI in Information Processing, and Human Imitation)"; and "Fear for the Future of Humanity (Technological Advancement and the Future of Humanity, Emotional Intelligence, and Ethical Regulations in AI)."The study confirmed the coexistence of various concerns along with the growing interest in generative AI like ChatGPT, including worries specific to the historical and social context of South Korea. These findings suggest the need for national-level efforts to ensure data fairness.

An Effective Microcalcification Detection in Digitized Mammograms Using Morphological Analysis and Multi-stage Neural Network (디지털 마모그램에서 형태적 분석과 다단 신경 회로망을 이용한 효율적인 미소석회질 검출)

  • Shin, Jin-Wook;Yoon, Sook;Park, Dong-Sun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.3C
    • /
    • pp.374-386
    • /
    • 2004
  • The mammogram provides the way to observe detailed internal organization of breasts to radiologists for the early detection. This paper is mainly focused on efficiently detecting the Microcalcification's Region Of Interest(ROI)s. Breast cancers can be caused from either microcalcifications or masses. Microcalcifications are appeared in a digital mammogram as tiny dots that have a little higher gray levels than their surrounding pixels. We can roughly determine the area which possibly contain microcalifications. In general, it is very challenging to find all the microcalcifications in a digital mammogram, because they are similar to some tissue parts of a breast. To efficiently detect microcalcifications ROI, we used four sequential processes; preprocessing for breast area detection, modified multilevel thresholding, ROI selection using simple thresholding filters and final ROI selection with two stages of neural networks. The filtering process with boundary conditions removes easily-distinguishable tissues while keeping all microcalcifications so that it cleans the thresholded mammogram images and speeds up the later processing by the average of 86%. The first neural network shows the average of 96.66% recognition rate. The second neural network performs better by showing the average recognition rate 98.26%. By removing all tissues while keeping microcalcifications as much as possible, the next parts of a CAD system for detecting breast cancers can become much simpler.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.