• Title/Summary/Keyword: learning through the image

Search Result 925, Processing Time 0.03 seconds

Development of Web Service for Liver Cirrhosis Diagnosis Based on Machine Learning (머신러닝기반 간 경화증 진단을 위한 웹 서비스 개발)

  • Noh, Si-Hyeong;Kim, Ji-Eon;Lee, Chungsub;Kim, Tae-Hoon;Kim, KyungWon;Yoon, Kwon-Ha;Jeong, Chang-Won
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.10
    • /
    • pp.285-290
    • /
    • 2021
  • In the medical field, disease diagnosis and prediction research using artificial intelligence technology is being actively conducted. It is being released as a variety of products for disease diagnosis and prediction, which are most widely used in the application of artificial intelligence technology based on medical images. Artificial intelligence is being applied to diagnose diseases, to classify diseases into benign and malignant, and to separate disease regions for use in identification or reading according to the risk of disease. Recently, in connection with cloud technology, its utility as a service product is increasing. Among the diseases dealt with in this paper, liver disease is a disease with very high risk because it is difficult to diagnose early due to the lack of pain. Artificial intelligence technology was introduced based on medical images as a non-invasive diagnostic method for diagnosing these diseases. We describe the development of a web service to help the most meaningful clinical reading of liver cirrhosis patients. Then, it shows the web service process and shows the operation screen of each process and the final result screen. It is expected that the proposed service will be able to diagnose liver cirrhosis at an early stage and help patients recover through rapid treatment.

A Deep Learning Method for Cost-Effective Feed Weight Prediction of Automatic Feeder for Companion Animals (반려동물용 자동 사료급식기의 비용효율적 사료 중량 예측을 위한 딥러닝 방법)

  • Kim, Hoejung;Jeon, Yejin;Yi, Seunghyun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.263-278
    • /
    • 2022
  • With the recent advent of IoT technology, automatic pet feeders are being distributed so that owners can feed their companion animals while they are out. However, due to behaviors of pets, the method of measuring weight, which is important in automatic feeding, can be easily damaged and broken when using the scale. The 3D camera method has disadvantages due to its cost, and the 2D camera method has relatively poor accuracy when compared to 3D camera method. Hence, the purpose of this study is to propose a deep learning approach that can accurately estimate weight while simply using a 2D camera. For this, various convolutional neural networks were used, and among them, the ResNet101-based model showed the best performance: an average absolute error of 3.06 grams and an average absolute ratio error of 3.40%, which could be used commercially in terms of technical and financial viability. The result of this study can be useful for the practitioners to predict the weight of a standardized object such as feed only through an easy 2D image.

Detection of video editing points using facial keypoints (얼굴 특징점을 활용한 영상 편집점 탐지)

  • Joshep Na;Jinho Kim;Jonghyuk Park
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, various services using artificial intelligence(AI) are emerging in the media field as well However, most of the video editing, which involves finding an editing point and attaching the video, is carried out in a passive manner, requiring a lot of time and human resources. Therefore, this study proposes a methodology that can detect the edit points of video according to whether person in video are spoken by using Video Swin Transformer. First, facial keypoints are detected through face alignment. To this end, the proposed structure first detects facial keypoints through face alignment. Through this process, the temporal and spatial changes of the face are reflected from the input video data. And, through the Video Swin Transformer-based model proposed in this study, the behavior of the person in the video is classified. Specifically, after combining the feature map generated through Video Swin Transformer from video data and the facial keypoints detected through Face Alignment, utterance is classified through convolution layers. In conclusion, the performance of the image editing point detection model using facial keypoints proposed in this paper improved from 87.46% to 89.17% compared to the model without facial keypoints.

Flood Mapping Using Modified U-NET from TerraSAR-X Images (TerraSAR-X 영상으로부터 Modified U-NET을 이용한 홍수 매핑)

  • Yu, Jin-Woo;Yoon, Young-Woong;Lee, Eu-Ru;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1709-1722
    • /
    • 2022
  • The rise in temperature induced by global warming caused in El Nino and La Nina, and abnormally changed the temperature of seawater. Rainfall concentrates in some locations due to abnormal variations in seawater temperature, causing frequent abnormal floods. It is important to rapidly detect flooded regions to recover and prevent human and property damage caused by floods. This is possible with synthetic aperture radar. This study aims to generate a model that directly derives flood-damaged areas by using modified U-NET and TerraSAR-X images based on Multi Kernel to reduce the effect of speckle noise through various characteristic map extraction and using two images before and after flooding as input data. To that purpose, two synthetic aperture radar (SAR) images were preprocessed to generate the model's input data, which was then applied to the modified U-NET structure to train the flood detection deep learning model. Through this method, the flood area could be detected at a high level with an average F1 score value of 0.966. This result is expected to contribute to the rapid recovery of flood-stricken areas and the derivation of flood-prevention measures.

A Comparative Study on the Possibility of Land Cover Classification of the Mosaic Images on the Korean Peninsula (한반도 모자이크 영상의 토지피복분류 활용 가능성 탐색을 위한 비교 연구)

  • Moon, Jiyoon;Lee, Kwang Jae
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_4
    • /
    • pp.1319-1326
    • /
    • 2019
  • The KARI(Korea Aerospace Research Institute) operates the government satellite information application consultation to cope with ever-increasing demand for satellite images in the public sector, and carries out various support projects including the generation and provision of mosaic images on the Korean Peninsula every year to enhance user convenience and promote the use of satellite images. In particular, the government has wanted to increase the utilization of mosaic images on the Korean Peninsula and seek to classify and update mosaic images so that users can use them in their businesses easily. However, it is necessary to test and verify whether the classification results of the mosaic images can be utilized in the field since the original spectral information is distorted during pan-sharpening and color balancing, and there is a limitation that only R, G, and B bands are provided. Therefore, in this study, the reliability of the classification result of the mosaic image was compared to the result of KOMPSAT-3 image. The study found that the accuracy of the classification result of KOMPSAT-3 image was between 81~86% (overall accuracy is about 85%), while the accuracy of the classification result of mosaic image was between 69~72% (overall accuracy is about 72%). This phenomenon is interpreted not only because of the distortion of the original spectral information through pan-sharpening and mosaic processes, but also because NDVI and NDWI information were extracted from KOMPSAT-3 image rather than from the mosaic image, as only three color bands(R, G, B) were provided. Although it is deemed inadequate to distribute classification results extracted from mosaic images at present, it is believed that it will be necessary to explore ways to minimize the distortion of spectral information when making mosaic images and to develop classification techniques suitable for mosaic images as well as the provision of NIR band information. In addition, it is expected that the utilization of images with limited spectral information could be increased in the future if related research continues, such as the comparative analysis of classification results by geomorphological characteristics and the development of machine learning methods for image classification by objects of interest.

The Development of Nutrition Education Program for Improvement of body Perception of Middle School Girls (II);Development of Nutrition Education Program (여중생의 체형인식 개선을 위한 영양교육 프로그램 개발(II);여중생 대상 영양교육 프로그램 개발)

  • Soh, Hye-Kyung;Lee, Eun-Ju;Choi, Bong-Soon
    • Journal of the Korean Society of Food Culture
    • /
    • v.23 no.1
    • /
    • pp.130-137
    • /
    • 2008
  • If we may practice the nutrition education planned on the basis which carefully grasped the inappropriate behavioral determinants of middle-school students, it might be an effective method achieving the change in perception and behavior improving the distorted perception about the ideal body shape, so we are to suggest the 8 week program of body shape perception improvement for successful nutrition education as follows. The body shape perception improvement program is a step-by-step group consulting program. At the introduction stage, we let them understand the meaning of true beauty and body change of teenage period and forming of sexual identity. At the stage of perception conversion, we let them have the opportunity to observe the status of body perception of the teenager and self-observation. At the stage of correction, we let them criticize the distorted body image in the society with mass media at the same time with the self-reflection. At the stage of maintenance and evaluation, we suggested the behavior guidance while preparing it. Setting this as the basis, we applied the contents such as the evaluations through cultural sharing events making somethings while directly participating. As the target groups to practice education were middle school students, we considered the learning level and behavioral features of the middle school students, and composed the programs including the methods such as role play, watching real things, media production, discussions and experiences. If the program of body shape perception improvement developed at this study could be utilized at the field of schools, the teenagers can change their ways of thought naturally avoiding the view about unified appearance rightly perceiving negative self-image that the teenagers can have and if the group consulting can be practiced regularly at each school, many students may experience the change in perception, so it might solicit the improvement of health of the families and local societies as well as that of the individual student.

Development on Identification Algorithm of Risk Situation around Construction Vehicle using YOLO-v3 (YOLO-v3을 활용한 건설 장비 주변 위험 상황 인지 알고리즘 개발)

  • Shim, Seungbo;Choi, Sang-Il
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.7
    • /
    • pp.622-629
    • /
    • 2019
  • Recently, the government is taking new approaches to change the fact that the accident rate and accident death rate of the construction industry account for a high percentage of the whole industry. Especially, it is investing heavily in the development of construction technology that is fused with ICT technology in line with the current trend of the 4th Industrial Revolution. In order to cope with this situation, this paper proposed a concept to recognize and share the work situation information between the construction machine driver and the surrounding worker to enhance the safety in the place where construction machines are operated. In order to realize the part of the concept, we applied image processing technology using camera based on artificial intelligence to earth-moving work. Especially, we implemented an algorithm that can recognize the surrounding worker's circumstance and identify the risk situation through the experiment using the compaction equipment. and image processing algorithm based on YOLO-v3. This algorithm processes 15.06 frames per second in video and can recognize danger situation around construction machine with accuracy of 90.48%. We will contribute to the prevention of safety accidents at the construction site by utilizing this technology in the future.

Evaluation of Usefulness of Assertive Devices to Improve the Accuracy in Skull lateral X-ray Projection (두개골 측방향 X-선 촬영에서 정확도 향상을 위한 촬영 보조 기구의 유용성 평가)

  • Bo-Seok Chang
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.2
    • /
    • pp.153-159
    • /
    • 2024
  • In X-ray projection, Unskilled radiologists become skilled through fail exam. This causes the patient to be exposed to unnecessary radiation. In this study, pre-position unskilled radiologic technologist presented ways to improve clinical proficiency. presented a skull lateral x-ray projection practice method using visual, spatial, and assistive devices. In addition, the accuracy and usefulness of the use of assistive devices were evaluated. When X-ray images were taken based on learning, the rotational spacing, which indicates image distortion, was 7.85 ± 1.45 mm and the tiliting spacing was 4.84 ± 0.5 mm. When practicing using visual aids, the rotational spacing is 4.4 ± 0.76 mm and the inclination spacing is 3.01 ± 0.87 mm. using a spatial compensation device, the rotational spacing is 5.2 ± 0.69 mm and the tiliting spacing is 3.33 ± 0.61 mm. Skull lateral X-ray Image distortion caused by empirical photography practice decreased by 5.4%, but image distortion caused by tilting increased by 1.2%. When practicing using a visual assistive devices, the degree of rotational spacing by 40.1% and the tiliting spacing decreased by 30.7% compared to the empirical x-ray exposure practice. When using spatial assistive devices, the rotation interval was reduced by 41.7% and the tilting interval by 23.7% compared to conventional empirical x-ray exposure practice. Therefore, if an unskilled radiologist practices using visual and spatial aids,the accuracy will be improved in skull lateral x-ray projection.

Study on Extracting Filming Location Information in Movies Using OCR for Developing Customized Travel Content (맞춤형 여행 콘텐츠 개발을 위한 OCR 기법을 활용한 영화 속 촬영지 정보 추출 방안 제시)

  • Park, Eunbi;Shin, Yubin;Kang, Juyoung
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.29-39
    • /
    • 2020
  • Purpose The atmosphere of respect for individual tastes that have spread throughout society has changed the consumption trend. As a result, the travel industry is also seeing customized travel as a new trend that reflects consumers' personal tastes. In particular, there is a growing interest in 'film-induced tourism', one of the areas of travel industry. We hope to satisfy the individual's motivation for traveling while watching movies with customized travel proposals, which we expect to be a catalyst for the continued development of the 'film-induced tourism industry'. Design/methodology/approach In this study, we implemented a methodology through 'OCR' of extracting and suggesting film location information that viewers want to visit. First, we extract a scene from a movie selected by a user by using 'OpenCV', a real-time image processing library. In addition, we detected the location of characters in the scene image by using 'EAST model', a deep learning-based text area detection model. The detected images are preprocessed by using 'OpenCV built-in function' to increase recognition accuracy. Finally, after converting characters in images into recognizable text using 'Tesseract', an optical character recognition engine, the 'Google Map API' returns actual location information. Significance This research is significant in that it provides personalized tourism content using fourth industrial technology, in addition to existing film tourism. This could be used in the development of film-induced tourism packages with travel agencies in the future. It also implies the possibility of being used for inflow from abroad as well as to abroad.

Grasping a Target Object in Clutter with an Anthropomorphic Robot Hand via RGB-D Vision Intelligence, Target Path Planning and Deep Reinforcement Learning (RGB-D 환경인식 시각 지능, 목표 사물 경로 탐색 및 심층 강화학습에 기반한 사람형 로봇손의 목표 사물 파지)

  • Ryu, Ga Hyeon;Oh, Ji-Heon;Jeong, Jin Gyun;Jung, Hwanseok;Lee, Jin Hyuk;Lopez, Patricio Rivera;Kim, Tae-Seong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.363-370
    • /
    • 2022
  • Grasping a target object among clutter objects without collision requires machine intelligence. Machine intelligence includes environment recognition, target & obstacle recognition, collision-free path planning, and object grasping intelligence of robot hands. In this work, we implement such system in simulation and hardware to grasp a target object without collision. We use a RGB-D image sensor to recognize the environment and objects. Various path-finding algorithms been implemented and tested to find collision-free paths. Finally for an anthropomorphic robot hand, object grasping intelligence is learned through deep reinforcement learning. In our simulation environment, grasping a target out of five clutter objects, showed an average success rate of 78.8%and a collision rate of 34% without path planning. Whereas our system combined with path planning showed an average success rate of 94% and an average collision rate of 20%. In our hardware environment grasping a target out of three clutter objects showed an average success rate of 30% and a collision rate of 97% without path planning whereas our system combined with path planning showed an average success rate of 90% and an average collision rate of 23%. Our results show that grasping a target object in clutter is feasible with vision intelligence, path planning, and deep RL.