• Title/Summary/Keyword: 데이터 증강

Search Result 494, Processing Time 0.022 seconds

Personalized Speech Classification Scheme for the Smart Speaker Accessibility Improvement of the Speech-Impaired people (언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법)

  • SeungKwon Lee;U-Jin Choe;Gwangil Jeon
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.17-24
    • /
    • 2022
  • With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%.

Assessment of Visual Landscape Image Analysis Method Using CNN Deep Learning - Focused on Healing Place - (CNN 딥러닝을 활용한 경관 이미지 분석 방법 평가 - 힐링장소를 대상으로 -)

  • Sung, Jung-Han;Lee, Kyung-Jin
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.3
    • /
    • pp.166-178
    • /
    • 2023
  • This study aims to introduce and assess CNN Deep Learning methods to analyze visual landscape images on social media with embedded user perceptions and experiences. This study analyzed visual landscape images by focusing on a healing place. For the study, seven adjectives related to healing were selected through text mining and consideration of previous studies. Subsequently, 50 evaluators were recruited to build a Deep Learning image. Evaluators were asked to collect three images most suitable for 'healing', 'healing landscape', and 'healing place' on portal sites. The collected images were refined and a data augmentation process was applied to build a CNN model. After that, 15,097 images of 'healing' and 'healing landscape' on portal sites were collected and classified to analyze the visual landscape of a healing place. As a result of the study, 'quiet' was the highest in the category except 'other' and 'indoor' with 2,093 (22%), followed by 'open', 'joyful', 'comfortable', 'clean', 'natural', and 'beautiful'. It was found through research that CNN Deep Learning is an analysis method that can derive results from visual landscape image analysis. It also suggested that it is one way to supplement the existing visual landscape analysis method, and suggests in-depth and diverse visual landscape analysis in the future by establishing a landscape image learning dataset.

Establishment of a deep learning-based defect classification system for optimizing textile manufacturing equipment

  • YuLim Kim;Jaeil Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.27-35
    • /
    • 2023
  • In this paper, we propose a process of increasing productivity by applying a deep learning-based defect detection and classification system to the prepreg fiber manufacturing process, which is in high demand in the field of producing composite materials. In order to apply it to toe prepreg manufacturing equipment that requires a solution due to the occurrence of a large amount of defects in various conditions, the optimal environment was first established by selecting cameras and lights necessary for defect detection and classification model production. In addition, data necessary for the production of multiple classification models were collected and labeled according to normal and defective conditions. The multi-classification model is made based on CNN and applies pre-learning models such as VGGNet, MobileNet, ResNet, etc. to compare performance and identify improvement directions with accuracy and loss graphs. Data augmentation and dropout techniques were applied to identify and improve overfitting problems as major problems. In order to evaluate the performance of the model, a performance evaluation was conducted using the confusion matrix as a performance indicator, and the performance of more than 99% was confirmed. In addition, it checks the classification results for images acquired in real time by applying them to the actual process to check whether the discrimination values are accurately derived.

Development of Deep Recognition of Similarity in Show Garden Design Based on Deep Learning (딥러닝을 활용한 전시 정원 디자인 유사성 인지 모형 연구)

  • Cho, Woo-Yun;Kwon, Jin-Wook
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.52 no.2
    • /
    • pp.96-109
    • /
    • 2024
  • The purpose of this study is to propose a method for evaluating the similarity of Show gardens using Deep Learning models, specifically VGG-16 and ResNet50. A model for judging the similarity of show gardens based on VGG-16 and ResNet50 models was developed, and was referred to as DRG (Deep Recognition of similarity in show Garden design). An algorithm utilizing GAP and Pearson correlation coefficient was employed to construct the model, and the accuracy of similarity was analyzed by comparing the total number of similar images derived at 1st (Top1), 3rd (Top3), and 5th (Top5) ranks with the original images. The image data used for the DRG model consisted of a total of 278 works from the Le Festival International des Jardins de Chaumont-sur-Loire, 27 works from the Seoul International Garden Show, and 17 works from the Korea Garden Show. Image analysis was conducted using the DRG model for both the same group and different groups, resulting in the establishment of guidelines for assessing show garden similarity. First, overall image similarity analysis was best suited for applying data augmentation techniques based on the ResNet50 model. Second, for image analysis focusing on internal structure and outer form, it was effective to apply a certain size filter (16cm × 16cm) to generate images emphasizing form and then compare similarity using the VGG-16 model. It was suggested that an image size of 448 × 448 pixels and the original image in full color are the optimal settings. Based on these research findings, a quantitative method for assessing show gardens is proposed and it is expected to contribute to the continuous development of garden culture through interdisciplinary research moving forward.

Comparative Study of Fish Detection and Classification Performance Using the YOLOv8-Seg Model (YOLOv8-Seg 모델을 이용한 어류 탐지 및 분류 성능 비교연구)

  • Sang-Yeup Jin;Heung-Bae Choi;Myeong-Soo Han;Hyo-tae Lee;Young-Tae Son
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.2
    • /
    • pp.147-156
    • /
    • 2024
  • The sustainable management and enhancement of marine resources are becoming increasingly important issues worldwide. This study was conducted in response to these challenges, focusing on the development and performance comparison of fish detection and classification models as part of a deep learning-based technique for assessing the effectiveness of marine resource enhancement projects initiated by the Korea Fisheries Resources Agency. The aim was to select the optimal model by training various sizes of YOLOv8-Seg models on a fish image dataset and comparing each performance metric. The dataset used for model construction consisted of 36,749 images and label files of 12 different species of fish, with data diversity enhanced through the application of augmentation techniques during training. When training and validating five different YOLOv8-Seg models under identical conditions, the medium-sized YOLOv8m-Seg model showed high learning efficiency and excellent detection and classification performance, with the shortest training time of 13 h and 12 min, an of 0.933, and an inference speed of 9.6 ms. Considering the balance between each performance metric, this was deemed the most efficient model for meeting real-time processing requirements. The use of such real-time fish detection and classification models could enable effective surveys of marine resource enhancement projects, suggesting the need for ongoing performance improvements and further research.

Application of Deep Learning for Classification of Ancient Korean Roof-end Tile Images (딥러닝을 활용한 고대 수막새 이미지 분류 검토)

  • KIM Younghyun
    • Korean Journal of Heritage: History & Science
    • /
    • v.57 no.3
    • /
    • pp.24-35
    • /
    • 2024
  • Recently, research using deep learning technologies such as artificial intelligence, convolutional neural networks, etc. has been actively conducted in various fields including healthcare, manufacturing, autonomous driving, and security, and is having a significant influence on society. In line with this trend, the present study attempted to apply deep learning to the classification of archaeological artifacts, specifically ancient Korean roof-end tiles. Using 100 images of roof-end tiles from each of the Goguryeo, Baekje, and Silla dynasties, for a total of 300 base images, a dataset was formed and expanded to 1,200 images using data augmentation techniques. After building a model using transfer learning from the pre-trained EfficientNetB0 model and conducting five-fold cross-validation, an average training accuracy of 98.06% and validation accuracy of 97.08% were achieved. Furthermore, when model performance was evaluated with a test dataset of 240 images, it could classify the roof-end tile images from the three dynasties with a minimum accuracy of 91%. In particular, with a learning rate of 0.0001, the model exhibited the highest performance, with accuracy of 92.92%, precision of 92.96%, recall of 92.92%, and F1 score of 92.93%. This optimal result was obtained by preventing overfitting and underfitting issues using various learning rate settings and finding the optimal hyperparameters. The study's findings confirm the potential for applying deep learning technologies to the classification of Korean archaeological materials, which is significant. Additionally, it was confirmed that the existing ImageNet dataset and parameters could be positively applied to the analysis of archaeological data. This approach could lead to the creation of various models for future archaeological database accumulation, the use of artifacts in museums, and classification and organization of artifacts.

3D analysis of soft tissue around implant after flap folding suture (Flap folding suture를 활용한 판막의 고정에 따른 임플란트 주변 연조직 3차원 부피 변화 관찰)

  • Jung, Sae-Young;Kang, Dae-Young;Shin, Hyun-Seung;Park, Jung-Chul
    • Journal of Dental Rehabilitation and Applied Science
    • /
    • v.37 no.3
    • /
    • pp.130-137
    • /
    • 2021
  • Purpose: The various suture techniques can be utilized in order to maximize the keratinized tissue healing around dental implants. The aim of this study is to compare the soft tissue healing pattern between two different suture techniques after implant placement. Materials and Methods: 15 patients with 18 implants were enrolled in this study. Simple implant placement without any additional bone graft was performed. Two different suture techniques were used to tug in the mobilized flap near the healing abutment after paramarginal flap design. Digital intraoral scan was performed at baseline, post-operation, stitch out, and 3 months after operation. The scan data were aligned using multiple points such as cusp, fossa of adjacent teeth, and/or healing abutment. After subtracting scan data at baseline with other time-point results, closed space indicating volume increment of peri-implant mucosa was selected. The volume of the close space was measured in mm3. The volume between two suture techniques at three time-points was compared using nonparametric rank-based analysis. Results: Healing was uneventful in both groups. Both suture technique groups showed increased soft tissue volume immediately after surgery. The amount of volume increment significantly decreased after 3 months (P < 0.001). Flap folding suture group showed higher median of volume increment than interrupted suture group after 3 months without any statistical significance (P > 0.05). Conclusion: After paramarginal flap reflection, the raised flaps stabilized by flap folding suture showed relatively higher volume maintenance after 3-month healing period. However, further studies are warranted.

Signal Change of Iodinated Contrast Agents in MR Imaging (요오드화 조영제가 MR영상에 미치는 신호 변화)

  • Jeong, HK;Kim, Seongho;Kang, Chunghwan;Lee, Suho;Yi, Yun;Kim, Mingi;Kim, Hochul
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.12
    • /
    • pp.131-138
    • /
    • 2016
  • In this study, we tried to analyze the influence of ICM(Iodinated Contrast Media) in MR imaging compare to GBCA(Gadolinium Based Contrast Agent), and as this result we discussed whether resonable or not the protocol which is MRI scan after enhanced CT scan without proper time interval in clinical field. For this research, we assembled two phantoms. which one was iodine and another one was gadolinium. We did test two phantoms in conventional MRI scan which is T1, T2, T2 FLAIR and 3D angio. After that, quantitative analysis was progressed. The results of study were as follow : SSI(Saline's Signal Intensity) was shown as each sequences 175, 1231, 333, 37 [a.u] at iodine. and 1297, 123, 757, 232 [a.u] was recorded at gadolinium. BDEPS(the Biggest Difference of EPS) was shown as each sequences 1297, 123, 757, 232 [a.u] at iodine and 793, 6, 1495, 365 [a.u] was recorded at gadolinium. At this time, EPS(Enhancement Percentage to Saline) was shown 641.1, -90.0, 127.3, 527% at iodine and 685.1, 99.4, 365.7, 1077.4% was recorded at gadolinium. BP(BDEPS's point) was shown 900, 900, 477, 900 mmol at iodine and 4, 0.2, 0.2, 40 mmol was recorded at gadolinium. CPSS(Change Point of SI to SSI) was shown 63, 423, 63, 29 mmol at iodine and each [50, 30], [4, 0.2], [4, 1], 0.2 mmol was recorded at gadolinium. According to this research, we could not only discover the fact that was iodine could effect on MR signal, but also the pattern is different as various sequences compare to gadolinium. Therefore, we expect useful diagnostic MR image in clinical field with this quantitative data for deciding protocol regarding MRI and CT scan order.

Quantitative Assessment using SNR and CNR in Cerebrovascular Diseases : Focusing on FRE-MRA, CTA Imaging Method (뇌혈관 질환에서 신호대 잡음비와 대조도대 잡음비를 이용한 정량적평가 : FRE-MRA, CTA 영상기법중심으로)

  • Goo, Eun-Hoe
    • Journal of the Korean Society of Radiology
    • /
    • v.11 no.6
    • /
    • pp.493-500
    • /
    • 2017
  • In this study, data analysis has been conducted by INFINITT program to analyze the effect of signal to noise ratio(SNR) and contrast to noise ratio(CNR) of flow related enhancement(FRE) and computed tomography Angiography(CTA) on cerebrovascular diseases for qualitative evaluations. Based on the cerebrovascular image results achieved from 63 patients (January to April, 2017, at C University Hospital), we have selected 19 patients that performed both FRE-MRA and CTA. From the 19 patients, 2 were excluded due to artifacts from movements in the cerebrovascular image results. For the analysis conditions, we have set the 5 part (anterior cerebral artery, right and left Middle cerebral artery, right and left Posterior cerebral artery) as the interest area to evaluate the SNR and CNR, and the results were validated through Independence t Test. As a result, by averaging the SNR, and CNR values, the corresponding FRE-MRA achieved were: anterior cerebral artery ($1500.73{\pm}12.23/970.43{\pm}14.55$), right middle cerebral artery ($1470.16{\pm}11.46/919.44{\pm}13.29$), left middle cerebral artery ($1457.48{\pm}17.11/903.96{\pm}14.53$), right posterior cerebral artery ($1385.83{\pm}16.52/852.11{\pm}14.58$), left posterior cerebral artery ($1318.52{\pm}13.49/756.21{\pm}10.88$). by averaging the SNR, and CNR values, the corresponding CTA achieved were: anterior cerebral artery ($159.95{\pm}12.23/123.36{\pm}11.78$), right middle cerebral artery ($236.66{\pm}17.52/202.37{\pm}15.20$), left middle cerebral artery ($224.85{\pm}13.45/193.14{\pm}11.88$), right posterior cerebral artery ($183.65{\pm}13.47/151.44{\pm}11.48$), left posterior cerebral artery ($177.7{\pm}16.72/144.71{\pm}11.43$) (p < 0.05). In conclusion, MRA had high SNR and CNR value regardless of the cerebral infarction or cerebral hemorrhage observed in the 5 part of the brain. Although FRE-MRA consumed longer time, it proved to have less side effect of contrast media when compared to the CTA.

A Study on Intuitive IoT Interface System using 3D Depth Camera (3D 깊이 카메라를 활용한 직관적인 사물인터넷 인터페이스 시스템에 관한 연구)

  • Park, Jongsub;Hong, June Seok;Kim, Wooju
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.2
    • /
    • pp.137-152
    • /
    • 2017
  • The decline in the price of IT devices and the development of the Internet have created a new field called Internet of Things (IoT). IoT, which creates new services by connecting all the objects that are in everyday life to the Internet, is pioneering new forms of business that have not been seen before in combination with Big Data. The prospect of IoT can be said to be unlimited in its utilization. In addition, studies of standardization organizations for smooth connection of these IoT devices are also active. However, there is a part of this study that we overlook. In order to control IoT equipment or acquire information, it is necessary to separately develop interworking issues (IP address, Wi-Fi, Bluetooth, NFC, etc.) and related application software or apps. In order to solve these problems, existing research methods have been conducted on augmented reality using GPS or markers. However, there is a disadvantage in that a separate marker is required and the marker is recognized only in the vicinity. In addition, in the case of a study using a GPS address using a 2D-based camera, it was difficult to implement an active interface because the distance to the target device could not be recognized. In this study, we use 3D Depth recognition camera to be installed on smartphone and calculate the space coordinates automatically by linking the distance measurement and the sensor information of the mobile phone without a separate marker. Coordination inquiry finds equipment of IoT and enables information acquisition and control of corresponding IoT equipment. Therefore, from the user's point of view, it is possible to reduce the burden on the problem of interworking of the IoT equipment and the installation of the app. Furthermore, if this technology is used in the field of public services and smart glasses, it will reduce duplication of investment in software development and increase in public services.