Search | Korea Science

Analysis of Research Trends in Deep Learning-Based Video Captioning (딥러닝 기반 비디오 캡셔닝의 연구동향 분석)

Lyu Zhi;Eunju Lee;Youngsoo Kim
- KIPS Transactions on Software and Data Engineering
- /
- v.13 no.1
- /
- pp.35-49
- /
- 2024
Video captioning technology, as a significant outcome of the integration between computer vision and natural language processing, has emerged as a key research direction in the field of artificial intelligence. This technology aims to achieve automatic understanding and language expression of video content, enabling computers to transform visual information in videos into textual form. This paper provides an initial analysis of the research trends in deep learning-based video captioning and categorizes them into four main groups: CNN-RNN-based Model, RNN-RNN-based Model, Multimodal-based Model, and Transformer-based Model, and explain the concept of each video captioning model. The features, pros and cons were discussed. This paper lists commonly used datasets and performance evaluation methods in the video captioning field. The dataset encompasses diverse domains and scenarios, offering extensive resources for the training and validation of video captioning models. The model performance evaluation method mentions major evaluation indicators and provides practical references for researchers to evaluate model performance from various angles. Finally, as future research tasks for video captioning, there are major challenges that need to be continuously improved, such as maintaining temporal consistency and accurate description of dynamic scenes, which increase the complexity in real-world applications, and new tasks that need to be studied are presented such as temporal relationship modeling and multimodal data integration.
https://doi.org/10.3745/KTSDE.2024.13.1.35 인용 PDF

Deep learning-based automatic segmentation of the mandibular canal on panoramic radiographs: A multi-device study

Moe Thu Zar Aung;Sang-Heon Lim;Jiyong Han;Su Yang;Ju-Hee Kang;Jo-Eun Kim;Kyung-Hoe Huh;Won-Jin Yi;Min-Suk Heo;Sam-Sun Lee
- Imaging Science in Dentistry
- /
- v.54 no.1
- /
- pp.81-91
- /
- 2024
Purpose: The objective of this study was to propose a deep-learning model for the detection of the mandibular canal on dental panoramic radiographs. Materials and Methods: A total of 2,100 panoramic radiographs (PANs) were collected from 3 different machines: RAYSCAN Alpha (n=700, PAN A), OP-100 (n=700, PAN B), and CS8100 (n=700, PAN C). Initially, an oral and maxillofacial radiologist coarsely annotated the mandibular canals. For deep learning analysis, convolutional neural networks (CNNs) utilizing U-Net architecture were employed for automated canal segmentation. Seven independent networks were trained using training sets representing all possible combinations of the 3 groups. These networks were then assessed using a hold-out test dataset. Results: Among the 7 networks evaluated, the network trained with all 3 available groups achieved an average precision of 90.6%, a recall of 87.4%, and a Dice similarity coefficient (DSC) of 88.9%. The 3 networks trained using each of the 3 possible 2-group combinations also demonstrated reliable performance for mandibular canal segmentation, as follows: 1) PAN A and B exhibited a mean DSC of 87.9%, 2) PAN A and C displayed a mean DSC of 87.8%, and 3) PAN B and C demonstrated a mean DSC of 88.4%. Conclusion: This multi-device study indicated that the examined CNN-based deep learning approach can achieve excellent canal segmentation performance, with a DSC exceeding 88%. Furthermore, the study highlighted the importance of considering the characteristics of panoramic radiographs when developing a robust deep-learning network, rather than depending solely on the size of the dataset.
https://doi.org/10.5624/isd.20230245 인용 PDF

Automated Lung Segmentation on Chest Computed Tomography Images with Extensive Lung Parenchymal Abnormalities Using a Deep Neural Network

Seung-Jin Yoo;Soon Ho Yoon;Jong Hyuk Lee;Ki Hwan Kim;Hyoung In Choi;Sang Joon Park;Jin Mo Goo
- Korean Journal of Radiology
- /
- v.22 no.3
- /
- pp.476-488
- /
- 2021
Objective: We aimed to develop a deep neural network for segmenting lung parenchyma with extensive pathological conditions on non-contrast chest computed tomography (CT) images. Materials and Methods: Thin-section non-contrast chest CT images from 203 patients (115 males, 88 females; age range, 31-89 years) between January 2017 and May 2017 were included in the study, of which 150 cases had extensive lung parenchymal disease involving more than 40% of the parenchymal area. Parenchymal diseases included interstitial lung disease (ILD), emphysema, nontuberculous mycobacterial lung disease, tuberculous destroyed lung, pneumonia, lung cancer, and other diseases. Five experienced radiologists manually drew the margin of the lungs, slice by slice, on CT images. The dataset used to develop the network consisted of 157 cases for training, 20 cases for development, and 26 cases for internal validation. Two-dimensional (2D) U-Net and three-dimensional (3D) U-Net models were used for the task. The network was trained to segment the lung parenchyma as a whole and segment the right and left lung separately. The University Hospitals of Geneva ILD dataset, which contained high-resolution CT images of ILD, was used for external validation. Results: The Dice similarity coefficients for internal validation were 99.6 ± 0.3% (2D U-Net whole lung model), 99.5 ± 0.3% (2D U-Net separate lung model), 99.4 ± 0.5% (3D U-Net whole lung model), and 99.4 ± 0.5% (3D U-Net separate lung model). The Dice similarity coefficients for the external validation dataset were 98.4 ± 1.0% (2D U-Net whole lung model) and 98.4 ± 1.0% (2D U-Net separate lung model). In 31 cases, where the extent of ILD was larger than 75% of the lung parenchymal area, the Dice similarity coefficients were 97.9 ± 1.3% (2D U-Net whole lung model) and 98.0 ± 1.2% (2D U-Net separate lung model). Conclusion: The deep neural network achieved excellent performance in automatically delineating the boundaries of lung parenchyma with extensive pathological conditions on non-contrast chest CT images.
https://doi.org/10.3348/kjr.2020.0318 인용 PDF

Development and Validation of a Deep Learning System for Segmentation of Abdominal Muscle and Fat on Computed Tomography

Hyo Jung Park;Yongbin Shin;Jisuk Park;Hyosang Kim;In Seob Lee;Dong-Woo Seo;Jimi Huh;Tae Young Lee;TaeYong Park;Jeongjin Lee;Kyung Won Kim
- Korean Journal of Radiology
- /
- v.21 no.1
- /
- pp.88-100
- /
- 2020
Objective: We aimed to develop and validate a deep learning system for fully automated segmentation of abdominal muscle and fat areas on computed tomography (CT) images. Materials and Methods: A fully convolutional network-based segmentation system was developed using a training dataset of 883 CT scans from 467 subjects. Axial CT images obtained at the inferior endplate level of the 3rd lumbar vertebra were used for the analysis. Manually drawn segmentation maps of the skeletal muscle, visceral fat, and subcutaneous fat were created to serve as ground truth data. The performance of the fully convolutional network-based segmentation system was evaluated using the Dice similarity coefficient and cross-sectional area error, for both a separate internal validation dataset (426 CT scans from 308 subjects) and an external validation dataset (171 CT scans from 171 subjects from two outside hospitals). Results: The mean Dice similarity coefficients for muscle, subcutaneous fat, and visceral fat were high for both the internal (0.96, 0.97, and 0.97, respectively) and external (0.97, 0.97, and 0.97, respectively) validation datasets, while the mean cross-sectional area errors for muscle, subcutaneous fat, and visceral fat were low for both internal (2.1%, 3.8%, and 1.8%, respectively) and external (2.7%, 4.6%, and 2.3%, respectively) validation datasets. Conclusion: The fully convolutional network-based segmentation system exhibited high performance and accuracy in the automatic segmentation of abdominal muscle and fat on CT images.
https://doi.org/10.3348/kjr.2019.0470 인용 PDF

A Study on Strategic Development Approaches for Cyber Seniors in the Information Security Industry

Seung Han Yoon;Ah Reum Kang
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.4
- /
- pp.73-82
- /
- 2024
In 2017, the United Nations reported that the population aged 60 and above was increasing more rapidly than all younger age groups worldwide, projecting that by 2050, the population aged 60 and above would constitute at least 25% of the global population, excluding Africa. The world is experiencing a decline in the rate of increase in the working-age population due to global aging, and the younger generation tends to avoid difficult and challenging occupations. Although theoretically, AI equipped with artificial intelligence can replace humans in all fields, in the realm of practical information security, human judgment and expertise are absolutely essential, especially in ethical considerations. Therefore, this paper proposes a method to retrain and reintegrate IT professionals aged 50 and above who are retiring or seeking career transitions, aiming to bring them back into the industry. For this research, surveys were conducted with 21 government/public agencies representing demand and 9 security monitoring companies representing supply. Survey results indicated that both demand (90%) and supply (78%) unanimously agreed on the absolute necessity of such measures. If the results of this research are applied in the field, it could lead to the strategic development of senior information security professionals, laying the foundation for a new market in the Korean information security industry amid the era of low birth rates and longevity.
https://doi.org/10.9708/jksci.2024.29.04.073 인용 PDF HTML

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.24 no.3
- /
- pp.21-44
- /
- 2018
In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.
https://doi.org/10.13088/jiis.2018.24.3.021 인용 PDF KSCI

Development of Intelligent Severity of Atopic Dermatitis Diagnosis Model using Convolutional Neural Network (합성곱 신경망(Convolutional Neural Network)을 활용한 지능형 아토피피부염 중증도 진단 모델 개발)

Yoon, Jae-Woong;Chun, Jae-Heon;Bang, Chul-Hwan;Park, Young-Min;Kim, Young-Joo;Oh, Sung-Min;Jung, Joon-Ho;Lee, Suk-Jun;Lee, Ji-Hyun
- Management & Information Systems Review
- /
- v.36 no.4
- /
- pp.33-51
- /
- 2017
With the advent of 'The Forth Industrial Revolution' and the growing demand for quality of life due to economic growth, needs for the quality of medical services are increasing. Artificial intelligence has been introduced in the medical field, but it is rarely used in chronic skin diseases that directly affect the quality of life. Also, atopic dermatitis, a representative disease among chronic skin diseases, has a disadvantage in that it is difficult to make an objective diagnosis of the severity of lesions. The aim of this study is to establish an intelligent severity recognition model of atopic dermatitis for improving the quality of patient's life. For this, the following steps were performed. First, image data of patients with atopic dermatitis were collected from the Catholic University of Korea Seoul Saint Mary's Hospital. Refinement and labeling were performed on the collected image data to obtain training and verification data that suitable for the objective intelligent atopic dermatitis severity recognition model. Second, learning and verification of various CNN algorithms are performed to select an image recognition algorithm that suitable for the objective intelligent atopic dermatitis severity recognition model. Experimental results showed that 'ResNet V1 101' and 'ResNet V2 50' were measured the highest performance with Erythema and Excoriation over 90% accuracy, and 'VGG-NET' was measured 89% accuracy lower than the two lesions due to lack of training data. The proposed methodology demonstrates that the image recognition algorithm has high performance not only in the field of object recognition but also in the medical field requiring expert knowledge. In addition, this study is expected to be highly applicable in the field of atopic dermatitis due to it uses image data of actual atopic dermatitis patients.
PDF

The perception of undergraduates of the college of education on the importance of trainee teacher certification areas and sub-factors (사범대학 재학생의 예비 교사 인증 영역 및 하위 요소에 대한 중요도 인식 분석)

Kim, Tae-Hoon;Lee, Tae-Ho
- 대한공업교육학회지
- /
- v.39 no.1
- /
- pp.164-188
- /
- 2014
The purpose of this study is to investigate the perception of undergraduates of the college of education on the importance of certification areas and factors suggested by the certification system at each department level as well as the college as a whole, in order to come up with measures for further improvement. The specific objectives of this study are first, verifying different perception on the importance of certification areas and factor per department, second, verifying different perception on the importance of certification areas and factor per grade. The population of this study is undergraduates of the college of education at A University, and the survey on the different perception on the importance was conducted on 758 students of 10 departments. Total 800 copies of survey were distributed, and 299 copies or 37.3% were retrieved. First, it was found that undergraduates of the college of education at A University highly recognize the necessity of a new system to produce excellent teachers. when it comes to different department, in the area of teaching personalities, there is difference in the importance of teaching aptitude test and completion of social intelligence development program. In the area of teaching expertise, there is different perception in the importance of completion of curriculum education subjects per major, completion of curriculum contents per major, and participation in teaching demonstration contest. In the area of student guidance expertise, there is difference by department in completion of creative character development related education programs and "teaching practice" course. In the area of communication skills in the information society, minimum score requirement for a second foreign language is considered less important than others. Second, as for grade, freshmen highly recognize the importance of validity of teaching training course, integrity of the course, validity of the teaching training course in producing excellent teachers, graduates' job performing ability development as a teacher, appropriacy of curriculum of the college of education in producing excellent teachers, compared to other grades. In particular, seniors consider the necessity of a new system to produce excellent teachers the most.
PDF KSCI

Learning from the USA's Single Emergency Number 911: Policy Implications for Korea (미국 긴급번호 911 운영시스템에 관한 연구: 긴급번호 실질적 통합을 위한 정책 시사점 제시 중심으로)

Kim, Hak-Kyong;Lee, Sung-Yong
- Korean Security Journal
- /
- no.43
- /
- pp.67-97
- /
- 2015
In Korea, a single emergency number, such as 911 of the USA and 999 of the UK, does not exist. This issue became highly controversial, when the Sewol Ferry Sinking disaster occurred last year. So, the Korean government has planned to adopt a single emergency number, integrating 112 of the Police, 119 of the Fire and Ambulance, 122 of the Korean Coast Guard, and many other emergency numbers. However, the integration plan recently proposed by the Ministry of Public Safety Security seems to be, what is called, a "partial integration model" which repeals the 122 number, but still maintains 112, 119, and 110 respectively. In this context, the study looks into USA's (diverse) 911 operating system, and subsequently tries to draw general features or characteristics. Further, the research attempts to derive policy implication from the general features. If the proposed partial integration model reflects the policy implications, the model can virtually operate like the 911 system -i.e. a single emergency number system - creating inter-operability between responding agencies such as police, fire, and ambulance, even though it is not a perfect integration model. The features drawn are (1) integration of emergency call-taking, (2) functional separation of call-taking and dispatching, (3) integration of physical facilities for call-taking and dispatching, and (4) professional call-takers and dispatchers. Moreover, the policy implications derived from the characteristics are (1) a user-friendly system - fast but accurate responses, (2) integrated responses to accidents, (3) professional call-taking and dispatching & objective and comprehensive risk assessment, and finally (4) active organizational learning in emergency call centers. Considering the policy implications, the following suggestions need to be applied to the current proposed plan: 1. Emergency services' systems should be tightly linked and connected in a systemic way so that they can communicate and exchange intelligence with one another. 2. Public safety answering points (call centers) of each emergency service should share their education and training modules, manuals, etc. Common training and manuals are also needed for inter-operability. 3. Personal management to enable-long term service in public safety answering points (call centers) should be established as one of the ways to promote professionalism.
PDF

아동의 잠재된 영재성 개발 프로그램

Lee, Sun-Ju
- Journal of Gifted/Talented Education
- /
- v.10 no.2
- /
- pp.71-86
- /
- 2000
According to Torrance, it is said that about 30% of feeble-headed children are bearing potential brilliant quality. Such potential gifted children exist really much more than children who appear to show their own brilliant quality. Nevertheless, research activity to reveal children's potential giftedness has not been relatively actively progressed. Renzulli, Smith and so on progresses research activity to reveal children's potential giftedness and strategies developed by them are being used widely all over the world, but these strategies show the defect that children can not overcome by themselves the interfered factors influenced negatively on their gift revelation. Russian scholar Babaeva made the effective education program to develop children's potential giftedness. Her program includes various activity methods or mental correct activity and psychological training activity and such training activity is realized through some steps and each step has it's own purpose and realization method. With the result that children discriminated as common children by the experimental study which is progressed by the present program participate, their standard of intelligence and originality showed the improved effect of the similar standard to children factors not a little influencing on children's giftedness development process but she also developed the leading concrete study method for the children to overcome this by themselves. In this paper, the present writer will examine potential giftedness development program researched in Russia and discuss their theoretic background and concrete activity course, activity result. Through this, children's have the excellent abilities in various spheres, this writer will obtain significant suggestions to educate the potential gifted children who do not display their own potentialities and do not receive a favor from gifted children program.
PDF

Search Result 738, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)