DOI QR코드

DOI QR Code

Aircraft Motion Identification Using Sub-Aperture SAR Image Analysis and Deep Learning

  • Doyoung Lee (School of Earth and Environmental Sciences, Seoul National University) ;
  • Duk-jin Kim (School of Earth and Environmental Sciences, Seoul National University) ;
  • Hwisong Kim (School of Earth and Environmental Sciences, Seoul National University) ;
  • Juyoung Song (School of Earth and Environmental Sciences, Seoul National University) ;
  • Junwoo Kim (Future Innovation Institute, Seoul National University)
  • Received : 2024.03.28
  • Accepted : 2024.04.15
  • Published : 2024.04.30

Abstract

With advancements in satellite technology, interest in target detection and identification is increasing quantitatively and qualitatively. Synthetic Aperture Radar(SAR) images, which can be acquired regardless of weather conditions, have been applied to various areas combined with machine learning based detection algorithms. However, conventional studies primarily focused on the detection of stationary targets. In this study, we proposed a method to identify moving targets using an algorithm that integrates sub-aperture SAR images and cosine similarity calculations. Utilizing a transformer-based deep learning target detection model, we extracted the bounding box of each target, designated the area as a region of interest (ROI), estimated the similarity between sub-aperture SAR images, and determined movement based on a predefined similarity threshold. Through the proposed algorithm, the quantitative evaluation of target identification capability enhanced its accuracy compared to when training with the targets with two different classes. It signified the effectiveness of our approach in maintaining accuracy while reliably discerning whether a target is in motion.

Keywords

1. Introduction

With the advancement of satellites such as CubeSat and satellite constellations, real-time remote sensing on the Earth became feasible. Especially, Synthetic Aperture Radar (SAR) gained attention for surveillance and reconnaissance due to its capability to operate without being affected by weather conditions. Previous studies widely focus on Automatic Target Detection (ATD) and Automatic Target Recognition (ATR), leveraging the characteristics of SAR images to detect aircraft or ships within the images automatically (Wang et al., 2023; Tian et al., 2024; Xiao et al., 2022b). Ocean targets exhibited distinct surface scattering signatures compared to terrestrial targets, which made training data generation and target detection more facile. In contrast, land targets were characterized by heterogeneous surface scattering properties, posing challenges in accurate identification and classification (Xiao et al., 2022a). Furthermore, accurately detecting aircraft targets caused limitations due to complex scattering patterns (Zhao et al., 2021). Airport infrastructure, such as hangars or boarding gates, generated these patterns, while it further complicated the perception by differentiating the representations of aircraft in SAR imagery. Despite such complexities, the competence to detect aircraft using SAR had significant value for applications in airport management and military target monitoring (Zhao et al., 2021; Guo et al., 2021).

Conventional ATD methodologies relied on algorithms such as the Constant False Alarm Rate (CFAR), which configured a variable threshold for target identification (Kim et al., 2020). Additionally, approaches that combined wavelet transformation in conjunction with Empirical Mode Decomposition (EMD) were used to augment the Signal-to-Noise Ratio (SNR) (Huang et al., 2019). However, real-world SAR images are frequently accompanied by limited pieces of information and the presence of different clutters (Li et al., 2021). These factors undermined the possibility of general application and enhanced performance of such traditional ATD methods.

In recent years, deep learning algorithms based on Convolutional Neural Networks (CNNs) have gained significant attention for ATD in SAR thanks to the advancements in computing resources and machine learning techniques, which facilitate feature extraction learning from massive SAR datasets (Zhang and Zhang, 2019; Song et al., 2023). Lately, CNNs have been criticized for their reliance on local patterns between neighboring pixels in an image, making them dependent on specific size and resolution (Dosovitskiy et al., 2020). In order to address this issue, the transformer model architecture (Vaswani et al., 2017), which utilizes self-attention for context comprehension in natural language processing, was introduced to computer vision. The Vision Transformer (ViT) model, which incorporates the transformer architecture, divided an image into patches and adds the positional information (positional query) of each patch to create sequence data. It then utilized self-attention to compute similarity scores between each image patch vector to understand the context. Unlike CNNs, which extract local spatial information through convolution operations, ViT interpreted images globally through self-attention, allowing for better context and feature extraction (Dosovitskiy et al., 2020).

Deep learning-based detection models were actively researched and applied to aircraft target detection in SAR. Li et al. (2021) proposed a lightweight algorithm based on You Only Look Once (YOLO) v3 by introducing Reuse Block and Information Correction Block to enhance aircraft feature extraction. Guo et al.(2021) improved aircraft detection accuracy by incorporating scattering information extracted through the Harris-Laplace detector. Additionally, the studies were executed to improve the accuracy of detecting targets using transformer architectures even in other targets (Feng et al., 2023; Chen et al., 2022; Zhao et al., 2023). Chen et al.(2022) introduced Sparse Attention to mitigate overfitting in attention operations, enhancing ship detection accuracy compared to conventional CNN algorithms. Zhao et al. (2023) successfully combined the Swin-Transformer backbone with YOLO, achieving robust performance in ship detection under complex backscatter patterns and speckle noise. Furthermore, Feng et al.(2023) showed that object-related queries are precisely focused on target locations during the attention process and made an Orientation Enhancement module and a Grouped Relation Contrastive loss function, resulting in significant performance improvement compared to the Detection Transformer (DETR).

Previous SAR-ATD research primarily focused on improving accuracy for stationary targets, rather than moving targets (Xiao et al., 2022a; Zhao et al., 2020; Zhao et al., 2021). For purposes such as identifying military threats and tracking aircraft, it was necessary to analyze beyond simple detection and understand the situation of the targets. Moving targets in SAR posed a challenge as conventional algorithms failed to identify them due to signal distortions caused by the Doppler effect. SAR sensor transmits high-frequency electromagnetic waves and synthesizes signals scattered from the surface to generate images. Consequently, if a target moves during synthesizing backscatter waves, distortions may occur. The Doppler effect causes target displacement or distortion in the azimuth direction by the velocity component (Song et al., 2024).

Non-parametric phase error correction methods, such as Phase Gradient Autofocus (PGA), are predominantly used to overcome the limitations of moving targets (Wahl et al., 1994). PGA-based refocusing algorithms were also applied to Inverse Synthetic Aperture Radar (ISAR) to compensate for phase distortions caused by motion components, thereby significantly improving image quality (Kim et al., 2014). However, the application of PGA to SAR images required prior detection and identification of target motion, which remained as a challenge. Although Ground Moving Target Indication (GMTI) such as Along Track Interferometry (ATI) has been used for moving target recognition, these methods required high computational resources and could not apply to single-channel SAR images (Bae et al., 2017; Ban et al., 2023). Furthermore, deep learning-based moving target identification was not fully explored due to a deficiency of training data.

In order to address these issues, this study suggests a method for performing ATD using a transformer-based deep learning algorithm and applying sub-aperture analysis to identify moving targets within the detected objects. The flow of our research is illustrated in Fig. 1. In the materials and methods section, we showed 1) image preprocessing and data generation methods for ATD training, 2) the characteristics and optimization process of the DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection (DINO) model used for deep learning based ATD, and 3) sub-aperture SAR analysis proposed for identifying moving targets. In the results and discussion section, we stated the detection results of aircraft targets and the determination of their movement, along with a conclusion.

OGCSBN_2024_v40n2_167_3_f0001.png 이미지

Fig. 1. Flow chart of research methodology. Kompsat-5 HH and VV Single Look Complex (SLC) images were used to make training datasets. DINO model was used to train and test for detecting aircraft. Cosine similarity analysis on sub-aperture SAR images was adopted for moving aircraft recognition.

2. Materials and Methods

2.1. Data Preprocessing and Training Data Preparation

In this study, a Level1-A (L1A) image acquired in spotlight mode by the Kompsat-5 (K5) satellite was utilized to generate a Level1-D(L1-D) image orthorectified to the ground. The acquired image operated within the X-band center frequency of 9.66 GHz with a spatial resolution of 1 meter (Korea Aerospace Research Institute, 2014), rendering it suitable for surveillance and reconnaissance studies. For polarization, single-polarization imagery in either HH or VV polarization was employed. Preprocessing involved radiometric calibration and geometric corrections along with speckle noise filtering, to transform the imagery into Level1-D format. The established dataset was converted to a geographic coordinate reference system of EPSG:4326. Supervised deep learning models necessitated a substantial amount of labeled training data for effective performance. Thus, using the Electro-Optical (EO) images whose acquisition times were close to K5 SAR images, training data was acquired by visual inspection; Computer Vision Annotation Tool (CVAT), an open-source labeling program, was employed for manual labeling of training data. The moving aircraft shape is distorted due to the azimuth defocusing and appears in a location where the aircraft cannot move due to the azimuth offset. Based on them, the labeled training data was validated by an expertized visual inspection, resulting in the acquisition of final data like Fig. 2. A total of 48 K5 SAR images were divided 8:2 randomly for training and validation, where the large scene image was divided into patch size of 400 × 400 pixels to serve as input data for the deep learning model.

OGCSBN_2024_v40n2_167_3_f0002.png 이미지

Fig. 2. This is a KOMPSAT-5 image of the study. (a) Example of preprocessed SAR image and (b) its corresponding labeled SAR image.

2.2. Deep Learning-Based Target Detection Algorithm Using DINO

In this study, we employed the DINO model, introduced by Zhang et al.(2022), for object detection, and optimized based on the default hyperparameters. The architecture of the DINO model is illustrated in Fig. 3. The DINO model was based on DETR (Carion et al., 2020), comprising encoder and decoder components with a transformer model structure. It predicted bounding boxes and classes by performing bipartite matching predictions with images, unlike other CNN architecture detectors. The DINO model introduced three originalities to resolve the slow convergence and low performance on small targets appearing in the DETR model. First, Contrastive De-Noising (CDN) generated noise during bounding box prediction using a hyperparameter (λ), thereby improving performance in distinguishing between background and targets. Second, the look forward twice algorithm stabilized training outcomes during backpropagation, considering both the bounding box information from previous layers and the gradients of the current layer to predict bounding boxes. Lastly, mixed query selection initialized position information by extracting only top k-queries after the encoder process, enabling the extraction of contextual information within the image.

OGCSBN_2024_v40n2_167_4_f0001.png 이미지

Fig. 3. Architecture of DINO used for detecting aircraft targets.

Unlike optical images, SAR images contain speckle noise and complex backscatter patterns, making it challenging to detect targets due to the surface and textual of SAR, when using traditional CNN detection algorithms (Feng et al., 2023). In the studies using DETR-based detectors, the results showed better detection performance rather than CNN-based detectors (Feng et al., 2023; Zhang et al., 2024). Therefore, it can be expected that a transformer-based detector like DINO would show robust performance even in cases where targets overlap or are difficult to recognize from the background. However, transformer-based detectors typically require a large amount of training data for generalization (Dosovitskiy et al., 2020). Accordingly, we utilized a pre-training method to improve detection accuracy by using weights pre-trained on the open dataset Common Objects in Context (COCO). The selected hyperparameters under the training process are presented in Table 1.

Table 1. Hyperparameter setting for training DINO to detect airplane target

OGCSBN_2024_v40n2_167_4_t0001.png 이미지

ReLU: Rectified Linear Unit.

In order to evaluate the DINO model, the training and testing datasets were randomly split into an 8:2 ratio, on which training and performance evaluation were respectively conducted. The training and testing of large scene images were performed based on the PyTorch framework, using two GeForce RTX3090 GPUs.

2.3. Sub-Aperture SAR Imaging for Moving Target Identification

Sub-aperture SAR images were generated to identify moving targets based on the Doppler centroid frequency shift. The spotlight SAR imaging mode used in the study significantly increases the virtual antenna length synthesized through beam steering to enhance resolution, consequently elevating the Doppler frequency bandwidth compared to other SAR satellite imaging modes (Ouchi, 2010). If the SAR image was generated by utilizing only a portion of the Doppler frequency, the same effect can be achieved as if the SAR antenna acquired the SAR image with a squint angle (Fig. 4).

OGCSBN_2024_v40n2_167_5_f0001.png 이미지

Fig. 4. Illustrated process of sub-aperture SAR image.

The spotlight mode possessed a longer coherent integration time compared to other imaging modes, which allowed the simulation of sub-aperture SAR images with higher squint angles.

First, to create sub-aperture images, this study defined a window function as shown in Eq. (1). Eq. (1) denotes a Kaiser window, where M denotes the size of the window to be created, n is an index of frequency and the parameter β determines the shape of the frequencies passed through the window. A window was created size as (Ra×r ); a is the azimuth length of image and r is the range direction length. The relevant frequency of SLC image was multiplied with the window after transforming into the frequency domain. Then, by performing inverse-fast Fourier transform (IFFT), positive and negative Doppler frequency sub-aperture SAR images can be acquired.

\(\begin{align}\omega(n)=I_{0}\left(\beta \sqrt{\frac{4 n^{2}}{(M-1)^{2}}}\right) / I_{0}(\beta)\end{align}\)       (1)

Target velocity-induced phase distortion in sub-aperture SAR images can be observed through the relationship between the Doppler centroid frequency and squint angle. The SAR image was subject to squint angle when only a portion of the Doppler frequency bin was exploited. In Eq. (2), η denotes azimuth time, fη represents the Doppler frequency, ν denotes the velocity component of the target scatterer, λ stands for the wavelength, and θs signifies the squint angle for the scatterer.

\(\begin{align}f_{\eta}=\frac{2 v \sin \theta_{s}(\eta)}{\lambda}\end{align}\)       (2)

When a target with range velocity was imaged under different squint angles, its range velocity projection was modified, producing the azimuth bin shift of the target position (Davidson and Cumming, 1997). Although the size of the images remained the same, the target position was modified when visualized on individual bands as shown in Fig. 5 (Greidanus, 2006).

OGCSBN_2024_v40n2_167_5_f0002.png 이미지

Fig. 5. Sub-aperture SAR images. (a), (d) positive Doppler frequency SAR image. (b), (e) negative Doppler frequency SAR image. (c), (f) result after merging each sub-aperture SAR image to the band.

Cosine similarity was utilized for quantitative assessment of the difference exhibited in sub-aperture images. The cosine similarity estimated correlation between two vectors with the cosine of the angle, allowing for quantitative evaluation of the positional changes in aircraft due to Doppler centroid frequency variations. Hence, in this study, the cosine similarity between two sub-aperture images was calculated according to Eq. 3, using the bounding boxes of the target obtained through preprocessed K5 data. It measured the morphological differences between sub-aperture images.

\(\begin{align}\operatorname{Cos}(\theta)=\frac{X \cdot Y}{\|X\|\|Y\|}\end{align}\)       (3)

As a higher correlation between two images indicates higher structural similarity of the image matrix (Wang et al., 2004), a higher cosine similarity implies the target is stationary, while a lower value suggests it is moving. For the distinction of target movement, this approach needs an empirical threshold. Therefore, to determine a threshold, Section 2.4 estimated the cosine similarity between sub-aperture SAR images and verified their probability distributions.

2.4. Calculation of Cosine Similarity Threshold Between Sub-Aperture Target Images

In order to determine the movement status of aircraft targets, we analyzed probability distributions for each situation and configured its threshold accordingly. We extracted sub-aperture images for each ROI, defined by bounding boxes coordinates(x, y, w, h). The cosine similarity between each pair of sub-images was calculated, and their distributions were analyzed. Fig. 6 depicts the probability distributions of similarity for moving and stationary aircraft. For moving aircraft, a distribution with a mean of 0.286 and a standard deviation of 0.189 was observed, while stationary aircraft exhibited a distribution with a mean of 0.731 and a standard deviation of 0.159. This distribution means that when the structure of the target bin was changed with respect to Doppler centroid frequency, the bin of the target, which has velocity along the range direction, was also shifted.

OGCSBN_2024_v40n2_167_6_f0001.png 이미지

Fig. 6. The way to identify moving aircraft. (a) Probability density distribution of cosine similarity between sub-aperture images used as train data. (b) Example of moving aircraft which has high cosine similarity. (c) Example of stationary aircraft which has low cosine similarity.

Consequently, structural changes in images were monitored, following different cosine similarity probability distributions, which were different from the stationary target illustrated in Fig. 5. Based on the probability distributions, a cosine similarity threshold of 0.5 was selected, which intersects two distributions, to minimize misidentification and false identification. In overlapping distributions of 0.4–0.5, the targets lying in the overlapped range were hard to discriminate as shown in (b) and (c). Except for this range, the status of the targets is generally well distinguished.

3. Results and Discussion

The DINO model detected aircraft targets and their movement was discerned by utilizing cosine similarity between the positive and negative frequency sub-aperture images. The performance of target detection was evaluated applying by the mean average precision (mAP@0.5), which was commonly used in object detection algorithm evaluations. Also, for the evaluation of the identification motion of the target, the geometric mean (Gmean) was utilized. The measurement follows Eq. (4–8).

mAP@0.5 = ∫01 P dR, if IoUthreshold = 0.5       (4)

\(\begin{align}G_{\text {mean }}=\sqrt{\text { Sensitivity } \times \text { Specificity }}\end{align}\)       (5)

\(\begin{align} \begin{array}{l} \text {TP}= \frac{ prediction_{true}}{ground \; truth_{true}}, \; \text {FN} = \frac {prediction _{false}} {ground \; truth_{true}}, \; \\\text {TN} = \frac {prediction _{false}} {ground \; truth_{false}} \end{array}\end{align}\)       (6)

\(\begin{align} \text {Sensitivity} = \frac{TP}{TP+FN} \end{align}\)       (7)

\(\begin{align} \text {Specificity}=\frac{TN}{TN+FP} \end{align}\)       (8)

The mAP@0.5 represents the area under the precision-recall curve, where a higher value indicates superior detection accuracy. The intersection over union (IoU) means a ratio between intersecting and combined area of bounding boxes. An aircraft was detected when the IoU between the predicted bounding box and the ground truth bounding box was over 0.5. Gmean is calculated as the square root of the product of sensitivity and specificity, which is advantageous for evaluating classification accuracy in the presence of data imbalance. Therefore, by employing both mAP@0.5 and Gmean, detection accuracy and moving target identification accuracy were quantitatively evaluated.

Quantitative comparisons were conducted regarding aircraft detection accuracy and identification of moving targets. In order to compare the effectiveness of the proposed method, two different experiments were performed. In the first case, the deep learning model was trained to discriminate two classes of aircraft: stationary and moving. In the second case, the model predicted a single class of aircraft integrating the aforementioned classes, followed by applying our proposed identification algorithm based on sub-aperture and cosine similarity. If the training data for moving aircraft were scarce and constructing such datasets was also challenging, the deep learning-based method could show unsatisfactory performance on account of not extracting patterns in that case. Compared to the deep learning method, the proposed algorithm is expected to work well in restricted circumstances.

The input data used in the deep learning training was comprised of image patches with dimensions of 400 × 400 pixels extracted from the K5 SAR image, containing 2,244 instances of stationary aircraft and 59 instances of moving aircraft. To prevent overfitting, data augmentation techniques such as image resizing, and horizontal flip were applied during training. The comparative results are presented in Table 2. The mAP@0.5 achieved a higher value of 0.661 when aircraft were classified into a single category compared to 0.657 when classified into two categories. Regarding Gmean, the application of the sub-aperture-based identification algorithm to detect single aircraft targets yielded a value of 0.591, representing a 1.7% increase compared to the results obtained from deep learning-based identification (0.574).

Table 2. Aircraft detection performance based on category classification

OGCSBN_2024_v40n2_167_7_t0001.png 이미지

Fig. 7(a) depicts the case where single targets were detected and moving aircraft were identified using the sub-aperture-based identification algorithm, while (b) represents the results of deep learning detection with separate classes for stationary and moving aircraft. In (a), four moving aircrafts were accurately detected out of a total of four moving aircrafts, with two false alarms on stationary aircrafts. On the other hand, (b) showed a false alarm on moving aircraft, and false and missed detections were observed in stationary aircraft. In Fig. 7(a), the values on the bounding box show the cosine similarity between sub-aperture images. It was ascertained that targets exhibiting low similarity values indeed corresponded to moving aircraft. This implied that the generalization performance of the identification algorithm was undermined due to insufficient training data for moving aircrafts. However, a single-class aircraft detection with the proposed algorithm showed enhanced performance in discerning movement, which was likely attributed to its ability to generalize the shapes of various aircraft and analyze between two sub-aperture images.

OGCSBN_2024_v40n2_167_8_f0002.png 이미지

Fig. 7. Example of aircraft detection and moving aircraft identification. Result of (a) combining one class detection and sub-aperture-based identification algorithm and (b) two class detection: aircraft and moving aircraft. The number near the bounding box in (a) signifies cosine similarity, and if it is lower than 0.5, it means a moving aircraft.

Table 3 and Fig. 8 present the accuracy and qualitative results of the cosine similarity-based moving target identification algorithm applied to the L1-D large scene image. Among 65 aircraft targets, seven aircrafts were moving, where six of them were successfully identified using the proposed method. One unrecognized moving aircraft was not initially detected by the ATD, leading to non-moving aircraft. However, as proved in Fig. 8, moving aircrafts on the runway were accurately detected, while taxiing aircrafts, were identified. Furthermore, there were no cases of misclassification between stationary aircraft and moving aircraft, demonstrating the effective performance of the proposed sub-aperture-based algorithm under general conditions.

Table 3. Confusion matrix results of sub-aperture-based moving target identification algorithm

OGCSBN_2024_v40n2_167_8_t0001.png 이미지

OGCSBN_2024_v40n2_167_8_f0001.png 이미지

Fig. 8. Example of moving target identification results from Level1-D images.

4. Conclusions

Conventional studies on aircraft detection have primarily focused on improving accuracy in classifying or detecting aircraft types. However, the ability to identify the movement status of targets in SAR images provided access to additional information. From this perspective, this study contributes to its ability to distinguish target states when training data for limited moving targets. In this study, we proposed an approach to detect aircraft using the DINO model, known for its strong performance in target detection, and then used a sub-aperture-based algorithm to determine the aircraft situation. For ROIs detected from the DINO model, sub-images were then extracted from each sub-aperture image, and cosine similarity was estimated for each pair of images.

Training and evaluation were conducted using a total of 48 images classified as training and testing data. When evaluating detection accuracy and identification algorithm, the proposed method showed a 1.7% higher accuracy compared to the experiment where two classes were categorized. The proposed sub-aperture-based identification algorithm, however, required manual selection of threshold based on images extracted from ROIs. In conclusion, additional analysis is required to address the issue of robust threshold selection. Furthermore, the proposed algorithm was subject to the detection accuracy of ATD models. If the robust detector is applied, this method poses a greater potential in moving target recognition.

Acknowledgments

This work was supported by the Korea Research Institute for Defense Technology Planning and Advancement (KRIT) grant funded by the Korean government’s Defense Acquisition Program Administration (DAPA)(KRIT-CT-22-040), Heterogenous satellite constellation based ISR Research Center, 2022 and the Institute of Civil-Military Technology Cooperation funded by DAPA and Ministry of Trade, Industry and Energy of Korean government under grant no. 22-CM-16.

Conflict of Interest

No potential conflict of interest relevant to this article was reported.

References

  1. Akosa, J., 2017. Predictive accuracy: A misleading performance measure for highly imbalanced data. In Proceedings of the SAS Global Forum, Cary, NC, USA, Apr. 2-5, pp. 1-12. https://support.sas.com/resources/papers/proceedings17/0942-2017.pdf
  2. Ali, M., Hossein, B., and Iman, K., 2023. Enhancing crop classification accuracy through synthetic SAR-optical data generation using deep learning. ISPRS International Journal of Geo-Information, 12(11), 450. https://doi.org/10.3390/ijgi12110450
  3. Bae, C-S., Jeon H-M., Yang, D-H., and Yang H-G., 2017. Ground moving target's velocity estimation in SAR-GMTI. The Journal of Korean Institute of Electromagnetic Engineering and Science, 28(2), 139-146. https://doi.org/10.5515/KJKIEES.2017.28.2.139
  4. Ban, I-M., Jo, H-J., Lee, S-W., Lee, M-J., Lee, W-K., and Yeo, K. G., 2023. High-resolution spotlight mode design for moving target detection and tracking using a single-channel spaceborne SAR system. The Journal of Korean Institute of Electromagnetic Engineering and Science, 34(3), 176-189. https://doi.org/10.5515/KJKIEES.2023.34.3.176
  5. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S., 2020. End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds.), Computer Vision - ECCV 2020, Springer, pp. 213-229. https://doi.org/10.1007/978-3-030-58452-8_13
  6. Chen, Y. Y., Xia, Z. H., Liu, J., and Wu, C. W., 2022. TSDet: Endto-end method with transformer for SAR ship detection. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, July 18-23, pp. 1-8. https://doi.org/10.1109/IJCNN55064.2022.9891879
  7. Davidson, G. W., and Cumming, I., 1997. Signal properties of spaceborne squint-mode SAR. IEEE Transactions on Geoscience and Remote Sensing, 35(3), pp. 611-617. https://doi.org/10.1109/36.581976
  8. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T. et al., 2020. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. https://doi.org/10.48550/arXiv.2010.11929
  9. Feng, Y., You, Y., Tian, J., and Gang, M., 2023. OEGR-DETR: A novel detection transformer based on orientation enhancement and group relations for SAR object detection. Remote Sensing, 16(1), 106. https://doi.org/10.3390/rs16010106
  10. Girshick, R., Donahue, J., Darrell, T., and Malik, J., 2014.Richfeature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, June 23-28, pp. 580-587. https://doi.org/10.1109/CVPR.2014.81
  11. Greidanus, H., 2006. Sub-aperture behavior of SAR signatures of ships. In Proceedings of the 2006 IEEE International Symposium on Geoscience and Remote Sensing (IGRASS), Denver, CO, USA, Jul. 31-Aug. 4, pp. 3579-3582. https://doi.org/10.1109/IGARSS.2006.917
  12. Guo, Q., Wang, H., and Xu, F., 2021. Scattering enhanced attention pyramid network for aircraft detection in SAR images. IEEE Transactions on Geoscience and Remote Sensing, 59(9), 7570-7587. https://doi.org/10.1109/TGRS.2020.30277624
  13. Huang, S. Q., Zhao, W. W., and Luo, P., 2019. Target detection of SAR image based on wavelet and empirical mode decomposition. In Proceedings of the 2019 6thAsia-Pacific Conference on Synthetic Aperture Radar(APSAR), Xiamen, China, Nov. 26-29, pp. 1-6. https://doi.org/10.1109/APSAR46974.2019.9048482
  14. Kim, D. H., Lee, Y-K., and Kim, S-W., 2020. Ship detection based onKOMPSAT-5 SLC image and AIS data. Korean Journal of Remote Sensing, 36(2-2), 365-377. https://doi.org/10.7780/kjrs.2020.36.2.2.11
  15. Kim, K. E., Kim, Y. C., and Park, S. C., 2014. Improvement of ISAR autofocusing performance based on PGA. Journal of the Korea Institute of Military Science and Technology, 17(5), 680-687. https://doi.org/10.9766/KIMST.2014.17.5.680
  16. Korea Aerospace Research Institute, 2012. KOMPSAT-5 (Korea Multi-Purpose Satellite-5) / Arirang-5. Available online: https://www.eoportal.org/satellite-missions/kompsat5#kompsat-5-korea-multi-purpose-satellite-5--arirang-5 (accessed on Feb. 23, 2024).
  17. Li, M. W., Wen, G. J., Huang, X. H., Li, K. H., and Lin, S. Z., 2021. A lightweight detection model for SAR aircraft in a complex environment. Remote Sensing, 13(24), 5020. https://doi.org/10.3390/rs13245020
  18. Ouchi, K., 2010. Principles of synthetic aperture radar for remote sensing. Dasom Publishing Company. https://policy.nl.go.kr/search/searchDetail.do?rec_key=SH1_UMO20170596098
  19. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A., 2016. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 27-30, pp. 779-788. https://doi.ieeecomputersociety.org/10.1109/CVPR.2016.91
  20. Song, J. Y., Kim, D-J., Kim. J. W., and Li, C. L., 2022. Exploitation of dual-polarimetric index of Sentinel-1 SAR data in vessel detection utilizing machine learning. Korean Journal of Remote Sensing, 38(5-2), 737-746. https://doi.org/10.7780/kjrs.2022.38.5.2.7
  21. Song, J. Y., Kim, D-J., Hwang, J-H., Kim, H. S., Li, C. L., Han, S. H. et al., 2024. Effective vessel recognition in high resolution SAR images utilizing quantitative and qualitative training data enhancement from target velocity phase refocusing. IEEE Transactions on Geoscience and Remote Sensing, 62, 1-14. https://doi.org/10.1109/TGRS.2023.3346171
  22. Tian, C. Y., Liu, D. C., Xue, F. L., Lv, Z. S., and Wu, X. Y., 2024. Faster and lighter: A Novel ship detector for SAR images. IEEE Geoscience and Remote Sensing Letters, 21, 1-5. https://doi.org/10.1109/LGRS.2024.3351132
  23. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N. et al., 2017. Attention is all you need. arXiv preprint arXiv:1706.03762. https://doi.org/10.48550/arXiv.1706.03762
  24. Wahl, D. E., Eicherl, P. H., Ghiglia, D. C., and Jakowatz, C. V., 1994. Phase gradient autofocus-a robust tool for high resolution SAR phase correction. IEEE Transactions on Aerospace and Electronic Systems, 30(3), 827-835. https://ieeexplore.ieee.org/document/303752 https://doi.org/10.1109/7.303752
  25. Wang, S. Y., Cai, Z. C., and Yuan, J. Y., 2023.Automatic SAR ship detection based on multi-feature fusion network in spatial and frequency domain. IEEE Transactions on Geoscience and Remote Sensing, 61, 1-11. https://doi.org/10.1109/TGRS.2023.3267495
  26. Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P., 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600-612. https://ieeexplore.ieee.org/document/1284395 https://doi.org/10.1109/TIP.2003.819861
  27. Xiao, X. Y., Jia, H. C., Xiao, P. H., and Wang, H. P., 2022a. Aircraft detection in SAR images based on peak feature fusion and adaptive deformable network. Remote Sensing, 14(23), 6077. https://doi.org/10.3390/rs14236077
  28. Xiao, X. Y., Yu, X. P., and Wang, H. P., 2022b. A high-efficiency aircraft detection approach utilizing auxiliary information in SAR images. In Proceeding of the IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, July 17-22, pp. 1700-1703. https://doi.org/10.1109/IGARSS46834.2022.9884883
  29. Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., N. et al., 2022. DINO: DETR with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605. https://doi.org/10.48550/arXiv.2203.03605
  30. Zhang, L. B., Li, C. Y., Zhao, L. J., Xiong, B. L., Quan, S., and Kuang, G. Y., 2020. A cascaded three-look network for aircraft detection in SAR images. Remote Sensing Letters, 11, 57-65. https://doi.org/10.1080/2150704X.2019.1681599
  31. Zhang, L., Zheng, J., Li, C., Xu, Z., Yang, J., Wei, Q. et al., 2024. CCDN-DETR: A detection transformer based on constrained contrast denoising for multi-class synthetic aperture radar object detection. Sensors, 24, 1793. https://doi.org/10.3390/s24061793
  32. Zhang, T. W., and Zhang, X. L., 2019. High-speed ship detection in SAR images based on a grid convolutional neural network. Remote Sensing, 11(10), 1206. https://doi.org/10.3390/rs11101206
  33. Zhao, K., Lu, R. T., Wang, S. Y., Yang, X. G., Li, Q. G., and Fan, J. W., 2023. ST-YOLOA: A swin-transformer-based YOLO model with an attention mechanism for SAR ship detection under complex background. Frontiers in Neurorobotics, 17, 1170163. https://doi.org/10.3389/fnbot.2023.1170163
  34. Zhao, Y., Zhao, L. J., Li, C. Y., and Kuang, G. Y., 2020. Pyramid attention dilated network for aircraft detection in SAR images. IEEE Geoscience and Remote Sensing Letters, 18(4), 662-666. https://doi.org/10.1109/LGRS.2020.2981255
  35. Zhao, Y., Zhao, L. J., Liu, Z., Hu, D., Kuang, G., and Liu, L., 2021. Attentional feature refinement and alignment network for aircraft detection in SAR Imagery. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-16. https://doi.org/10.1109/TGRS.2021.3139994