• Title/Summary/Keyword: Resolution of Image

Search Result 3,691, Processing Time 0.028 seconds

Real-time Color Recognition Based on Graphic Hardware Acceleration (그래픽 하드웨어 가속을 이용한 실시간 색상 인식)

  • Kim, Ku-Jin;Yoon, Ji-Young;Choi, Yoo-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.1-12
    • /
    • 2008
  • In this paper, we present a real-time algorithm for recognizing the vehicle color from the indoor and outdoor vehicle images based on GPU (Graphics Processing Unit) acceleration. In the preprocessing step, we construct feature victors from the sample vehicle images with different colors. Then, we combine the feature vectors for each color and store them as a reference texture that would be used in the GPU. Given an input vehicle image, the CPU constructs its feature Hector, and then the GPU compares it with the sample feature vectors in the reference texture. The similarities between the input feature vector and the sample feature vectors for each color are measured, and then the result is transferred to the CPU to recognize the vehicle color. The output colors are categorized into seven colors that include three achromatic colors: black, silver, and white and four chromatic colors: red, yellow, blue, and green. We construct feature vectors by using the histograms which consist of hue-saturation pairs and hue-intensity pairs. The weight factor is given to the saturation values. Our algorithm shows 94.67% of successful color recognition rate, by using a large number of sample images captured in various environments, by generating feature vectors that distinguish different colors, and by utilizing an appropriate likelihood function. We also accelerate the speed of color recognition by utilizing the parallel computation functionality in the GPU. In the experiments, we constructed a reference texture from 7,168 sample images, where 1,024 images were used for each color. The average time for generating a feature vector is 0.509ms for the $150{\times}113$ resolution image. After the feature vector is constructed, the execution time for GPU-based color recognition is 2.316ms in average, and this is 5.47 times faster than the case when the algorithm is executed in the CPU. Our experiments were limited to the vehicle images only, but our algorithm can be extended to the input images of the general objects.

Development of an Offline Based Internal Organ Motion Verification System during Treatment Using Sequential Cine EPID Images (연속촬영 전자조사 문 영상을 이용한 오프라인 기반 치료 중 내부 장기 움직임 확인 시스템의 개발)

  • Ju, Sang-Gyu;Hong, Chae-Seon;Huh, Woong;Kim, Min-Kyu;Han, Young-Yih;Shin, Eun-Hyuk;Shin, Jung-Suk;Kim, Jing-Sung;Park, Hee-Chul;Ahn, Sung-Hwan;Lim, Do-Hoon;Choi, Doo-Ho
    • Progress in Medical Physics
    • /
    • v.23 no.2
    • /
    • pp.91-98
    • /
    • 2012
  • Verification of internal organ motion during treatment and its feedback is essential to accurate dose delivery to the moving target. We developed an offline based internal organ motion verification system (IMVS) using cine EPID images and evaluated its accuracy and availability through phantom study. For verification of organ motion using live cine EPID images, a pattern matching algorithm using an internal surrogate, which is very distinguishable and represents organ motion in the treatment field, like diaphragm, was employed in the self-developed analysis software. For the system performance test, we developed a linear motion phantom, which consists of a human body shaped phantom with a fake tumor in the lung, linear motion cart, and control software. The phantom was operated with a motion of 2 cm at 4 sec per cycle and cine EPID images were obtained at a rate of 3.3 and 6.6 frames per sec (2 MU/frame) with $1,024{\times}768$ pixel counts in a linear accelerator (10 MVX). Organ motion of the target was tracked using self-developed analysis software. Results were compared with planned data of the motion phantom and data from the video image based tracking system (RPM, Varian, USA) using an external surrogate in order to evaluate its accuracy. For quantitative analysis, we analyzed correlation between two data sets in terms of average cycle (peak to peak), amplitude, and pattern (RMS, root mean square) of motion. Averages for the cycle of motion from IMVS and RPM system were $3.98{\pm}0.11$ (IMVS 3.3 fps), $4.005{\pm}0.001$ (IMVS 6.6 fps), and $3.95{\pm}0.02$ (RPM), respectively, and showed good agreement on real value (4 sec/cycle). Average of the amplitude of motion tracked by our system showed $1.85{\pm}0.02$ cm (3.3 fps) and $1.94{\pm}0.02$ cm (6.6 fps) as showed a slightly different value, 0.15 (7.5% error) and 0.06 (3% error) cm, respectively, compared with the actual value (2 cm), due to time resolution for image acquisition. In analysis of pattern of motion, the value of the RMS from the cine EPID image in 3.3 fps (0.1044) grew slightly compared with data from 6.6 fps (0.0480). The organ motion verification system using sequential cine EPID images with an internal surrogate showed good representation of its motion within 3% error in a preliminary phantom study. The system can be implemented for clinical purposes, which include organ motion verification during treatment, compared with 4D treatment planning data, and its feedback for accurate dose delivery to the moving target.

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

A Mobile Landmarks Guide : Outdoor Augmented Reality based on LOD and Contextual Device (모바일 랜드마크 가이드 : LOD와 문맥적 장치 기반의 실외 증강현실)

  • Zhao, Bi-Cheng;Rosli, Ahmad Nurzid;Jang, Chol-Hee;Lee, Kee-Sung;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.1-21
    • /
    • 2012
  • In recent years, mobile phone has experienced an extremely fast evolution. It is equipped with high-quality color displays, high resolution cameras, and real-time accelerated 3D graphics. In addition, some other features are includes GPS sensor and Digital Compass, etc. This evolution advent significantly helps the application developers to use the power of smart-phones, to create a rich environment that offers a wide range of services and exciting possibilities. To date mobile AR in outdoor research there are many popular location-based AR services, such Layar and Wikitude. These systems have big limitation the AR contents hardly overlaid on the real target. Another research is context-based AR services using image recognition and tracking. The AR contents are precisely overlaid on the real target. But the real-time performance is restricted by the retrieval time and hardly implement in large scale area. In our work, we exploit to combine advantages of location-based AR with context-based AR. The system can easily find out surrounding landmarks first and then do the recognition and tracking with them. The proposed system mainly consists of two major parts-landmark browsing module and annotation module. In landmark browsing module, user can view an augmented virtual information (information media), such as text, picture and video on their smart-phone viewfinder, when they pointing out their smart-phone to a certain building or landmark. For this, landmark recognition technique is applied in this work. SURF point-based features are used in the matching process due to their robustness. To ensure the image retrieval and matching processes is fast enough for real time tracking, we exploit the contextual device (GPS and digital compass) information. This is necessary to select the nearest and pointed orientation landmarks from the database. The queried image is only matched with this selected data. Therefore, the speed for matching will be significantly increased. Secondly is the annotation module. Instead of viewing only the augmented information media, user can create virtual annotation based on linked data. Having to know a full knowledge about the landmark, are not necessary required. They can simply look for the appropriate topic by searching it with a keyword in linked data. With this, it helps the system to find out target URI in order to generate correct AR contents. On the other hand, in order to recognize target landmarks, images of selected building or landmark are captured from different angle and distance. This procedure looks like a similar processing of building a connection between the real building and the virtual information existed in the Linked Open Data. In our experiments, search range in the database is reduced by clustering images into groups according to their coordinates. A Grid-base clustering method and user location information are used to restrict the retrieval range. Comparing the existed research using cluster and GPS information the retrieval time is around 70~80ms. Experiment results show our approach the retrieval time reduces to around 18~20ms in average. Therefore the totally processing time is reduced from 490~540ms to 438~480ms. The performance improvement will be more obvious when the database growing. It demonstrates the proposed system is efficient and robust in many cases.

Performance Evaluation of Siemens CTI ECAT EXACT 47 Scanner Using NEMA NU2-2001 (NEMA NU2-2001을 이용한 Siemens CTI ECAT EXACT 47 스캐너의 표준 성능 평가)

  • Kim, Jin-Su;Lee, Jae-Sung;Lee, Dong-Soo;Chung, June-Key;Lee, Myung-Chul
    • The Korean Journal of Nuclear Medicine
    • /
    • v.38 no.3
    • /
    • pp.259-267
    • /
    • 2004
  • Purpose: NEMA NU2-2001 was proposed as a new standard for performance evaluation of whole body PET scanners. in this study, system performance of Siemens CTI ECAT EXACT 47 PET scanner including spatial resolution, sensitivity, scatter fraction, and count rate performance in 2D and 3D mode was evaluated using this new standard method. Methods: ECAT EXACT 47 is a BGO crystal based PET scanner and covers an axial field of view (FOV) of 16.2 cm. Retractable septa allow 2D and 3D data acquisition. All the PET data were acquired according to the NEMA NU2-2001 protocols (coincidence window: 12 ns, energy window: $250{\sim}650$ keV). For the spatial resolution measurement, F-18 point source was placed at the center of the axial FOV((a) x=0, and y=1, (b)x=0, and y=10, (c)x=70, and y=0cm) and a position one fourth of the axial FOV from the center ((a) x=0, and y=1, (b)x=0, and y=10, (c)x=10, and y=0cm). In this case, x and y are transaxial horizontal and vertical, and z is the scanner's axial direction. Images were reconstructed using FBP with ramp filter without any post processing. To measure the system sensitivity, NEMA sensitivity phantom filled with F-18 solution and surrounded by $1{\sim}5$ aluminum sleeves were scanned at the center of transaxial FOV and 10 cm offset from the center. Attenuation free values of sensitivity wire estimated by extrapolating data to the zero wall thickness. NEMA scatter phantom with length of 70 cm was filled with F-18 or C-11solution (2D: 2,900 MBq, 3D: 407 MBq), and coincidence count rates wire measured for 7 half-lives to obtain noise equivalent count rate (MECR) and scatter fraction. We confirmed that dead time loss of the last flame were below 1%. Scatter fraction was estimated by averaging the true to background (staffer+random) ratios of last 3 frames in which the fractions of random rate art negligibly small. Results: Axial and transverse resolutions at 1cm offset from the center were 0.62 and 0.66 cm (FBP in 2D and 3D), and 0.67 and 0.69 cm (FBP in 2D and 3D). Axial, transverse radial, and transverse tangential resolutions at 10cm offset from the center were 0.72 and 0.68 cm (FBP in 2D and 3D), 0.63 and 0.66 cm (FBP in 2D and 3D), and 0.72 and 0.66 cm (FBP in 2D and 3D). Sensitivity values were 708.6 (2D), 2931.3 (3D) counts/sec/MBq at the center and 728.7 (2D, 3398.2 (3D) counts/sec/MBq at 10 cm offset from the center. Scatter fractions were 0.19 (2D) and 0.49 (3D). Peak true count rate and NECR were 64.0 kcps at 40.1 kBq/mL and 49.6 kcps at 40.1 kBq/mL in 2D and 53.7 kcps at 4.76 kBq/mL and 26.4 kcps at 4.47 kBq/mL in 3D. Conclusion: Information about the performance of CTI ECAT EXACT 47 PET scanner reported in this study will be useful for the quantitative analysis of data and determination of optimal image acquisition protocols using this widely used scanner for clinical and research purposes.

The Correction Factor of Sensitivity in Gamma Camera - Based on Whole Body Bone Scan Image - (감마카메라의 Sensitivity 보정 Factor에 관한 연구 - 전신 뼈 영상을 중심으로 -)

  • Jung, Eun-Mi;Jung, Woo-Young;Ryu, Jae-Kwang;Kim, Dong-Seok
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.12 no.3
    • /
    • pp.208-213
    • /
    • 2008
  • Purpose: Generally a whole body bone scan has been known as one of the most frequently executed exams in the nuclear medicine fields. Asan medical center, usually use various gamma camera systems - manufactured by PHILIPS (PRECEDENCE, BRIGHTVIEW), SIEMENS (ECAM, ECAM signature, ECAM plus, SYMBIA T2), GE (INFINIA) - to execute whole body scan. But, as we know, each camera's sensitivity is not same so it is hard to consistent diagnosis of patients. So our purpose is when we execute whole body bone scans, we exclude uncontrollable factors and try to correct controllable factors such as inherent sensitivity of gamma camera. In this study, we're going to measure each gamma camera's sensitivity and study about reasonable correction factors of whole body bone scan to follow up patient's condition using different gamma cameras. Materials and Methods: We used the $^{99m}Tc$ flood phantom, it recommend by IAEA recommendation based on general counts rate of a whole body scan and measured counts rates by the use of various gamma cameras - PRECEDENCE, BRIGHTVIEW, ECAM, ECAM signature, ECAM plus, IFINIA - in Asan medical center nuclear medicine department. For measuring sensitivity, all gamma camera equipped LEHR collimator (Low Energy High Resolution multi parallel Collimator) and the $^{99m}Tc$ gamma spectrum was adjusted around 15% window level, the photo peak was set to 140-kev and acquirded for 60 sec and 120 sec in all gamma cameras. In order to verify whether can apply calculated correction factors to whole body bone scan or not, we actually conducted the whole body bone scan to 27 patients and we compared it analyzed that results. Results: After experimenting using $^{99m}Tc$ flood phantom, sensitivity of ECAM plus was highest and other sensitivity order of all gamma camera is ECAM signature, SYMBIA T2, ECAM, BRIGHTVIEW, IFINIA, PRECEDENCE. And yield sensitivity correction factor show each gamma camera's relative sensitivity ratio by yielded based on ECAM's sensitivity. (ECAM plus 1.07, ECAM signature 1.05, SYMBIA T2 1.03, ECAM 1.00, BRIGHTVIEW 0.90, INFINIA 0.83, PRECEDENCE 0.72) When analyzing the correction factor yielded by $^{99m}Tc$ experiment and another correction factor yielded by whole body bone scan, it shows statistically insignificant value (p<0.05) in whole body bone scan diagnosis. Conclusion: In diagnosing the bone metastasis of patients undergoing cancer, whole body bone scan has been conducted as follow up tests due to its good points (high sensitivity, non invasive, easily conducted). But as a follow up study, it's hard to perform whole body bone scan continuously using same gamma camera. If we use same gamma camera to patients, we have to consider effectiveness of equipment's change by time elapsed. So we expect that applying sensitivity correction factor to patients who tested whole body bone scan regularly will add consistence in diagnosis of patients.

  • PDF

The Evaluation of Reconstruction Method Using Attenuation Correction Position Shifting in 3D PET/CT (PET/CT 3D 영상에서 감쇠보정 위치 변화 방법을 이용한 영상 재구성법의 평가)

  • Hong, Gun-Chul;Park, Sun-Myung;Jung, Eun-Kyung;Choi, Choon-Ki;Seok, Jae-Dong
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.14 no.2
    • /
    • pp.172-176
    • /
    • 2010
  • Purpose: The patients' moves occurred at PET/CT scan will cause the decline of correctness in results by resulting in inconsistency of Attenuation Correction (AC) and effecting on quantitative evaluation. This study has evaluated the utility of reconstruction method using AC position changing method when having inconsistency of AC depending on the position change of emission scan after transmission scan in obtaining PET/CT 3D image. Materials and Methods: We created 1 mL syringe injection space up to ${\pm}2$, 6, 10 cm toward x and y axis based on central point of polystyrene ($20{\times}20110$ cm) into GE Discovery STE16 equipment. After projection of syringe with $^{18}F$-FDG 5 kBq/mL, made an emission by changing the position and obtained the image by using AC depending on the position change. Reconstruction method is an iteration reconstruction method and is applied two times of iteration and 20 of subset, and for every emission data, decay correction depending on time pass is applied. Also, after setting ROI to the position of syringe, compared %Difference (%D) at each position to radioactivity concentrations (kBq/mL) and central point. Results: Radioactivity concentrations of central point of emission scan is 2.30 kBq/mL and is indicated as 1.95, 1.82 and 1.75 kBq/mL, relatively for +x axis, as 2.07, 1.75 and 1.65 kBq/mL for -x axis, as 2.07, 1.87 and 1.90 kBq/mL for +y axis and as 2.17, 1.85 and 1.67 kBq/mL for -y axis. Also, %D is yield as 15, 20, 23% for +x axis, as 9, 23, 28% for -x axis, as 12, 21, 20% for +y axis and as 8, 22, 29% for -y axis. When using AC position changing method, it is indicated as 2.00, 1.95 and 1.80 kBq/mL, relatively for +x axis, as 2.25, 2.15 and 1.90 kBq/mL for -x axis, as 2.07, 1.90 and 1.90 kBq/mL for +y axis, and as 2.10, 2.02, and 1.72 kBq/mL for -y axis. Also, %D is yield as 13, 15, 21% for +x axis, as 2, 6, 17% for -x axis, as 9, 17, 17% for +y axis, and as 8, 12, 25% for -y axis. Conclusion: When in inconsistency of AC, radioactivity concentrations for using AC position changing method increased average of 0.14, 0.03 kBq/mL at x, y axis and %D was improved 6.1, 4.2%. Also, it is indicated that the more far from the central point and the further position from the central point under the features that spatial resolution is lowered, the higher in lowering of radioactivity concentrations. However, since in actual clinic, attenuation degree increases more, it is considered that when in inconsistency, such tolerance will be increased. Therefore, at the lesion of the part where AC is not inconsistent, the tolerance of radioactivity concentrations will be reduced by applying AC position changing method.

  • PDF

DC Resistivity method to image the underground structure beneath river or lake bottom (하저 지반특성 규명을 위한 전기비저항 탐사)

  • Kim Jung-Ho;Yi Myeong-Jong;Song Yoonho;Cho Seong-Jun;Lee Seong-Kon;Son Jeongsul
    • 한국지구물리탐사학회:학술대회논문집
    • /
    • 2002.09a
    • /
    • pp.139-162
    • /
    • 2002
  • Since weak zones or geological lineaments are likely to be eroded, weak zones may develop beneath rivers, and a careful evaluation of ground condition is important to construct structures passing through a river. Dc resistivity surveys, however, have seldomly applied to the investigation of water-covered area, possibly because of difficulties in data aquisition and interpretation. The data aquisition having high quality may be the most important factor, and is more difficult than that in land survey, due to the water layer overlying the underground structure to be imaged. Through the numerical modeling and the analysis of case histories, we studied the method of resistivity survey at the water-covered area, starting from the characteristics of measured data, via data acquisition method, to the interpretation method. We unfolded our discussion according to the installed locations of electrodes, ie., floating them on the water surface, and installing at the water bottom, since the methods of data acquisition and interpretation vary depending on the electrode location. Through this study, we could confirm that the dc resistivity method can provide the fairly reasonable subsurface images. It was also shown that installing electrodes at the water bottom can give the subsurface image with much higher resolution than floating them on the water surface. Since the data acquired at the water-covered area have much lower sensitivity to the underground structure than those at the land, and can be contaminated by the higher noise, such as streaming potential, it would be very important to select the acquisition method and electrode array being able to provide the higher signal-to-noise ratio data as well as the high resolving power. The method installing electrodes at the water bottom is suitable to the detailed survey because of much higher resolving power, whereas the method floating them, especially streamer dc resistivity survey, is to the reconnaissance survey owing of very high speed of field work.

  • PDF

Development of a Method to Measure the Radiation Isocenter Size of Linear Accelerators and Quantitative Analysis of the Radiation Isocenter Size for Clinac 21EX Linear Accelerator (선형가속기 방사선 중심점의 크기 측정 방법 개발과 Clinac 21EX 선형가속기의 방사선 중심점 크기 분석)

  • Jeon, Ho-Sang;Nam, Ji-Ho;Park, Dahl;Kim, Yong-Ho;Kim, Won-Taek;Kim, Dong-Won;Ki, Yong-Kan;Kim, Dong-Hyun
    • Progress in Medical Physics
    • /
    • v.22 no.3
    • /
    • pp.131-139
    • /
    • 2011
  • A method to get a size of the radiation isocenter of linear accelerators using star-shot images was presented and a computer program was developed to automate the method. Accuracy of the method was verified. The developed program was used to measure sizes of the radiation isocenters for a Clinac 21EX (Varian, USA) using data of quality assurance (QA) performed from June 2008 to December 2010. To calculated the size of radiation isocenter, positions of two points on each central ray of the star-shot image were found and the equation of the central ray was determined using the positions of two points. Using the equations of central rays the radius of the minimum circle intersecting all the central rays, which is one half of the size of radiation isocenter, was calculated. The program measured x-intercepts and y-intercepts of the central rays within errors of 0.084 mm and sizes of radiation isocenters within 0.053 mm. All the errors were less than the spatial resolution of star-shot images 0.085 mm. The radiation isocenter sizes of Clinac 21EX were $0.33{\pm}0.27mm$, $0.71{\pm}0.36mm$, $0.50{\pm}0.16mm$ for collimator, gantry and couch respectively. During the measurement period all the measured sizes were less than 2.0 mm and within tolerance. The developed program could calculate the size of radiation isocenters and it would be helpful to routine QA.

An Embedding /Extracting Method of Audio Watermark Information for High Quality Stereo Music (고품질 스테레오 음악을 위한 오디오 워터마크 정보 삽입/추출 기술)

  • Bae, Kyungyul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.21-35
    • /
    • 2018
  • Since the introduction of MP3 players, CD recordings have gradually been vanishing, and the music consuming environment of music users is shifting to mobile devices. The introduction of smart devices has increased the utilization of music through music playback, mass storage, and search functions that are integrated into smartphones and tablets. At the time of initial MP3 player supply, the bitrate of the compressed music contents generally was 128 Kbps. However, as increasing of the demand for high quality music, sound quality of 384 Kbps appeared. Recently, music content of FLAC (Free License Audio Codec) format using lossless compression method is becoming popular. The download service of many music sites in Korea has classified by unlimited download with technical protection and limited download without technical protection. Digital Rights Management (DRM) technology is used as a technical protection measure for unlimited download, but it can only be used with authenticated devices that have DRM installed. Even if music purchased by the user, it cannot be used by other devices. On the contrary, in the case of music that is limited in quantity but not technically protected, there is no way to enforce anyone who distributes it, and in the case of high quality music such as FLAC, the loss is greater. In this paper, the author proposes an audio watermarking technology for copyright protection of high quality stereo music. Two kinds of information, "Copyright" and "Copy_free", are generated by using the turbo code. The two watermarks are composed of 9 bytes (72 bits). If turbo code is applied for error correction, the amount of information to be inserted as 222 bits increases. The 222-bit watermark was expanded to 1024 bits to be robust against additional errors and finally used as a watermark to insert into stereo music. Turbo code is a way to recover raw data if the damaged amount is less than 15% even if part of the code is damaged due to attack of watermarked content. It can be extended to 1024 bits or it can find 222 bits from some damaged contents by increasing the probability, the watermark itself has made it more resistant to attack. The proposed algorithm uses quantization in DCT so that watermark can be detected efficiently and SNR can be improved when stereo music is converted into mono. As a result, on average SNR exceeded 40dB, resulting in sound quality improvements of over 10dB over traditional quantization methods. This is a very significant result because it means relatively 10 times improvement in sound quality. In addition, the sample length required for extracting the watermark can be extracted sufficiently if the length is shorter than 1 second, and the watermark can be completely extracted from music samples of less than one second in all of the MP3 compression having a bit rate of 128 Kbps. The conventional quantization method can extract the watermark with a length of only 1/10 compared to the case where the sampling of the 10-second length largely fails to extract the watermark. In this study, since the length of the watermark embedded into music is 72 bits, it provides sufficient capacity to embed necessary information for music. It is enough bits to identify the music distributed all over the world. 272 can identify $4*10^{21}$, so it can be used as an identifier and it can be used for copyright protection of high quality music service. The proposed algorithm can be used not only for high quality audio but also for development of watermarking algorithm in multimedia such as UHD (Ultra High Definition) TV and high-resolution image. In addition, with the development of digital devices, users are demanding high quality music in the music industry, and artificial intelligence assistant is coming along with high quality music and streaming service. The results of this study can be used to protect the rights of copyright holders in these industries.