Search | Korea Science

Performance Comparison of Tilera Many-core and x86-64 Multi-core Systems (Tilera 다중코어와 x86-64 멀티코어 시스템의 성능 비교)

Choi, HeeSeok;Lyoo, TaeMuk;Park, JiSu;Jung, Daeyong;Lim, JongBeom;Lee, Jungha;Suh, Teaweon;Yu, Heonchang
- Proceedings of the Korea Information Processing Society Conference
- /
- 2013.05a
- /
- pp.102-105
- /
- 2013
최근 멀티코어 시스템은 컴퓨터의 성능을 향상시키기 위해 더 많은 수의 코어를 연결시키는 다중코어 시스템으로 발전하고 있다. 그러나 멀티코어 시스템은 사용하는 코어의 아키텍처 구조와 개수에 따라 성능 차이가 발생한다. 이에, 본 논문에서는 코어의 아키텍처 구조와 코어의 개수가 성능에 미치는 영향을 분석하기 위해 Tilera의 다중코어 시스템인 Tile-Gx36, TilePro64와 Intel의 x86-64 멀티코어 시스템인 Core i5의 성능을 비교하였다. 코어의 사용률이 늘어남에 따른 성능차이를 알아보기 위해 벤치마크 프로그램인 SPEC CPU 2006을 이용하여 각 시스템 내 단일코어의 성능을 측정하고, OpenMP 벤치마크 프로그램을 이용하여 시스템의 모든 코어를 사용했을 때의 입력 데이터 크기에 따른 성능을 측정하였다. 실험 결과, 단일코어에서의 성능은 정수형 데이터를 사용하여 측정하였을 경우 Core i5가 Tile-Gx36보다 약 87%, 실수형 데이터를 사용하여 측정하였을 경우 약 94% 더 빠른 것으로 나타났다. 그러나 코어 전체를 이용한 성능 결과에서는 정수형 배열 크기가 이상일 경우 Tile-Gx36 시스템의 처리 속도가 Core i5 시스템 보다 평균적으로 약 7.6배 향상됨을 확인할 수 있었다. 따라서 Tilera의 다중코어 시스템은 클럭 속도와 아키텍처 구조의 영향으로 단일코어의 성능은 떨어지나, 병렬 처리를 이용한 고속연산에서는 성능이 향상된다고 할 수 있다.
https://doi.org/10.3745/PKIPS.y2013m05a.102 인용 PDF

Grid Peak Power Limiting / Compensation Power Circuit for Power Unit under Dynamic Load Profile Conditions (Dynamic Load Profile 조건의 전원 장치에 있어서 계통 Peak Power 제한/보상 전력 회로)

Jeong, Hee-Seong;Park, Do-Il;Lee, Yong-Hwi;Lee, Chang-Hyeon;Rho, Chung-Wook
- The Transactions of the Korean Institute of Power Electronics
- /
- v.27 no.5
- /
- pp.376-383
- /
- 2022
The improved performance of computer parts, such as graphic card, CPU, and main board, has led to the need for power supplies with a high power output. The dynamic load profile rapidly changes the usage of power consumption depending on load operations, such as PC power and air conditioner. Under dynamic load profile conditions, power consumption can be classified into maximum, normal, and standby power. Several problems arise in the case of maximum power. Peak power is generated at the system power source in the maximum-power situation. Frequent generation of peak power can cause high-frequency problems and reduce the life of high-pressure parts (especially high-pressure capacitors). For example, when a plurality of PCs are used, system overload occurs due to peak power generation and causes problems, such as power failure and increase in electricity bills due to exceeded contract power. To solve this problem, a system peak power limit/compensation power circuit is proposed for a power supply under dynamic load profile conditions. The proposed circuit detects the system current to determine the power situation of the load. When the system current is higher than the set level, the circuit recognizes that the system current generates peak power and compensates for the load power through a converter using a super capacitor as the power source. Thus, the peak power of loads with a dynamic load profile is limited and compensated for, and problems, such as high-frequency issues, are solved. In addition, the life of high-pressure parts is increased.
https://doi.org/10.6113/TKPE.2022.27.5.376 인용 PDF KSCI

PDF Version 1.4-1.6 Password Cracking in CUDA GPU Environment (PDF 버전 1.4-1.6의 CUDA GPU 환경에서 암호 해독 최적 구현)

Hyun Jun, Kim;Si Woo, Eum;Hwa Jeong, Seo
- KIPS Transactions on Computer and Communication Systems
- /
- v.12 no.2
- /
- pp.69-76
- /
- 2023
Hundreds of thousands of passwords are lost or forgotten every year, making the necessary information unavailable to legitimate owners or authorized law enforcement personnel. In order to recover such a password, a tool for password cracking is required. Using GPUs instead of CPUs for password cracking can quickly process the large amount of computation required during the recovery process. This paper optimizes on GPUs using CUDA, with a focus on decryption of the currently most popular PDF 1.4-1.6 version. Techniques such as eliminating unnecessary operations of the MD5 algorithm, implementing 32-bit word integration of the RC4 algorithm, and using shared memory were used. In addition, autotune techniques were used to search for the number of blocks and threads that affect performance improvement. As a result, we showed throughput of 31,460 kp/s (kilo passwords per second) and 66,351 kp/s at block size 65,536, thread size 96 in RTX 3060, RTX 3090 environments, and improved throughput by 22.5% and 15.2%, respectively, compared to the cracking tool hashcat that achieves the highest throughput.
https://doi.org/10.3745/KTCCS.2023.12.2.69 인용 PDF

SDP DB Generator Using XML (XML을 이용한 SDP DB 생성기)

Yi, Chang-Hwan;Oh, Se-Man
- Proceedings of the Korea Information Processing Society Conference
- /
- 2001.10b
- /
- pp.1163-1166
- /
- 2001
최근의 네트워크는 유선에서 무선으로 발전하고 있다. 무선 네트워크 기술은 여러 가지가 있다. 그 중 좁은 영역에서 사용하고 개인적인 용도로 알맞은 것이 블루투스(Bluetooth)이다. 블루투스 기술 표준은 데이터를 전송하는 단순한 통신 방법만을 정의한 것이 아니라, 몇 가지 응용에 대한 표준도 정의되어 있다. 이 응용은 모든 블루투스 기기가 전부 구현해야 하는 것이 아니라, 그 중 기기에 알맞은 것만을 선택해서 구현을 할 수 있도록 되어 있다. 그렇다면 블루투스 기기에서 다른 기기가 제공하는 서비스가 무엇인지를 알아낼 수 있는 방법이 필요하게 된다. 다른 기기의 서비스를 알아내는데 사용되는 블루투스 기술이 SDP 계층 (Service Discovery Protocol Layer) 이다. SDP 계층은 프로토콜만으로 작동 가능한 것이 아니라, 블루투스 기기에서 제공 가능한 서비스와 서비스 속성을 정의한 내부 데이터베이스를 참조해서 작동하게 된다. 이 내부 데이터베이스는 블루투스를 구현하는 사람마다 모두 다르게 구현하고 있다. 그래서 블루투스 서비스와 서비스 속성에 관한 정보는 글과 간단한 도표로만 정의되고 있는 상황이다. 블루투스 서비스와 서비스 속성 정보를 글과 도표가 아닌 XML을 이용한 문서로 표현을 하는 방법이 나타났었다. 그러나 블루투스 기기에서 직접 서비스와 서비스 속성을 기술한 XML 문서를 바탕으로 SDP를 작동시키는 것은 블루투스 기기에 XML 파서를 포함시켜야 한다는 것을 말한다. 대체로 작은 CPU 성능과 적은 메모리를 가지고 있는 블루투스 기기에서는 XML 파서를 포함하는 근 부담이 된다. 이에 본 논문에서는 보편적으로 사용될 수 있는 블루투스 서비스와 서비스 속성을 기술한 XML 문서에서 블루투스 기기에 적합한 내부 정보를 생성하는 생성기를 설계하고 구현을 하였다.보다는 현저히 낮았다. 총 휘발성 유기화합물읜 농도는 실내가 실외 보다 높았다(I/O ratio 2.5). BTEX의 상대적 함량도 실내가 실외보다 높아 실내에도 발생원이 있음을 암시하고 있다. 자료 분석결과 유치원 실내의 벤젠은 실외로부터 유입되고 있었고, 톨루엔, 에틸벤젠, 크실렌은 실외뿐 아니라 실내에서도 발생하고 있었다. 정량한 8개 화합물 각각과 총 휘발성 유기화합물의 스피어만 상관계수는 벤젠을 제외하고는 모두 유의하였다. 이중 톨루엔과 크실렌은 총 휘발성 유기화합물과 좋은 상관성 (톨루엔 0.76, 크실렌, 0.87)을 나타내었다. 이 연구는 톨루엔과 크실렌이 총 휘발성 유기화합물의 좋은 지표를 사용될 있고, 톨루엔, 에틸벤젠, 크실렌 등 많은 휘발성 유기화합물의 발생원은 실외뿐 아니라 실내에도 있음을 나타내고 있다.>10)의 $[^{18}F]F_2$를 얻었다. 결론: $^{18}O(p,n)^{18}F$ 핵반응을 이용하여 친전자성 방사성동위원소 $[^{18}F]F_2$를 생산하였다. 표적 챔버는 알루미늄으로 제작하였으며 본 연구에서 연구된 $[^{18}F]F_2$가스는 친핵성 치환반응으로 방사성동위원소를 도입하기 어려운 다양한 방사성의 약품개발에 유용하게 이용될 수 있을 것이다.었으나 움직임 보정 후 영상을 이용하여 비교한 경우, 결합능 변화가 선조체 영역에서 국한되어 나타나며 그 유의성이 움직임 보정 전에 비하여 낮음을 알 수 있었다. 결론: 뇌활성화 과제 수행시에 동반되는 피험자의 머리 움직임에 의하여 도파민 유리가 과대평가되었으며 이는 이 연구에서 제안한 영상정합을 이용한 움직임 보정기법에 의해서 개선되었다. 답이 없는 문제, 문제 만들기, 일반화가 가능한 문제 등으로 보고, 수학적 창의성 중 특히 확산적 사
PDF

Noise Removal using Fuzzy Mask Filter (퍼지 마스크 필터를 이용한 잡음 제거)

Lee, Sang-Jun;Yoon, Seok-Hyun;Kim, Kwang-Baek
- Journal of the Korea Society of Computer and Information
- /
- v.15 no.11
- /
- pp.41-45
- /
- 2010
Image processing techniques are fundamental in human vision-based image information processing. There have been widely studied areas such as image transformation, image enhancement, image restoration, and image compression. One of research subgoals in those areas is enhancing image information for the correct information retrieval. As a fundamental task for the image recognition and interpretation, image enhancement includes noise filtering techniques. Conventional filtering algorithms may have high noise removal rate but usually have difficulty in conserving boundary information. As a result, they often use additional image processing algorithms in compensation for the tradeoff of more CPU time and higher possibility of information loss. In this paper, we propose a Fuzzy Mask Filtering algorithm that has high noise removal rate but lesser problems in above-mentioned side-effects. Our algorithm firstly decides a threshold based on fuzzy logic with information from masks. Then it decides the output pixel value by that threshold. In a designed experiment that has random impulse noise and salt pepper noise, the proposed algorithm was more effective in noise removal without information loss.
https://doi.org/10.9708/jksci.2010.15.11.041 인용 PDF KSCI

Fast GPU Implementation for the Solution of Tridiagonal Matrix Systems (삼중대각행렬 시스템 풀이의 빠른 GPU 구현)

Kim, Yong-Hee;Lee, Sung-Kee
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.11_12
- /
- pp.692-704
- /
- 2005
With the improvement of computer hardware, GPUs(Graphics Processor Units) have tremendous memory bandwidth and computation power. This leads GPUs to use in general purpose computation. Especially, GPU implementation of compute-intensive physics based simulations is actively studied. In the solution of differential equations which are base of physics simulations, tridiagonal matrix systems occur repeatedly by finite-difference approximation. From the point of view of physics based simulations, fast solution of tridiagonal matrix system is important research field. We propose a fast GPU implementation for the solution of tridiagonal matrix systems. In this paper, we implement the cyclic reduction(also known as odd-even reduction) algorithm which is a popular choice for vector processors. We obtained a considerable performance improvement for solving tridiagonal matrix systems over Thomas method and conjugate gradient method. Thomas method is well known as a method for solving tridiagonal matrix systems on CPU and conjugate gradient method has shown good results on GPU. We experimented our proposed method by applying it to heat conduction, advection-diffusion, and shallow water simulations. The results of these simulations have shown a remarkable performance of over 35 frame-per-second on the 1024x1024 grid.
PDF KSCI

A Study on Determination of the Number of Work Processes Reflecting Characteristics of Program on Computational Grid (계산 그리드 상에서 프로그램의 특성을 반영한 작업 프로세스 수의 결정에 관한 연구)

Cho, Soo-Hyun;Kim, Young-Hak
- Journal of the Korea Society of Computer and Information
- /
- v.11 no.1 s.39
- /
- pp.71-85
- /
- 2006
The environment of computational grid is composed of the LAN/WAN each of which has different efficiency and heterogeneous network conditions, and where various programs are running. In this environment, the role of the resource selection broker is very important because the work of each node is performed by considering heterogeneous network environment and the computing power of each node according to the characteristics of a program. In this paper, a new resource selection broker is presented that decides the number of work processes to be allocated at each node by considering network state information and the performance of each node according to the characteristics of a program in the environment of computational grid. The proposed resource selection broker has three steps as follows. First, the performance ratio of each node is computed using latency-bandwidth-cpu mixture information reflecting the characteristics of a program, and the number of work processes that will be performed at each node are decided by this ratio. Second, RSL file is automatically made based on the number of work processes decided at the previous step. Finally, each node creates work processes by using that RSL file and performs the work which has been allocated to itself. As experimental results, the proposed method reflecting characteristics of a program, compared with the existing (uniformity) and latency-bandwidth method is improved $278%\sim316%,\;524%\sim595%,\;924%\sim954%$ in the point of work amount, work process number, and node number respectively.
PDF

Evaluating Reverse Logistics Networks with Centralized Centers : Hybrid Genetic Algorithm Approach (집중형센터를 가진 역물류네트워크 평가 : 혼합형 유전알고리즘 접근법)

Yun, YoungSu
- Journal of Intelligence and Information Systems
- /
- v.19 no.4
- /
- pp.55-79
- /
- 2013
In this paper, we propose a hybrid genetic algorithm (HGA) approach to effectively solve the reverse logistics network with centralized centers (RLNCC). For the proposed HGA approach, genetic algorithm (GA) is used as a main algorithm. For implementing GA, a new bit-string representation scheme using 0 and 1 values is suggested, which can easily make initial population of GA. As genetic operators, the elitist strategy in enlarged sampling space developed by Gen and Chang (1997), a new two-point crossover operator, and a new random mutation operator are used for selection, crossover and mutation, respectively. For hybrid concept of GA, an iterative hill climbing method (IHCM) developed by Michalewicz (1994) is inserted into HGA search loop. The IHCM is one of local search techniques and precisely explores the space converged by GA search. The RLNCC is composed of collection centers, remanufacturing centers, redistribution centers, and secondary markets in reverse logistics networks. Of the centers and secondary markets, only one collection center, remanufacturing center, redistribution center, and secondary market should be opened in reverse logistics networks. Some assumptions are considered for effectively implementing the RLNCC The RLNCC is represented by a mixed integer programming (MIP) model using indexes, parameters and decision variables. The objective function of the MIP model is to minimize the total cost which is consisted of transportation cost, fixed cost, and handling cost. The transportation cost is obtained by transporting the returned products between each centers and secondary markets. The fixed cost is calculated by opening or closing decision at each center and secondary markets. That is, if there are three collection centers (the opening costs of collection center 1 2, and 3 are 10.5, 12.1, 8.9, respectively), and the collection center 1 is opened and the remainders are all closed, then the fixed cost is 10.5. The handling cost means the cost of treating the products returned from customers at each center and secondary markets which are opened at each RLNCC stage. The RLNCC is solved by the proposed HGA approach. In numerical experiment, the proposed HGA and a conventional competing approach is compared with each other using various measures of performance. For the conventional competing approach, the GA approach by Yun (2013) is used. The GA approach has not any local search technique such as the IHCM proposed the HGA approach. As measures of performance, CPU time, optimal solution, and optimal setting are used. Two types of the RLNCC with different numbers of customers, collection centers, remanufacturing centers, redistribution centers and secondary markets are presented for comparing the performances of the HGA and GA approaches. The MIP models using the two types of the RLNCC are programmed by Visual Basic Version 6.0, and the computer implementing environment is the IBM compatible PC with 3.06Ghz CPU speed and 1GB RAM on Windows XP. The parameters used in the HGA and GA approaches are that the total number of generations is 10,000, population size 20, crossover rate 0.5, mutation rate 0.1, and the search range for the IHCM is 2.0. Total 20 iterations are made for eliminating the randomness of the searches of the HGA and GA approaches. With performance comparisons, network representations by opening/closing decision, and convergence processes using two types of the RLNCCs, the experimental result shows that the HGA has significantly better performance in terms of the optimal solution than the GA, though the GA is slightly quicker than the HGA in terms of the CPU time. Finally, it has been proved that the proposed HGA approach is more efficient than conventional GA approach in two types of the RLNCC since the former has a GA search process as well as a local search process for additional search scheme, while the latter has a GA search process alone. For a future study, much more large-sized RLNCCs will be tested for robustness of our approach.
https://doi.org/10.13088/jiis.2013.19.4.055 인용 PDF KSCI

A Study on Fast Iris Detection for Iris Recognition in Mobile Phone (휴대폰에서의 홍채인식을 위한 고속 홍채검출에 관한 연구)

Park Hyun-Ae;Park Kang-Ryoung
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.43 no.2 s.308
- /
- pp.19-29
- /
- 2006
As the security of personal information is becoming more important in mobile phones, we are starting to apply iris recognition technology to these devices. In conventional iris recognition, magnified iris images are required. For that, it has been necessary to use large magnified zoom & focus lens camera to capture images, but due to the requirement about low size and cost of mobile phones, the zoom & focus lens are difficult to be used. However, with rapid developments and multimedia convergence trends in mobile phones, more and more companies have built mega-pixel cameras into their mobile phones. These devices make it possible to capture a magnified iris image without zoom & focus lens. Although facial images are captured far away from the user using a mega-pixel camera, the captured iris region possesses sufficient pixel information for iris recognition. However, in this case, the eye region should be detected for accurate iris recognition in facial images. So, we propose a new fast iris detection method, which is appropriate for mobile phones based on corneal specular reflection. To detect specular reflection robustly, we propose the theoretical background of estimating the size and brightness of specular reflection based on eye, camera and illuminator models. In addition, we use the successive On/Off scheme of the illuminator to detect the optical/motion blurring and sunlight effect on input image. Experimental results show that total processing time(detecting iris region) is on average 65ms on a Samsung SCH-S2300 (with 150MHz ARM 9 CPU) mobile phone. The rate of correct iris detection is 99% (about indoor images) and 98.5% (about outdoor images).
PDF KSCI

Development of a real-time surface image velocimeter using an android smartphone (스마트폰을 이용한 실시간 표면영상유속계 개발)

Yu, Kwonkyu;Hwang, Jeong-Geun
- Journal of Korea Water Resources Association
- /
- v.49 no.6
- /
- pp.469-480
- /
- 2016
The present study aims to develop a real-time surface image velocimeter (SIV) using an Android smartphone. It can measure river surface velocity by using its built-in sensors and processors. At first the SIV system figures out the location of the site using the GPS of the phone. It also measures the angles (pitch and roll) of the device by using its orientation sensors to determine the coordinate transform from the real world coordinates to image coordinates. The only parameter to be entered is the height of the phone from the water surface. After setting, the camera of the phone takes a series of images. With the help of OpenCV, and open source computer vision library, we split the frames of the video and analyzed the image frames to get the water surface velocity field. The image processing algorithm, similar to the traditional STIV (Spatio-Temporal Image Velocimeter), was based on a correlation analysis of spatio-temporal images. The SIV system can measure instantaneous velocity field (1 second averaged velocity field) once every 11 seconds. Averaging this instantaneous velocity measurement for sufficient amount of time, we can get an average velocity field. A series of tests performed in an experimental flume showed that the measurement system developed was greatly effective and convenient. The measured results by the system showed a maximum error of 13.9 % and average error less than 10 %, when we compared with the measurements by a traditional propeller velocimeter.
https://doi.org/10.3741/JKWRA.2016.49.6.469 인용 PDF KSCI

Search Result 762, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)