• Title/Summary/Keyword: Computation Execution

Search Result 172, Processing Time 0.02 seconds

Parallel Distributed Implementation of GHT on Ethernet Multicluster (이더넷 다중 클러스터에서 GHT의 병렬 분산 구현)

  • Kim, Yeong-Soo;Kim, Myung-Ho;Choi, Heung-Moon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.3
    • /
    • pp.96-106
    • /
    • 2009
  • Extending the scale of the distributed processing in a single Ethernet cluster is physically restricted by maximum ports per switch. This paper presents an implementation of MPI-based multicluster consisting of multiple Ethernet switches for extending the scale of distributed processing, and a asymptotical analysis for communication overhead through execution-time analysis model. To determine an optimum task partitioning, we analyzed the processing time for various partitioning schemes, and AAP(accumulator array partitioning) scheme was finally chosen to minimize the overall communication overhead. The scope of data partitioned in AAP was modified to fit for incremented nodes, and suitable load balancing algorithm was implemented. We tried to alleviate the communication overhead through exploiting the pipelined broadcast and flat-tree based result gathering, and overlapping of the communication and the computation time. We used the linear pipeline broadcast to reduce the communication overhead in intercluster which is interconnected by a single link. Experimental results shows nearly linear speedup by the proposed parallel distributed GHT implemented on MPI-based Ethernet multicluster with four 100Mbps Ethernet switches and up to 128 nodes of Pentium PC.

MAC-Layer Error Control for Real-Time Broadcasting of MPEG-4 Scalable Video over 3G Networks (3G 네트워크에서 MPEG-4 스케일러블 비디오의 실시간 방송을 위한 실행시간 예측 기반 MAC계층 오류제어)

  • Kang, Kyungtae;Noh, Dong Kun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.3
    • /
    • pp.63-71
    • /
    • 2014
  • We analyze the execution time of Reed-Solomon coding, which is the MAC-layer forward error correction scheme used in CDMA2000 1xEV-DO broadcast services, under different air channel conditions. The results show that the time constraints of MPEG-4 cannot be guaranteed by Reed-Solomon decoding when the packet loss rate (PLR) is high, due to its long computation time on current hardware. To alleviate this problem, we propose three error control schemes. Our static scheme bypasses Reed-Solomon decoding at the mobile node to satisfy the MPEG-4 time constraint when the PLR exceeds a given boundary. Second, dynamic scheme corrects errors in a best-effort manner within the time constraint, instead of giving up altogether when the PLR is high; this achieves a further quality improvement. The third, video-aware dynamic scheme fixes errors in a similar way to the dynamic scheme, but in a priority-driven manner which makes the video appear smoother. Extensive simulation results show the effectiveness of our schemes compared to the original FEC scheme.

A 3D Magnetic Inversion Software Based on Algebraic Reconstruction Technique and Assemblage of the 2D Forward Modeling and Inversion (대수적 재구성법과 2차원 수치모델링 및 역산 집합에 기반한 3차원 자력역산 소프트웨어)

  • Ko, Kwang-Beom;Jung, Sang-Won;Han, Kyeong-Soo
    • Geophysics and Geophysical Exploration
    • /
    • v.16 no.1
    • /
    • pp.27-35
    • /
    • 2013
  • In this study, we developed the trial product on 3D magnetic inversion tentatively named 'KMag3D'. Also, we briefly introduced its own function and graphic user interface on which especially focused through the development in the form of user manual. KMag3D is consisted of two fundamental frame for the 3D magnetic inversion. First, algebraic reconstruction technique was selected as a 3D inversion algorithm instead of least square method conventionally used in various magnetic inversion. By comparison, it was turned out that algebraic reconstruction algorithm was more effective and economic than that of least squares in aspect of both computation time and memory. Second, for the effective determination of the 3D initial and a-priori information model required in the execution of our algorithm, we proposed the practical technique based on the assemblage of 2D forward modeling and inversion results for individual user-selected 2D profiles. And in succession, initial and a-priori information model were constructed by appropriate interpolation along the strke direction. From this, we concluded that our technique is both suitable and very practical for the application of 3D magentic inversion problem.

Modeling Environment for Distributed Simulation with Hierarchical Animation (계층적 애니메이션이 가능한 분산 시뮬레이션 모델링 환경)

  • Yi, Mi-Ra;Kim, Hyung-Jong
    • Journal of the Korea Society for Simulation
    • /
    • v.17 no.1
    • /
    • pp.33-42
    • /
    • 2008
  • In general, simulation is to predict or evaluate some systems that are hard to be executed in real world, and so usually the target systems to be modeled are large and complex. Trying to observe the dynamics of the systems results in similar level of animation complexity, the model and animation has the same complexity as the system. Trying to display all the graphic objects representing the dynamics of the models being simulated, however, causes the distraction of focus, which results in solving the above listed problems difficult. The redundant graphic objects also increase the computer computation overhead. To solve the problem, a research about a hierarchical animation environment has been proposed a few years ago. In the research, the users can have better focus on the dynamics of system components by selectively choosing the hierarchical level and components within a level of the hierarchically structured model. However, the research has not a modeling methodology for modelers to describe systematically animation part corresponding to dynamics of simulation in a model. This research has defined the modeling methodology of DESHA and defined DESHA-C++, improving the previous research output, as an execution environment of DESHA models. In addition, to use hierarchical animation environment in various problems, this research proposed and developed the distributed simulation modeling environment that connects DESHA environment and HLA.

  • PDF

Fast Planar Shape Deformation using a Layered Mesh (계층 메쉬를 이용한 빠른 평면 형상 변형)

  • Yoo, Kwang-Seok;Choi, Jung-Ju
    • Journal of the Korea Computer Graphics Society
    • /
    • v.17 no.3
    • /
    • pp.43-50
    • /
    • 2011
  • We present a trade-off technique for fast but qualitative planar shape deformation using a layered mesh. We construct a layered mesh that is embedding a planar input shape; the upper-layer is denoted as a control mesh and the other lower-layer as a shape mesh that is defined by mean value coordinates relative to the control mesh. First, we try to preserve some shape properties including user constraints for the control mesh by means of a known existing nonlinear least square optimization technique, which produces deformed positions of the control mesh vertices. Then, we compute the deformed positions of the shape mesh vertices indirectly from the deformed control mesh by means of simple coordinates computation. The control mesh consists of a small number of vertices while the shape layer contains relatively a large number of vertices in order to embed the input shape as tightly as possible. Since the time-consuming optimization technique is applied only to the control mesh, the overall execution is extremely fast; however, the quality of deformation is sacrificed due to the sacrificed quality of the control mesh and its relativity to the shape mesh. In order to change the deformation behavior and consequently to compensate the quality sacrifice, we present a method to control the deformation stiffness by incorporating the orientation into the user constraints. According to our experiments, the proposed technique produces a planar shape deformation fast enough for real-time applications on limited embedded systems such as cell phones and tablet PCs.

A Digital Twin Software Development Framework based on Computing Load Estimation DNN Model (컴퓨팅 부하 예측 DNN 모델 기반 디지털 트윈 소프트웨어 개발 프레임워크)

  • Kim, Dongyeon;Yun, Seongjin;Kim, Won-Tae
    • Journal of Broadcast Engineering
    • /
    • v.26 no.4
    • /
    • pp.368-376
    • /
    • 2021
  • Artificial intelligence clouds help to efficiently develop the autonomous things integrating artificial intelligence technologies and control technologies by sharing the learned models and providing the execution environments. The existing autonomous things development technologies only take into account for the accuracy of artificial intelligence models at the cost of the increment of the complexity of the models including the raise up of the number of the hidden layers and the kernels, and they consequently require a large amount of computation. Since resource-constrained computing environments, could not provide sufficient computing resources for the complex models, they make the autonomous things violate time criticality. In this paper, we propose a digital twin software development framework that selects artificial intelligence models optimized for the computing environments. The proposed framework uses a load estimation DNN model to select the optimal model for the specific computing environments by predicting the load of the artificial intelligence models with digital twin data so that the proposed framework develops the control software. The proposed load estimation DNN model shows up to 20% of error rate compared to the formula-based load estimation scheme by means of the representative CNN models based experiments.

Object Detection Performance Analysis between On-GPU and On-Board Analysis for Military Domain Images

  • Du-Hwan Hur;Dae-Hyeon Park;Deok-Woong Kim;Jae-Yong Baek;Jun-Hyeong Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.157-164
    • /
    • 2024
  • In this paper, we propose a discussion that the feasibility of deploying a deep learning-based detector on the resource-limited board. Although many studies evaluate the detector on machines with high-performed GPUs, evaluation on the board with limited computation resources is still insufficient. Therefore, in this work, we implement the deep-learning detectors and deploy them on the compact board by parsing and optimizing a detector. To figure out the performance of deep learning based detectors on limited resources, we monitor the performance of several detectors with different H/W resource. On COCO detection datasets, we compare and analyze the evaluation results of detection model in On-Board and the detection model in On-GPU in terms of several metrics with mAP, power consumption, and execution speed (FPS). To demonstrate the effect of applying our detector for the military area, we evaluate them on our dataset consisting of thermal images considering the flight battle scenarios. As a results, we investigate the strength of deep learning-based on-board detector, and show that deep learning-based vision models can contribute in the flight battle scenarios.

Real-time Color Recognition Based on Graphic Hardware Acceleration (그래픽 하드웨어 가속을 이용한 실시간 색상 인식)

  • Kim, Ku-Jin;Yoon, Ji-Young;Choi, Yoo-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.1-12
    • /
    • 2008
  • In this paper, we present a real-time algorithm for recognizing the vehicle color from the indoor and outdoor vehicle images based on GPU (Graphics Processing Unit) acceleration. In the preprocessing step, we construct feature victors from the sample vehicle images with different colors. Then, we combine the feature vectors for each color and store them as a reference texture that would be used in the GPU. Given an input vehicle image, the CPU constructs its feature Hector, and then the GPU compares it with the sample feature vectors in the reference texture. The similarities between the input feature vector and the sample feature vectors for each color are measured, and then the result is transferred to the CPU to recognize the vehicle color. The output colors are categorized into seven colors that include three achromatic colors: black, silver, and white and four chromatic colors: red, yellow, blue, and green. We construct feature vectors by using the histograms which consist of hue-saturation pairs and hue-intensity pairs. The weight factor is given to the saturation values. Our algorithm shows 94.67% of successful color recognition rate, by using a large number of sample images captured in various environments, by generating feature vectors that distinguish different colors, and by utilizing an appropriate likelihood function. We also accelerate the speed of color recognition by utilizing the parallel computation functionality in the GPU. In the experiments, we constructed a reference texture from 7,168 sample images, where 1,024 images were used for each color. The average time for generating a feature vector is 0.509ms for the $150{\times}113$ resolution image. After the feature vector is constructed, the execution time for GPU-based color recognition is 2.316ms in average, and this is 5.47 times faster than the case when the algorithm is executed in the CPU. Our experiments were limited to the vehicle images only, but our algorithm can be extended to the input images of the general objects.

Implementation and Evaluation of the Electron Arc Plan on a Commercial Treatment Planning System with a Pencil Beam Algorithm (Pencil Beam 알고리즘 기반의 상용 치료계획 시스템을 이용한 전자선 회전 치료 계획의 구현 및 정확도 평가)

  • Kang, Sei-Kwon;Park, So-Ah;Hwang, Tae-Jin;Cheong, Kwang-Ho;Lee, Me-Yeon;Kim, Kyoung-Ju;Oh, Do-Hoon;Bae, Hoon-Sik
    • Progress in Medical Physics
    • /
    • v.21 no.3
    • /
    • pp.304-310
    • /
    • 2010
  • Less execution of the electron arc treatment could in large part be attributed to the lack of an adequate planning system. Unlike most linear accelerators providing the electron arc mode, no commercial planning systems for the electron arc plan are available at this time. In this work, with the expectation that an easily accessible planning system could promote electron arc therapy, a commercial planning system was commissioned and evaluated for the electron arc plan. For the electron arc plan with use of a Varian 21-EX, Pinnacle3 (ver. 7.4f), with an electron pencil beam algorithm, was commissioned in which the arc consisted of multiple static fields with a fixed beam opening. Film dosimetry and point measurements were executed for the evaluation of the computation. Beam modeling was not satisfactory with the calculation of lateral profiles. Contrary to good agreement within 1% of the calculated and measured depth profiles, the calculated lateral profiles showed underestimation compared with measurements, such that the distance-to-agreement (DTA) was 5.1 mm at a 50% dose level for 6 MeV and 6.7 mm for 12 MeV with similar results for the measured depths. Point and film measurements for the humanoid phantom revealed that the delivered dose was more than the calculation by approximately 10%. The electron arc plan, based on the pencil beam algorithm, provides qualitative information for the dose distribution. Dose verification before the treatment should be mandatory.

A Study on the Intelligent Quick Response System for Fast Fashion(IQRS-FF) (패스트 패션을 위한 지능형 신속대응시스템(IQRS-FF)에 관한 연구)

  • Park, Hyun-Sung;Park, Kwang-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.163-179
    • /
    • 2010
  • Recentlythe concept of fast fashion is drawing attention as customer needs are diversified and supply lead time is getting shorter in fashion industry. It is emphasized as one of the critical success factors in the fashion industry how quickly and efficiently to satisfy the customer needs as the competition has intensified. Because the fast fashion is inherently susceptible to trend, it is very important for fashion retailers to make quick decisions regarding items to launch, quantity based on demand prediction, and the time to respond. Also the planning decisions must be executed through the business processes of procurement, production, and logistics in real time. In order to adapt to this trend, the fashion industry urgently needs supports from intelligent quick response(QR) system. However, the traditional functions of QR systems have not been able to completely satisfy such demands of the fast fashion industry. This paper proposes an intelligent quick response system for the fast fashion(IQRS-FF). Presented are models for QR process, QR principles and execution, and QR quantity and timing computation. IQRS-FF models support the decision makers by providing useful information with automated and rule-based algorithms. If the predefined conditions of a rule are satisfied, the actions defined in the rule are automatically taken or informed to the decision makers. In IQRS-FF, QRdecisions are made in two stages: pre-season and in-season. In pre-season, firstly master demand prediction is performed based on the macro level analysis such as local and global economy, fashion trends and competitors. The prediction proceeds to the master production and procurement planning. Checking availability and delivery of materials for production, decision makers must make reservations or request procurements. For the outsourcing materials, they must check the availability and capacity of partners. By the master plans, the performance of the QR during the in-season is greatly enhanced and the decision to select the QR items is made fully considering the availability of materials in warehouse as well as partners' capacity. During in-season, the decision makers must find the right time to QR as the actual sales occur in stores. Then they are to decide items to QRbased not only on the qualitative criteria such as opinions from sales persons but also on the quantitative criteria such as sales volume, the recent sales trend, inventory level, the remaining period, the forecast for the remaining period, and competitors' performance. To calculate QR quantity in IQRS-FF, two calculation methods are designed: QR Index based calculation and attribute similarity based calculation using demographic cluster. In the early period of a new season, the attribute similarity based QR amount calculation is better used because there are not enough historical sales data. By analyzing sales trends of the categories or items that have similar attributes, QR quantity can be computed. On the other hand, in case of having enough information to analyze the sales trends or forecasting, the QR Index based calculation method can be used. Having defined the models for decision making for QR, we design KPIs(Key Performance Indicators) to test the reliability of the models in critical decision makings: the difference of sales volumebetween QR items and non-QR items; the accuracy rate of QR the lead-time spent on QR decision-making. To verify the effectiveness and practicality of the proposed models, a case study has been performed for a representative fashion company which recently developed and launched the IQRS-FF. The case study shows that the average sales rateof QR items increased by 15%, the differences in sales rate between QR items and non-QR items increased by 10%, the QR accuracy was 70%, the lead time for QR dramatically decreased from 120 hours to 8 hours.