• Title/Summary/Keyword: instructions

Search Result 1,407, Processing Time 0.036 seconds

Study of an In-order SMT Architecture and Grouping Schemes

  • Moon, Byung-In;Kim, Moon-Gyung;Hong, In-Pyo;Kim, Ki-Chang;Lee, Yong-Surk
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.3
    • /
    • pp.339-350
    • /
    • 2003
  • In this paper, we propose a simultaneous multithreading (SMT) architecture that improves instruction throughput by exploiting instruction level parallelism (ILP) and thread level parallelism (TLP). The proposed architecture issues and completes instructions belonging to the same thread in exact program order. The issue and completion policy greatly reduces the design complexity and hardware cost of our architecture, compared with others that employ out-of-order issue and completion. On the other hand, when the instructions belong to different threads, the issue and completion orders for those instructions may not necessarily be identical to the fetch order. The processor issues instructions simultaneously from multiple threads to functional units by exploiting ILP and TLP, and by dynamic resource sharing. That parallel execution notably improves performance and resource utilization with minimal additional hardware cost over the conventional superscalar processors. This paper proposes an SMT architecture with grouping as well as one without grouping. Without grouping, all threads dynamically and flexibly share most resources. On the other hand, in the SMT architecture with grouping, in which resources and threads are divided into several groups for design simplification, resources are shared only among threads belonging to the same group as those resources. Simulation results show that our processors with four and eight threads improve performance by three or more times over the conventional superscalar processor with comparable execution resources and policies, and that reasonable grouping reduces the design complexity of SMT processors with little negative effect on performance.

Instruction-corruption-less Binary Modification Mechanism for Static Stack Protections (이진 조작을 통한 정적 스택 보호 시 발생하는 명령어 밀림현상 방지 기법)

  • Lee, Young-Rim;Kim, Young-Pil;Yoo, Hyuck
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.71-75
    • /
    • 2008
  • Many sensor operating systems have memory limitation constraint; therefore, stack memory areas of threads resides in a single memory space. Because most target platforms do not have hardware MMY (Memory Management Unit), it is difficult to protect each stack area. The method to solve this problem is to exchange original stack handling instructions in binary code for wrapper routines to protect stack area. In this exchanging phase, instruction corruption problem occurs due to difference of each instruction length between stack handling instructions and branch instructions. In this paper, we propose the algorithm to call a target routine without instruction corruption problem. This algorithm can reach a target routine by repeating branch instructions to have a short range. Our solution makes it easy to apply security patch and maintain upgrade of software of sensor node.

SIMD Instruction-based Fast HEVC RExt Decoder (SIMD 명령어 기반 HEVC RExt 복호화기 고속화)

  • Mok, Jung-Soo;Ahn, Yong-Jo;Ryu, Hochan;Sim, Donggyu
    • Journal of Broadcast Engineering
    • /
    • v.20 no.2
    • /
    • pp.224-237
    • /
    • 2015
  • In this paper, we introduce the fast decoding method with the SIMD (Single Instruction Multiple Data) instructions for HEVC RExt (High Efficiency Video Coding Range Extensions). Several tools of HEVC RExt such as intra prediction, interpolation, inverse-quantization, inverse-transform, and clipping modules can be classified as the proper modules for applying the SIMD instructions. In consideration of bit-depth increasement of RExt, intra prediction, interpolation, inverse-quantization, inverse-transform, and clipping modules are accelerated by SSE (Streaming SIMD Extension) instructions. In addition, we propose effective implementations for interpolation filter, inverse-quantization, and clipping modules by utilizing a set of AVX2 (Advanced Vector eXtension 2) instructions that can use 256 bits register. The evaluation of the proposed methods were performed on the private HEVC RExt decoder developed based on HM 16.0. The experimental results show that the developed RExt decoder reduces 12% average decoding time, compared with the conventional sequential method.

Hazard Communication of Dental Materials for Dental Hygienists in Daegu or Gyeongsangbuk-do Province Area (대구경북 치과위생사들의 치과재료에 대한 유해정보 소통 실태)

  • Kim, Haekyoung;Choi, Sangjun
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.25 no.4
    • /
    • pp.506-515
    • /
    • 2015
  • Objectives: This study was conducted to evaluate the status of hazard communication regarding dental materials among dental hygienists in the Daegu Metropolitan City and the North Gyeongsang-do Province area. Materials: A total of 310 dental hygienists were surveyed using self-administered questionnaires to investigate the status of hazard communication on dental materials and information needs. We collected instructions for use and material safety data sheets(MSDSs) for 67 dental materials frequently used at dental hospitals located in the Daegu Metropolitan City and the North Gyeongsang-do Province area. Results: The questionnaire surveys showed that only 11% of the 310 dental hygienists had knowledge of MSDS and 46.8% of respondents never read instructions for use before using materials. Just 7.4% of dental hygienists have undergone training on hazard information for dental materials. In particular, dental hygienists working at dental clinics had significantly lower response rates on knowledge of MSDS(p<0.001), reading of instructions for use(p=0.042) and training on the hazard information of dental materials(p=0.004) than those in dental hospitals or general hospitals. The essential information most desired by dental hygienists was hazard identification(82.3%) followed by first-aid measures(53.9%), handling and storage(51%), disposal considerations (49%) and toxicological information(47.1%). All dental materials were on foreign products which came from Japan(59.7%), the USA(26.9%) and Liechtenstein(13.7%). In terms of usage, 56.7% of dental materials were prosthetic, followed by conservation(31.3%), orthodontics(9%), and prevention(3%). We found that dental hygienists had accessed MSDSs for only five dental products among the 67 dental materials. The instructions for the use of the 67 dental materials provided hazard identification(64.2%), first-aid measures(83.6%), handling and storage(97%), disposal considerations(20.9%) and toxicological information(26.9%). Conclusions: Based on the results of this study, the hazard communication system for dental hygienists working at dental clinics should be improved.

Branch Misprediction Recovery Mechanism That Exploits Control Independence on Program (프로그램 상의 제어 독립성을 이용한 분기 예상 실패 복구 메커니즘)

  • Yoon, Sung-Lyong;Lee, Won-Mo;Cho, Yeong-Il
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.7
    • /
    • pp.401-410
    • /
    • 2002
  • Control independence has been put forward as a new significant source of instruction-level parallelism for superscalar processors. In branch prediction mechanisms, all instructions after a mispredicted branch have to be squashed and then instructions of a correct path have to be re-fetched and re-executed. This paper presents a new branch misprediction recovery mechanism to reduce the number of instructions squashed on a misprediction. Detection of control independent instructions is accomplished with the help of the static method using a profiling and the dynamic method using a control flow of program sequences. We show that the suggested branch misprediction recovery mechanism improves the performance by 2~7% on a 4-issue processor, by 4~15% on an 8-issue processor and by 8~28% on a 16-issue processor.

A Systematic Process for Generating Applications in Product Line Engineering (제품계열공학에서 어플리케이션 생성을 위한 체계적인 프로세스)

  • Chang, Chee-Won;Chang, Soo-Ho;Kim, Soo-Dong
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.8
    • /
    • pp.717-729
    • /
    • 2005
  • Product Line Engineering (PLE) consists of two phases; Core Assets Development and Application Engineering. The core asset development is to model common features of members in a domain and to develop them. The application engineering is to effectively generate an application by instantiating the core asset. Today, PLE research mainly focuses on developing core assets, whereas activities and instructions for application engineering are weakly defined. Moreover, instructions of application engineering are not enough to be practically applied. To widely apply PLE to industry, researches on systematic and practical methods such as instantiation processes, instructions, and artifacts are needed. In this paper, we propose a practical PLE process, instructions, and artifacts about each activity. And then, we also present a case study to show applicability and practicality of the process proposed in this paper.

Performance Analysis of Caching Instructions on SVLIW Processor and VLIW Processor (SVLIW 프로세서와 VLIW 프로세서의 명령어 캐싱에 따른 성능 분석)

  • Ji, Sung-Hyun;Park, No-Kwang;Kim, Suk-Il
    • Journal of IKEEE
    • /
    • v.1 no.1 s.1
    • /
    • pp.101-110
    • /
    • 1997
  • SVLIW processor architectures can resolve resource collisions and data dependencies between the instructions while scheduling VLIW instructions at run-time. As a result, long NOP word instructions can be removed from the object code produced for the processor. Thus, the occurrence of cache misses on the SVLIW processor would be lesser than that on the same cache size VLIW processor. Less frequent cache misses on the SVLIW processor would incur less frequent memory access, and thus, the total execution cycles to complete an application would be shortened compared with cases on the VLIW processor. Such a feature eventually compromises effects of longer instruction pipeline stages than those of the VLIW processor. In this paper, we formulate and compare two execution cycle models of the two architectures. A simulation results show that the longer memory access cycles when cache miss occurs, the total execution cycles of SVLIW processor would be shorter than those of VLIW processor.

  • PDF

An Efficient Bit Stream Instruction-set for Network Packet Processing Applications (네트워크 패킷 처리를 위한 효율적인 비트 스트림 명령어 세트)

  • Yoon, Yeo-Phil;Lee, Yong-Surk;Lee, Jung-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.10
    • /
    • pp.53-58
    • /
    • 2008
  • This paper proposes a new set of instructions to improve the packet processing capacity of a network processor. The proposed set of instructions is able to achieve more efficient packet processing by accelerating integration of packet headers. Furthermore, a hardware configuration dedicated to processing overlay instructions was designed to reduce additional hardware cost. For this purpose, the basic architecture for the network processor was designed using LISA and the overlay block was optimized based on the barrel shifter. The block was synthesized to compare the area and the operation delay, and allocated to a C-level macro function using the compiler known function (CKF). The improvement in performance was confirmed by comparing the execution cycle and the execution time of an application program. Experiments were conducted using the processor designer and the compiler designer from Coware. The result of synthesis with the TSMC ($0.25{\mu}m$) from Synopsys indicated a reduction in operation delay by 20.7% and an improvement in performance of 30.8% with the proposed set of instructions for the entire execution cycle.

The Change in Pre-service Chemistry Teachers' Pedagogical Content Knowledge through Mentoring (멘토링을 통한 예비화학교사들의 Pedagogical Content Knowledge 변화)

  • Lee, Song-Yeon;Min, Hee-Jung;Won, Jeong-Ae;Paik, Seoung-Hey
    • Journal of The Korean Association For Science Education
    • /
    • v.31 no.4
    • /
    • pp.621-640
    • /
    • 2011
  • The purposes of this study were to analyze the PCK of pre-service chemistry teachers and to know the changes of in PCK before and after the educational practice. For this study, four pre-service teachers majoring in chemistry education were selected as proteges and one professor of chemistry education department participated as a mentor. For the analysis of pre-service teachers' PCK, proteges' instructions, mentoring process, and semi-constructed interviews were collected. According to the results, most of the elements of the PCK were lack of proteges' instructions before the mentoring, because they didn't know the necessary practical knowledge for instructions. They also didn't know how they could apply their knowledge to the instructions. However, most of the pre-service teachers developed their PCK through the mentoring. This study shows that pre-service teacher's PCK could develop effectively by well-formed programs of mentoring before and after the educational practice in college education for pre-service teachers.

A Study on Instructions for Access Points Representing Works and Expressions in RDA (RDA의 저작과 표현형의 접근점 규정에 관한 연구)

  • Doh, Tae-Hyeon
    • Journal of Korean Library and Information Science Society
    • /
    • v.43 no.3
    • /
    • pp.27-48
    • /
    • 2012
  • This study analyzed guidelines and instructions for access points representing works and expressions in RDA. The preferred title for the work is used as the basis for constructing an authorized access point to represent a work or expression. If applicable, the authorized access point is constructed by combining the preferred title for the work to the authorized access point for the identity with principal responsibility for the work. The variant titles for the work are used as the basis for constructing variant access points to represent a work or expression. If the authorized access point is constructed by combining the preferred title for the work to the authorized access point for the identity responsible for the work, the variant access points are constructed by combining the variant titles for the work to the authorized access point, and by using only the preferred title for the work. Besides, RDA provides instructions to construct the controlled access points for special works like musical works, laws, religious works and others, but the general principles for these works are same as the above instructions. The authorized access points for works and expressions in RDA are almost same as the main entry headings in AACR2.