DOI QR코드

DOI QR Code

A new model and testing verification for evaluating the carbon efficiency of server

  • Liang Guo (Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology) ;
  • Yue Wang (Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology) ;
  • Yixing Zhang (Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology) ;
  • Caihong Zhou (Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology) ;
  • Kexin Xu (Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology) ;
  • Shaopeng Wang (Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology)
  • 투고 : 2023.03.08
  • 심사 : 2023.09.21
  • 발행 : 2023.10.31

초록

To cope with the risks of climate change and promote the realization of carbon peaking and carbon neutrality, this paper first comprehensively considers the policy background, technical trends and carbon reduction paths of energy conservation and emission reduction in data center server industry. Second, we propose a computing power carbon efficiency of data center server, and constructs the carbon emission per performance of server (CEPS) model. According to the model, this paper selects the mainstream data center servers for testing. The result shows that with the improvement of server performance, the total carbon emissions are rising. However, the speed of performance improvement is faster than that of carbon emission, hence the relative carbon emission per unit computing power shows a continuous decreasing trend. Moreover, there are some differences between different products, and it is calculated that the carbon emission per unit performance is 20-60KG when the service life of the server is five years.

키워드

1. Background

The Intergovernmental Panel on Climate Change (IPCC) of the United Nations calculates that to achieve the 2 ℃ temperature control target of the Paris agreement, the world must reach zero net carbon dioxide emissions, also known as carbon neutrality, by 2050. That is, the annual carbon dioxide emissions are equal to the offset of its emissions reduction through tree planting and other means. In 2067, it will achieve net zero greenhouse gas emissions, also known as greenhouse gas neutralization or climate neutrality. That is, in addition to carbon dioxide, the emissions of methane and other greenhouse gases will balance with the offset amount. At present, more than 120 countries and regions in the world have proposed the goal of carbon neutrality. Most of them such as the European Union (EU), United Kingdom (UK), Canada, Japan, New Zealand, South Africa, are planned to achieve carbon neutrality in 2050.

Under the background of carbon peak and carbon neutralization strategy, the data center is accelerating its low-carbon progress. Tackling global warming is the core direction of climate change, while controlling greenhouse gas emissions is the most important. So data center enterprises worldwide continue to explore intelligent energy conservation, renewable energy development, resource recycling in order to reduce the carbon emissions of data centers. International internet companies like Microsoft, Google, Meta, Amazon, Chinese telecom operators, such as China Telecom, China Mobile, and domestic data center operators such as Alibaba, Baidu, GDS, Chindata Group, all attach great importance to carbon emission-related work [1]. Furthermore, green computing has become more important and it is regarded as a widely popular field in data center and information technology (IT) industry. The Environmental Protection Agency (EPA) established energy efficiency specifications for computers in 1992 [2-3].

2. Green and low-carbon technology analysis of data center computing power

2.1 Server green and low-carbon technology

As the most basic computing equipment in IT infrastructure, server needs to undertake more and more computing. While the computing power of the server is improved, the energy consumption of the server is also increasing exponentially. The performance improvement increases the processing capacity of the server, but the energy consumption growth brings a lot of pressure to the maintenance and expenditure cost of the data center. From the perspective of data center technology, the industry has carried out a lot of technical research and exploration on how to find a solution to the contradiction between computationally intensive workload and low energy consumption operation, and how to reduce carbon emissions as much as possible on the premise of technical requirements. With the increasing use of energy, enterprise data center can account for 50% company’s energy bill and carbon footprint [4].

2.1.1 Sever cabinet technology

The cabinet server is a server solution inspired by the modular design idea. The system architecture consists of six subsystems: cabinet, network, power supply, server node, centralized cooling and centralized management. It is a significant reform in the design technology of data center server. The whole cabinet server pools the power supply unit and the heat dissipation unit to increase the deployment density by saving space, and the deployment density can usually be doubled [5]. The design of centralized power supply and heat dissipation makes the whole cabinet server only need to configure 10% of the power supply of the traditional cabinet server to meet the power supply needs. The power efficiency can be increased by more than 10 %, and the energy consumption of a single server can be reduced by 5%.

2.1.2 Liquid cooling technology

Liquid cooling server technology is the use of special or specially treated liquid to directly or indirectly cool chips and overall IT equipment, including cold plate cooling, immersion cooling and other ways. Liquid cooling server for Central Processing Unit (CPU) precise fixed-point cooling, precise control of refrigeration distribution, can enable high-density deployment to reach an unprecedented higher level (such as 20 kW-100 kW high-density data center), is one of the development direction of data center energy saving technology. Liquid cooling has many advantages. On the one hand, air cooling limits the growth of power density of single chassis, while liquid cooling can greatly increase the deployment density of single cabinet and save room space. On the other hand, the liquid cooling configuration can greatly improve the heat dissipation efficiency and reduce the data center power usage effectiveness (PUE) [6].

2.1.3 High density technology

Advances in server CPUs enable high performance with lower consumption per unit, however overall server power consumption continues to increase [7-8]. Consequently, it requires more floor space as more servers are installed in data center. The deployment of high-density servers will significantly increase the computing power per unit area of the data center and reduce the operating costs of the data center. The key to building a high-density data center is to deploy high-density servers. On the one hand, it reduces the weight and space occupation of the body and improves the computing power per unit area of the data center. On the other hand, it can improve the efficiency of power supply and heat dissipation systems and reduce the operating costs of the data center. At present, IBM, Cisco, Huawei, Inspur, SUGON and other well-known Internet hardware manufacturers have accelerated the product design and market layout of high-density servers.

2.2 CPU green and low-carbon technology

CPU is the core component of the server. On the one hand, 70% of the server's energy consumption comes from the CPU, which consumes a large amount of energy itself; on the other hand, the energy consumption of CPU will affect other auxiliary equipment, which is regarded as the energy cascade concept [9]. The study found that the CPU energy consumption of the server will be reduced by 1W, resulting in a total energy consumption reduction of 2.84W for itself and other corresponding auxiliary equipment [10]. As shown in Fig. 1, we can start from the CPU and deeply analyze the CPU green energy-saving technology to achieve the purpose of energy conservation. The reduction in CPU energy consumption has significantly reduced the overall energy consumption of auxiliary systems such as power supply and distribution systems and cooling systems.

E1KOBZ_2023_v17n10_2682_f0001.png 이미지

Fig. 1. CPU energy consumption path.

2.2.1 CPU product design for energy conservation and emission reduction

The realization of low power consumption must adopt the comprehensive method of coverage technology, design, chip architecture and software. At present, the industry's most low-power processor and system-level chip developers not only achieve advantages by optimizing architecture and materials, but also use collaborative design of packaging, power, radio frequency (RF) circuits and software to reduce power consumption. In the CPU product design stage, collaborative design has become a trend. The first step of the design is to confirm the target of the product from the performance and power consumption, and select the appropriate process technology. Electronic Design Automation (EDA) tools may become the key to achieve the power consumption target, but the traditional EDA tools for power consumption estimation are only accurate near the end of the design cycle. In the future, accurate power consumption estimation must be performed early in the design cycle.

Using new materials with higher mobility in chip design can also reduce power consumption, such as adding magnetic materials to standard semiconductor product lines, using new materials such as carbon nanotubes and graphene. Enpirion integrates different metal alloys to enable magnetic materials to perform operations at a high frequency while maintaining high energy efficiency. Semiconductor Research has funded a joint research project between IBM and Columbia University to integrate inductors into processors. This research can adjust the power supply voltage in nanosecond time through the chip voltage stabilization function to achieve workload matching, thus reducing energy consumption. The laboratory of Georgia Tech has shown that the interconnection performance of graphene exceeds that of copper. IBM's research shows that using carbon nanotubes or graphene materials can produce low-power, ultra-high-speed electro crystals.

2.2.2 CPU production process for energy conservation and emission reduction

CPU production process includes silicon raw material purification, wafer cutting, photocopying, etching, duplication and layering, packaging, and testing. Silicon purification is to melt the raw material silicon, put it into the quartz furnace and add crystal seeds to form single crystal silicon. Cutting wafers is to shape silicon ingots into cylinders and then cut them into wafers. The thinner the wafer is cut, the more CPU products can be made of the same amount of silicon material. Photocopy is to irradiate the silicon substrate with ultraviolet light through the template printed with the CPU complex circuit structure pattern. The photoresist in the area exposed by ultraviolet light is dissolved, and the area that does not need to be exposed is masked. The etching uses ultraviolet light of very short wavelength with a large lens. The short wavelength light shines on the photoresist film through the hole of the quartz mask to expose it. The exposed silicon will be bombarded by atoms, making the exposed silicon substrate partially doped, thus changing the regional conductive state to produce N wells or P wells. Repetition and layering are to process a new layer of circuit, grow silicon oxide again, deposit a layer of polysilicon, coat photoresist, repeat the process of photocopying and etching, and obtain a groove structure containing polysilicon and silicon oxide. Repeat it for many times to form a 3D structure. Packaging is to enclose the CPU in the form of a wafer into a ceramic or plastic enclosure for installation on the circuit board. Finally, the CPU is tested many times to check whether there are any errors and in which step these errors occur. Energy saving technologies in production are mainly the following two hot spots.

2.2.2.1 Advanced technology processing

Manufacturing process refers to specific semiconductor manufacturing process and its design rules. Different processes mean different circuit characteristics. Generally, the smaller the process node, the smaller the transistor, the faster the speed, and the better the energy consumption performance. We often say 7 nm, 3 nm, etc., refer to the gate length that controls the electron flow inside the transistor. A transistor is composed of a source, a drain, and a gate. The flow of current from the source to the drain through the gate represents that the transistor is on, and the gate is a switch that controls whether it is on or off. The advanced manufacturing process is to make the switch as narrow as possible. When the gate is narrow, the required control voltage will also be reduced, thus reducing the power consumption and heating problem of the transistor switch as a whole. At present, Intel still uses the 14 nm process, while AMD uses the more advanced 12 nm process of GlobalFoundries and the 7 nm process of TSMC. TSMC's 7 nm technology means that AMD can build cheaper, faster and denser chips, integrate more cores and still maintain power consumption within a specific range.

2.2.2.2 Advanced packaging technology

With the continuous improvement of the integration and complexity of integrated circuits, the design and manufacturing costs of traditional system on chip (SoC) have risen sharply, which makes it difficult to maintain Moore's Law. More and more people attach importance to the advanced packaging technology chiplet, which can improve the working speed of the chip operation process while reducing production costs.

Chiplet technology is a new chip design method after SoC integration has developed to a certain extent. It divides SoC into smaller chips, interconnects these modular small chips (bare chips), and uses new packaging technology to package small chips made with different functions and processes together to become a heterogeneous integrated chip. At present, chip manufacturers have made some research on the chiplet technology. Intel introduced a 3D CPU platform by using a chip interconnection method called Foveros, which introduced a 3D stack design for the CPU processor in 2019. It can stack chips on the chip and integrate chips with different processes, structures and uses [12].

E1KOBZ_2023_v17n10_2682_f0002.png 이미지

Fig. 2. Intel 2.5D and 3D packaging using Intel bridge and Foveros technology.

AMD's first generation EPYC has one to four homogeneous processor chips. It divides different product models by increasing or decreasing the number of CPUs and cache capacity. In 2019, it began to use chiplet to produce the second generation EPYC (Rome) with a heterogeneous design, separating the CPU and cache from all I/O functions including memory controllers. Eight CPUs are connected to the central I/O chip through Infinity Fabric technology [13]. 86% of the second generation EPYC computing complex chips use CPU and L3 SRAM memory, and use 7 nm technology, analog circuits and I/O units. It is an important representative of high-performance desktop computers and server processors. At present, AMD uses a technology called "3D V-Cache" (3D stacking technology), which is based on Micro Bump 3D technology, combined with Silicon Through Hole (TSV), and applies the concept of Hybrid Bonding, so that the distance between micro bumps is only 9 microns.

TSMC integrated its front-end chip stacking technology and back-end packaging technology into a new system level integration project and registered it as 3D Fabric. In the front end, TSMC provides both chip on wafer (CoW) and wafer on wafer (WoW), that is, integrated system on chip (SoIC). On the back end, TSMC's LSI (local silicon interconnection) is similar to Intel's EMI or embedded multi-chip interconnection bridge [14].

Recently, Intel, AMD, ARM, Qualcomm, TSMC, Samsung, ASE, Google, Meta, Microsoft and other industry giants established a Chiplet Standards Alliance, and formulated the universal Chiplet Interconnect Ex press (hereinafter referred to as "UCIe"), a high-speed interconnection standard for general-purpose Chiplets, to jointly create a Chiplet interconnection standard and promote open ecology.

2.2.3 CPU green low carbon operation

The computer system manages hardware resources through the operating system to achieve system resource allocation and scaling. At present, the software optimization technology of computer system energy efficiency is mainly implemented at the operating system level. The more common technologies are DPM (dynamic power management) technology and DVFS (dynamic voltage frequency scaling) technology [15]. DPM technology aims at the disadvantage that the traditional computer system power management system only has two states: suspend and resume, and all hardware devices can only be turned off or on at the same time. Peripherals can be dynamically managed according to the running state of the device (idle or running), such as turning off the display and other devices only when the CPU is running, to achieve the purpose of energy conservation. DVFS adjusts the processor frequency and system voltage as needed to avoid the waste of computing resources and energy consumption caused by the one size fits all frequency scaling mode according to the different computing power required by computers when running computing tasks. The energy efficiency optimization ideas of DPM and DVFS are the same, both of which are realized through on-demand scaling and allocation of hardware resources, but there are some differences in implementation details.

Table 1. Identical points and Differences between Dynamic Power Management DPM and Dynamic Voltage Frequency Scaling DVFS

E1KOBZ_2023_v17n10_2682_t0001.png 이미지

2.2.3.1 Power management

In general, the lower the voltage and time pulse speed provided by the power supply, the lower the power consumption, but performance will also suffer. Therefore, the latest microcontroller starts to use intelligent power management unit to automatically adjust the working voltage and time pulse speed to match the workload. The basic idea of power management is to independently adjust the power supply voltage and time pulse speed of different parts of the chip, so as to match its workload at any specific point in time, and turn off unused circuits. The power management unit is usually built as a state machine module, which can selectively reduce the voltage and clock speed of non-critical functions.

2.2.3.2 Dynamic energy-saving technology

CPU dynamic energy saving technology is one of the mainstream technologies for reducing server power consumption [16]. On the one hand, by selecting power management strategies with different idle states of the system, you can reduce server power consumption to different degrees; on the other hand, the lower power consumption strategy means that the CPU wakes up more slowly, which has a greater impact on performance. For applications with high delay and performance requirements, the internal master clock of the CPU, the Bus Interface Unit, and the Advanced Programmable Interrupt Controller (APIC) can be stopped through software, but still run at full speed.

3. Research on carbon efficiency per server model

3.1 Energy consumption of server

The research data of the Open Data Center Committee (ODCC) shows that in 2020, the total energy consumption of China's data centers was 93.9 billion kWh, and the carbon emissions will be 64.64 million tons. It is estimated that by 2030, the total energy consumption of China's data centers will reach about 380 billion kWh, and the growth rate of carbon emissions will exceed 300% [17]. As an important part of the energy consumption of the data center, the carbon emissions of servers cannot be underestimated. According to CAICT, in 2021, the number of servers in China’s data center was 19 million, and the power consumption was 110 billion kWh. The annual carbon emission of each server was about 2600KG.

Carbon emissions are strongly related to server power consumption. When analyzing the relationship between carbon emissions and computing power, we can first combine computing power and power consumption. The research team counted the mainstream models in the 2017-2021 data center server market, calculated the server thermal design power (TDP) and computing power level according to the market share, and calculated the TDP value of the unit computing power.

In this paper, the average floating-point operations per second (FLOPS) of single server is used to evaluate general computational power output by CPU in the data center, and the unit is TFLOPS (FP32).

CPUtheoreticalFP = CPU core number * CPU Clock Sped * flpos       (1)

\(\begin{aligned}\text {Unit computing force TDP}=\frac{\text { server } T D P}{C P U_{\text {theoreticalFP }}}\end{aligned}\)       (2)

where CPUtheoreticalFP is calculated from the server CPU related parameters, and the server TDP comes from the official website of mainstream server chip manufacturers.

By analyzing the TDP value of unit computing power in different years, it can be seen that: on the one hand, with the improvement of server performance, the TDP of unit computing power is gradually declining; On the other hand, server manufacturing and chip technology upgrading have reduced server energy consumption, but their effects are limited, and the decline rate of TDP per unit computing power of the server has gradually slowed down. As shown in Fig. 3, the TDP of unit computing power continues to decline, but the downward trend has slowed down significantly since 2019.

E1KOBZ_2023_v17n10_2682_f0003.png 이미지

Fig. 3. 2017-2021 Server TDP per unit CPU computing power.

3.2 Research on server computing power and energy consumption

SPEC testing is a basic test to comprehensively measure server performance. It is one of the industry standard and authoritative benchmark tests. In terms of energy consumption testing research, the most widely used server equipment energy consumption models in the data center mainly include additive model, system utilization based model and other server power models. The additive model refers to formalizing the energy consumption of the whole server into the sum of the energy consumption of the server substructure. The core idea is to combine the fitted local nonparametric functions to establish the target model. Therefore, the additive model can be simply regarded as an improved version of linear regression. The simple version of the model considers the energy consumption of CPU and memory. The model is:

𝐸(𝐴) = 𝐸CPU(𝐴) + 𝐸memory(𝐴)       (3)

where 𝐸CPU(𝐴) and 𝐸memory(𝐴) respectively represent the energy consumed by CPU and memory when running Algorithm A [18]. The energy model for an algorithm A is a weighted linear combination of the work complexity of A and the number of “parallel” accesses to the memory.

The slightly complex model has refined and improved this, mainly focusing on taking more server energy consumption components into account in the model, such as disks, I/O devices, network cards, etc [19-[21]. In addition, some studies also take into account the energy consumption of the server motherboard, or directly add this energy consumption to the model as a constant [22].

In addition to the additive model, the other most commonly used server energy consumption model is based on system utilization. Generally, the energy consumption of the server system is composed of static energy consumption and dynamic energy consumption. Considering that CPU is the most energy consuming component in each subsystem of the server, the utilization rate of CPU is usually taken as the variable of the energy consumption model of the server system. This type of model first included the clock frequency of CPU operation as a variable in the energy consumption model for calculation, which can be seen as an extension of the basic digital circuit level power model [23]. The CPU energy consumption can be formalized as:

P = C0 + ACV2f       (4)

where 𝐶0 is the static power of the 𝐶PU, 𝐴CV2𝑓 is its dynamic power, 𝐴 is the conversion coefficient, 𝐶 is the capacitor, 𝑉 is the voltage, and f is the clock frequency. 𝐴, 𝐶, 𝐶0 are constants for specific hardware. 𝑉 is proportional to f, so the dynamic energy consumption of 𝐶PU can be considered as a cubic relationship with its clock frequency, and because f is proportional to the system operating speed 𝑠𝑠 [24], the relationship between dynamic power 𝑃 and operating speed 𝑠(𝑠 > 0) is established as follows:

𝑃(𝑠) = σ + 𝑢s𝛼       (5)

where σ is static power, 𝜇 and 𝛼 is a constant, related to specific hardware devices, 𝛼 > 1. Another common way of such models is to estimate the power of each component of the system and obtain the functional relationship between the server and various resource utilization rates by means of linear regression. However, this method based on regression analysis requires a lot of experiments on specific servers to get the energy consumption parameters of the corresponding servers. In addition to the above two models, there is also a widely used utilization-based power model. The model proposer found that the linear power model can more accurately track the power usage of the server system. Under the assumption that the power of the server is approximately 0 when it is turned off, the full system power of any server under any 𝐶PU utilization 𝑢 can be formalized into an expression:

Pu = (Pmax - Pidle)u + Pidle       (6)

where 𝑃max and 𝑃𝑖d𝑙e represent the average power of the server in full rate operation and idle state respectively. The evaluation model used in this calculation includes three main components: CPU, memory and storage, and also takes into account the operating conditions under different loads. It combines the additive model and the utilization model, which can comprehensively measure the energy efficiency of the server during operation and the carbon emissions.

Furthermore, there are various other server power models. In a work on operational state-based power model, the researchers concluded that power consumption of a server at a certain workload is determined solely by the performance setting and is independent consumption in former periods [25].

P(k) = At(k) + B       (7)

where A and B are two system parameters. p(k) refers to the power consumption of server in kth period, and t(k) is the performance state of the processors in the kth control period [26].

3.3 Research on Carbon Emission per Performance of Server

3.3.1 Design of server computing power carbon efficiency

At present, there is no fundamental model for server computing power carbon efficiency. Most of the research focuses on the construction of server energy efficiency models and the selection of energy-saving methods. This paper creatively proposes the concept of server computing power carbon efficiency, and constructs a model combines server computing power performance and carbon emissions. The definition of server computing power carbon efficiency is the ratio of the carbon emissions generated during the service life of the server to the computing performance provided, that is, the Carbon emissions per performance of Server (CEPS). Carbon emission per performance of Server model is as follows:

\(\begin{aligned}\text {CEPS} =\frac{C }{s}\end{aligned}\)       (8)

where 𝐶 is the carbon emission; 𝑆 is the server computing performance.

3.3.2 Definition of carbon emissions

Carbon emission 𝐶 is the product of power consumption of server service life (based on five years) and carbon emission factor. Hence it can be defined as

C = 𝜂 ∗ 𝐸𝑢s𝑒       (9)

where 𝜂 is a carbon emission factor. According to “the Average Carbon Dioxide Emission Factors of China's Regional Power Grid in 2011 and 2012” issued by the National Climate Strategy Center, 𝜂 equals to 0.6808 KGCO2/kWh. 𝐸𝑢s𝑒 is the power consumption of the server under the specified load pressure. According to the calculation method of server power consumption in the service phase in ODCC whitepaper, and taking into account the server general benchmark test scenario, the power consumption data coversthe server CPU, memory, storage IO under the specified load pressure is used. The Benchmark of Server Energy efficiency (BenchSEE) energy efficiency test benchmark tool is used to test the server under 7 types of CPU general loads, 2 types of memory general loads, and 2 types of storage general loads. So as to calculate the energy consumption of the whole life cycle, the calculation formula is as follows:

𝐸𝑢s𝑒 = 𝑃𝑠erver ∗ 24 ∗ 365 ∗ 5       (10)

BenchSEE, a load tool for server energy efficiency testing, is a benchmark software for server product energy efficiency testing led by the Resources and Environment Branch of China National Institute of Standardization. BenchSEE benchmark design has referred to the opinions of many server manufacturers, chip manufacturers, energy efficiency certification institutions and scientific research institutions in the field of IT energy conservation around the world, which can meet the needs of server market applications for energy efficiency evaluation. 𝑃server can be calculated by 𝑊power consumption/𝑇 according to the sum of 𝑊𝑝ower consumption(in kWh) and the sum of 𝑇 measurement time (in h) of energy consumption under 11 benchmark working conditions and various load conditions under BenchSEE's 7 CPU general loads, 2 memory test benchmarks and 2 storage IO read/write benchmarks. Referring to the product life cycle, the standard period of server carbon emission accounting is 5 years.

3.3.3 Definition of Server computing performance

According to the concept of server computing power carbon efficiency - server computing power carbon efficiency is defined as the ratio of carbon emissions generated and computing power performance provided by the server, that is, the carbon emissions per unit computing power performance of the server. For the computing power performance of the server, this paper first theoretically calculates the computing power performance of the server's service life (based on five years), which is defined as 50% of its theoretical peak floating point computing power, as shown in the following formula:

𝑆theoretical = 𝐶PUtheoreticalFP ∗ 60 ∗ 60 ∗ 24 ∗ 365 ∗ 5 ∗ 50%       (11)

Then we tested the actual value of the server computing power, and used SPEC (an industry standard 𝐶PU intensive benchmark suite that can test the computing performance of users when using different applications, and give a comprehensive score) to test. SPECrate 2017 is used to test the integer concurrency rate and floating-point concurrency rate, and SPECrate 2017 Integer and SPECrate 2017 Floating scores are weighted by 50% and 50% to get a score representing the computing performance. The specific formula is as follows:

𝑆tes𝑡 = 50% ∗ 𝑆PECrate2017lnt + 50% ∗ 𝑆PECrate2017Float       (12)

The mainstream server chip products in the industry are selected for the test. In this paper, the test results of different types of server CPUs in the latest series of AMD and Intel are selected for display and analysis. See the following table for specific server CPU models and parameters.

Table 2. The latest server CPU models and parameters of Intel and AMD

E1KOBZ_2023_v17n10_2682_t0002.png 이미지

3.4 Model experiment and test result analysis

3.4.1 Analysis of the theoretical server computing power carbon efficiency

It is estimated that the carbon dioxide emissions of AMD and Intel's latest servers are between 8 tons and 21 tons, and the theoretical computing power level is between 22-93 EFLOPS within the 5-year service life of the servers. Analyzing the theoretical server computing power carbon efficiency, as shown in Fig. 4, it can be seen that the theoretical value of server computing power carbon efficiency is between 200-450KG/EFLOPS, and the theoretical value of carbon efficiency of high-end products is less than that of low-end products; Intel accounts for a relatively large proportion of products whose theoretical carbon efficiency value is less than 300KG/EFLOPS, which may be related to the fact that the current AMD server chip does not support the AVX-512 instruction set. According to the documents publicly released by Intel, the fundamental frequency of Intel's CPU will be reduced to a certain extent when running AVX512 instruction set. There are still different views on the calculation of theoretical computing power performance of AVX512 instruction set environment computing CPU.

E1KOBZ_2023_v17n10_2682_f0004.png 이미지

Fig. 4. The theoretical value of computing power carbon efficiency during the server usage phase.

3.4.2 Analysis of the measured value of CEPS

FLOPS is the representative of the theoretical computing power of the server. In the application process of the data center, the measured value is more valuable for reference. Therefore, this paper mainly focuses on the analysis of measured results. It is recommended that the measured results of computing power and carbon efficiency can be used as the indicators for the analysis of the actual application of the server. According to the test results, the correlation between SPEC scores and carbon emissions is explored, as shown in Fig. 5:

E1KOBZ_2023_v17n10_2682_f0005.png 이미지

Fig. 5. Correlation between SPEC scores and carbon emissions during server use.

With the increase of SPEC score, the total carbon emissions show an upward trend, but the speed of performance increase exceeds that of carbon emissions, that is, the slope of the correlation curve between computing performance score and carbon emissions is getting smaller and smaller. The data center can choose servers with appropriate performance according to its own business needs to minimize the overall carbon emissions of the data center. Take SPEC score 8000 as an example, replace 21 Intel® Xeon® Gold 6342 can use 16 Intel® Xeon® Platinum 8380 or 11 AMD EPYC™ 7763, the number of servers can be reduced by up to 10, and the carbon emissions during the service life can be reduced by up to 43%, which is equivalent to the carbon emissions absorbed by more than 8100 trees in a year.

Due to different technologies used by different manufacturers, there are certain differences in products, and the relationship between performance and carbon emissions presents different characteristics. Chip manufacturers are committed to promoting the development of computing towards green and energy saving.

E1KOBZ_2023_v17n10_2682_f0006.png 이미지

Fig. 6. Measured value of force and carbon efficiency calculated in the service phase of the server.

Based on the calculation results, starting from the definition of server computing power carbon efficiency, it is found that the better the CPU performance is, the better the computing power it can provide, the more energy it consumes, and the more greenhouse gas emissions it will bring, but the server computing power carbon efficiency, that is, the carbon emissions per unit computing power performance, will decrease. The line chart demonstrates that in the 5 years, service life of the server, the carbon emissions of the unit computing performance score are between 20-60 KG, and the carbon emissions of the AMD server unit computing performance score are relatively lower, almost all lower than 30 KG.

4. Development prospect

4.1 Energy consumption of IT equipment is the core element of energy conservation and carbon reduction in the future

The energy efficiency measurement standard PUE is widely used in the data center industry at home and abroad. It is defined as the ratio of the total energy consumption of the data center to the energy consumption of IT equipment. The lower the PUE value is, the lower the energy consumption of non-IT equipment in the data center is. However, PUE indicators directly reflect the relative ratio between the total energy consumption of the data center and IT equipment, and cannot reflect whether the total power consumption of the data center is reduced. Through the analysis of the proportion of IT equipment energy consumption, cooling system energy consumption, power supply and distribution and auxiliary lighting energy consumption in the data center, it can be seen that the energy consumption of IT equipment accounts for about 50% to 60% of the data center's energy consumption [27].With the improvement of IT equipment energy efficiency, the power consumption of IT equipment itself will be reduced, and the energy consumption of its corresponding power supply and cooling equipment will also be reduced, which will lead to the reduction of the total energy consumption of the data center, and achieve cost reduction and efficiency increase of the data center.

4.2 Server computing power carbon efficiency should be considered when selecting server equipment

The performance and price-performance of computer systems have been one of the core concerns of customer purchases. With rising energy costs and growing demand for computing power, electricity bills have become a big expense for data centers [28]. Currently data centers should pay more attention to computing performance when selecting server equipment to ensure that the equipment can provide computing power for the business. The improvement of server performance also brings higher power consumption and carbon emissions. As the core IT equipment in the operation process, the carbon footprint and computational carbon efficiency performance of server products in their life cycle are critical to the overall operation of greenhouse gas emission control. At present, government agencies, social organizations and enterprise investors have gradually taken the quantitative impact on the environment as one of the core evaluation indicators. It is estimated that the server energy consumption accounts for more than 90% of the IT equipment in the data center. In the future, it is necessary to conduct research on the process and intelligent power consumption management of each component of the server (CPU, memory, fan, disk, network card, motherboard components, etc.), find breakthrough points in each link, improve the carbon efficiency level of the server computing power, and finally minimize the carbon emissions of the data center.

4.3 Strengthen research on carbon efficiency model matching different business scenarios

Computing resources have penetrated into all aspects of life, and infrastructure and application have gradually enriched. Computing infrastructure applications cover high-performance computing, edge computing, intelligent computing, general computing and other scenarios. Based on the current market shipment, this research mainly conducts research and analysis on the general scenario of the X86 architecture CPU server through BenchSEE and SPEC test benchmark software, and will subsequently test and analyze other architecture servers such as ARM. The two software designs dozens of different stress scenarios for CPU, memory and storage, but still cannot fully meet the current diversified application requirements. In actual use, the performance score and carbon emission data based on these two test benchmarks cannot actually reflect the performance of servers in all scenarios. In the future, it is necessary to promote the research on data center efficiency models under different business scenarios, which will lay a better theoretical foundation for energy efficiency optimization.

참고문헌

  1. Y. Wang, Y. X. Zhang, and J. Li, "Analysis and prospect of low-carbon development of data centers," Communications World, 15, 42-44, 2021.
  2. Wilbanks, L, "Green: My favorite color," IT professional, 10(6), 64-64, 2008.  https://doi.org/10.1109/MITP.2008.122
  3. Harmon, R. R., & Auseklis, N, "Sustainable IT services: Assessing the impact of green computing practices," in Proc. of PICMET'09-2009 Portland International Conference on Management of Engineering & Technology, pp. 1707-1717, August 2009.
  4. FBoccaletti, G., Loffler, M., & Oppenheim, J. M., "How IT can cut carbon emissions," McKinsey Quarterly, 37, 37-41, 2008.
  5. Y. R. Liang, M. Lei, L. Liu, Y. M. Cao, and Q. Liu, "Research on Application of Customized Server for the Whole Cabinet," Telecom Power Technology, 2020(01), pp. 190-191, 2020. 
  6. L. N. Xie, and L. Guo, "Discussion of liquid cooling technology and development," Information and Communications Technology and Policy, vol. 45(02), pp. 22-25, 2019.
  7. Stanford, E., "Environmental trends and opportunities for computer system power delivery," in Proc. of 2008 20th International Symposium on Power Semiconductor Devices and IC's, pp. 1-3, May 2008.
  8. Wang, D., "Meeting green computing challenges," in Proc. of 2008 10th Electronics Packaging Technology Conference, pp. 121-126, December 2008.
  9. Judge, J., Pouchet, J., Ekbote, A., & Dixit, S., "Reducing data center energy consumption," Ashrae Journal, 50(11), 14-26, 2008.
  10. V. W. Freeh, D. K. Lowenthal, "Using multiple energy gears in MPI programs on a powerscalable cluster," in Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, pp. 164-173, June 15-17, 2005.
  11. L. Tong, "Analysis of energy-saving technology of various components of the server," Communication & Information Technolog, (03), pp. 86-88+104, 2015.
  12. B. Bailey, "Chiplets for the masses," Semi-conductor Engineering, 2021-03-03. [Online] . Available: https://semiengineering.com/chiplets-for-the-masses/
  13. Scansen, D., "AMD, TSMC & IMEC show their Chiplet playbooks at ISSCC," EE Times, 2021, February 26. [Online] . Available: https://www.eetimes.com/amd-tsmc-imec-show-theirchiplet-playbooks-at-isscc/
  14. Y. X. Li, "The state-of-the-art of Chiplet and problems need to be solved," Microelectronics & Computer, vol. 39(05), pp. 1-9, 2022.
  15. W. Luo, "Research on processor energy efficiency optimization for mobile games," Thesis for academic degree of master, Southeast University.
  16. J. Zhao, R. G. Li, and Y. Liu, "Energy-Saving Technologies and Standard Requirements of Server," Information Technology & Standardization, (10), 37-39+59, 2018.
  17. China Academy of Information and Communications, Open Data Center Committee, 2020. Data Center Whitepaper, 2020. whitepaper- results release - ODCC.
  18. S. Roy, A. Rudra, A. Verma, "An energy complexity model for algorithms," in Proc of the 4th Conf on Innovations in Theoretical Computer Science, New York, ACM, pp. 283-304, 2013. 
  19. A. Lewis, J. Simo, N. F. Tzeng, "Chaotic attractor prediction for server run-time energy consumption," in Proc of the Int Conf on Power Aware Computing and Systems, Berkeley, CA, USENIX Association, pp. 116, 2010.
  20. A. W. Lewis, N. F. Tzeng, S. Ghosh, "Runtime energy consumption estimation for server workloads based on chaotic time-series approximation," ACM Transactions on Architecture and Code Optimization, vol. 9, no. 3, pp. 1-26, 2012. https://doi.org/10.1145/2355585.2355588
  21. Song, S. L., Barker, K., & Kerbyson, D., "Unified performance and power modeling of scientific workloads," in Proc. of the 1st international workshop on energy efficient supercomputing, pp. 1-8, November 2013.
  22. A. Chatzipapas, D. Pediaditakis, C. Rotsos, et al., "Challenge: Resolving data center power bill disputes: The energy- performance trade-offs of consolidation," in Proc of the 6th ACM Int Conf on Future Energy Systems, New York, ACM, pp. 89-94, 2015.
  23. R. Ge, X. Feng, W. C. Kirk, "Performance- constrained distributed DVs scheduling for scientific applications on power-aware clusters," in Proc of the 18th ACM/IEEE Conf on Supercomputing, Piscataway, NJ, pp. 34-45, 2005.
  24. S. Albers, "Energy-efficient algorithms," Communications of the ACM, vol. 53, no. 5, pp. 86-96, 2010. https://doi.org/10.1145/1735223.1735245
  25. Lefurgy, C., Wang, X., & Ware, M, "Server-level power control," in Proc. of Fourth International Conference on Autonomic Computing (ICAC'07), pp. 4-4, June 2007.
  26. Dayarathna, M., Wen, Y., & Fan, R, "Data center energy consumption modeling: A survey," IEEE Communications Surveys & Tutorials, 18(1), 732-794, 2016. https://doi.org/10.1109/COMST.2015.2481183
  27. Z. G. Li, "Energy consumption analysis and energy saving research of data center IT equipment," Science Technology and Industry, (04), pp. 124-126+153, 2014.
  28. Poess, M., & Nambiar, R. O, "Energy cost, the key challenge of today's data centers: a power consumption analysis of TPC-C results," Proceedings of the VLDB Endowment, 1(2), 1229-1240, 2008. https://doi.org/10.14778/1454159.1454162