DOI QR코드

DOI QR Code

Composite Measures of Supercomputer Technology

  • Kim, Nam-Gyu (Division of National Supercomputing, Korea Institute of Science and Technology Information) ;
  • On, Noo Ri (Division of National Supercomputing, Korea Institute of Science and Technology Information) ;
  • Koh, Myoung-Ju (Division of National Supercomputing, Korea Institute of Science and Technology Information) ;
  • Lee, JongSuk Ruth (Division of National Supercomputing, Korea Institute of Science and Technology Information) ;
  • Cho, Keun-Tae (Department of Management of Technology, Sungkyunkwan University)
  • Received : 2019.03.03
  • Accepted : 2019.06.06
  • Published : 2019.08.31

Abstract

We have developed composite measures of supercomputer technology, reflecting various factors of supercomputers using Martino's scoring model. CPUs, accelerators, memory, interconnection networks, and power consumption are chosen as factors of the model. The weight values of the factors are derived based on a survey of 129 domestic and international experts. The measured values are then standardized to integrate measurement units of the factors in the model. This model has been applied to 50 supercomputers, and rank correlation analysis was performed using representative measures. As a consequence, the ranking drastically changes except for the 1st and 2nd supercomputers on the TOP500. In addition, the characteristics of memory and interconnection networks influence the ranking, and the results demonstrate that the proposed model has low correlations with HPL and HPCG but a high correlation with Green500. This indicates that power consumption is an important factor that has a significant effect on the measures of supercomputer technology. In addition, it is determined that the differences between the HPL ranking and the proposed model ranking are influenced by power consumption, CPU theoretical peak performance, and main memory bandwidth in order of significance. In conclusion, the composite measures proposed in this study are more suitable for comprehensively describing supercomputer technology than existing performance measures. The findings of this study are expected to support decision making related to management and policy in the procurement and operation of supercomputers.

Keywords

1. Introduction

Scientific computing or computational science and engineering is a multidisciplinary field comprising algorithms, software, computers, information technology, and computinginfrastructure required to understand and solve complex problems in science and engineering [1,2,3]. Supercomputers are computer systems capable of the most powerful computation performance for solving such problems. Therefore, the performance of supercomputers is animportant concern for researchers and organizations that own and operate such computersystems [3,4].

The TOP500 Project, which began in 1993, ranks the top 500 supercomputers in the world and announces the ranking twice a year based on the results of the benchmark programHigh-Performance LINPACK (HPL) [5,6]. HPL is easy to measure, use, and update because itindicates the performance of supercomputers based on the number of floating-point operationsper second (FLOPS). However, accelerators, such as GPU and many-core processors, arecurrently being employed for data-based artificial intelligence studies such as deep learning and big data analysis studies that require massive data processing [7]. In the life sciences field, memory capacity and memory bandwidth limitations present considerable challenges as asignificant amount of memory is needed for large-scale genetic analysis [7,8]. Consequently, HPL has been strongly criticized for low correlations with the operating performance of applications because the measurements of HPL do not reflect the characteristics of otherimportant factors of supercomputers [9-12].

The High Performance Conjugate Gradients (HPCG) benchmark that complements HPLreflects the characteristics of data access patterns of current applications. However, HPCGexhibits a shortcoming in which benchmark results are greatly affected by actual memory band width [11,12]. In addition, other supercomputer benchmarking projects for specific purposes are in progress, such as Graph500, which focuses on supercomputer performancemeasurements for data-intensive applications, and Deep500, which is intended for performance measurement of deep learning applications.

Meanwhile, as supercomputer performance increases, power consumption also increases, raising concerns associated with “The Energy Crisis in Supercomputing” [13,14]. Therefore, power consumption is a critical factor that must be considered in the procurement and operation of supercomputers, and as a consequence, accelerators are increasingly being used in many supercomputers because of their power efficiency and high parallel computing performance characteristics [15-20]. The Green500, which measures the energy efficiency of supercomputers based on FLOPS per watt, has been launched through the TOP500 Project[21]. This means that Green500 denotes the HPL measurement of per unit power, therefore, italso contains the disadvantages of HPL.

This study proposes composite measures of supercomputer technology using a scoring method that indicates the technology level of supercomputers by comprehensively considering the utilization characteristics of users and the primary factors of supercomputers. This papersuggests that the model proposed in this study is suitable for comprehensively describing thetechnology level of supercomputers compared to existing simple performance benchmarkprogram-based indicators, and that this can help to inform management decisions and policies for procurement and operation of supercomputers.

The remainder of this study is organized as follows. Section 2 presents a discussion on prior studies on representative performance measurement methods for supercomputers including their limitations. Section 3 presents the proposed research methods and composite measures of supercomputer technology. Section 4 evaluates supercomputers based on the proposed composite measures and compares the ranking order of the supercomputers derived from the proposed model with the existing ranking orders of supercomputers using traditional benchmarks. Section 5 summarizes the research results and presents a discussion on the limitations and future works.

2. Prior Studies on Supercomputer Performance Measurement

The performance of supercomputers is a critical concern for researchers who use them forsolving complex and large problems in science and engineering. Various benchmarks have been developed to measure the performance of supercomputers, but the most widely used benchmarks are HPL and HPCG.

HPL is a benchmark program that determines the solution to Ax = b, which denotes alarge-scale dense matrix problem of a linear equation. The performance of HPL is determined by the 64-bit floating-point operation used in multiplication of the dense matrix, which is amajor calculation in the methodology of the benchmark program [5,22]. The FLOPS value, which is obtained from HPL, is used as a measure of supercomputing performance in the TOP500 Project, which presents a list of the top 500 fastest supercomputers in the world since 1993. When HPL was initially applied in the TOP500 Project, most benchmark applications were based on the dense matrix calculation, which is similar to HPL. However, the majority of current applications compute differential equations that require high memory bandwidth andirregular data access. As a consequence, there is a low correlation between the performances of HPL and the application [22,23,24].

To complement this disadvantage of HPL, HPCG was developed by Jack Dongarra and theresearchers who developed HPL. HPCG is a benchmark program that solves apartial-differential equation in a three-dimensional grid discretized with a 27-point stencil. Similar to HPL, HPCG determines the solution to Ax = b, but unlike HPL, the matrix A is sparse. The solution process uses the conjugate gradient algorithm and the Gauss-Seidel preconditioner, unlike HPL [23]. HPCG compensates for the shortcoming of the HPL by reflecting the characteristics of data access patterns seen in current applications. The result of HPCG depends on the actual bandwidth of memory because HPCG requires an excessivelylarger memory bandwidth compared to real applications, which is a shortcoming of HPCG[13,14].

Compared to supercomputer performance evaluation obtained from benchmark programs, which comprise approaches based on the concept that performance is denoted by calculationspeed, the Green500 Project focuses on the power efficiency of supercomputers and derived the ranking based on FLOPS per watt. Because the increase in calculation performance of supercomputers has caused a rapid increase in power consumption, power efficiency presents an important factor that must be considered when procuring and operating supercomputers, leading to the concept of “Green Supercomputing” [13,14]. For example, the Earth Simulatorin Japan, which ranked no.1 for two years from 2002, incurred an annual cost of 10 million US dollars (USD) to run at 12 MW power consumption [13]. Moreover, Nurion, the 5th generationsupercomputer introduced at KISTI in Korea in 2018, incurs a peak power consumption of 3809 kW, and the annual electricity cost incurred by Nurion is estimated to be 4.4 billion KRW. In general, 1MW-year worth of electricity costs approximately one million USD, and power consumption costs are increasing uncontrollably as the performance of supercomputersincreases, owing to competition among developed countries [13,14,19,20]. However, Green 500 is also a straightforward metric that presents the computation performance per unit power by dividing the HPL benchmark results of supercomputers by their power consumption. Therefore, Green500 also includes the shortcomings of HPL.

3. Research Method and Model

The HPL, HPCG, and Green500 in the TOP500 project are largely influenced by specificcomponents of supercomputers and hold the disadvantage of not reflecting the features of various supercomputer components. Supercomputers consist of vairous factors, and supercomputer technology depends on the combination of these factors. In addition, there is no specific analysis method that combines factors for composite measures of supercomputertechnology. Therefore, developing comprehensive composite measures that overcome the limitations of previous studies is essential. In this paper, we use Martino's general scoring model, which has been widely used to develop composite measures based on a combination of several factors [25,26].

To develop a scoring model, factors must be first identified. All factors that are necessary for the technology to function properly are listed. Duplicate factors are excluded and onlymeasurable factors are chosen. Similar factors are then grouped together. Weightings are thenassigned to each factor by considering the relative importance of the grouped factors. Standardization is then performed to convert all units to the same scale, which can be used to evaluate and measure all factors.

3.1 Selection and Identification of Factors

In this paper, we focus on supercomputer hardware to measure the technology level of the supercomputer regardless of its specific applications. Therefore, we select several major hardware components of the supercomputer as factors for the scoring model. Major hardwarecomponents of the supercomputer such as processor, memory, interconnection networks, I/O system, and storage are considered based on several previous studies [4,12,27,28]. As described above, the performance of a factor must be objectively measurable. Because most supercomputer manufacturers do not publish information on the internal I/O systems in the computer node, the I/O systems were excluded. Storage systems were also excluded because itis difficult to measure the storage capacity used by a specific HPC system, because several supercomputers use several storages in most organizations. In conclusion, we selected processors, memory, and interconnection networks as the major factors, which can provide objective data. Data on them are available from the TOP500 project, which includes history and credibility as supercomputer statistical data repository dating back to 1993. Furthermore, to reflect recent trends, processors are divided into CPUs and accelerators such as graphicsprocessing units (GPUs) and a many integrated core (MIC) architecture.

In addition, as described in the previous section, the power consumption of supercomputershas been added as a major factor because it is an important consideration for supercomputerprocurement and operation. Information on accelerators and power consumption are also included in the TOP500 project.

3.2 Weights of the Factors

To derive the weights assigned to the major factors presented in this paper, we conducted asurvey on the importance of major supercomputer components to domestic and foreign researchers. In total, 129 questionnaires were collected during supercomputing conferences from computational science and engineering researchers who use high-performance computers (cluster, cloud, or supercomputers). Fig. 1 presents the descriptive statistics of the survey respondents.

Fig. 1. Respondents’ affiliations and regions

Based on the survey, we performed multiple ranked response analysis to determine the importance of CPUs, accelerators, memory, and interconnection networks. The frequencies of the 1st, 2nd, 3rd, and 4th rankings based on the importance of each factor are calculated, and the significance is assigned to each ranking multiplied by 4, 3, 2, or 1. The proportions of sums are used as weights [29]. Table 1 summarizes the results.

Table 1. Rankings and weights based on importance of major supercomputer components

In addition, we consider the performance of the supercomputer in terms of its effectiveness while considering power consumption in terms of costs. We then derive the weights for performance and power consumption. The procurement costs and electricity costs of the top10 supercomputers among 50 target supercomputers were calculated. Procurement costs weresourced through the Internet, and the electricity cost is estimated at 1 million USD per1MW-year [11-14]. As described above, because supercomputer performances are increasingrapidly, the lifetime of supercomputers is considered 5 years, so the electricity cost has beencalculated for 5 years. [4,7]. Therefore, the ratio of the cost of procurement and operation is determined by assuming that the sum of the procurement price and the electricity cost for 5 years is the total cost of each supercomputer. Based on the average across the 10 supercomputers, we determined the ratios of procurement cost and operating cost to total cost to be 0.81 and 0.19, respectively. Therefore, the system performance weight( ) is 0.81 and the power consumption weight( ) is 0.19. Table 2 summarizes the results.

Table 2. Procurement costs and electricity costs of the top 10 supercomputers

3.3 Construction of the Model

The composite measure model for reflecting major supercomputer factors is based on the general scoring model developed by Joseph Martino. The model proposed in this study is expressed as follows:

(1)

Here,

In the equation (1), uppercase letters denote factors, and lowercase letters denote the weights applied to the factors. The uppercase letters of the numerator indicate the desirable factors, and the larger the values of factors, the higher the score of the technology. The uppercase letters of the denominator denote costs or other undesirable factors. If these valuesincrease, the score of the technology decreases. Table 3 presents a description of the factors in the model.

Table 3. Factors Considered in the Model

P, A, MS, MB, AS, AB, and I are placed in the numerator because the performance of supercomputers increases as their values increase. PW is placed in the denominator because it denotes the cost of these supercomputers.

I is an overriding factor, which is the most important and is expressed as a multiplication in the equation. The total score is computed as zero when any of these factors is zero.

P and A, which are related to computation performance, and MS, MB, AS, and AB, whichare related to memory performance, are tradable factors. Even if the value of any of these factors is zero, the total score is not zero. These factors can be expressed as a multiplication in the equation after integration with other factors.

The weight values a, b, c, d, α, β, γ, and δ are determined by expert judgment or statistics. Their sum must be 1 because the weight values of the factors need to be normalized. This study also considers a model that did not consider power consumption for comparing therankings of the conventional HPL and HPCG benchmarks. Consequently, the weights aredifferent between the models that consider power consumption and those that do not. Themethod of determining the weight values of factors a, b, c, and d is described in Table 4.

Table 4. Weights of the Model Factors

α, β, γ, and δ are the weight values of tradable factors. The sum of the theoretical peak performance of CPUs and accelerators is the theoretical peak performance of a supercomputer. Therefore, P and A are expressed by an addition, and their weight values α and β are denoted by the proportions of the total theoretical performance. Furthermore, the sum of the mainmemory performance and the accelerator memory performance is the total memoryperformance of the supercomputer. γ and δ indicate the weight values that denote theirimportance, and they are equal to α and β because they are determined by the proportions of the CPU and accelerator that make up the total theoretical performance in the supercomputer. The equations for calculating the weights of the tradable factors are listed in Table 5.

Table 5. Weights between the tradable factors

Finally, we propose the power model (P-Model) by integrating the factors and their weights. On the other hand, to verify the validity of the P-Model, we also proposed the non-powermodel (NP-Model) that does not include the power consumption factor for comparing againstrepresentative supercomputer performance measures. The models are defined as follows:

(P-Model)

(NP-Model)

3.4 Centering and Standardizing the Measured Values

Each factor uses a different measurement unit or measures a different value depending on the measurement unit, even for the same factor. For example, the measurement unit of the CPU is FLOPS, which is different from B/s as the measurement unit of interconnection networks. In addition, the value can differ by 1,000 times depending on whether GFLOPS or TFLOPS are used as the measurement unit of CPU peak performance. Therefore, the measurements of factors with different measurement units and values must be standardized to construct a composite measure. The mean and normal distribution of the measured values must becalculated for each factor, and the measured values of each factor must be standardized based on them. Finally, to center the standardized values onto a scale within the same range, the standardized values were multiplied by 0.7 and then 5 added so that the values fall between 1 and 10.

4. Analysis and Results

4.1 Target Selection

To apply the scoring model proposed in this paper, we selected supercomputers with measured values of HPL, HPCG, and Green 500 on the TOP500 list as of June 2018 to compare ourproposed model with representative supercomputer performance measures. Moreover, Nurion, the latest supercomputer to be introduced at KISTI National Supercomputing Division in Korea, was added to the analysis target. Fifty supercomputers, excluding machines with novalues for the major factors among the selected supercomputers, were finally selected foranalysis.

4.2 Analysis Results

The P-Model was used as the composite measure model of supercomputer technology, and the NP-Model was used for comparisons with representative measures of the selected 50 supercomputers. Table 6 presents the rankings of the 50 supercomputers that were selected based on the P-Model score and the NP-Model score, including their HPL, HPCG, and Green 500 rankings and the differences between the HPL ranking and the P-Model ranking. Summit and Sunway TaihuLight, which ranked the 1st and 2nd in the P-Model ranking, were also ranked the 1st and 2nd in the TOP 500, respectively. These are overwhelminglylarger in scale than other supercomputers because Summit and Sunway TaihuLight havemaximal achieved performance of 122.3 PF and 93 PF on HPL, respectively [6]. Therefore, their scores were much higher than those of other supercomputers. Trinity, which ranked 6thin the HPL ranking, was 3rd in the P-Model ranking. Trinity contains 9,984 nodes, which are equipped with an Intel Xeon Phi Knight Landing with high-bandwidth memory (MCDRAM) with 16 GB capacity and a peak memory bandwidth of over 450 GB/s, nominally five times faster than DDR4. Consequently, Trinity has a large memory capacity of over 2 PB [29]. Trinity was the third-ranked supercomputer in the P-Model because it consists of top-ranked factors such as main memory size, accelerator peak performance, and accelerator memory size, which were ranked 2nd among the 50 supercomputers. Plasma Simulator, which ranked 4th in the P-Model, consists of a Fujitsu PRIMEHPC FX 100 with HMC (Hybrid Memory Cube), which has a high memory bandwidth of 240 GB/s x in/out [6,31]. In addition, this has the T of u2 interconnection network with a high bandwidth of 50 GB/s per node [32]. Thesupercomputer at the Information Technology Center, Nagoya University, which ranked 5th, and the SORA-MA, which ranked 6th in the P-Model ranking, also consist of the same hardware features as Plasma Simulator [33]. Based on the results, the P-Model 4th, 5th, and 6th ranked systems, which have risen significantly from their HPL rankings, have the characteristics of systems in which the main memory bandwidth and the interconnectionnetwork bandwidth are ranked 1st among the 50 supercomputers, respectively.

In other words, the result demonstrates that the model presented in this paper reflects notonly the performance of CPUs and accelerators but also the memory bandwidth that isimportant in the current application, and the interconnection network bandwidth for datatransfer between nodes that are important in massively parallel applications.

In the case of Nurion, the HPL ranking and the HPCG ranking were 8th and 6th in the analysis target 50s, respectively. By contrast, it was ranked 16th on the Green500 ranking, and so the efficiency of power consumption by Nurion was lower than that of othersupercomputers. However, Nurion ranked 12th in the P-Model as the performance and powerconsumption of Nurion were properly reflected.

Table 6. Scores and rankings of the proposed models and representative supercomputer rankings

4.3 Comparisons with Representative Supercomputer Measures

To compare the proposed composite measures with existing measures of supercomputers, arank correlation analysis was performed on the rankings from Table 6 and the results are listed in Table 7.

Table 7. Results of rank correlation analysis

We first examined the ranking of conventional measures of supercomputers. The correlation between the HPL ranking and the HPCG ranking showed a correlation coefficient of 0.93 and a significance level of less than 0.01, indicating a very high correlation [34 ]. Therefore, a supercomputer with a high HPL ranking is expected to have a high HPCG ranking as well, and the explanatory power is 86%. However, the correlation of the HPL ranking and the Green500 ranking showed a correlation coefficient of 0.271 and a significance level of 0.061, indicating low correlation. In addition, the correlation coefficient of the HPCG ranking and the Green500 ranking was not significant. This means that the ranking of a supercomputercan change considerably when power consumption is included in the measures, and the measures that include power consumption have a different explanatory power compared to thesimple performance benchmarks.

The correlation between the P-Model ranking and the HPL ranking showed a correlation coefficient of 0.29 and a significance level of 0.41. In addition, the correlation of the P-Model ranking and the HPCG ranking showed a correlation coefficient of 0.271 and a significancelevel of 0.057. This means that the P-Model ranking is very different from the rankings derived by the HPL and HPCG benchmark results. In contrast, there was a high correlation between the P-Model ranking and the Green500 ranking, and the explanatory power between them was 51.4% as the correlation coefficient between them was 0.717 and the significancelevel was less than 0.01.

The ranking of the NP-Model for comparing with existing measures showed correlations with the HPL ranking and the HPCG ranking, with correlation coefficients of 0.692 and 0.725, respectively and a significance level of less than 0.01. However, there was a low correlation between the rankings of the NP-Model and Green 500, which showed a correlation coefficient of 0.29 and a significance level of less than 0.01.

In summary, there was no correlation of the HPL ranking and the HPCG ranking with the Green 500 ranking because power consumption was included only in the latter measure. Likewise, the correlations of the HPL ranking and the HPCG ranking with the P-Model werevery low because power consumption was only included in the P-Model. By contrast, the correlations of the HPL ranking and the HPCG ranking with the NP-Model were high, as none of the models considered power consumption. This suggests that the power consumption of asupercomputer has a greater effect on rankings compared with other factors. In other words, power consumption is a critical factor in the evaluation of supercomputer technology.

The above analysis demonstrates that the composite measures of supercomputers proposed in this study have high explanatory power for the overall technology level of supercomputers because they include a better variety of influencing factors compared to the simple measuresused by conventional benchmarks.

4.4 Analysis on the differences between the HPL Ranking and the P-Model
Ranking

To identify factors in terms of differences between the P-Model and existing HPC performance measures, the differences between the P-Model ranking and existing rankings were analyzed. For this purpose, only variation between the HPL ranking and the P-Model ranking was analyzed. In the above section, the rank correlation analysis results demonstrated that the HPL and the HPCG rankings were similar, and the ranking variation between the P-Model and the Green500 had an approximately 50% explanatory power. Therefore, it iseasy to determine which factors have a significant influence on rank changes by analyzingrankings which are considerably different. In addition, it is meaningful to identify the cause of the ranking differences between the P-Model proposed in this paper and the mostrepresentative existing measure.

First, the differences between the P-Model ranking and the HPL ranking of the 50 supercomputers were used to derive the ranking change values. Then, to evaluate the overalltendency of the ranking changes, these values were divided into five levels based on the quartile with a significant drop, slight drop, no change, slight rise, or significant rise. Asindependent variables, we used all eight major factors employed in the P-Model. Because theranking change level, as the dependent variable, included two or more ranks, ordinal logisticregression analysis was performed [34].

Prior to conducting the ordinal logistic regression analysis, the analysis must satisfy the basic assumption that each independent variable has the same effect when a dependent variable changes by one unit. As shown in Table 8, the results of the parallel lines test demonstrated that the significance level is 0.527, higher than 0.05, and so it is possible toperform ordinal logistic regression analysis.

Table 8. Results of the parallel line test

The results of the model fitting test were presented in Table 9. The log-likelihoods of the intercept-only model and the final model with independent variables were 148.491 and 98.360, respectively. The chi-square statistic, the difference between the two models, was 5 0.131, which was at a significance level of 0.00. Therefore, this model is valid because the final model, in which independent variables are added, is more significant for explaining ranking variations.

Table 9. Result of model fitting test

The goodness-of-fit test was performed to confirm that the data for analysis is well suited to a given model. The probabilities of significance for Pearson and Deviance statistics were .914 and 1.000, respectively, which are both significance levels greater than 0.05. Therefore, the data were well suited to a given model as shown in Table 10.

Table 10. Results of goodness-of-fit

The pseudo R2, which is the coefficient of determination in a logistic regression model, generally has a lower value than the normal regression model, even if the model is well fitted. The Pseudo R2 values for the Nagelkerke, Cox and Snell, and McFadden were 0.667, 0.633, and 0.338, respectively as shown in Table 11.

Table 11. Results of pseudo R-square

The results of the ordinal logistic analysis are shown in Table 12. Only CPU theoretical peak performance, main memory bandwidth, and power consumption were found to besignificant independent variables with a significance level of less than 0.01. The larger Waldstatistic means that an independent variable has a greater influence on the dependent variable. Therefore, the differences between the HPL ranking and the P-Model ranking was influenced by power consumption, CPU theoretical peak performance, and main memory bandwidth inorder of significance.

CPU theoretical peak performance is a positive sign of the coefficient estimate, and if CPU theoretical peak performance increases by one unit, the odds of being included in the one-level higher category of the ranking change are increased by 0.000014249335%. This resultoccurred because the measured values of CPU theoretical peak performance were extreme, ranging from a minimum of 29,120 to a maximum of 128,446,365. In addition, in the ordinallogistic regression analysis, when continuous variables, as independent variables, are used asinputs, the odds ratio, which indicates changes in the dependent variable, is very low. Mainmemory bandwidth is a positive sign of the coefficient estimate, and if main memory band width increases by one unit, the odds of being included in the one-level higher category of the ranking change are increased by 3.2%. Power consumption is a negative sign of the coefficient estimate and if power consumption increases by one unit, the odds of beingincluded in the one-level higher category of the ranking change are reduced by 0.002%.

Table 12. Results of the ordinal logistic analysis

In summary, the results of the ordinal logistic analysis on the differences between the HPLranking and the P-Model ranking demonstrated that CPU theoretical peak performance and main memory bandwidth have a positive (+) effect, but power consumption has a negative effect. In addition, power consumption has the greatest influence on ranking changes, followed by CPU theoretical peak performance and main memory bandwidth. Therefore, supercomputers with low power consumption, high main memory bandwidth, and high CPU theoretical peak performance are more likely to rank higher in the P-Model than in HPL. Thisis consistent with the results that the systems ranked 4th, 5th, and 6th in P-Model, higher thanin the HPL ranking, had characteristics in which the main memory bandwidth was ranked firstamong 50 supercomputers.

5. Conclusion

In this paper, we developed the composite measures of supercomputer technology thatreflected the characteristics of a variety of supercomputer components to overcome the fragmentation of traditional supercomputer performance measures. For this purpose, major supercomputer hardware components such as CPUs, accelerators, interconnection networks, and power consumption were chosen as factors. Weightings were then derived by conducting surveys of domestic and foreign experts’ opinions regarding which factors were important. Furthermore, we applied the concept of cost-effectiveness to derive weights to represent the importance of supercomputer performance and power consumption. Finally, we developed the model by integrating the factors and weights based on Martino’s general scoring model.

As a consequence of applying this model to selected supercomputers on the TOP500 list, therankings significantly changed except for the supercomputers ranked 1st and 2nd. In addition, we found that the characteristics of the memory and the interconnection networks, which werenot reflected in the conventional measures, influenced the ranking. In addition, the result of arank correlation analysis on the proposed model and representative measures’ rankings demonstrated that the proposed model showed low correlations with the HPL and the HPCGranking, but a high correlation with the Green500 ranking. This indicated that powerconsumption is a critical factor that significantly influences supercomputer technology. Furthermore, the differences between the HPL ranking and the P-Model ranking was found to be influenced by power consumption, CPU theoretical peak performance, and main memory band width in order of significance. In conclusion, we demonstrated that the compositemeasures presented in this study are suitable to comprehensively measure the supercomputertechnology compared to conventional measures based on the benchmarks. The proposed composite measures are expected to contribute to management decisions and policies in the procurement and operation of supercomputers in the future.

However, it is important to note that it may result in bias, depending on the selected targets, when this model is applied in practice. Supercomputers are state-of-the-art facilities whose price and performance change rapidly with time. Therefore, including older systems in the analysis may result in bias resulting from significant performance gaps. For example, whencalculating the ratio of the procurement cost to the operating cost, the K-Computers systemincurred significant development costs; thus, the procurement cost ratio was higher than forother supercomputers. Attention must be accorded to the interpretation of results as deflection may occur depending on the selection of the systems to be compared.

The limitations of this study are as follows. First, some major hardware factors such as I/O systems and storages were not reflected owing to lack of data. Therefore, additional factors should be further explored, investigated, and analyzed. We also did not consider the cost of the facilities in which a supercomputer is installed as part of the procurement cost. In addition, wedid not consider other operating costs such as labor costs, software purchase costs, and otherindirect expenses. Therefore, a more accurate model that reflects the general overall procurement and operation costs may be developed after collecting data from othersupercomputer centers. On the other hand, weights were derived from the survey responses of experts in this study. We may consider the use of statistical methods such as factor analysis, principal component analysis, or other survey methods such as the Delphi method or the analytic hierarchy process. Finally, we compared the models presented in this paper only with the supercomputer performance measures provided by TOP500, but we plan to examine thesemodels against various other measures such as Graph500 and I/O500 in future work.

Acknowledgement

This research was supported by the KISTI Program (No. K-19-L02-C05-S01), the EDISON Program through the National Research Foundation of Korea (NRF) (No. NRF-2011-0020576), the NRF and the Center for Women In Science, Engineering and Technology (WISET-2019-233). A Grant was awarded by the Ministry of Science and ICT(MSIT) under the Program for Returners for R&D.

References

  1. Gene H. Golub and James M. Ortega, Scientific Computing and Differential Equations: An Introduction to Numerical Methods, 1st Edition, Academic Press, Orlando, 1991.
  2. E. Gallopoulos and A. Sameh, "CSE: content and product," IEEE Computational Science and Engineering, vol. 4, no. 2, pp. 39-43, June, 1997. https://doi.org/10.1109/99.609830
  3. PITAC (President's Information Technology Advisory Committee), Computational Science: Ensuring America's Competitiveness, the National Coordination Office for Information Technology Research and Development, Arlington, 2005.
  4. National Research Council, Getting Up to Speed: The Future of Supercomputing, The National Academies Press, Washington, DC, 2005.
  5. Jack J. Dongarra, Piotr Luszczek and Antoine Petitet, "The LINPACK benchmark: past, present and future," Concurrency and Computation: Practice and Experience, vol. 15, no. 9, pp. 803-820, August, 2003. https://doi.org/10.1002/cpe.728
  6. The TOP500 Supercomputer Sites. https://www.top500.org
  7. National Academies of Sciences, Engineering, and Medicine, Future Directions for NSF Advanced Computing Infrastructure to Support U.S. Science and Engineering in 2017-2020, The National Academies Press, Washington, DC, 2016.
  8. Xiang Chen, Meizhuo Zhang, Minghui Wang, Wensheng Zhu, Kelly Cho and Heping Zhang, "Memory management in genome-wide association studies," BMC Proceedings, vol. 3, no. 7, pp. S54, December, 2009. https://doi.org/10.1186/1753-6561-3-s7-s54
  9. Tzu-Yi Chen, Omid Khalili, Roy L. Campbell, Laura Carrington, Mustafa M. Tikir and Allan Snavely, "Performance Prediction and Ranking of Supercomputers," Advances in Computers, Vol. 72, pp. 135-172, 2008. https://doi.org/10.1016/S0065-2458(08)00003-X
  10. John L. Gustafson and Rajat Todi, "Conventional Benchmarks as a Sample of the Performance Spectrum," Journal of Supercomputing, pp 321-342, vol. 13, no. 3, May 1999. https://doi.org/10.1023/A:1008013204871
  11. Ryusuke Egawa, Kazuhiko Komatsu, Yoko Isobe, Toshihiro Kato, Souya Fujimoto, Hiroyuki Takizawa, Akihiro Musa and Hiroaki Kobayashi, "Performance and Power Analysis of SX-ACE using HP-X Benchmark Programs," in Proc. of IEEE Int. Conf. on Cluster Computing, pp. 693-700, September 5-8, 2017.
  12. Vladimir Marjanovic, Jose Gracia and Colin W. Glass, "HPC Benchmarking: Problem Size Matters," in Proc. of Int. Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, pp. 1-10, November 14, 2016.
  13. Wu-chun Feng, Xizhou Feng and Rong Ge, "Green Supercomputing Comes of Age," IT Professional, vol. 10, no. 1, pp. 17-23, January, 2008. https://doi.org/10.1109/MITP.2008.8
  14. Jeffrey S. Vetter, Contemporary High Performance Computing: From Petascale toward Exascale, 1st Edition, Chapman and Hall, New York, 2013.
  15. Francesco Fraternali, Andrea Bartolini, Carlo Cavazzoni and Luca Benini, "Quantifying the Impact of Variability and Heterogeneity on the Energy Efficiency for a Next-Generation Ultra-Green Supercomputer," IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 7, pp. 1575-1588, July, 2018. https://doi.org/10.1109/TPDS.2017.2766151
  16. Stephen W. Keckler, William J. Dally, Brucek Khailany, Michael Garland and David Glasco, "GPUs and the Future of Parallel Computing," IEEE Micro, vol. 31, no. 5, pp. 7-17, September, 2011. https://doi.org/10.1109/MM.2011.89
  17. David Rohr, Sebastian Kalcher, Matthias Bach, Abdulqadir A. Alaqeeliy, Hani M. Alzaidy, Dominic Eschweiler, Volker Lindenstruth, Sakhar B. Alkhereyfy, Ahmad Alharthiy, Abdulelah Almubaraky, Ibraheem Alqwaizy and Riman Bin Suliman, "An Energy-Efficient Multi-GPU Supercomputer," in Proc. of IEEE Int. Conf. on High Performance Computing and Communications, pp.42-45, August 20-22, 2014.
  18. David Rohr, Gvozden Neskovic and Volker Lindenstruth, "The L-CSC cluster: Optimizing power efficiency to become the greenest supercomputer in the world in the Green500 list of November 2014," The Journal of Supercomputing Frontiers and Innovations, vol. 2, no. 3, pp.41-48, July, 2015.
  19. Tapasya Patki, David K. Lowenthal, Barry Rountree, Martin Schulz, and Bronis R. de Supinski, "Exploring hardware overprovisioning in power-constrained, high performance computing," in Proc. of Int. ACM Conf. on supercomputing, pp.173-182, June 10-14, 2013.
  20. J.R. Neely, "The U.S. Government Role in HPC: Technology, Mission, and Policy," in Proc. of Conf. on Comparing High Performance Computing in the U.S. and China, pp.1-16, April 29-30, 2014.
  21. The Green500. https://www.top500.org/green500
  22. Everett Phillips and Massimiliano Fatica, "A CUDA Implementation of the High Performance Conjugate Gradient Benchmark," in Proc. of High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, Springer International Publishing, pp. 68-84, 2015.
  23. Jack Dongarra, Michael A Heroux and Piotr Luszczek, "High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems," The International Journal of High Performance Computing Applications, vol. 30, no. 1, pp. 3-10, February, 2016. https://doi.org/10.1177/1094342015593158
  24. JaeHyuk Kwack and Gregory H. Bauer, "HPCG and HPGMG benchmark tests on multiple program, multiple data (MPMD) mode on Blue Waters-A Cray XE6/XK7 hybrid system," Concurrency and Computation: Practice and Experience, vol. 30, no. 1, October, 2017.
  25. Joseph P. Martino, "A comparison of two composite measures of technology," Technological Forecasting and Social Change, vol. 44, no. 2, pp. 147-159, September, 1993. https://doi.org/10.1016/0040-1625(93)90024-2
  26. Joseph P. Martino, Technological Forecasting for Decision Making, 3rd Edition, McGraw-Hill, New York, 1993.
  27. Shinobu Yoshimura, Muneo Hori and Makoto Ohsaki, High-Performance Computing for Structural Mechanics and Earthquake/Tsunami Engineering, 1st Edition, Springer International Publishing, Switzerland, 2016.
  28. Thomas Sterling, Matthew Anderson and Maciej Brodowicz, High Performance Computing - Modern Systems and Practices, 1st Edition, Morgan Kaufmann, San Francisco, 2017.
  29. Sun Yeong Heo, Duk Joon Chang, Jae Kyoung Shin, "Ordering Items from Ranking Procedures in Survey Research," Survey Research, vol. 9, no. 2, pp.29-49, July, 2008.
  30. Solmaz Salehian and Yonghong Yan, "Evaluation of Knight Landing High Bandwidth Memory for HPC Workloads," in Proc. of Workshop on Irregular Applications: Architectures and Algorithms, pp.10:1-10:4, November 12-17, 2017.
  31. FUJITSU Supercomputer PRIMEHPC FX100. http://www.fujitsu.com/global/products/computing/servers/supercomputer/primehpc-fx100/
  32. Yuichiro Ajima, Takahiro Kawashima, Takayuki Okamoto, Naoyuki Shida, Kouichi Hirai, Toshiyuki Shimizu, Shinya Hiramoto, Yoshiro Ikeda, Takahide Yoshikawa, Kenji Uchida and Tomohiro Inoue, "The Tofu Interconnect D," in Proc. of IEEE Int. Conf. on Cluster Computing, pp.646-654, September 10-13, 2018.
  33. Supercomputer at the Information Technology Center of Nagoya University. http://www.icts.nagoya-u.ac.jp/en/sc/
  34. Alfred P. Rovai, Jason D. Baker and Michael K. Ponton, Social Science Research Design and Statistics: A Practitioner's Guide to Research Methods and IBM SPSS, 2nd Edition, Watertree Press, Chesapeake, 2013.
  35. Kim Soon-Kwi, Logistic regression analysis, Kyowoo, Seoul, 2014.