• 제목/요약/키워드: Error classification pattern

검색결과 94건 처리시간 0.02초

데이터 기반 리튬 이온 배터리 성능 예측을 위한 학습 데이터 모델 정의 및 기계학습 분석 (Learning Data Model Definition and Machine Learning Analysis for Data-Based Li-Ion Battery Performance Prediction)

  • 김병욱;박지수;장홍준
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제12권3호
    • /
    • pp.133-140
    • /
    • 2023
  • 리튬 이온 배터리는 사용 환경과 양극재 조합 비율에 따라 배터리의 성능이 좌우된다. 고성능 리튬 이온 배터리를 개발하기 위해서는 양극재 비율을 다양하게 변화시켜가면서 배터리를 제작하고 성능을 측정해야 한다. 하지만 모든 변수 조합에 대해 배터리를 제작하고 성능을 측정하기에는 많은 시간과 비용이 소모된다. 그렇기 때문에 최근에는 데이터 기반으로 인공지능 모델을 활용하여 배터리의 성능을 예측하고자 하는 연구가 활발히 진행되고 있다. 그러나 기존 공개 배터리 데이터는 동일한 배터리로 측정 실험을 하였기 때문에 양극재 조합 비율은 고정되어 있어서 데이터 속성으로 포함되지 않았다. 본 논문에서는 양극재 소재 조합 비율에 따른 배터리의 성능을 예측할 수 있는 인공지능 모델 개발에 필요한 학습 데이터 모델을 정의한다. 우리는 리튬 이온 배터리의 성능에 영향을 미칠 수 있는 요인을 분석하여 양극재 소재별 질량과 배터리 사용 환경을 입력데이터로, 배터리의 출력과 용량을 목적 데이터로 정의하였다. 공개 배터리 데이터 중에는 양극재 비율이 포함된 데이터가 없어 양극재 비율을 모두 동일한 값으로 설정한 제한된 데이터로 다중 선형회귀 분석, 서포트 벡터 회귀분석, 다중 로지스틱 회귀 분석, LSTM 분석을 수행하였다. 실험 환경이 다른 배터리 데이터에서 각각의 배터리 데이터는 고유한 패턴을 유지하였으며, 배터리 분류 모델은 각각의 배터리를 약 2%의 오차로 분류하는 것으로 나타났다.

MRI Predictors of Malignant Transformation in Patients with Inverted Papilloma: A Decision Tree Analysis Using Conventional Imaging Features and Histogram Analysis of Apparent Diffusion Coefficients

  • Chong Hyun Suh;Jeong Hyun Lee;Mi Sun Chung;Xiao Quan Xu;Yu Sub Sung;Sae Rom Chung;Young Jun Choi;Jung Hwan Baek
    • Korean Journal of Radiology
    • /
    • 제22권5호
    • /
    • pp.751-758
    • /
    • 2021
  • Objective: Preoperative differentiation between inverted papilloma (IP) and its malignant transformation to squamous cell carcinoma (IP-SCC) is critical for patient management. We aimed to determine the diagnostic accuracy of conventional imaging features and histogram parameters obtained from whole tumor apparent diffusion coefficient (ADC) values to predict IP-SCC in patients with IP, using decision tree analysis. Materials and Methods: In this retrospective study, we analyzed data generated from the records of 180 consecutive patients with histopathologically diagnosed IP or IP-SCC who underwent head and neck magnetic resonance imaging, including diffusion-weighted imaging and 62 patients were included in the study. To obtain whole tumor ADC values, the region of interest was placed to cover the entire volume of the tumor. Classification and regression tree analyses were performed to determine the most significant predictors of IP-SCC among multiple covariates. The final tree was selected by cross-validation pruning based on minimal error. Results: Of 62 patients with IP, 21 (34%) had IP-SCC. The decision tree analysis revealed that the loss of convoluted cerebriform pattern and the 20th percentile cutoff of ADC were the most significant predictors of IP-SCC. With these decision trees, the sensitivity, specificity, accuracy, and C-statistics were 86% (18 out of 21; 95% confidence interval [CI], 65-95%), 100% (41 out of 41; 95% CI, 91-100%), 95% (59 out of 61; 95% CI, 87-98%), and 0.966 (95% CI, 0.912-1.000), respectively. Conclusion: Decision tree analysis using conventional imaging features and histogram analysis of whole volume ADC could predict IP-SCC in patients with IP with high diagnostic accuracy.

暴雨의 時間的 分布에 關한 硏究 (Studies on the Time Distribution of Heavy Storms)

  • 이근후
    • 한국농공학회지
    • /
    • 제26권2호
    • /
    • pp.69-84
    • /
    • 1984
  • This study was carried out to investigate the time distribution of single storms and to establish the model of storm patterns in korea. Rainfall recording charts collected from 42 metheorological stations covering the Korean peninsula were analyzed. A single storm was defined as a rain period seperated from preceding and succeeding rainfall by 6 hours and more. Among the defined single storms, 1199 storms exceeding total rainfall of 80 mm were qualified for the study. Storm patterns were cklassified by quartile classification method and the relationship between cummulative percent of rainfalls and cummulative storm time was established for each quartile storm group. Time distribution models for each stations were prepared through the various analytical and inferential procedures. Obtained results are summarized as follows: 1. The percentile frequency of quartile storms for the first to the fourth quartile were 22.0%, 26.5%, 28.9% and 22.6%, respectively. The large variation of percentile frequency was show between the same quartile storms. The advanced type storm pattern was predominant in the west coastal type storm patterns predominantly when compared to the single storms with small total rainfalls. 3. The single storms with long storm durations tended to show delayed type storm patterns predominantly when compared to the single storms with short storm durations. 4. The percentile time distribution of quartile storms for 42 rin gaging stations was estimated. Large variations were observed between the percentiles of time distributions of different stations. 5. No significant differences were generally found between the time distribution of rainfalls with greater total rainfall and with less total rainfall. This fact suggests that the size of the total rainfall of single storms was not the main factor affecting the time distribution of heavy storms. 6. Also, no significant difference were found between the time distribution of rainfalls with long duration and with short duration. The fact indicates that the storm duration was no the main factor affecting the time distribution of heavy storms. 7. In Korea, among all single storms, 39.0% show 80 to 100mm of total rainfall which stands for the mode of the frequency distribution of total rainfalls. The median value of rainfalls for all single storms from the 42 stations was 108mm. The shape of the frequency distribution of total rainfalls showed right skewed features. No significant differences were shown in the shape of distribution histograms for total rainfall of quartile storms. The mode of rainfalls for the advanced type quartile storms was 80~100mm and their frequencies were 39~43% for respective quartiles. For the delayed type quartile storms, the mode was 80~100mm and their frequencies were 36!38%. 8. In Korea, 29% of all single storms show 720 to 1080 minutes of storm durations which was the highest frequency in the frequency distribution of storm durations. The median of the storm duration for all single storms form 42 stations was 1026 minutes. The shape of the frequency distribution was right skewed feature. For the advanced type storms, the higher frequency of occurrence was shown by the single storms with short durations, whereas for the delayed type quartile storms, the higher frequency was shown gy the long duration single storms. 9. The total rainfall of single storms was positively correlated to storm durations in all the stations throughout the nation. This fact was also true for most of the quartile storms. 10. The third order polynomial regression models were established for estimating the time distribution of quartile storms at different stations. The model test by relative error method resulted good agreements between estimated and observed values with the relative error of less than 0.10 in average.

  • PDF

한정된 O-D조사자료를 이용한 주 전체의 트럭교통예측방법 개발 (DEVELOPMENT OF STATEWIDE TRUCK TRAFFIC FORECASTING METHOD BY USING LIMITED O-D SURVEY DATA)

  • 박만배
    • 대한교통학회:학술대회논문집
    • /
    • 대한교통학회 1995년도 제27회 학술발표회
    • /
    • pp.101-113
    • /
    • 1995
  • The objective of this research is to test the feasibility of developing a statewide truck traffic forecasting methodology for Wisconsin by using Origin-Destination surveys, traffic counts, classification counts, and other data that are routinely collected by the Wisconsin Department of Transportation (WisDOT). Development of a feasible model will permit estimation of future truck traffic for every major link in the network. This will provide the basis for improved estimation of future pavement deterioration. Pavement damage rises exponentially as axle weight increases, and trucks are responsible for most of the traffic-induced damage to pavement. Consequently, forecasts of truck traffic are critical to pavement management systems. The pavement Management Decision Supporting System (PMDSS) prepared by WisDOT in May 1990 combines pavement inventory and performance data with a knowledge base consisting of rules for evaluation, problem identification and rehabilitation recommendation. Without a r.easonable truck traffic forecasting methodology, PMDSS is not able to project pavement performance trends in order to make assessment and recommendations in the future years. However, none of WisDOT's existing forecasting methodologies has been designed specifically for predicting truck movements on a statewide highway network. For this research, the Origin-Destination survey data avaiiable from WisDOT, including two stateline areas, one county, and five cities, are analyzed and the zone-to'||'&'||'not;zone truck trip tables are developed. The resulting Origin-Destination Trip Length Frequency (00 TLF) distributions by trip type are applied to the Gravity Model (GM) for comparison with comparable TLFs from the GM. The gravity model is calibrated to obtain friction factor curves for the three trip types, Internal-Internal (I-I), Internal-External (I-E), and External-External (E-E). ~oth "macro-scale" calibration and "micro-scale" calibration are performed. The comparison of the statewide GM TLF with the 00 TLF for the macro-scale calibration does not provide suitable results because the available 00 survey data do not represent an unbiased sample of statewide truck trips. For the "micro-scale" calibration, "partial" GM trip tables that correspond to the 00 survey trip tables are extracted from the full statewide GM trip table. These "partial" GM trip tables are then merged and a partial GM TLF is created. The GM friction factor curves are adjusted until the partial GM TLF matches the 00 TLF. Three friction factor curves, one for each trip type, resulting from the micro-scale calibration produce a reasonable GM truck trip model. A key methodological issue for GM. calibration involves the use of multiple friction factor curves versus a single friction factor curve for each trip type in order to estimate truck trips with reasonable accuracy. A single friction factor curve for each of the three trip types was found to reproduce the 00 TLFs from the calibration data base. Given the very limited trip generation data available for this research, additional refinement of the gravity model using multiple mction factor curves for each trip type was not warranted. In the traditional urban transportation planning studies, the zonal trip productions and attractions and region-wide OD TLFs are available. However, for this research, the information available for the development .of the GM model is limited to Ground Counts (GC) and a limited set ofOD TLFs. The GM is calibrated using the limited OD data, but the OD data are not adequate to obtain good estimates of truck trip productions and attractions .. Consequently, zonal productions and attractions are estimated using zonal population as a first approximation. Then, Selected Link based (SELINK) analyses are used to adjust the productions and attractions and possibly recalibrate the GM. The SELINK adjustment process involves identifying the origins and destinations of all truck trips that are assigned to a specified "selected link" as the result of a standard traffic assignment. A link adjustment factor is computed as the ratio of the actual volume for the link (ground count) to the total assigned volume. This link adjustment factor is then applied to all of the origin and destination zones of the trips using that "selected link". Selected link based analyses are conducted by using both 16 selected links and 32 selected links. The result of SELINK analysis by u~ing 32 selected links provides the least %RMSE in the screenline volume analysis. In addition, the stability of the GM truck estimating model is preserved by using 32 selected links with three SELINK adjustments, that is, the GM remains calibrated despite substantial changes in the input productions and attractions. The coverage of zones provided by 32 selected links is satisfactory. Increasing the number of repetitions beyond four is not reasonable because the stability of GM model in reproducing the OD TLF reaches its limits. The total volume of truck traffic captured by 32 selected links is 107% of total trip productions. But more importantly, ~ELINK adjustment factors for all of the zones can be computed. Evaluation of the travel demand model resulting from the SELINK adjustments is conducted by using screenline volume analysis, functional class and route specific volume analysis, area specific volume analysis, production and attraction analysis, and Vehicle Miles of Travel (VMT) analysis. Screenline volume analysis by using four screenlines with 28 check points are used for evaluation of the adequacy of the overall model. The total trucks crossing the screenlines are compared to the ground count totals. L V/GC ratios of 0.958 by using 32 selected links and 1.001 by using 16 selected links are obtained. The %RM:SE for the four screenlines is inversely proportional to the average ground count totals by screenline .. The magnitude of %RM:SE for the four screenlines resulting from the fourth and last GM run by using 32 and 16 selected links is 22% and 31 % respectively. These results are similar to the overall %RMSE achieved for the 32 and 16 selected links themselves of 19% and 33% respectively. This implies that the SELINICanalysis results are reasonable for all sections of the state.Functional class and route specific volume analysis is possible by using the available 154 classification count check points. The truck traffic crossing the Interstate highways (ISH) with 37 check points, the US highways (USH) with 50 check points, and the State highways (STH) with 67 check points is compared to the actual ground count totals. The magnitude of the overall link volume to ground count ratio by route does not provide any specific pattern of over or underestimate. However, the %R11SE for the ISH shows the least value while that for the STH shows the largest value. This pattern is consistent with the screenline analysis and the overall relationship between %RMSE and ground count volume groups. Area specific volume analysis provides another broad statewide measure of the performance of the overall model. The truck traffic in the North area with 26 check points, the West area with 36 check points, the East area with 29 check points, and the South area with 64 check points are compared to the actual ground count totals. The four areas show similar results. No specific patterns in the L V/GC ratio by area are found. In addition, the %RMSE is computed for each of the four areas. The %RMSEs for the North, West, East, and South areas are 92%, 49%, 27%, and 35% respectively, whereas, the average ground counts are 481, 1383, 1532, and 3154 respectively. As for the screenline and volume range analyses, the %RMSE is inversely related to average link volume. 'The SELINK adjustments of productions and attractions resulted in a very substantial reduction in the total in-state zonal productions and attractions. The initial in-state zonal trip generation model can now be revised with a new trip production's trip rate (total adjusted productions/total population) and a new trip attraction's trip rate. Revised zonal production and attraction adjustment factors can then be developed that only reflect the impact of the SELINK adjustments that cause mcreases or , decreases from the revised zonal estimate of productions and attractions. Analysis of the revised production adjustment factors is conducted by plotting the factors on the state map. The east area of the state including the counties of Brown, Outagamie, Shawano, Wmnebago, Fond du Lac, Marathon shows comparatively large values of the revised adjustment factors. Overall, both small and large values of the revised adjustment factors are scattered around Wisconsin. This suggests that more independent variables beyond just 226; population are needed for the development of the heavy truck trip generation model. More independent variables including zonal employment data (office employees and manufacturing employees) by industry type, zonal private trucks 226; owned and zonal income data which are not available currently should be considered. A plot of frequency distribution of the in-state zones as a function of the revised production and attraction adjustment factors shows the overall " adjustment resulting from the SELINK analysis process. Overall, the revised SELINK adjustments show that the productions for many zones are reduced by, a factor of 0.5 to 0.8 while the productions for ~ relatively few zones are increased by factors from 1.1 to 4 with most of the factors in the 3.0 range. No obvious explanation for the frequency distribution could be found. The revised SELINK adjustments overall appear to be reasonable. The heavy truck VMT analysis is conducted by comparing the 1990 heavy truck VMT that is forecasted by the GM truck forecasting model, 2.975 billions, with the WisDOT computed data. This gives an estimate that is 18.3% less than the WisDOT computation of 3.642 billions of VMT. The WisDOT estimates are based on the sampling the link volumes for USH, 8TH, and CTH. This implies potential error in sampling the average link volume. The WisDOT estimate of heavy truck VMT cannot be tabulated by the three trip types, I-I, I-E ('||'&'||'pound;-I), and E-E. In contrast, the GM forecasting model shows that the proportion ofE-E VMT out of total VMT is 21.24%. In addition, tabulation of heavy truck VMT by route functional class shows that the proportion of truck traffic traversing the freeways and expressways is 76.5%. Only 14.1% of total freeway truck traffic is I-I trips, while 80% of total collector truck traffic is I-I trips. This implies that freeways are traversed mainly by I-E and E-E truck traffic while collectors are used mainly by I-I truck traffic. Other tabulations such as average heavy truck speed by trip type, average travel distance by trip type and the VMT distribution by trip type, route functional class and travel speed are useful information for highway planners to understand the characteristics of statewide heavy truck trip patternS. Heavy truck volumes for the target year 2010 are forecasted by using the GM truck forecasting model. Four scenarios are used. Fo~ better forecasting, ground count- based segment adjustment factors are developed and applied. ISH 90 '||'&'||' 94 and USH 41 are used as example routes. The forecasting results by using the ground count-based segment adjustment factors are satisfactory for long range planning purposes, but additional ground counts would be useful for USH 41. Sensitivity analysis provides estimates of the impacts of the alternative growth rates including information about changes in the trip types using key routes. The network'||'&'||'not;based GMcan easily model scenarios with different rates of growth in rural versus . . urban areas, small versus large cities, and in-state zones versus external stations. cities, and in-state zones versus external stations.

  • PDF