• Title/Summary/Keyword: 통계적 테스트

Search Result 82, Processing Time 0.047 seconds

Automatic Word Spacing of the Korean Sentences by Using End-to-End Deep Neural Network (종단 간 심층 신경망을 이용한 한국어 문장 자동 띄어쓰기)

  • Lee, Hyun Young;Kang, Seung Shik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.441-448
    • /
    • 2019
  • Previous researches on automatic spacing of Korean sentences has been researched to correct spacing errors by using n-gram based statistical techniques or morpheme analyzer to insert blanks in the word boundary. In this paper, we propose an end-to-end automatic word spacing by using deep neural network. Automatic word spacing problem could be defined as a tag classification problem in unit of syllable other than word. For contextual representation between syllables, Bi-LSTM encodes the dependency relationship between syllables into a fixed-length vector of continuous vector space using forward and backward LSTM cell. In order to conduct automatic word spacing of Korean sentences, after a fixed-length contextual vector by Bi-LSTM is classified into auto-spacing tag(B or I), the blank is inserted in the front of B tag. For tag classification method, we compose three types of classification neural networks. One is feedforward neural network, another is neural network language model and the other is linear-chain CRF. To compare our models, we measure the performance of automatic word spacing depending on the three of classification networks. linear-chain CRF of them used as classification neural network shows better performance than other models. We used KCC150 corpus as a training and testing data.

3-stage Portfolio Selection Ensemble Learning based on Evolutionary Algorithm for Sparse Enhanced Index Tracking (부분복제 지수 상향 추종을 위한 진화 알고리즘 기반 3단계 포트폴리오 선택 앙상블 학습)

  • Yoon, Dong Jin;Lee, Ju Hong;Choi, Bum Ghi;Song, Jae Won
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.39-47
    • /
    • 2021
  • Enhanced index tracking is a problem of optimizing the objective function to generate returns above the index based on the index tracking that follows the market return. In order to avoid problems such as large transaction costs and illiquidity, we used a method of constructing a portfolio by selecting only some of the stocks included in the index. Commonly used enhanced index tracking methods tried to find the optimal portfolio with only one objective function in all tested periods, but it is almost impossible to find the ultimate strategy that always works well in the volatile financial market. In addition, it is important to improve generalization performance beyond optimizing the objective function for training data due to the nature of the financial market, where statistical characteristics change significantly over time, but existing methods have a limitation in that there is no direct discussion for this. In order to solve these problems, this paper proposes ensemble learning that composes a portfolio by combining several objective functions and a 3-stage portfolio selection algorithm that can select a portfolio by applying criteria other than the objective function to the training data. The proposed method in an experiment using the S&P500 index shows Sharpe ratio that is 27% higher than the index and the existing methods, showing that the 3-stage portfolio selection algorithm and ensemble learning are effective in selecting an enhanced index portfolio.

Identification of Sweet Pepper Greenhouse by Analysis of Environmental Data in Greenhouse (온실 내 환경데이터 분석을 통한 파프리카 온실의 식별)

  • Kim, Na-eun;Lee, Kyoung-geun;Lee, Deog-hyun;Moon, Byeong-eun;Park, Jae-sung;Kim, Hyeon-tae
    • Journal of Bio-Environment Control
    • /
    • v.30 no.1
    • /
    • pp.19-26
    • /
    • 2021
  • In this study, analysis was performed to identify three greenhouses located in the same area using principal component analysis (PCA) and linear discrimination analysis (LDA). The environmental data in the greenhouse were from 3 farms in the same area, and the values collected at 1 hour intervals for a total of 4 weeks from April 1 to April 28 were used. Before analyzing the data, it was pre-processed to normalize the data, and the analysis was performed by dividing it into 80% of the training data and 20% of the test data. As a result of PCA and LDA analysis, it was found that PCA classification accuracy was 57.51% and LDA classification was 67.06%, indicating that it can be classified by greenhouse. Based on the farmhouse data classified in advance, the data of the new environment can be classified into specific groups to determine the tendency of the data. Such data is judged to be a way to increase the utilization of data by facilitating identification.

Assessment of water resources availability considering complex water use in upstream of the Hantan River Dam (한탄강댐 상류 상세 물이용체계를 고려한 수자원가용량 평가)

  • Jang, Cheol Hee;Kim, Hyeon Jun;Kim, Deok Hwan
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.252-252
    • /
    • 2020
  • 대하천 주변 광역상수도 공급지역은 가뭄 발생시에도 안정적으로 물이용이 가능하나, 중소하천을 수원으로 하는 하천의 상류지역은 가뭄시 물공급 안정성이 취약하다. 따라서 중소하천을 대상으로 가뭄시 물 공급시설의 효율적 운영, 물부족 위험도 평가, 가용 수자원의 최적이용 등 종합적인 대책 마련을 위해서는 신뢰성 높은 수문량(하천유출량 및 수자원가용량) 예측이 필요하다. 기존의 가뭄시 하천유출량 예측정확도 평가는 통계적 회귀분석을 통한 가뭄지수 기반의 가뭄상황의 예측에 치중하여 불확실성이 크며 국내 유역의 복잡한 물이용체계를 고려하지 않아 시·공간적인 규모에 따라 상이한 결과를 나타내며 실측자료 기반의 하천유출량과 비교하면 정확도가 대부분 60% 이하로 나타난다(이상은 등, 2015). 본 연구에서는 상세 물이용체계를 고려한 정도높은 수자원가용량의 평가를 위하여 한강권역 내의 한탄강댐 상류 유역을 테스트베드로 선정하였다. 한탄강댐 상류유역은 다수의 복잡한 농업용 수리시설 운영에 따른 수자원가용량 예측정확도가 매우 낮은 지역으로 본 연구를 통해 정도 높은 수자원가용량 예측정확도를 확보하기에 적정한 유역이라 판단하였다. 수자원가용량을 평가하기 위한 모형은 한국건설기술연구원에서 개발된 CAT3.1(Catchment hydrologic cycle Assessment Tool 3.1)을 이용하였다. CAT 3.1은 중소하천 유역내의 인위적인 물이용체계(광역급수, 재이용, 지하수 취수, 하천수 취·배수 등)를 반영한 수문량(하천유출량 및 수자원가용량) 평가 및 예측이 가능한 모형으로 기존 개념적 매개변수 기반의 집중형 수문모형과 물리적 매개변수 기반의 분포형 수문모형의 장점을 최대한 집약하여 개발되었다. 한탄강댐 상류유역의 물리적 매개변수는 최대한 기 구축된 GIS 자료를 활용하여 추출하였다. 토지이용현황은 산림과 농업지역이 대부분을 차지하여 농업용수 공급이 대부분인 물이용체계를 가지고 있다. 따라서 한국농어촌공사에서 관리하는 11개 농업용 저수지에 대한 취수현황 및 제원, 국가지하수센터의 유역내 지하수사용량, 하폐수처리량을 기본 입력 자료로 사용하였다. 특히 농업용 저수지의 경우에는 저수지출구점을 기준으로 저수지 상류유역 및 한국농어촌공사에서 기 구축된 관개면적 공간자료를 기본으로 수혜구역을 세분화하여 모형을 적용하였다.

  • PDF

Statistical Evaluation of Moisture Resistance by Mixing Method of Recycled Asphalt Mixtures (혼합방법에 따른 순환아스팔트 혼합물의 수분저항성 통계검정 평가)

  • Kim, Sungun;Kim, Yeongsam;Jo, Youngjin;Kim, Kwangwoo
    • Journal of the Korean Recycled Construction Resources Institute
    • /
    • v.9 no.2
    • /
    • pp.167-176
    • /
    • 2021
  • When producing recycled asphalt mix, it is important that the old binder of reclaimed asphalt pavement(RAP) should be well melted during blending in the mixer. The recycled asphalt mix is produced by instant mixing(IM) of all materials(RAP, virgin asphalt and new aggregates) all together in the mixer. However, in the same recycled mix, the binder around RAP aggregate was found to show higher oxidation level than the binder coated around the virgin aggregate because the old binder of RAP was not rejuvenated properly while instant mixing. The partially-rejuvenated RAP binder is assumed to be a high stiffness point in IM recycled mix. In this study, the stage mixing(SM) method was introduced; blending RAP and virgin asphalt for the first stage, and then mixing all together with hot new aggregates for the second stage. To compare the effect of the two mixing methods on moisture resistance of recycled mixes, a statistical t-test was performed between SM and IM using indirect tensile strength(ITS) and tensile strength ratio(TSR). Three conditioning methods were used; a 16-h freezing and then 24-h submerging, 48-h submerging, and 72-h submerging in 60℃ water. It was found that the TSR(=ITSwet/ITSdry) values of the mixes prepared by SM was clearly higher than the IM mixes, and coefficients of variation of SM mixes were lower than the IM mixes. It was also observed that the ITSWET of SM was significantly different from the IM at α=0.05 level by statistical t-test. The ITSWET of SM mix was reduced less than the IM mix in severer conditioned mixes. Therefore, it was concluded that the stage mixing method was an important blending technique for producing better-quality of recycled asphalt mixes, which would show higher moisture resistance than the recycled mixes produced by conventional instant mixing.

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.

A Study of Changes of Inversion Time Effect on Brain Volume of Normal Volunteers (반전 시간의 변화가 정상인의 뇌 체적에 미치는 영향에 대한 고찰)

  • Kim, Ju Ho;Kim, Seong-Hu;Shin, Hwa Seon;Kim, Ji-Eun;Na, Jae Boem;Park, Kisoo;Choi, Dae Seob
    • Investigative Magnetic Resonance Imaging
    • /
    • v.17 no.4
    • /
    • pp.286-293
    • /
    • 2013
  • Purpose : The objective of this study was to analyze the brain volume according to the brain image of healthy adults in the 20s taken with different inversion time (TI). Materials and Methods: Brain images of healthy adults in the 20 s were acquired using magnetization prepared rapid acquisition gradient echo (MPRAGE) pulse sequence with 1.5 mm thickness of pieces and four inversion times (1100 ms, 1000 ms, 900 ms, 800 ms). The acquired brain images were analyzed to measure the volume of white matter (WM), gray matter (GM), intracranial volume (ICV). The statistical difference according to brain volume and gender was analyzed for each TI. Results: The brain volume calculated using Freesurfer was WM$486.52{\pm}48.64cm^3$ and GM=$646.83{\pm}57.12cm^3$ in mean when adjusted by mean ICV=$1278.94{\pm}154.92cm^3$. Men's brain volume(WM, GM, ICV) was larger than women's brain volume. In the intrarater reliability test, all of the intraclass correlation coefficients were high (0.992 for WM, 0.988 for GM, and 0.997 for ICV). In the repeated measures analysis of variance, GM and ICV did not show a significant difference at each TI (GM p=0.143, ICV p=0.052), but WM showed a significant (p=0.001). In the linear structure relation analysis, all of the Pearson correlation coefficients were high. Conclusion: WM, GM, and ICV indicated high reliability and solid linear structure relations, but WM showed significant differences at each TI. The brain volume of healthy adults in the 20s could be used in comparison with that of patients for reference purposes and to predict the structural change of brain. It would be needed to conduct additional studies to examine the contract, SNR, and lesion detection ability according to variable TI.

Changes of Qualities of Green Asparagus Packed with Different Types during Low Temperature Storage (포장 방법에 따른 아스파라거스의 저온저장 중 품질변화)

  • Wang, Lixia;Choi, In-Lee;Kang, Ho-Min
    • Journal of Bio-Environment Control
    • /
    • v.29 no.3
    • /
    • pp.239-244
    • /
    • 2020
  • Effect of 6kg large unit with the carton box (20% open ratio) and MA box (10,000cc·m-2·day-1·atm-1 oxygen transmission rates modified atmosphere package), and the 100g small unit with MA film on asparagus sensory quality were evaluated. The CO2 concentration depended largely on the packing unit and maintained at around 3% in small MA packages, whereas in the MA box increased to 12%. Ethylene concentration rapidly increased until after 3 days of storage in MA packages and then decreased to maintain 5μL·L-1. Unrelated to the unit size, the lower weight loss was obtained in MA packages. A significant difference in visual quality was shown since the 15th day, the best and worst were the MA box and small MA package on the finish day. Off-odor was the highest in small MA packages and the lowest in the carton box (< 3.0). Although there was no significant difference in firmness among all treatments, the packages showed the highest firmness in tips and stems, respectively. The sugar content and hue angle decreased during storage, but there was no statistical difference in all treatments. EL was lowest and highest in small MA package and carton box, respectively. On the 10th day, the total aerobic bacteria was lowest in small MA packages, but no significant difference on the 20th day. E. coli was not found in all treatments on the 10th day, while it was the lowest in the MA box on the 20th day. The mold and yeast were not observed during the whole storage. Based on the above results, the carton box packaged with 10,000cc OTR film was more effective in maintaining the quality of green asparagus with the suitable CO2 concentration for asparagus cold storage.

The Statistical Approach-based Intelligent Education Support System (통계적 접근법을 기초로 하는 지능형 교육 지원 시스템)

  • Chung, Jun-Hee
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.109-123
    • /
    • 2012
  • Many kinds of the education systems are provided to students. Many kinds of the contents like School subjects, license, job training education and so on are provided through many kinds of the media like text, image, video and so on. Students will apply the knowledge they learnt and will use it when they learn other things. In the existing education system, there have been many situations that the education system isn't really helpful to the students because too hard contents are transferred to them or because too easy contents are transferred to them and they learn the contents they already know again. To solve this phenomenon, a method that transfers the most proper lecture contents to the students is suggested in the thesis. Because the difficulty is relative, the contents A can be easier than the contents B to a group of the students and the contents B can be easier than the contents A to another group of the students. Therefore, it is not easy to measure the difficulty of the lecture contents. A method considering this phenomenon to transfer the proper lecture contents is suggested in the thesis. The whole lecture contents are divided into many lecture modules. The students solve the pattern recognition questions, a kind of the prior test questions, before studying the lecture contents and the system selects and provides the most proper lecture module among many lecture modules to the students according to the score about the questions. When the system selects the lecture module and transfer it to the student, the students' answer and the difficulty of the lecture modules are considered. In the existing education system, 1 kind of the content is transferred to various students. If the same lecture contents is transferred to various students, the contents will not be transferred efficiently. The system selects the proper contents using the students' pattern recognition answers. The pattern recognition question is a kind of the prior test question that is developed on the basis of the lecture module and used to recognize whether the student knows the contents of the lecture module. Because the difficulty of the lecture module reflects the all scores of the students' answers, whenever a student submits the answer, the difficulty is changed. The suggested system measures the relative knowledge of the students using the answers and designates the difficulty. The improvement of the suggested method is only applied when the order of the lecture contents has nothing to do with the progress of the lecture. If the contents of the unit 1 should be studied before studying the contents of the unit 2, the suggested method is not applied. The suggested method is introduced on the basis of the subject "English grammar", subjects that the order is not important, in the thesis. If the suggested method is applied properly to the education environment, the students who don't know enough basic knowledge will learn the basic contents well and prepare the basis to learn the harder lecture contents. The students who already know the lecture contents will not study those again and save more time to learn more various lecture contents. Many improvement effects like these and so on will be provided to the education environment. If the suggested method that is introduced on the basis of the subject "English grammar" is applied to the various education systems like primary education, secondary education, job education and so on, more improvement effects will be provided. The direction to realize these things is suggested in the thesis. The suggested method is realized with the MySQL database and Java, JSP program. It will be very good if the suggested method is researched developmentally and become helpful to the development of the Korea education.

DIFFERENCE IN BOND STRENGTH ACCORDING TO FILLING TECHNIQUES AND CAVITY WALLS IN BOX-TYPE OCCLUSAL COMPOSITE RESIN RESTORATION (박스 형태의 복합레진 수복시 충전법 및 와동벽에 따른 결합력 차이에 관한 연구)

  • Ko, Eun-Joo;Shin, Dong-Hoon
    • Restorative Dentistry and Endodontics
    • /
    • v.34 no.4
    • /
    • pp.350-355
    • /
    • 2009
  • Bond strength depends on characteristics of bonding surface and restorative technique. The majority of studies dealing with dentin bond strength were carried out on flat bonding surface, therefore, difference of bond strength between axial wall and pulpal wall is not clear yet. This study evaluated bonding difference between cavity walls in class I composite resin restoration with different filling techniques. Twenty extracted caries-free human third molars were used. Cavities were prepared in 6 ${\times}$4 ${\times}$3 mm box-type and divided into four groups according to filling technique and bonding surface: Group I; bulk filling - pulpal wall, Group II; bulk filling - axial wall, Group III; incremental filling - pulpal wall, Group IV; incremental filling - axial wall. Cavities were filled with Filtek $Z250^{(R)}$(3M/ESPE., USA) and Clearfill SE $bond^{(R)}$(Kuraray, Japan). After 24 hour-storage in $37^{\circ}C$water, the resin bonded teeth were sectioned bucco-lingualy at the center of cavity. Specimens were vertically sectioned into 1.0 ${\times}$1.0 mm thick serial sticks perpendicular to the bond surface using a low-speed diamond saw (Accutom 50, Struers, Copenhagen, Denmark) under water cooling. The trimmed specimens were then attached to the testing device and in turn, was placed in a universal testing machine (EZ test, Shimadzu Co., Kyoto, Japan) for micro-tensile testing at a cross-head speed of 1 mm/min. The results obtained were statistically analyzed using 2-way ANOVA and t-test at a significance level of 95%. The results were as follows: 1. There was no significant difference between bulk filling and incremental filling. 2. There was no significant difference between pulpal wall and axial wall, either. Within the limit of this study, it was concluded that microtensile bond strength was not affected by the filling technique and the site of cavity walls.