• Title/Summary/Keyword: experimental techniques

Search Result 3,198, Processing Time 0.036 seconds

Reverse engineering technique on the evaluation of impression accuracy in angulated implants (경사진 임플란트에서 임플란트 인상의 정확도 평가를 위한 역공학 기법)

  • Jung, Hong-Taek;Lee, Ki-Sun;Song, So-Yeon;Park, Jin-Hong;Lee, Jeong-Yol
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.59 no.3
    • /
    • pp.261-270
    • /
    • 2021
  • Purpose. The aim of this study was (1) to compare the reverse engineering technique with other existing measurement methods and (2) to analyze the effect of implant angulations and impression coping types on implant impression accuracy with reverse engineering technique. Materials and methods. Three different master models were fabricated and the distance between the two implant center points in parallel master model was measured with different three methods; digital caliper measurement (Group DC), optical measuring (Group OM), and reverse engineering technique (Group RE). The 90 experimental models were fabricated with three types of impression copings for the three different implant angulation and the angular and distance error rate were calculated. One-way ANOVA was used for comparison among the evaluation methods (P < .05). The error rates of experimental groups were analyzed by two-way ANOVA (P < .05). Results. While there was significant difference between Group DC and RE (P < .05), Group OM had no significant difference compared with other groups (P > .05). The standard deviations in reverse engineering were much lower than those of digital caliper and optical measurement. Hybrid groups had no significant difference from the pick-up groups in distance error rates (P > .05). Conclusion. The reverse engineering technique demonstrated its potential as an evaluation technique of 3D accuracy of impression techniques.

Application of Automated Microscopy Equipment for Rock Analog Material Experiments: Static Grain Growth and Simple Shear Deformation Experiments Using Norcamphor (유사물질 실험을 위한 자동화 현미경 실험 기기의 적용과 노캠퍼를 이용한 입자 성장 및 단순 전단 변형 실험의 예)

  • Ha, Changsu;Kim, Sungshil
    • Economic and Environmental Geology
    • /
    • v.54 no.2
    • /
    • pp.233-245
    • /
    • 2021
  • Many studies on the microstructures in rocks have been conducted using experimental methods with various equipment as well as natural rock studies to see the development of microstructures and understand their mechanisms. Grain boundary migration of mineral aggregates in rocks could cause grain growth or grain size changes during metamorphism or deformation as one of the main recrystallization mechanisms. This study suggests improved ways regarding the analog material experiments with reformed equipment to see sequential observations of these grain boundary migration. It can be more efficient than the existing techniques and carry out an appropriate microstructure analysis. This reformed equipment was implemented to enable optical manipulation by mounting polarizing plates capable of rotating operation on a stereoscopic microscope and a deformation rig capable of experimenting with analog materials. The equipment can automatically control the temperature and strain rate of the deformation rig by microcontrollers and programming and can take digital photomicrographs with constant time intervals during the experiment to observe any microstructure changes. The composite images synthesized using images by rotated polarizing plates enable us to see more accurate grain boundaries. As a rock analog material, norcamphor(C7H10O) was used, which has similar birefringence to quartz. Static grain growth and simple shear deformation experiments were performed using the norcamphor to verify the effectiveness of the equipment. The static grain growth experiments showed the characteristics of typical grain growth behavior. The number of grains decreases and the average grain size increases over time. These case experiments also showed a clear difference between the growth curves with three temperature conditions. The result of the simple shear deformation experiment under the medium temperature-low strain rate showed no significant change in the average grain size but presented the increased elongation of grain shapes in the direction of about 53° regarding the direction perpendicular to the shearing direction as the shear strain increases over time. These microstructures are interpreted as both the plastic deformation and the internal recovery process in grains are balanced by the deformation under the given experimental conditions. These experiments using the reformed equipment represent the ability to sequentially observe changing the microstructure during experiments as desired in the tests with the analog material during the entire process.

A Comprehensive Review of Geological CO2 Sequestration in Basalt Formations (현무암 CO2 지중저장 해외 연구 사례 조사 및 타당성 분석)

  • Hyunjeong Jeon;Hyung Chul Shin;Tae Kwon Yun;Weon Shik Han;Jaehoon Jeong;Jaehwii Gwag
    • Economic and Environmental Geology
    • /
    • v.56 no.3
    • /
    • pp.311-330
    • /
    • 2023
  • Development of Carbon Capture and Storage (CCS) technique is becoming increasingly important as a method to mitigate the strengthening effects of global warming, generated from the unprecedented increase in released anthropogenic CO2. In the recent years, the characteristics of basaltic rocks (i.e., large volume, high reactivity and surplus of cation components) have been recognized to be potentially favorable in facilitation of CCS; based on this, research on utilization of basaltic formations for underground CO2 storage is currently ongoing in various fields. This study investigated the feasibility of underground storage of CO2 in basalt, based on the examination of the CO2 storage mechanisms in subsurface, assessment of basalt characteristics, and review of the global research on basaltic CO2 storage. The global research examined were classified into experimental/modeling/field demonstration, based on the methods utilized. Experimental conditions used in research demonstrated temperatures ranging from 20 to 250 ℃, pressure ranging from 0.1 to 30 MPa, and the rock-fluid reaction time ranging from several hours to four years. Modeling research on basalt involved construction of models similar to the potential storage sites, with examination of changes in fluid dynamics and geochemical factors before and after CO2-fluid injection. The investigation demonstrated that basalt has large potential for CO2 storage, along with capacity for rapid mineralization reactions; these factors lessens the environmental constraints (i.e., temperature, pressure, and geological structures) generally required for CO2 storage. The success of major field demonstration projects, the CarbFix project and the Wallula project, indicate that basalt is promising geological formation to facilitate CCS. However, usage of basalt as storage formation requires additional conditions which must be carefully considered - mineralization mechanism can vary significantly depending on factors such as the basalt composition and injection zone properties: for instance, precipitation of carbonate and silicate minerals can reduce the injectivity into the formation. In addition, there is a risk of polluting the subsurface environment due to the combination of pressure increase and induced rock-CO2-fluid reactions upon injection. As dissolution of CO2 into fluids is required prior to injection, monitoring techniques different from conventional methods are needed. Hence, in order to facilitate efficient and stable underground storage of CO2 in basalt, it is necessary to select a suitable storage formation, accumulate various database of the field, and conduct systematic research utilizing experiments/modeling/field studies to develop comprehensive understanding of the potential storage site.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

A New Exploratory Research on Franchisor's Provision of Exclusive Territories (가맹본부의 배타적 영업지역보호에 대한 탐색적 연구)

  • Lim, Young-Kyun;Lee, Su-Dong;Kim, Ju-Young
    • Journal of Distribution Research
    • /
    • v.17 no.1
    • /
    • pp.37-63
    • /
    • 2012
  • In franchise business, exclusive sales territory (sometimes EST in table) protection is a very important issue from an economic, social and political point of view. It affects the growth and survival of both franchisor and franchisee and often raises issues of social and political conflicts. When franchisee is not familiar with related laws and regulations, franchisor has high chance to utilize it. Exclusive sales territory protection by the manufacturer and distributors (wholesalers or retailers) means sales area restriction by which only certain distributors have right to sell products or services. The distributor, who has been granted exclusive sales territories, can protect its own territory, whereas he may be prohibited from entering in other regions. Even though exclusive sales territory is a quite critical problem in franchise business, there is not much rigorous research about the reason, results, evaluation, and future direction based on empirical data. This paper tries to address this problem not only from logical and nomological validity, but from empirical validation. While we purse an empirical analysis, we take into account the difficulties of real data collection and statistical analysis techniques. We use a set of disclosure document data collected by Korea Fair Trade Commission, instead of conventional survey method which is usually criticized for its measurement error. Existing theories about exclusive sales territory can be summarized into two groups as shown in the table below. The first one is about the effectiveness of exclusive sales territory from both franchisor and franchisee point of view. In fact, output of exclusive sales territory can be positive for franchisors but negative for franchisees. Also, it can be positive in terms of sales but negative in terms of profit. Therefore, variables and viewpoints should be set properly. The other one is about the motive or reason why exclusive sales territory is protected. The reasons can be classified into four groups - industry characteristics, franchise systems characteristics, capability to maintain exclusive sales territory, and strategic decision. Within four groups of reasons, there are more specific variables and theories as below. Based on these theories, we develop nine hypotheses which are briefly shown in the last table below with the results. In order to validate the hypothesis, data is collected from government (FTC) homepage which is open source. The sample consists of 1,896 franchisors and it contains about three year operation data, from 2006 to 2008. Within the samples, 627 have exclusive sales territory protection policy and the one with exclusive sales territory policy is not evenly distributed over 19 representative industries. Additional data are also collected from another government agency homepage, like Statistics Korea. Also, we combine data from various secondary sources to create meaningful variables as shown in the table below. All variables are dichotomized by mean or median split if they are not inherently dichotomized by its definition, since each hypothesis is composed by multiple variables and there is no solid statistical technique to incorporate all these conditions to test the hypotheses. This paper uses a simple chi-square test because hypotheses and theories are built upon quite specific conditions such as industry type, economic condition, company history and various strategic purposes. It is almost impossible to find all those samples to satisfy them and it can't be manipulated in experimental settings. However, more advanced statistical techniques are very good on clean data without exogenous variables, but not good with real complex data. The chi-square test is applied in a way that samples are grouped into four with two criteria, whether they use exclusive sales territory protection or not, and whether they satisfy conditions of each hypothesis. So the proportion of sample franchisors which satisfy conditions and protect exclusive sales territory, does significantly exceed the proportion of samples that satisfy condition and do not protect. In fact, chi-square test is equivalent with the Poisson regression which allows more flexible application. As results, only three hypotheses are accepted. When attitude toward the risk is high so loyalty fee is determined according to sales performance, EST protection makes poor results as expected. And when franchisor protects EST in order to recruit franchisee easily, EST protection makes better results. Also, when EST protection is to improve the efficiency of franchise system as a whole, it shows better performances. High efficiency is achieved as EST prohibits the free riding of franchisee who exploits other's marketing efforts, and it encourages proper investments and distributes franchisee into multiple regions evenly. Other hypotheses are not supported in the results of significance testing. Exclusive sales territory should be protected from proper motives and administered for mutual benefits. Legal restrictions driven by the government agency like FTC could be misused and cause mis-understandings. So there need more careful monitoring on real practices and more rigorous studies by both academicians and practitioners.

  • PDF

Establishment and Application of Molecular Genetic Techniques for Preimplantation Genetic Diagnosis of Osteogenesis Imperfecta (골형성부전증의 착상전 유전진단을 위한 분자유전학적 방법의 조건 확립과 적용)

  • Kim, Min-Jee;Lee, Hyoung-Song;Choi, Hye-Won;Lim, Chun-Kyu;Cho, Jae-Won;Kim, Jin-Young;Song, In-Ok;Kang, Inn-Soo
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.35 no.2
    • /
    • pp.99-110
    • /
    • 2008
  • Objectives: Preimplantation genetic diagnosis (PGD) has become an assisted reproductive technique for couples carrying genetic conditions that may affect their offspring. Osteogenesis imperfecta (OI) is an autosomal dominant disorder of connective tissue characterized by bone fragility and low bone mass. At least 95% of cases are caused by dominant mutations in the COL1A1 or COL1A2. In this study, we report on our experience clinical outcomes with 5 PGD cycles for OI in two couples. Methods: Before clinical PGD, we assessed the amplification rate and allele drop-out (ADO) rate of alkaline lysis and nested PCR protocol using heterozygous patient's single lymphocytes in the pre-clinical diagnostic tests for OI. We performed 5 cycles of PGD for OI by nested PCR for the causative mutation loci, COL1A1 c.2452G>A and c.3226G>A, in case 1 and case 2, respectively. The PCR products were analyzed by agarose gel electrophoresis, restriction fragment length polymorphism (RFLP) analysis with HaeIII restriction enzyme in the case 1 and direct DNA sequencing. Results: We confirmed the causative mutation loci, COL1A1 c.2452G>A in case 1 and c.3226G>A in case 2. In the pre-clinical tests, the amplification rate was 94.2% and ADO rate was 22.5% in case 1, while 98.1% and 1.9% in case 2, respectively. In case 1, a total of 34 embryos were analyzed and 31 embryos (91.2%) were successfully diagnosed in 3 PGD cycles. Eight out of 19 embryos diagnosed as unaffected embryos were transferred in all 3 cycles, and in the third cycle, pregnancy was achieved and a healthy baby was delivered without any complications in July, 2005. In case 2, all 19 embryos (100.0%) were successfully diagnosed and 4 out of 11 unaffected embryos were transferred in 2 cycles. Pregnancy was achieved in the second cycle and the healthy baby was delivered in March, 2008. The causative locus was confirmed as a normal by amniocentesis and postnatal diagnosis. Conclusions: To our knowledge, these two cases are the first successful PGD for OI in Korea. Our experience provides a further demonstration that PGD is a reliable and effective clinical techniques and a useful option for many couples with a high risk of transmitting a genetic disease.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

The efficacy and safety of transcatheter closure of atrial septal defect with Amplatzer septal occluder in young children less than 3 years of age (3세 미만 심방중격결손 소아에서 Amplatzer 기구 폐쇄술의 안전성 및 효용성)

  • Lee, Soo Hyun;Choi, Deok Young;Kim, Nam Kyun;Choi, Jae Young;Sul, Jun Hee
    • Clinical and Experimental Pediatrics
    • /
    • v.52 no.4
    • /
    • pp.494-498
    • /
    • 2009
  • Purpose : Applicability of transcatheter closure of atrial septal defect (ASD) has been expanded by accumulation of clinical experiences and evolutions of the device. This study was performed to evaluate the safety and efficacy of transcatheter closure of ASD with Amplatzer septal occluder (ASO) in young children less than 3 years of age. Methods : From May 2003 to December 2005, 295 patients underwent transcatheter closure of ASD with ASO in the Severance Cardiovascular Hospital, Yonsei University Health System. Among them, 51 patients less than 3 years of age were enrolled in this study. We investigated procedural success rate, rate of residual shunt, frequency of complications, procedure/fluoroscopy time, and need of modified techniques for device implantation. Results : The median age was 2.1 years and median body weight was 12 kg. Implantation of device was successful in 50 patients (98%). Seven patients (15%) showed a small residual shunt 1 day after the procedure, but complete occlusion had been documented at 6 month follow-up in all patients (100%). The pulmonary to systemic flow ratio (Qp/Qs), peak systolic pulmonary artery pressure, and peak systolic right ventricular pressure had decreased significantly after closure of ASD. There were 2 complications including device embolization (1, 2%) and temporary groin hematoma (1, 2%). Conclusion : Transcatheter closure of ASD with ASO can be performed with satisfactory results and acceptable risk even in young children less than 3 years of age. We could suggest that even in very young children with ASD, there is no need to wait until they grow to a sufficient size for the transcatheter closure.

GENE EXPRESSION PATTERNS INDUCED BY $TAXOL^{(R)}$ AND CYCLOSPORIN A IN ORAL SQUAMOUS CELL CARCINOMA CELL LINE USING CDNA MICROARRAY (cDNA Microarray를 이용한 구강편평세포암종 세포주에서 $Taxol^{(R)}$과 Cyclosporin A로 유도된 유전자 발현양상)

  • Kim, Yong-Kwan;Lee, Jae-Hoon;Kim, Chul-Hwan
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.28 no.3
    • /
    • pp.202-212
    • /
    • 2006
  • It is well-known that paclitaxel($Taxol^{(R)}$), which is extracted from the pacific and English yew, has been used as a chemotherapeutic agent for ovarian carcinoma and advanced breast carcinoma and Cyclosporin A, which is highly lipophilic cyclic peptide and isolated from a fungus, has been also used as an useful immunosuppressive drug after transplantation and is associated with cellular apoptosis. Since 1953, in which James Watson, Rosalind Franklin and Francis Crick discovered the double helical structure of DNA, a few kinds of techniques for identifying gene expression have been developed. In postgenomic period, many of researchers have used the DNA microarray which is high throughput screening technique to screen large numbers of gene expression simultaneously. In this study, we searched and screened the gene expression in the oral squamous cell carcinoma cell lines treated with $Taxol^{(R)}$, cyclosporin or cyclosporin combined with $Taxol^{(R)}$ using cDNA microarray. The results were as following; 1. It was useful that the appropriate concentration of Cyclosporin A and $Taxol^{(R)}$ used in oral squamous cell carcinoma cell line was under 1${\mu}g/ml$ and 3${\mu}g/ml$. 2. In the experimental group in which $Taxol^{(R)}$ and $Taxol^{(R)}$ + Cyclosporin A were used, the cell growth was extremely decreased. 3. In the group in which Cyclosporin A was used, the MTT assay was rarely decreased which means the activity of succinyl dehydrogenase is remained in mitochondria but in the group in which the mixture of Cyclosporin A and $Taxol^{(R)}$ were used, the MTT assay was extremely decreased. 4. In the each group in which Cyclosporin A(3 ${\mu}g/ml$) and $Taxol^{(R)}$(1 ${\mu}g/ml$) were used, the cell arrest was appeared in $G_2/M$ phase and in the group in which $Taxol^{(R)}$(3 ${\mu}g/ml$) was used, the cell arrest was appeared in both S phase and $G_2/M$ phase. 5. In the oral squamous cell carcinoma cell line treated with $Taxol^{(R)}$, several genes including ANGPTL4, RALBP1 and TXNRD1, associated with apoptosis, SUI1, MAC30, RRAGA and CTGF, related with cell growth, HUS1 and DUSP5, related with cell cycle and proliferation, ATF4 and CEBPG, associated with transcription factor, BTG1 and VEGF, associated with angiogenesis, FDPS, FCER1G, GPA33 and EPHA4 associated with signal transduction and receptor activity and AKR1C2 and UGTA10 related with carcinogenesis were detected in increased levels. The genes that showed increaced expression in the oral squamous cell carcinoma cell line treated with Cyclosporin A were CYR61, SERPINB2, SSR3 and UPA3A which are known as genes associated with cell growth, carcinogenesis, receptor activity and transcription factor. The genes expressed in the HN22 cell line treated with cyclosporin combined with $taxol^{(R)}$ were ALCAM and GTSE1 associated with cancer invasiveness and cell cycle regulation.

Latent topics-based product reputation mining (잠재 토픽 기반의 제품 평판 마이닝)

  • Park, Sang-Min;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.39-70
    • /
    • 2017
  • Data-drive analytics techniques have been recently applied to public surveys. Instead of simply gathering survey results or expert opinions to research the preference for a recently launched product, enterprises need a way to collect and analyze various types of online data and then accurately figure out customer preferences. In the main concept of existing data-based survey methods, the sentiment lexicon for a particular domain is first constructed by domain experts who usually judge the positive, neutral, or negative meanings of the frequently used words from the collected text documents. In order to research the preference for a particular product, the existing approach collects (1) review posts, which are related to the product, from several product review web sites; (2) extracts sentences (or phrases) in the collection after the pre-processing step such as stemming and removal of stop words is performed; (3) classifies the polarity (either positive or negative sense) of each sentence (or phrase) based on the sentiment lexicon; and (4) estimates the positive and negative ratios of the product by dividing the total numbers of the positive and negative sentences (or phrases) by the total number of the sentences (or phrases) in the collection. Furthermore, the existing approach automatically finds important sentences (or phrases) including the positive and negative meaning to/against the product. As a motivated example, given a product like Sonata made by Hyundai Motors, customers often want to see the summary note including what positive points are in the 'car design' aspect as well as what negative points are in thesame aspect. They also want to gain more useful information regarding other aspects such as 'car quality', 'car performance', and 'car service.' Such an information will enable customers to make good choice when they attempt to purchase brand-new vehicles. In addition, automobile makers will be able to figure out the preference and positive/negative points for new models on market. In the near future, the weak points of the models will be improved by the sentiment analysis. For this, the existing approach computes the sentiment score of each sentence (or phrase) and then selects top-k sentences (or phrases) with the highest positive and negative scores. However, the existing approach has several shortcomings and is limited to apply to real applications. The main disadvantages of the existing approach is as follows: (1) The main aspects (e.g., car design, quality, performance, and service) to a product (e.g., Hyundai Sonata) are not considered. Through the sentiment analysis without considering aspects, as a result, the summary note including the positive and negative ratios of the product and top-k sentences (or phrases) with the highest sentiment scores in the entire corpus is just reported to customers and car makers. This approach is not enough and main aspects of the target product need to be considered in the sentiment analysis. (2) In general, since the same word has different meanings across different domains, the sentiment lexicon which is proper to each domain needs to be constructed. The efficient way to construct the sentiment lexicon per domain is required because the sentiment lexicon construction is labor intensive and time consuming. To address the above problems, in this article, we propose a novel product reputation mining algorithm that (1) extracts topics hidden in review documents written by customers; (2) mines main aspects based on the extracted topics; (3) measures the positive and negative ratios of the product using the aspects; and (4) presents the digest in which a few important sentences with the positive and negative meanings are listed in each aspect. Unlike the existing approach, using hidden topics makes experts construct the sentimental lexicon easily and quickly. Furthermore, reinforcing topic semantics, we can improve the accuracy of the product reputation mining algorithms more largely than that of the existing approach. In the experiments, we collected large review documents to the domestic vehicles such as K5, SM5, and Avante; measured the positive and negative ratios of the three cars; showed top-k positive and negative summaries per aspect; and conducted statistical analysis. Our experimental results clearly show the effectiveness of the proposed method, compared with the existing method.