• Title/Summary/Keyword: Candidate List

Search Result 104, Processing Time 0.021 seconds

An Expert System for the Estimation of the Growth Curve Parameters of New Markets (신규시장 성장모형의 모수 추정을 위한 전문가 시스템)

  • Lee, Dongwon;Jung, Yeojin;Jung, Jaekwon;Park, Dohyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.17-35
    • /
    • 2015
  • Demand forecasting is the activity of estimating the quantity of a product or service that consumers will purchase for a certain period of time. Developing precise forecasting models are considered important since corporates can make strategic decisions on new markets based on future demand estimated by the models. Many studies have developed market growth curve models, such as Bass, Logistic, Gompertz models, which estimate future demand when a market is in its early stage. Among the models, Bass model, which explains the demand from two types of adopters, innovators and imitators, has been widely used in forecasting. Such models require sufficient demand observations to ensure qualified results. In the beginning of a new market, however, observations are not sufficient for the models to precisely estimate the market's future demand. For this reason, as an alternative, demands guessed from those of most adjacent markets are often used as references in such cases. Reference markets can be those whose products are developed with the same categorical technologies. A market's demand may be expected to have the similar pattern with that of a reference market in case the adoption pattern of a product in the market is determined mainly by the technology related to the product. However, such processes may not always ensure pleasing results because the similarity between markets depends on intuition and/or experience. There are two major drawbacks that human experts cannot effectively handle in this approach. One is the abundance of candidate reference markets to consider, and the other is the difficulty in calculating the similarity between markets. First, there can be too many markets to consider in selecting reference markets. Mostly, markets in the same category in an industrial hierarchy can be reference markets because they are usually based on the similar technologies. However, markets can be classified into different categories even if they are based on the same generic technologies. Therefore, markets in other categories also need to be considered as potential candidates. Next, even domain experts cannot consistently calculate the similarity between markets with their own qualitative standards. The inconsistency implies missing adjacent reference markets, which may lead to the imprecise estimation of future demand. Even though there are no missing reference markets, the new market's parameters can be hardly estimated from the reference markets without quantitative standards. For this reason, this study proposes a case-based expert system that helps experts overcome the drawbacks in discovering referential markets. First, this study proposes the use of Euclidean distance measure to calculate the similarity between markets. Based on their similarities, markets are grouped into clusters. Then, missing markets with the characteristics of the cluster are searched for. Potential candidate reference markets are extracted and recommended to users. After the iteration of these steps, definite reference markets are determined according to the user's selection among those candidates. Then, finally, the new market's parameters are estimated from the reference markets. For this procedure, two techniques are used in the model. One is clustering data mining technique, and the other content-based filtering of recommender systems. The proposed system implemented with those techniques can determine the most adjacent markets based on whether a user accepts candidate markets. Experiments were conducted to validate the usefulness of the system with five ICT experts involved. In the experiments, the experts were given the list of 16 ICT markets whose parameters to be estimated. For each of the markets, the experts estimated its parameters of growth curve models with intuition at first, and then with the system. The comparison of the experiments results show that the estimated parameters are closer when they use the system in comparison with the results when they guessed them without the system.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

Effect of Moutan Cortex Radicis on gene expression profile of differentiated PC12 rat cells oxidative-stressed with hydrogen peroxide (모단피의 PC12 cell 산화억제 효과 및 neuronal 유전자 발현 profile 분석에 대한 연구)

  • Kim Hyun Hee;Rho Sam Woong;Na Youn Gin;Bae Hyun Su;Shin Min Kyu;Kim Chung Suk;Hong Moo Chang
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.17 no.2
    • /
    • pp.529-541
    • /
    • 2003
  • Yukmijihwang-tang has been widely used as an and-aging herbal medicine for hundred years in Asian countries. Numerous studies show that Yukmijihwangtang has anti-oxidative effect both in vivo and in vitro. It has been reported that Moutan Cortex Radicis extract (MCR) was the most effective herb in Yukmijihwang-tang on undifferentiated PC12 cells upon oxidative-stressed with hydrogen peroxide. The purpose of this study is to; 1) evaluate the recovery of neuronal damage by assessing the anti-oxidant effect of MCR on PC12 cells differentiated with nerve growth factor (NGF), 2) identify candidate genes responsible for anti-oxidative effect on differentiated PC12 cells by oligonucleotide chip microarray. PC12 cells, which were differentiated by treating with NGF, were treated without or with hydrogen peroxide in the presence or absence of various concentration of MCR. Cell survival was determined by using MTS assay. Measurement of intracellular reactive oxygen species (ROS) generation was determined using the H2DCFDA assay The viability of cells treated with MCR was significantly recovered from stressed PC12 cell. In addition, wide rage of concentrations of MCR shows dose-dependent inhibitory effect on ROS production in oxidative-stressed cells. Total RNAs of cells without treatment(Control group), only treated with H₂O₂ (stressed group) and treated with both H₂O₂ and of MCR (MCR group) were isolated, and cDNAs was synthesized using oligoT7(dT) primer. The fragmented cRNAs, synthesized from cDNAs, were applied to Affymetrix GeneChip Rat Neurobiology U34 Array. mRNA of Calcium/calmodulin-dependent protein kinase II delta subunit(CaMKII), neuron glucose transporter (GLUT3) and myelin/oligodendrocyte glycoprotein(MOG) were downregulated in Stressed group comparing to Control group. P2X2-5 receptor (P2X2R-5), P2X2-4 receptor (P2X2R-4), c-fos, 25 kDa synaptosomal attachment protein(SNAP-25a) and GLUT3 were downregulated, whereas A2 adenosine receptor (A2AR), cathechol-O-methyltransferase(COMT), glucose transporter 1 (GLUT1), EST223333, heme oxygenase (HO), VGF, UI-R-CO-ja-a-07-0-Ul.s1 and macrophage migration inhibitory factor (MIF) were upregulated in MCA group comparing to Control group. Expression of Putative potassium channel subunit protein (ACK4), P2X2A-5, P2X2A-4, Interferon-gamma inducing factor isoform alpha precursor (IL-18α), EST199031, P2XR, P2X2 purinoceptor isoform e (P2X2R-e), Precursor interleukin 18 (IL-18) were downregulated, whereas MOO, EST223333, GLUT-1, MIF, Neuronatin alpha, UI-R-C0-ja-a-07-0-Ul.s1, A2. adenosine receptor, COMT, neuron-specific enolase (NSE), HO, VGF, A rat novel protein which is expressed with nerve injury (E12625) were upregulated in MCR group comparing to Stressed group. The results suggest that decreased viability and AOS production of PC12 cell by H₂O₂ may be, at lease, mediated by impaired glucose transporter expression. It is implicated that the MCR treatment protect PC12 cell from oxidative stress via following mechanisms; improving glucose transport into the cell, enhancing expression of anti-oxidative genes and protecting from dopamine cytotoxicity by increment of COMT and MIF expression. The list of differentially expressed genes may implicate further insight on the action and mechanism behind the anti-oxidative effects of herbal extract Moutan Cortex Radicis.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.