• Title/Summary/Keyword: 벡터모델

Search Result 1,379, Processing Time 0.032 seconds

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

  • Kim Yu-Seop;Chang Jeong-Ho
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.749-758
    • /
    • 2004
  • In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.

The pattern of movement and stress distribution during retraction of maxillary incisors using a 3-D finite element method (상악 전치부 후방 견인 시 이동 양상과 응력 분포에 관한 삼차원 유한요소법적 연구)

  • Chung, Ae-Jin;Kim, Un-Su;Lee, Soo-Haeng;Kang, Seong-Soo;Choi, Hee-In;Jo, Jin-Hyung;Kim, Sang-Cheol
    • The korean journal of orthodontics
    • /
    • v.37 no.2 s.121
    • /
    • pp.98-113
    • /
    • 2007
  • Objective: The purpose of this study was to evaluate the displacement pattern and the stress distribution shown on a finite element model 3-D visualization of a dry human skull using CT during the retraction of upper anterior teeth. Methods: Experimental groups were differentiated into 8 groups according to corticotomy, anchorage (buccal: mini implant between the maxillary second premolar and first molar and second premolar reinforced with a mini Implant, palatal: mini implant between the maxillary first molar and second molar and mini implant on the midpalatal suture) and force application point (use of a power arm or not). Results: In cases where anterior teeth were retracted by a conventional T-loop arch wire, the anterior teeth tipped more postero-inferiorly and the posterior teeth moved slightly in a mesial direction. In cases where anterior teeth were retracted with corticotomy, the stress at the anterior bone segment was distributed widely and showed a smaller degree of tipping movement of the anterior teeth, but with a greater amount of displacement. In cases where anterior teeth were retracted from the buccal side with force applied to the mini implant placed between the maxillary second premolar and the first molar to the canine power arm, it showed that a smaller degree of tipping movement was generated than when force was applied to the second premolar reinforced with a mini implant from the canine bracket. In cases where anterior teeth were retracted from the palatal side with force applied to the mini implant on the midpalatal suture, it resulted in a greater degree of tipping movement than when force was applied to the mini implant between the maxillary first and second molars. Conclusion: The results of this study verifies the effects of corticotomies and the effects of controlling orthodontic force vectors during tooth movement.

Hepatitis B Virus-Induced TNF-a Expression in Hepa-lc1c7 Mouse Hepatoma Cell Line (마우스 Hepa-1c1c7 세포주에서 B형 간염 바이러스에 의한 tumor necrosis factor-a의 발현 유도)

  • Yea Sung Su;Jang Won Hee;Yang Young-Il;Lee Youn Jae;Kim Mi Seong;Seog Dae-Hyun;Park Yeong-Hong;Paik Kye-Hyung
    • Journal of Life Science
    • /
    • v.15 no.1 s.68
    • /
    • pp.38-44
    • /
    • 2005
  • Infection with hepatitis B virus (HBV) is a major health problem worldwide. Although a tremendous amount has been known about HBV, there have been obstacles in the study of HBV due to the narrow host range of HBV limited to humans and primates. In the present study, we investigated the susceptibility to HBV infection of mouse hepatoma cell line, Hepa-1c1c7. In addition, based on that human hepatocytes infected by HBV increase the expression of the pro-inflammatory cytokine TNF-a, the inducibility of TNF-a expression by HBV in the cells was determined. HBV surface antigen (HBsAg) secretion was measured by the microparticle enzyme immunoassay and steady state mRNA expression was analyzed by quantitative competitive RT-PCR. Transient transfection of Hepa-1c1c7 cells with HBV expression vector resulted in a dose-dependent induction of TNF-a expression. Infection of Hepa-1c1c7 cells with the serum of HBV carrier also increased TNF-a mRNA expression. Both in the transfected and infected cells, HBV mRNA was expressed and significant HBsAg secretion was detected. There was no significant variation in $\beta-actin$ mRNA expression by HBV. These results demonstrate that HBV is infectious to Hepa-lc1c7 in vitro and the viral infection induces TNF-a expression, which suggests that Hepa-lc1c7, a mouse hepatoma cell line, may be a possible model system for analysis of various molecular aspects of HBV infection.

On Method for LBS Multi-media Services using GML 3.0 (GML 3.0을 이용한 LBS 멀티미디어 서비스에 관한 연구)

  • Jung, Kee-Joong;Lee, Jun-Woo;Kim, Nam-Gyun;Hong, Seong-Hak;Choi, Beyung-Nam
    • 한국공간정보시스템학회:학술대회논문집
    • /
    • 2004.12a
    • /
    • pp.169-181
    • /
    • 2004
  • SK Telecom has already constructed GIMS system as the base common framework of LBS/GIS service system based on OGC(OpenGIS Consortium)'s international standard for the first mobile vector map service in 2002, But as service content appears more complex, renovation has been needed to satisfy multi-purpose, multi-function and maximum efficiency as requirements have been increased. This research is for preparation ion of GML3-based platform to upgrade service from GML2 based GIMS system. And with this, it will be possible for variety of application services to provide location and geographic data easily and freely. In GML 3.0, it has been selected animation, event handling, resource for style mapping, topology specification for 3D and telematics services for mobile LBS multimedia service. And the schema and transfer protocol has been developed and organized to optimize data transfer to MS(Mobile Stat ion) Upgrade to GML 3.0-based GIMS system has provided innovative framework in the view of not only construction but also service which has been implemented and applied to previous research and system. Also GIMS channel interface has been implemented to simplify access to GIMS system, and service component of GIMS internals, WFS and WMS, has gotten enhanded and expanded function.

  • PDF

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

Evaluating Global Container Ports' Performance Considering the Port Calls' Attractiveness (기항 매력도를 고려한 세계 컨테이너 항만의 성과 평가)

  • Park, Byungin
    • Journal of Korea Port Economic Association
    • /
    • v.38 no.3
    • /
    • pp.105-131
    • /
    • 2022
  • Even after the improvement in 2019, UNCTAD's Liner Shipping Connectivity Index (LSCI), which evaluates the performance of the global container port market, has limited use. In particular, since the liner shipping connectivity index evaluates the performance based only on the distance of the relationship, the performance index combining the port attractiveness of calling would be more efficient. This study used the modified Huff model, the hub-authority algorithm and the eigenvector centrality of social network analysis, and correlation analysis for 2007, 2017, and 2019 data of Ocean-Commerce, Japan. The findings are as follows: Firstly, the port attractiveness of calling and the overall performance of the port did not always match. However, according to the analysis of the attractiveness of a port calling, Busan remained within the top 10. Still, the attractiveness among other Korean ports improved slowly from the low level during the study period. Secondly, Global container ports are generally specialized for long-term specialized inbound and outbound ports by the route and grow while maintaining professionalism throughout the entire period. The Korean ports continue to change roles from analysis period to period. Lastly, the volume of cargo by period and the extended port connectivity index (EPCI) presented in this study showed a correlation from 0.77 to 0.85. Even though the Atlantic data is excluded from the analysis and the ship's operable capacity is used instead of the port throughput volume, it shows a high correlation. The study result would help evaluate and analyze global ports. According to the study, Korean ports need a long-term strategy to improve performance while maintaining professionalism. In order to maintain and develop the port's desirable role, it is necessary to utilize cooperation and partnerships with the complimentary port and attract shipping companies' services calling to the complementary port. Although this study carried out a complex analysis using a lot of data and methodologies for an extended period, it is necessary to conduct a study covering ports around the world, a long-term panel analysis, and a scientific parameter estimation study of the attractiveness analysis.

Therapeutic Angiogenesis by Intramyocardial Injection of pCK-VEGF165 in Pigs (돼지에서 pCK-VEGF165의 심근내 주입에 의한 치료적 혈관조성)

  • Choi Jae-Sung;Han Woong;Kim Dong Sik;Park Jin Sik;Lee Jong Jin;Lee Dong Soo;Kim Ki-Bong
    • Journal of Chest Surgery
    • /
    • v.38 no.5 s.250
    • /
    • pp.323-334
    • /
    • 2005
  • Background: Gene therapy is a new and promising option for the treatment of severe myocardial ischemia by therapeutic angiogenesis. The goal of this study was to elucidate the efficacy of therapeutic angiogenesis by using VEGF165 in large animals. Material and Method: Twenty-one pigs that underwent ligation of the distal left anterior descending coronary artery were randomly allocated to one of two treatments: intramyocardial injection of pCK-VEGF (VEGF) or intramyocardial injection of pCK-Null (Control). Injections were administered 30 days after ligation. Seven pigs died during the trial, but eight pigs from VEGF and six from Control survived. Echo-cardiography was performed on day 0 (preoperative) and on days 30 and 60 following coronary ligation. Gated myocardial single photon emission computed tomography imaging (SPECT) with $^{99m}Tc-labeled$ sestamibi was performed on days 30 and 60. Myocardial perfusion was assessed from the uptake of $^{99m}Tc-labeled$ sestamibi at rest. Global and regional myocardial function as well as post-infarction left ventricular remodeling were assessed from segmental wall thickening; left ventricular ejection fraction (EF); end systolic volume (ESV); and end diastolic volume (EDV) using gated SPECT and echocardiography. Myocardium of the ischemic border zone into which pCK plasmid vector had been injected was also sampled to assess micro-capillary density. Result: Micro-capillary density was significantly higher in the VEGF than in Control ($386\pm110/mm^{2}\;vs.\;291\pm127/mm^{2};\;p<0.001$). Segmental perfusion increased significantly from day 30 to day 60 after intramyocardial injection of plasmid vector in VEGF ($48.4\pm15.2\%\;vs.\;53.8\pm19.6\%;\;p<0.001$), while no significant change was observed in the Control ($45.1\pm17.0\%\;vs.\;43.4\pm17.7\%;\;p=0.186$). This resulted in a significant difference in the percentage changes between the two groups ($11.4\pm27.0\%\;increase\;vs.\;2.7\pm19.0\%\;decrease;\;p=0.003$). Segmental wall thickening increased significantly from day 30 to day 60 in both groups; the increments did not differ between groups. ESV measured using echocardiography increased significantly from day 0 to day 30 in VEGF ($22.9\pm9.9\;mL\;vs.\;32.3\pm9.1\;mL;\; p=0.006$) and in Control ($26.3\pm12.0\;mL\;vs.\;36.8\pm9.7\;mL;\;p=0.046$). EF decreased significantly in VEGF ($52.0\pm7.7\%\;vs.\;46.5\pm7.4\%;\;p=0.004$) and in Control ($48.2\pm9.2\%\;vs.\;41.6\pm10.0\%;\;p=0.028$). There was no significant change in EDV. The interval changes (days $30\~60$) of EF, ESV, and EDV did not differ significantly between groups both by gated SPECT and by echocardiography. Conclusion: Intramyocardial injection of pCK-VEGF165 induced therapeutic angiogenesis and improved myocardial perfusion. However, post-infarction remodeling and global myocardial function were not improved.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.