• Title/Summary/Keyword: Word space

Search Result 264, Processing Time 0.026 seconds

A Study on the Change of Paradigm and Analysis of Qualitative Space in Public Space - Focused on the Entrance Floor in General Hospital over 500 beds in Korea - (공용 공간의 패러다임 변화와 질적 공간 분석 - 500병상 이상 국내 종합병원 진입층을 중심으로 -)

  • Son, Ji-Hye;Yang, Nae-Won
    • Korean Institute of Interior Design Journal
    • /
    • v.23 no.6
    • /
    • pp.212-220
    • /
    • 2014
  • Entrance floor in hospital has became important space for medical service in holistic perspective and image enhancement. However, basically a discussion of the qualitative properties and the role of public space is paucity in change of public spaces paradigm. In accordance with this problem, this study consider the change of paradigm in public space based on earlier studies and create classification criterion of space. According to the criterion, G/D ratio and the qualitative spatial area ratio of 26 general hospitals which were planned over 500 beds are analyzed by case study method. The conclusion of this study is as follows. 1) The space according to the medical function is variable element. So the public space should be planned from function-subordinate space to self-reliance space in the future. 2) There is no correlation between the high G/D ratio and the high ratio of qualitative spacial area. In other word It's hard to say that the public space which G/D ratio is high is qualitative space. 3) Since 2000, various types in accordance with the circulation system is applied to public space. And ratio of qualitative spacial area is relatively high in the street type and the concourse type. 4) The qualitative spatial area ratio of stay space is higher than passage's one.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

  • Kim Yu-Seop;Chang Jeong-Ho
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.749-758
    • /
    • 2004
  • In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.

A Research on the Concept of Emptiness that Appears in Seung, Hyo-sang and Louis I. Kahn's Architecture (승효상과 Louis I. Kahn의 건축에 나타난 '비어있음(Emptiness)'의 개념에 대한 연구)

  • Choi, Jee-Yeon;Lee, Young-Soo
    • Korean Institute of Interior Design Journal
    • /
    • v.25 no.5
    • /
    • pp.157-166
    • /
    • 2016
  • This study is intended to examine the various kinds of emptiness that are shown in the empty spaces created by Hyo-Sang Seung and Louis I. Kahn. Architecture creates the usefulness of a certain space thanks to its emptiness. The word, 'Emptiness' has the following meanings, 'non-fulfillment', 'non-existence', 'emptiness', 'blankness', 'non-visibility' and 'non-limitation', etc. Such concepts of emptiness can be broadly divided into 'ideological emptiness', 'aesthetic' and 'architectural spatial'. Comparisons regarding such concepts of emptiness as they are expressed in the architectural spaces created by Hyo-sang Seung and Louis I. Kahn shall be identified in this study. Hyo-sang Seung tried to leave numerous open possibilities which can contain everything into a certain space since 'emptiness' is not being used at this moment. Louis I. Kahn tried to contain into a space the essence and silence of that space before something that exists is created. The 'emptiness' created by Hyo-sang Seung and Louis I. Kahn is a specific space which can contain more kinds of non-visible spaces than some visible ones. Thus, such spaces may contain various potentials which are possible because they are empty, because of changes in nature and time, and because of silence and intentions going beyond an artist's original intent.

Performance Analysis of Space-Time Codes in Realistic Propagation Environments: A Moment Generating Function-Based Approach

  • Lamahewa Tharaka A.;Simon Marvin K.;Kennedy Rodney A.;Abhayapala Thushara D.
    • Journal of Communications and Networks
    • /
    • v.7 no.4
    • /
    • pp.450-461
    • /
    • 2005
  • In this paper, we derive analytical expressions for the exact pairwise error probability (PEP) of a space-time coded system operating over spatially correlated fast (constant over the duration of a symbol) and slow (constant over the length of a code word) fad­ing channels using a moment-generating function-based approach. We discuss two analytical techniques that can be used to evaluate the exact-PEPs (and therefore, approximate the average bit error probability (BEP)) in closed form. These analytical expressions are more realistic than previously published PEP expressions as they fully account for antenna spacing, antenna geometries (uniform linear array, uniform grid array, uniform circular array, etc.) and scattering models (uniform, Gaussian, Laplacian, Von-mises, etc.). Inclusion of spatial information in these expressions provides valuable insights into the physical factors determining the performance of a space-time code. Using these new PEP expressions, we investigate the effect of antenna spacing, antenna geometries and azimuth power distribution parameters (angle of arrival/departure and angular spread) on the performance of a four-state QPSK space-time trellis code proposed by Tarokh et al. for two transmit antennas.

Legal Issues Regarding the Launch Vechicle by DPRK : the Scope and Limit of the UN Security Council Resolution (북한의 발사체발사에 따른 법적 쟁점 : UN 안전보장이사회 결의의 성격과 한계)

  • Shin, Hong-Kyun
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.31 no.1
    • /
    • pp.145-167
    • /
    • 2016
  • UN Security Council is entitled to power for determining the existence of the threat to the peace. Specifying the provisions adopted in accordance with the chapter 7 of the UN Charter, its resolution is deemed as document confirming its decision about the threat to the peace. In general, resolutions adopted by the Security Council acting under Chapter VII of the Charter, are considered binding, in accordance with Article 25 of the Charter. Regarding to the terms of the Resolutions to be interpreted, the word "decide" is used as to the suspension of the ballistic missile program, the word "demand" is used as to the stopping of the the launch of ballistic missile, and the word "demand" is used as to return to the missile test moratorium. These provisions may be deemed to determining specific obligations to be imposed upon the States in accordance with the 1967 Outer Space Treaty. On the other hand, the Resolutions may be limited to the decision, not leading to a sort of international legislation, the main purpose of which is to provide a legal basis for international sanctions against Northe Korea. North Korea missile test case has reminded us of continuing discussion about whether the decision of the Security Council lacks the legislative authority due to its decision process. Furthermore, having regard to the outer space and space activities, the outer space law regime would be not compatible with the Security Council decision process in that the former presupposes the agreement among all States parties, while the latter based upon the agreement between Council member States. Therefore, it is premature to consider the Security Council decision as becoming the lex specialis of the space law regime.

Chatbot Design Method Using Hybrid Word Vector Expression Model Based on Real Telemarketing Data

  • Zhang, Jie;Zhang, Jianing;Ma, Shuhao;Yang, Jie;Gui, Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1400-1418
    • /
    • 2020
  • In the development of commercial promotion, chatbot is known as one of significant skill by application of natural language processing (NLP). Conventional design methods are using bag-of-words model (BOW) alone based on Google database and other online corpus. For one thing, in the bag-of-words model, the vectors are Irrelevant to one another. Even though this method is friendly to discrete features, it is not conducive to the machine to understand continuous statements due to the loss of the connection between words in the encoded word vector. For other thing, existing methods are used to test in state-of-the-art online corpus but it is hard to apply in real applications such as telemarketing data. In this paper, we propose an improved chatbot design way using hybrid bag-of-words model and skip-gram model based on the real telemarketing data. Specifically, we first collect the real data in the telemarketing field and perform data cleaning and data classification on the constructed corpus. Second, the word representation is adopted hybrid bag-of-words model and skip-gram model. The skip-gram model maps synonyms in the vicinity of vector space. The correlation between words is expressed, so the amount of information contained in the word vector is increased, making up for the shortcomings caused by using bag-of-words model alone. Third, we use the term frequency-inverse document frequency (TF-IDF) weighting method to improve the weight of key words, then output the final word expression. At last, the answer is produced using hybrid retrieval model and generate model. The retrieval model can accurately answer questions in the field. The generate model can supplement the question of answering the open domain, in which the answer to the final reply is completed by long-short term memory (LSTM) training and prediction. Experimental results show which the hybrid word vector expression model can improve the accuracy of the response and the whole system can communicate with humans.

Sentiment Analysis using Robust Parallel Tri-LSTM Sentence Embedding in Out-of-Vocabulary Word (Out-of-Vocabulary 단어에 강건한 병렬 Tri-LSTM 문장 임베딩을 이용한 감정분석)

  • Lee, Hyun Young;Kang, Seung Shik
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.16-24
    • /
    • 2021
  • The exiting word embedding methodology such as word2vec represents words, which only occur in the raw training corpus, as a fixed-length vector into a continuous vector space, so when mapping the words incorporated in the raw training corpus into a fixed-length vector in morphologically rich language, out-of-vocabulary (OOV) problem often happens. Even for sentence embedding, when representing the meaning of a sentence as a fixed-length vector by synthesizing word vectors constituting a sentence, OOV words make it challenging to meaningfully represent a sentence into a fixed-length vector. In particular, since the agglutinative language, the Korean has a morphological characteristic to integrate lexical morpheme and grammatical morpheme, handling OOV words is an important factor in improving performance. In this paper, we propose parallel Tri-LSTM sentence embedding that is robust to the OOV problem by extending utilizing the morphological information of words into sentence-level. As a result of the sentiment analysis task with corpus in Korean, we empirically found that the character unit is better than the morpheme unit as an embedding unit for Korean sentence embedding. We achieved 86.17% accuracy on the sentiment analysis task with the parallel bidirectional Tri-LSTM sentence encoder.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Geometrical Uniformity For Space-Time Codes (시공간 부호의 기하학적 균일성)

  • 정영석;이재홍
    • Proceedings of the IEEK Conference
    • /
    • 2003.07a
    • /
    • pp.89-92
    • /
    • 2003
  • A geometrically uniform code in AWGN channel has strong symmetry properties such as a) the distance profiles form codewords On C to all other codewords are all the same, and b) all Voronoi regions of codewords in C have the same shape. Such properties make the word error probability of geometrically uniform codes be transparent to the transmitted codeword. In this paper, we extend the geometrically uniform codes in AWGN channel to the geometrical uniform codes in fading channel with multiple transmit antennas.

  • PDF