• Title/Summary/Keyword: index development

Search Result 5,176, Processing Time 0.04 seconds

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Intake of Snacks, and Perceptions and Use of Food and Nutrition Labels by Middle School Students in Chuncheon Area (춘천지역 중학생들의 간식 섭취 실태와 식품·영양표시에 대한 인식 및 이용실태)

  • Kim, Yoon-Sun;Kim, Bok-Ran
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.41 no.9
    • /
    • pp.1265-1273
    • /
    • 2012
  • The purpose of this study was to investigate the BMI, intake of snacks, and perceptions and use of food and nutrition labels by middle school students (144 boys and 189 girls) in Chuncheon area. The average height and weight of boys were $171.0{\pm}6.4$ cm and $61.0{\pm}11.4$ kg, respectively, whereas those of girls were $160.0{\pm}4.8$ cm and $50.8{\pm}6.6$ kg, respectively. Average body mass index (BMI) of boys and girls were $20.8{\pm}3.3$ and $19.8{\pm}2.4$, respectively (p<0.01). Dietary intake attitude score of girls ($34.39{\pm}5.66$) was higher than that of boys ($33.92{\pm}5.40$) (p<0.05). Subjects bought and ate snacks 1 to 3 times per week (40.2%) by themselves, and most consumed snacks were cookies (23.1%), instant noodles (16.2%), ice cream (13.2%), and candy and chocolates (13.2%). The most important factor in purchasing of snacks was 'taste' ($4.49{\pm}0.67$). When subjects bought processed foods, the rates of reading food labels was 86.6%. The most important factor of the food labels was 'expiration date' (42.9%). The degree of reading food labels on processed foods by girls ($22.70{\pm}5.72$) was higher than that of boys ($20.96{\pm}5.35$) (p<0.01). Of the 13.2% of subjects that did not read food labels, the reason why was that they were not interested (50.0%). Of the 78.4% of subjects that read nutrition labels, the most important component of the nutrition labels was 'calories' (75.9%). The main reason for reading nutrition labels was 'to control weight' (45.6%). In general, use of food labels correlated positively with dietary intake attitude score (p<0.05) and use of nutrition labels (p<0.01). Using multiple regression analysis, we found that 'usefulness of dietary life' was the most significant variable that affects the importance of food and nutrition labels. Therefore, development of an educational program on food and nutrition labels for adolescents will be effective in improving dietary life.

Cooperation Strategy in the Business Ecosystem and Its Healthiness: Case of Win - Win Growth of Samsung Electronics and Partnering Companies (기업생태계 상생전략과 기업건강성효과: 삼성전자와 협력업체의 상생경영사례를 중심으로)

  • Sung, Changyong;Kim, Ki-Chan;In, Sungyong
    • The Journal of Small Business Innovation
    • /
    • v.19 no.4
    • /
    • pp.19-39
    • /
    • 2016
  • With increasing adoption of smart products and complexity, companies have shifted their strategies from stand alone and competitive strategies to business ecosystem oriented and cooperative strategies. The win-win growth of business refers to corporate efforts undertaken by companies to pursue the healthiness of business between conglomerates and partnering companies such as suppliers for mutual prosperity and a long-term corporate soundness based on their business ecosystem and cooperative strategies. This study is designed to validate a theoretical proposition that the win-win growth strategy of Samsung Electronics and cooperative efforts among companies can create a healthy business ecosystem, based on results of case studies and surveys. In this study, a level of global market access of small and mid-sized companies is adopted as the key achievement index. The foreign market entry is considered as one of vulnerabilities in the ecosystem of small and mid-sized enterprises (SMEs). For SMEs, the global market access based on the research and development (R&D) has become the critical component in the process of transforming them into global small giants. The results of case studies and surveys are analyzed mainly based on a model of a virtuous cycle of Creativity, Opportunity, Productivity, and Proactivity (the COPP model) that features the characteristics of the healthiness of a business ecosystem. In the COPP model, a virtuous circle of profits made by the first three factors and Proactivity, which is the manifestation of entrepreneurship that proactively invests and reacts to the changing business environment of the future, enhances the healthiness of a given business ecosystem. With the application of the COPP model, this study finds major achievements of the win-win growth of Samsung Electronics as follows. First, Opportunity plays a role as a parameter in the relations of Creativity, Productivity, and creating profits. Namely, as companies export more (with more Opportunity), they are more likely to link their R&D efforts to Productivity and profitability. However, companies that do not export tend to fail to link their R&D investment to profitability. Second, this study finds that companies with huge investment on R&D for the future, which is the result of Proactivity, tend to hold a large number of patents (Creativity). And companies with significant numbers of patents tend to be large exporters as well (Opportunity), and companies with a large amount of exports tend to record high profitability (Productivity and profitability), and thus forms the virtuous cycle of the COPP model. In addition, to access global markets for sustainable growth, SMEs need to build and strengthen their competitiveness. This study concludes that companies with a high level of proactivity to invest for the future can create a virtuous circle of Creativity, Opportunity, Productivity, and Proactivity, thereby providing a strategic implication that SMEs should invest time and resources in forming such a virtuous cycle which is a sure way for the SMEs to grow into global small giants.

  • PDF

Categorizing Quality Features of Franchisees: In the case of Korean Food Service Industry (프랜차이즈 매장 품질요인의 속성분류: 국내 외식업을 중심으로)

  • Byun, Sook-Eun;Cho, Eun-Seong
    • Journal of Distribution Research
    • /
    • v.16 no.1
    • /
    • pp.95-115
    • /
    • 2011
  • Food service is the major part of franchise business in Korea, accounting for 69.9% of the brands in the market. As the food service industry becomes mature, many franchisees have struggled to survive in the market. In general, consumers have higher levels of expectation toward service quality of franchised outlets compared that of (non-franchised) independent ones. They also tend to believe that franchisees deliver standardized service at the uniform food price, regardless of their locations. Such beliefs seem to be important reasons that consumers prefer franchised outlets to independent ones. Nevertheless, few studies examined the impact of qualify features of franchisees on customer satisfaction so far. To this end, this study examined the characteristics of various quality features of franchisees in the food service industry, regarding their relationship with customer satisfaction and dissatisfaction. The quality perception of heavy-users was also compared with that of light-users in order to find insights for developing differentiated marketing strategy for the two segments. Customer satisfaction has been understood as a one-dimensional construct while there are recent studies that insist two-dimensional nature of the construct. In this regard, Kano et al. (1984) suggested to categorize quality features of a product or service into five types, based on their relation to customer satisfaction and dissatisfaction: Must-be quality, Attractive quality, One-dimensional quality, Indifferent quality, and Reverse quality. According to the Kano model, customers are more dissatisfied when Must-be quality(M) are not fulfilled, but their satisfaction does not arise above neutral no matter how fully the quality fulfilled. In comparison, customers are more satisfied with a full provision of Attactive quality(A) but manage to accept its dysfunction. One-dimensional quality(O) results in satisfaction when fulfilled and dissatisfaction when not fulfilled. For Indifferent quality(I), its presence or absence influences neither customer satisfaction nor dissatisfaction. Lastly, Reverse quality(R) refers to the features whose high degree of achievement results in customer dissatisfaction rather than satisfaction. Meanwhile, the basic guidelines of the Kano model have a limitation in that the quality type of each feature is simply determined by calculating the mode statistics. In order to overcome such limitation, the relative importance of each feature on customer satisfaction (Better value; b) and dissatisfaction (Worse value; w) were calculated following the formulas below (Timko, 1993). The Better value indicates how much customer satisfaction is increased by providing the quality feature in question. In contrast, the Worse value indicates how much customer dissatisfaction is decreased by providing the quality feature. Better = (A + O)/(A+O+M+I) Worse = (O+M)/(A+O+M+I)(-1) An on-line survey was performed in order to understand the nature of quality features of franchisees in the food service industry by applying the Kano Model. A total of twenty quality features (refer to the Table 2) were identified as the result of literature review in franchise business and a pre-test with fifty college students in Seoul. The potential respondents of our main survey was limited to the customers who have visited more than two restaurants/stores of the same franchise brand. Survey invitation e-mails were sent out to the panels of a market research company and a total of 257 responses were used for analysis. Following the guidelines of Kano model, each of the twenty quality features was classified into one of the five types based on customers' responses to a set of questions: "(1) how do you feel if the following quality feature is fulfilled in the franchise restaurant that you visit," and "(2) how do you feel if the following quality feature is not fulfilled in the franchise restaurant that you visit." The analyses revealed that customers' dissatisfaction with franchisees is commonly associated with the poor level of cleanliness of the store (w=-0.872), kindness of the staffs(w=-0.890), conveniences such as parking lot and restroom(w=-0.669), and expertise of the staffs(w=-0.492). Such quality features were categorized as Must-be quality in this study. While standardization or uniformity across franchisees has been emphasized in franchise business, this study found that consumers are interested only in uniformity of price across franchisees(w=-0.608), but not interested in standardizations of menu items, interior designs, customer service procedures, and food tastes. Customers appeared to be more satisfied when the franchise brand has promotional events such as giveaways(b=0.767), good accessibility(b=0.699), customer loyalty programs(b=0.659), award winning history(b=0.641), and outlets in the overseas market(b=0.506). The results are summarized in a matrix form in Table 1. Better(b) and Worse(w) index indicate relative importance of each quality feature on customer satisfaction and dissatisfaction, respectively. Meanwhile, there were differences in perceiving the quality features between light users and heavy users of any specific franchise brand in the food service industry. Expertise of the staffs was labeled as Must-be quality for heavy users but Indifferent quality for light users. Light users seemed indifferent to overseas expansion of the brand and offering new menu items on a regular basis, while heavy users appeared to perceive them as Attractive quality. Such difference may come from their different levels of involvement when they eat out. The results are shown in Table 2. The findings of this study help practitioners understand the quality features they need to focus on to strengthen the competitive power in the food service market. Above all, removing the factors that cause customer dissatisfaction seems to be the most critical for franchisees. To retain loyal customers of the franchise brand, it is also recommended for franchisor to invest resources in the development of new menu items as well as training programs for the staffs. Lastly, if resources allow, promotional events, loyalty programs, overseas expansion, award-winning history can be considered as tools for attracting more customers to the business.

  • PDF

Studies on Development of Prediction Model of Landslide Hazard and Its Utilization (산지사면(山地斜面)의 붕괴위험도(崩壞危險度) 예측(豫測)모델의 개발(開發) 및 실용화(實用化) 방안(方案))

  • Ma, Ho-Seop
    • Journal of Korean Society of Forest Science
    • /
    • v.83 no.2
    • /
    • pp.175-190
    • /
    • 1994
  • In order to get fundamental information for prediction of landslide hazard, both forest and site factors affecting slope stability were investigated in many areas of active landslides. Twelve descriptors were identified and quantified to develop the prediction model by multivariate statistical analysis. The main results obtained could be summarized as follows : The main factors influencing a large scale of landslide were shown in order of precipitation, age group of forest trees, altitude, soil texture, slope gradient, position of slope, vegetation, stream order, vertical slope, bed rock, soil depth and aspect. According to partial correlation coefficient, it was shown in order of age group of forest trees, precipitation, soil texture, bed rock, slope gradient, position of slope, altitude, vertical slope, stream order, vegetation, soil depth and aspect. The main factors influencing a landslide occurrence were shown in order of age group of forest trees, altitude, soil texture, slope gradient, precipitation, vertical slope, stream order, bed rock and soil depth. Two prediction models were developed by magnitude and frequency of landslide. Particularly, a prediction method by magnitude of landslide was changed the score for the convenience of use. If the total store of the various factors mark over 9.1636, it is evaluated as a very dangerous area. The mean score of landslide and non-landslide group was 0.1977 and -0.1977, and variance was 0.1100 and 0.1250, respectively. The boundary value between the two groups related to slope stability was -0.02, and its predicted rate of discrimination was 73%. In the score range of the degree of landslide hazard based on the boundary value of discrimination, class A was 0.3132 over, class B was 0.3132 to -0.1050, class C was -0.1050 to -0.4196, class D was -0.4195 below. The rank of landslide hazard could be divided into classes A, B, C and D by the boundary value. In the number of slope, class A was 68, class B was 115, class C was 65, and class D was 52. The rate of landslide occurrence in class A and class B was shown at the hige prediction of 83%. Therefore, dangerous areas selected by the prediction method of landslide could be mapped for land-use planning and criterion of disaster district. And also, it could be applied to an administration index for disaster prevention.

  • PDF

A Comparison of Body Image and Dietary Behavior in Middle and High School girls in Gyeongbuk Area (경북 일부지역 여자 중·고등학생의 체형인식도 및 식생활 행동 비교)

  • Kim, Hye-Jin;Lee, Kyung-A
    • Korean journal of food and cookery science
    • /
    • v.31 no.4
    • /
    • pp.497-504
    • /
    • 2015
  • The purpose of this study was to compare body image and dietary behavior in middle and high school girls in the Gyeongbuk area in September, 2014. Data were collected from a total of 194 middle school and 170 high school girls through a self-reported questionnaire. A total of 364 completed questionnaires were collected and used for the final analysis. The mean body mass index (BMI) of respondents was normal at 21.29. Generally, high school girls had greater height, weight and BMI than middle school girls. Height (p<0.001) and weight (p<0.001) were significantly different, while BMI was not. The ratio of students who perceived their body size as 'Fat' was significantly (p<0.05) higher in high school (43.9%) than in middle school (31.6%). The ratio of dissatisfaction with their current body image was significantly (p<0.001) higher in high school girls (64.1%) than in middle school girls (44.0%). Among respondents who perceived their body size as 'Fat', many high school girls actually (53.3%) had normal or low body weight and this was significantly (p<0.001) higher than in middle school girls (39.3%). Experience with weight control was higher in high school girls (67.3%) than in middle school girls (60.6%), but there was no significant difference. Regarding the weight control methods, respondents selected 'combination diet and exercise' (22.2%), 'diet control' (20.9%), 'exercise' (18.7%), and 'reduce snacks and midnight snack' (17.4%). 15 items under obesity-related dietary behavior were measured with 5-point scales and lower scores indicated obesity diet behavior. The mean score for all respondents was 3.19/5.00, and high school girls (3.06) scored significantly (p<0.001) higher than middle school girls (3.33). Our study suggests that the development of effective nutrition and health education for diet control is crucial for adolescent girls. This study will enable educators to plan more effective strategies to improve the dietary knowledge of adolescent girls.

An Ontology Model for Public Service Export Platform (공공 서비스 수출 플랫폼을 위한 온톨로지 모형)

  • Lee, Gang-Won;Park, Sei-Kwon;Ryu, Seung-Wan;Shin, Dong-Cheon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.149-161
    • /
    • 2014
  • The export of domestic public services to overseas markets contains many potential obstacles, stemming from different export procedures, the target services, and socio-economic environments. In order to alleviate these problems, the business incubation platform as an open business ecosystem can be a powerful instrument to support the decisions taken by participants and stakeholders. In this paper, we propose an ontology model and its implementation processes for the business incubation platform with an open and pervasive architecture to support public service exports. For the conceptual model of platform ontology, export case studies are used for requirements analysis. The conceptual model shows the basic structure, with vocabulary and its meaning, the relationship between ontologies, and key attributes. For the implementation and test of the ontology model, the logical structure is edited using Prot$\acute{e}$g$\acute{e}$ editor. The core engine of the business incubation platform is the simulator module, where the various contexts of export businesses should be captured, defined, and shared with other modules through ontologies. It is well-known that an ontology, with which concepts and their relationships are represented using a shared vocabulary, is an efficient and effective tool for organizing meta-information to develop structural frameworks in a particular domain. The proposed model consists of five ontologies derived from a requirements survey of major stakeholders and their operational scenarios: service, requirements, environment, enterprise, and county. The service ontology contains several components that can find and categorize public services through a case analysis of the public service export. Key attributes of the service ontology are composed of categories including objective, requirements, activity, and service. The objective category, which has sub-attributes including operational body (organization) and user, acts as a reference to search and classify public services. The requirements category relates to the functional needs at a particular phase of system (service) design or operation. Sub-attributes of requirements are user, application, platform, architecture, and social overhead. The activity category represents business processes during the operation and maintenance phase. The activity category also has sub-attributes including facility, software, and project unit. The service category, with sub-attributes such as target, time, and place, acts as a reference to sort and classify the public services. The requirements ontology is derived from the basic and common components of public services and target countries. The key attributes of the requirements ontology are business, technology, and constraints. Business requirements represent the needs of processes and activities for public service export; technology represents the technological requirements for the operation of public services; and constraints represent the business law, regulations, or cultural characteristics of the target country. The environment ontology is derived from case studies of target countries for public service operation. Key attributes of the environment ontology are user, requirements, and activity. A user includes stakeholders in public services, from citizens to operators and managers; the requirements attribute represents the managerial and physical needs during operation; the activity attribute represents business processes in detail. The enterprise ontology is introduced from a previous study, and its attributes are activity, organization, strategy, marketing, and time. The country ontology is derived from the demographic and geopolitical analysis of the target country, and its key attributes are economy, social infrastructure, law, regulation, customs, population, location, and development strategies. The priority list for target services for a certain country and/or the priority list for target countries for a certain public services are generated by a matching algorithm. These lists are used as input seeds to simulate the consortium partners, and government's policies and programs. In the simulation, the environmental differences between Korea and the target country can be customized through a gap analysis and work-flow optimization process. When the process gap between Korea and the target country is too large for a single corporation to cover, a consortium is considered an alternative choice, and various alternatives are derived from the capability index of enterprises. For financial packages, a mix of various foreign aid funds can be simulated during this stage. It is expected that the proposed ontology model and the business incubation platform can be used by various participants in the public service export market. It could be especially beneficial to small and medium businesses that have relatively fewer resources and experience with public service export. We also expect that the open and pervasive service architecture in a digital business ecosystem will help stakeholders find new opportunities through information sharing and collaboration on business processes.

Geochemical Equilibria and Kinetics of the Formation of Brown-Colored Suspended/Precipitated Matter in Groundwater: Suggestion to Proper Pumping and Turbidity Treatment Methods (지하수내 갈색 부유/침전 물질의 생성 반응에 관한 평형 및 반응속도론적 연구: 적정 양수 기법 및 탁도 제거 방안에 대한 제안)

  • 채기탁;윤성택;염승준;김남진;민중혁
    • Journal of the Korean Society of Groundwater Environment
    • /
    • v.7 no.3
    • /
    • pp.103-115
    • /
    • 2000
  • The formation of brown-colored precipitates is one of the serious problems frequently encountered in the development and supply of groundwater in Korea, because by it the water exceeds the drinking water standard in terms of color. taste. turbidity and dissolved iron concentration and of often results in scaling problem within the water supplying system. In groundwaters from the Pajoo area, brown precipitates are typically formed in a few hours after pumping-out. In this paper we examine the process of the brown precipitates' formation using the equilibrium thermodynamic and kinetic approaches, in order to understand the origin and geochemical pathway of the generation of turbidity in groundwater. The results of this study are used to suggest not only the proper pumping technique to minimize the formation of precipitates but also the optimal design of water treatment methods to improve the water quality. The bed-rock groundwater in the Pajoo area belongs to the Ca-$HCO_3$type that was evolved through water/rock (gneiss) interaction. Based on SEM-EDS and XRD analyses, the precipitates are identified as an amorphous, Fe-bearing oxides or hydroxides. By the use of multi-step filtration with pore sizes of 6, 4, 1, 0.45 and 0.2 $\mu\textrm{m}$, the precipitates mostly fall in the colloidal size (1 to 0.45 $\mu\textrm{m}$) but are concentrated (about 81%) in the range of 1 to 6 $\mu\textrm{m}$in teams of mass (weight) distribution. Large amounts of dissolved iron were possibly originated from dissolution of clinochlore in cataclasite which contains high amounts of Fe (up to 3 wt.%). The calculation of saturation index (using a computer code PHREEQC), as well as the examination of pH-Eh stability relations, also indicate that the final precipitates are Fe-oxy-hydroxide that is formed by the change of water chemistry (mainly, oxidation) due to the exposure to oxygen during the pumping-out of Fe(II)-bearing, reduced groundwater. After pumping-out, the groundwater shows the progressive decreases of pH, DO and alkalinity with elapsed time. However, turbidity increases and then decreases with time. The decrease of dissolved Fe concentration as a function of elapsed time after pumping-out is expressed as a regression equation Fe(II)=10.l exp(-0.0009t). The oxidation reaction due to the influx of free oxygen during the pumping and storage of groundwater results in the formation of brown precipitates, which is dependent on time, $Po_2$and pH. In order to obtain drinkable water quality, therefore, the precipitates should be removed by filtering after the stepwise storage and aeration in tanks with sufficient volume for sufficient time. Particle size distribution data also suggest that step-wise filtration would be cost-effective. To minimize the scaling within wells, the continued (if possible) pumping within the optimum pumping rate is recommended because this technique will be most effective for minimizing the mixing between deep Fe(II)-rich water and shallow $O_2$-rich water. The simultaneous pumping of shallow $O_2$-rich water in different wells is also recommended.

  • PDF

Annual Reproductive Cycle and Changes in Plasma Levels of Sex Steroid Hormones of the Female Korean Dark Sleeper, Odontobutis platycephala (Iwata et Jeon) (동사리, Odontobutis platycephala (Iwata et Jeon) 암컷의 생식주기와 혈중 성스테로이드 호르몬의 변화)

  • LEE Won-Kyo
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.31 no.4
    • /
    • pp.599-607
    • /
    • 1998
  • To clarify annual reproductive cycle of Korean dark sleeper, Odontobutis platycephala (Iwata et Jeon), we examined the seasonal changes of gonadosomatic index (GSI), the proportional frequency of oocyte development stages in the ovary and the changes of sex steroid hormone levels in blood from December 1995 to November 1997. In July and August, GSI was 0.35 to 0.72 and most oocytes in the ovary were chromatin-nucleolus stage and perinucleolar stage (proportional frequency: $87\%\~96\%$). In September, GSI was 1.20 $\pm$ 0.12, some oocytes in the ovary were yolk vesifle stage (proportional frequency: $22.8\%$) and vitellogenic stage which appeared very rarely(proportional frequency: $2.2\%$). GSI increased gradually from October and reached 4.59± 0.61 to December. During this period, oocytes of vitellogenic stage increased slightly (proportional frequency in December: $22.1\%$). In January, GSI was 4.32 $\pm$ 0.72 but the proportional frequency of oocytes in vitellogenic stage increased (proportional frequency: $51.2\%$). from February, GSI was increased sharply and reached to 10.51 $\pm$ 1.04 in March, the highest value throughout the year and the proportional frequency of oocytes in vitellogenic stage also reached the highest levels (proportional frequency: $60\%$). From April, GSI was gradually decreased and fell down to 1.11 $\pm$ 0.35 in June. During this period, the proportional frequency of mature oocytes was the highest in April (proportional frequency of mature oocyte stage: $40\%$ in April, $12\%$ May, $5\%$ June) throughout the year, and atretic ovarian follicles were appeared. The blood level of estradiol-17$\beta$ ($E_2$), which stimulates the hepatic synthesis and secretion of vitellogenin, was $0.84{\pm}0.20\;ng/m{\ell}$ in August, and thereafter was not changed until December. from January, it increased sharply and reached the highest level of $ 2.85{\pm}0.35\;ng/m{\ell}$ in March throughout the year, but fell to $0.14{\pm}0.02\;ng/m{\ell}$ in July(P<0.05), 17$\alpha$-hydroxprogesterone(17$\alpha$-OHP) was the peak $13.37{\pm}0.52ng/m{\ell}$ in March, but no significant changes in other period(below $3ng/m{\ell}$, P<0.05). 17$\alpha$, 20$\beta$-dihydroxy-4-pregnen-3-one(17$\alpha$, 20$\beta$-P), which was known as the final maturation inducing hormone in teleost, was $0.74{\pm}0.09ng/m{\ell}$ in April and $0.54{\pm}0.07ng/m{\ell}$ in May, but no significant changes in other period (below $0.26\;ng/m{\ell}$, p<0.05). Taken together these results, the annual reproductive cycle of O. platycephala divided into 4 periods as follows: 1) ripe and spawning period from April to June, main spawning period was from April to May, 2) Resting period from July to August, 3) Growing period from September to December, 4) Maturing period from January to March. Moreover, It was showed that the changes of sex steroid hormone in blood played a important roles in the annual reproductive cycle of O. platycephala.

  • PDF

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.