• Title/Summary/Keyword: Extraction Time

Search Result 3,562, Processing Time 0.034 seconds

An Analysis of IT Trends Using Tweet Data (트윗 데이터를 활용한 IT 트렌드 분석)

  • Yi, Jin Baek;Lee, Choong Kwon;Cha, Kyung Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.143-159
    • /
    • 2015
  • Predicting IT trends has been a long and important subject for information systems research. IT trend prediction makes it possible to acknowledge emerging eras of innovation and allocate budgets to prepare against rapidly changing technological trends. Towards the end of each year, various domestic and global organizations predict and announce IT trends for the following year. For example, Gartner Predicts 10 top IT trend during the next year, and these predictions affect IT and industry leaders and organization's basic assumptions about technology and the future of IT, but the accuracy of these reports are difficult to verify. Social media data can be useful tool to verify the accuracy. As social media services have gained in popularity, it is used in a variety of ways, from posting about personal daily life to keeping up to date with news and trends. In the recent years, rates of social media activity in Korea have reached unprecedented levels. Hundreds of millions of users now participate in online social networks and communicate with colleague and friends their opinions and thoughts. In particular, Twitter is currently the major micro blog service, it has an important function named 'tweets' which is to report their current thoughts and actions, comments on news and engage in discussions. For an analysis on IT trends, we chose Tweet data because not only it produces massive unstructured textual data in real time but also it serves as an influential channel for opinion leading on technology. Previous studies found that the tweet data provides useful information and detects the trend of society effectively, these studies also identifies that Twitter can track the issue faster than the other media, newspapers. Therefore, this study investigates how frequently the predicted IT trends for the following year announced by public organizations are mentioned on social network services like Twitter. IT trend predictions for 2013, announced near the end of 2012 from two domestic organizations, the National IT Industry Promotion Agency (NIPA) and the National Information Society Agency (NIA), were used as a basis for this research. The present study analyzes the Twitter data generated from Seoul (Korea) compared with the predictions of the two organizations to analyze the differences. Thus, Twitter data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. To overcome these challenges, we used SAS IRS (Information Retrieval Studio) developed by SAS to capture the trend in real-time processing big stream datasets of Twitter. The system offers a framework for crawling, normalizing, analyzing, indexing and searching tweet data. As a result, we have crawled the entire Twitter sphere in Seoul area and obtained 21,589 tweets in 2013 to review how frequently the IT trend topics announced by the two organizations were mentioned by the people in Seoul. The results shows that most IT trend predicted by NIPA and NIA were all frequently mentioned in Twitter except some topics such as 'new types of security threat', 'green IT', 'next generation semiconductor' since these topics non generalized compound words so they can be mentioned in Twitter with other words. To answer whether the IT trend tweets from Korea is related to the following year's IT trends in real world, we compared Twitter's trending topics with those in Nara Market, Korea's online e-Procurement system which is a nationwide web-based procurement system, dealing with whole procurement process of all public organizations in Korea. The correlation analysis show that Tweet frequencies on IT trending topics predicted by NIPA and NIA are significantly correlated with frequencies on IT topics mentioned in project announcements by Nara market in 2012 and 2013. The main contribution of our research can be found in the following aspects: i) the IT topic predictions announced by NIPA and NIA can provide an effective guideline to IT professionals and researchers in Korea who are looking for verified IT topic trends in the following topic, ii) researchers can use Twitter to get some useful ideas to detect and predict dynamic trends of technological and social issues.

Effects of Soil Temperature on Biodegradation Rate of Diesel Compounds from a Field Pilot Test Using Hot Air Injection Process (고온공기주입 공법 적용시 지중온도가 생분해속도에 미치는 영향)

  • Park Gi-Ho;Shin Hang-Sik;Park Min-Ho;Hong Seung-Mo;Ko Seok-Oh
    • Journal of Soil and Groundwater Environment
    • /
    • v.10 no.4
    • /
    • pp.45-53
    • /
    • 2005
  • The objective of this study is to evaluate the effects of changes in soil temperature on biodegradation rate of diesel compounds from a field pilot test using hot air injection process. Total remediation time was estimated from in-situ biodegradation rate and temperature for optimum biodegradation. All tests were conducted by measuring in-situ respiration rates every about 10 days on highly contaminated area where an accidental diesel release occurred. The applied remediation methods were hot air injection/extraction process to volatilize and extract diesel compounds followed by a bioremediation process to degrade residual diesels in soils. Oxygen consumption rate varied from 2.2 to 46.3%/day in the range of 26 to $60^{\circ}C$, and maximum $O_2$ consumption rate was observed at $32.0^{\circ}C$. Zero-order biodegradation rate estimated on the basis of oxygen consumption rates varied from 6.5 to 21.3 mg/kg-day, and the maximum biodegradation rate was observed at $32^{\circ}C$ as well. In other temperature range, the values were in the decreasing trend. The first-order kinetic constants (k) estimated from in-situ respiration rates measured periodically were 0.0027, 0.0013, and $0.0006d^{-1}$ at 32.8, 41.1, and $52.7^{\circ}C$, respectively. The estimated remediation time was from 2 to 9 years, provided that final TPH concentration in soils was set to 870 mg/kg.

Effect of antioxidation and antibacterial activity on crude extract and Characterization of American Cockroaches (Periplaneta americana L.) in Korea (국내 서식 미국바퀴(Periplaneta americana L.)의 특성 및 추출물의 항산화·항균 효과)

  • Kim, Jung-Eun;Kim, Seon-Gon;Kang, Sung-Ju;Kim, Chun-Sung;Choi, Yong-Soo
    • Journal of Sericultural and Entomological Science
    • /
    • v.53 no.2
    • /
    • pp.135-142
    • /
    • 2015
  • The American cockroaches, Periplaneta americana L. was the most important worldwide pest species. It has been an public health problems. We were determinated life cycle and extraction of crude extracts by chemical reagents from cockraches (P. americana L.). The extracted crude solution has been antibacterial activity to gram negative bacteria (Pseudomonas aeruginosa, $6.44{\pm}1.03mm$), gram positive bacteria (Bacillus subtilis, $1.88{\pm}0.40mm$), and fungus (Candida albicans, $5.61{\pm}0.57mm$) using radial diffusion assay. We were analysed of up-regulation of Glutathione-S-transferases (GSTs) stimulation, indicating that antioxidantial protein from various classes are simultaneously expressed in a single insect upon infection or injury. The gene from Periplaneta americana L. were cloned, analysed sequence, and measured protein expression by Real Time PCR (Polymerase Chain Reaction).

Improvement of Radiosynthesis Yield of [11C]acetate ([11C]아세트산의 방사화학적 수율 증가를 위한 연구)

  • Park, Jun Young;Son, Jeongmin
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.22 no.2
    • /
    • pp.74-78
    • /
    • 2018
  • Purpose $[^{11}C]$acetate has been proved useful in detecting the myocardial oxygen metabolism and various malignancies including prostate cancer, hepatocellular carcinoma, renal cell carcinoma and brain tumors. The purpose of study was to improve the radiosynthesis yield of $[^{11}C]$acetate on a automated radiosynthesis module. Materials and Methods $[^{11}C]$acetate was prepared by carboxylation of grignard reagent, methylmagnesium chloride, with $[^{11}C]$$CO_2$ gas, followed by hydrolysis with 1 mM acetic acid and purification using solid phase extraction cartridges. The effect of the reaction temperature ($0^{\circ}C$, $10^{\circ}C$, $-55^{\circ}C$) and cyclotron beam time (10 min, 15 min, 20 min, 25 min) on the radiosynthesis yield were investigated in the $[^{11}C]$acetate labeling reaction. Results The maximum radiosynthesis yield was obtained at $-10^{\circ}C$ of reaction temperature. The radioactivities of $[^{11}C]$acetate acquired at $-10^{\circ}C$ reaction temperature was 2.4 times higher than those of $[^{11}C]$acetate acquired at $-55^{\circ}C$. Radiosynthesis yield of $[^{11}C]$acetate increased with increasing cyclotron beam time. Conclusion This study shows that radiosynthesis yield of $[^{11}C]$acetate highly dependent on reaction temperature. The best radiosynthesis yield was obtained in reaction of grignard reagent with $[^{11}C]$$CO_2$ at $-10^{\circ}C$. This radiolabeling conditions will be ideal for routine clinical application.

A Study on the Effect of Improving Permeability by Injecting a Soil Remediation Agent in the In-situ Remediation Method Using Plasma Blasting, Pneumatic Fracturing, and Vacuum Suction Method (플라즈마 블라스팅, 공압파쇄, 진공추출이 활용된 지중 토양정화공법의 정화제 주입에 따른 투수성 개선 연구)

  • Geun-Chun Lee;Jae-Yong Song;Cha-Won Kang;Hyun-Shic Jang;Bo-An Jang;Yu-Chul Park
    • The Journal of Engineering Geology
    • /
    • v.33 no.3
    • /
    • pp.371-388
    • /
    • 2023
  • A stratum with a complex composition and a distributed low-permeability soil layer is difficult to remediate quickly because the soil remediation does not proceed easily. For efficient purification, the permeability should be improved and the soil remediation agent (H2O2) should be injected into the contaminated section to make sufficient contact with the TPH (Total petroleum hydrocarbons). This study analyzed a method for crack formation and effective delivery of the soil remediation agent based on pneumatic fracturing, plasma blasting, and vacuum suction (the PPV method) and compared its improvement effect relative to chemical oxidation. A demonstration test confirmed the effective delivery of the soil remediation agent to a site contaminated with TPH. The injection amount and injection time were monitored to calculate the delivery characteristics and the range of influence, and electrical resistivity surveying qualitatively confirmed changes in the underground environment. Permeability tests also evaluated and compared the permeability changes for each method. The amount of soil remediation agent injected was increased by about 4.74 to 7.48 times in the experimental group (PPV method) compared with the control group (chemical oxidation); the PPV method allowed injection rates per unit time (L/min) about 5.00 to 7.54 times quicker than the control method. Electrical resistivity measurements assessed that in the PPV method, the diffusion of H2O22 and other fluids to the surface soil layer reduced the low resistivity change ratio: the horizontal change ratio between the injection well and the extraction well decreased the resistivity by about 1.12 to 2.38 times. Quantitative evaluation of hydraulic conductivity at the end of the test found that the control group had 21.1% of the original hydraulic conductivity and the experimental group retained 81.3% of the initial value, close to the initial permeability coefficient. Calculated radii of influence based on the survey results showed that the results of the PPV method were improved by 220% on average compared with those of the control group.

Region of Interest Extraction and Bilinear Interpolation Application for Preprocessing of Lipreading Systems (입 모양 인식 시스템 전처리를 위한 관심 영역 추출과 이중 선형 보간법 적용)

  • Jae Hyeok Han;Yong Ki Kim;Mi Hye Kim
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.189-198
    • /
    • 2024
  • Lipreading is one of the important parts of speech recognition, and several studies have been conducted to improve the performance of lipreading in lipreading systems for speech recognition. Recent studies have used method to modify the model architecture of lipreading system to improve recognition performance. Unlike previous research that improve recognition performance by modifying model architecture, we aim to improve recognition performance without any change in model architecture. In order to improve the recognition performance without modifying the model architecture, we refer to the cues used in human lipreading and set other regions such as chin and cheeks as regions of interest along with the lip region, which is the existing region of interest of lipreading systems, and compare the recognition rate of each region of interest to propose the highest performing region of interest In addition, assuming that the difference in normalization results caused by the difference in interpolation method during the process of normalizing the size of the region of interest affects the recognition performance, we interpolate the same region of interest using nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation, and compare the recognition rate of each interpolation method to propose the best performing interpolation method. Each region of interest was detected by training an object detection neural network, and dynamic time warping templates were generated by normalizing each region of interest, extracting and combining features, and mapping the dimensionality reduction of the combined features into a low-dimensional space. The recognition rate was evaluated by comparing the distance between the generated dynamic time warping templates and the data mapped to the low-dimensional space. In the comparison of regions of interest, the result of the region of interest containing only the lip region showed an average recognition rate of 97.36%, which is 3.44% higher than the average recognition rate of 93.92% in the previous study, and in the comparison of interpolation methods, the bilinear interpolation method performed 97.36%, which is 14.65% higher than the nearest neighbor interpolation method and 5.55% higher than the bicubic interpolation method. The code used in this study can be found a https://github.com/haraisi2/Lipreading-Systems.

Anti-wrinkle effects of solvent fractions from Jubak on CCD-986sk (CCD-986sk 세포 내 주박 분획물의 항주름 효능)

  • Young-Ah Jang;Hyejeong Lee
    • Journal of the Korean Applied Science and Technology
    • /
    • v.41 no.2
    • /
    • pp.508-519
    • /
    • 2024
  • In this study, in order to evaluate the possibility of using Jubak as a functional cosmetic material, evaluation of antioxidant activity according to fractions and anti-wrinkle efficacy in CCD-986sk cells, a human fibroblast, were conducted. As a result of confirming the antioxidant activity by measuring ABTS+ radical scavenging ability, Jubak's Ethyl Acetate fractions was found to be 75.5% at a concentration of 1,000 ㎍/ml, showing the highest antioxidant activity among the extraction solvents. The wrinkle improvement effect was confirmed by measuring the inhibitory activity of elastase and collagenase, and in both test results, Jubak's Ethyl Acetate fractions showed the highest efficacy at a concentration of 1,000 ㎍/ml. As a result of measuring the synthesis rate of pro-collagen type I in CCD-986sk cells induced by UVB, Jubak showed the highest efficacy in the order of Ethyl Acetate, Water, Acetonitrile, and Hexan fractions at the same concentration of 20 ㎍/ml. As a result of measuring the inhibition rate of MMP-1, a collagen degrading enzyme, all four solvent fractions showed an efficacy of more than 70% at 20 ㎍/ml. As a result of measuring the mRNA expression levels of pro-collagen type I, MMP-1, and MMP-3 in a real-time PCR experiment, the protein expression level of pro-collagen type I increased when treated with Jubak fractions compared to the UVB group alone. The mRNA expression levels of MMP-1 and MMP-3 were confirmed to be decreased, and Ethyl Acetate fractions was the most effective in improving wrinkles after the control group (EGCG). As a result, it was confirmed that the Ethyl Acetate fractions among Jubak's solvent fractions has an anti-wrinkle effect against photoaging caused by UVB stimulation, and is expected to be used as a natural material for cosmetics.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.