Search | Korea Science

Kim, A-Yong;Park, Man-Seub;Kim, Jong-Moon;Jeong, Dae-Jin;Jung, Hoe-kyung
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.05a
- /
- pp.619-621
- /
- 2014
addition to the advances in information technology and the latest IT technology for their issue. To enable users who are using the Web to find need the information your search data they're sifting through about how many are struggling. In this paper, we propose a user-centered search system. Lucene search system to offer Hadoop's MapReduce with the Apache project Nutch, Solr, HDFS, utilizing design and implementation. This is the Web search users who wish to use depending on the intentions of the data that you want to collect and index information will be utilized in the search field.
PDF

Song, Hyun-Ok;Kim, A-Yong;Jung, Hoe-Kyung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.3
- /
- pp.552-557
- /
- 2015
IT technology development and smart appliances due to the increased use of a lot of data on production and consumption has become in the internet. Because this is why importance of information retrieval technology although the growing becoming aware of the difficult techniques to access the required of lot a background knowledge on information retrieval technology. However, the Lucene due to emerge provide to background can implement on search engine by using the Lucene of lack background knowledge for search technology. In this paper, suggest to implement on search engine by using the developed a framework on Lucene-based. Suggest a frameworks are use in the search engines on have guarantee in server environment support on distributed processing and distributed storage, and high availability by using the Hadoop and Nutch, Solr, Zookeeper.
https://doi.org/10.6109/jkiice.2015.19.3.552 인용 PDF KSCI KPUBS HTML

Kim, Heejung
- Proceedings of the Korean Society for Information Management Conference
- /
- 2011.08a
- /
- pp.95-98
- /
- 2011
디지털 자원의 장기보존을 위한 기법과 전략은 지속적인 관심 속에서 개발되어 오고 있다. 특히, 웹 자원에 대한 의존도가 증폭될수록 웹 아카이빙에 대한 중요성이 커지고 있다. 본 연구에서는 IIPC에서 제시하는 웹 아카이빙 체인의 네 단계에 해당하는 각 단계별 웹 아카이빙 툴과 그 특성을 살펴보았다. 대상이 되는 웹 아카이빙 도구는 총 9개로서, Heritrix, DeepArc, Web Curator Tool, NetarchiveSuite, BnFArcTools, Wayback, NutchWAX, WERA 그리고 Xinq 등이다.
PDF

Kim, A-Yong;Jeong, Dae-Jin;Gye, Min-Suk;Kim, Chang-Su;Jung, Hoe-kyung
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.10a
- /
- pp.603-605
- /
- 2014
Has been switched to the online shopping market in stores of the consumer is from increased utilization and smart devices, the internet popularization. That is why has been converting the user's consumption patterns and consumer culture. Open markets is provides of making a wide variety of events and lowest price policies, safe transactions etc, for attract the consumers of expand distribution channels of the web and via mobile. In this paper, a designs of provides a search system for minimum price product information to the user of Information collect and analyze on sale from open market.
PDF

Park, Ki-Sung;Choi, Jae-Hyun;Kim, Jong-Bae;Park, Jae-Won
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.21 no.1
- /
- pp.17-28
- /
- 2017
Recently, a study on data has been actively conducted because the value of the data has become more useful. Web crawler that is program of data collection recently spotlighted because it can take advantage of the various fields. Web crawler can be defined as a tool to analyze the web pages and collects the URL by traversing the web server in an automated manner. For the treatment of Big-data, distributed Web crawler is widely used which is based on the Hadoop MapReduce. But, it is difficult to use and has constraints on the performance. Apache spark that is the In-memory computing platform is an alternative to MapReduce. The search engine which is one of the main purposes of web crawler displays the information you search by keyword gathered by web crawler. If search engines implement a spark-based web crawler instead of traditional MapReduce-based web crawler, it would be a more rapid data collection.
https://doi.org/10.6109/jkiice.2017.21.1.17 인용 PDF KSCI