자연언어의 순차적 레이블 결정을 위한 심층학습

Kim, Gang-Il;

정보과학회지 (Communications of the Korean Institute of Information Scientists and Engineers)

제33권10호
/
Pages.14-21
/
2015
/
1229-6821(pISSN)

한국정보과학회 (Korean Institute of Information Scientists and Engineers)

자연언어의 순차적 레이블 결정을 위한 심층학습

김강일 (한국전자통신연구원)

Kim, Gang-Il

발행 : 2015.10.18

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

키워드

참고문헌

C. D. Manning and H. Schutze, "Foundations of statistical natural language processing". MIT press, 1999.
D. Jurafsky and J. H. Martin, "Speech &Ianguage processing. Pearson Education India", 2000.
B. Arias, N. Bel, B. Fisas, M. Lorente, M. Marimon, C. Morell, and J. Vivaldi, "The IULA Spanish LSP Treebank: building and browsing".
M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini, "Building a large annotated corpus of English: The Penn Treebank. Computational linguistics", 19(2), pp. 313-330, 1993.
H. Tseng, D. Jurafsky, and C. Manning, "Morphological features help POS tagging of unknown words across language varieties", In Proceedings of the fourth SIGHAN workshop on Chinese language processing, pp. 32-39, October 2005.
L. R. Rabiner and B. H. Juang, "An introduction to hidden Markov models", ASSP Magazine, IEEE, 3(1), pp. 4-16, 1986 https://doi.org/10.1109/MASSP.1986.1165351
E. Charniak, C. Hendrickson, N. Jacobson, and M. Perkowitz, "Equations for part-of-speech tagging", In AAAl, pp. 784-789, July 1993.
J. Kupiec, "Robust part-of-speech tagging using a hidden Markov model", Computer Speech & Language, 6(3), pp. 225-242, 1992. https://doi.org/10.1016/0885-2308(92)90019-Z
A. McCallum, D. Freitag, and F. C. Pereira, "Maximum Entropy Markov Models for Information Extraction and Segmentation", In ICML, Vol. 17, pp. 591-598, June 2000.
A. Ratnaparkhi, "A maximum entropy model for part-of-speech tagging", In Proceedings of the conference on empirical methods in natural language processing, Vol. 1, pp. 133-142, May 1996.
J. D. Lafferty, A. McCallum, F. C. N. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data", Proceedings of the Eighteenth International Conference on Machine Learning, p.282-289, June 28-July 01, 2001
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, "Large Margin Methods for Structured and Interdependent Output Variables", Journal of Machine Learning Research, 6, 1453-1484, December 2005.
G. D. Jr. Forney, "The viterbi algorithm", Proceedings of the IEEE, 61(3), pp. 268-278, March 1973. https://doi.org/10.1109/PROC.1973.9030
D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, "A learning algorithm for boltzmann machines", Cognitive Science, 9(1), pp. 147-169, 1985. https://doi.org/10.1207/s15516709cog0901_7
Y. Le Cun, "Learning process in an asymmetric threshold network", In Disordered systems and biological organization, Springer Berlin Heidelberg, pp. 233-240, 1986
D. E. Rumelhmi, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation", In Parallel distributed processing: explorations in the microstructure of cognition, vol. 1, MIT Press, Cambridge, MA, USA, pp. 318-362, 1986.
Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencies with gradient descent is difficult", IEEE Transactions on Neural Networks, 5(2), pp. 157-166, 1994. https://doi.org/10.1109/72.279181
G. E. Hinton, S. Osindero, and Y. W. Teh, "A fast learning algorithm for deep belief nets", Neural computation, 18(7), pp. 1527-1554, 2006. https://doi.org/10.1162/neco.2006.18.7.1527
D. Erhan, Y. Bengio, A. Courville, P. A. Manzagol, P. Vincent, and S. Bengio, "Why does unsupervised pre-training help deep learning?", The Journal of Machine Learning Research, 11, pp. 625-660, 2010.
C. Dyer, M. Ballesteros, W. Ling, A. Matthews, and N. A. Smith, "Transition-Based Dependency Parsing with Stack Long Short-Tenn Memory", arXiv preprint arXiv:1505.08075, 2015.
Weiss, D., Alberti, C., Collins, M., &Petrov, S. (2015). Structured training for neural network transition-based parsing. arXiv preprint arXiv: 1506.06158.
H. C. Carneiro, F. M. Franya, and P. M. Lima, "Multilingual part-of-speech tagging with weightless neural networks", Neural Networks, 66, pp. 11-21, 2015. https://doi.org/10.1016/j.neunet.2015.02.012
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing(almost) from scratch", The Journal of Machine Learning Research, 12, pp. 2493-2537, 2011.
Z. S. Harris, "Distributional structure", Word, 1954.
Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin, "A neural probabilistic language model", The Journal of Machine Learning Research, 3, pp. 1137-1155, 2003.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition", Proceedings of the IEEE, 86(11), pp. 2278-2324, 1998. https://doi.org/10.1109/5.726791
V Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines", In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807-814, 2010.
P. Sibi, S. A. Jones, and P. Siddarth, "Analysis of different activation functions using back propagation neural networks", Journal of Theoretical and Applied Information Technology, 47(3), pp. 1264-1268, 2013.
B. Karlik and A. V. Olgac, "Performance analysis of various activation functions in generalized MLP architectures of neural networks",. Internation Journal of Artificial Intelligence and Expert Systems, 1(4), pp. 111-122, 2011.
M. T. Luong, I. Sutskever, Q. V Le, O. Vinyals, and W. Zaremba, "Addressing the rare word problem in neural machine translation", In Proceedings of Association of computational linguistics(ACL), 2015.
S. Hochreiter and J. Schmidhuber, "Long short-term memory", Neural computation, 9(8), pp. 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
D. Koller and N. Friedman, "Probabilistic graphical models: principles and techniques", MIT press, 2009.

정보과학회지 (Communications of the Korean Institute of Information Scientists and Engineers)

자연언어의 순차적 레이블 결정을 위한 심층학습

초록

키워드

참고문헌

자세히 찾기