1 |
D. Becker and K. Riaz, A study in Urdu corpus construction, in Proc. Workshop Asian Lang. Resour. Int. Stand. vol. 12, (Stroudsburg, PA, USA), Aug. 2002, pp. 1-5.
|
2 |
S. Urooj et al., Cle Urdu digest corpus, in Proc. Conf. Lang. Technol. (SNLP), (Lahore, Pakistan), (2012), pp. 47-53.
|
3 |
F. Baseer, A. Habib, and J. Ashraf, Romanized Urdu corpus development (rucd) model: Edit-distance based most frequent unique unigram extraction approach using real-time interactive dataset, in Proc. Int. Conf. Innov. Comput. Technol. (INTECH), (Dublin, Ireland), Aug. 2016, pp. 513-518.
|
4 |
Q. Abbas, Building a hierarchical annotated corpus of Urdu: The Urdu. kon-tb treebank, in International Conference on Intelligent Text Processing and Computational Linguistics, Springer, Berlin, Germany, 2012, pp. 66-79.
|
5 |
M. Ijaz and S. Hussain, Corpus based Urdu lexicon development, in Proc. Conf. Lang. Technol. (CLT07), vol. 73, (Peshawar, Pakistan), Aug. 2007.
|
6 |
M. Karthikeyan and P. Aruna, Probability based document clustering and image clustering using content-based image retrieval, Appl. Soft Comp. 13 (2013), no. 2, 959 -966.
DOI
|
7 |
M. Humayoun et al., Urdu summary corpus, in Proc. Int. Conf. Lang. Resour. Eval. (Reykjavik, Iceland), May 2014, pp. 796-800, https://github.com/humsh a/USCorpus
|
8 |
Q. A. Akram, A. Naseer, and S. Hussain, Assasband, an affix-exception-list based Urdu stemmer, in Proc. Workshop Asian Lang. Resour. (Suntec, Singapore), Aug. 2009, pp. 40-47,
|
9 |
I. Rasheed et al., Urdu text classification: A comparative study using machine learning techniques, in Proc. Int. Conf. Digit. Inf. Manag. (ICDIM) (Berlin, Germany), Sept. 2018, pp. 274-278.
|
10 |
A. AleAhmad et al., Hamshahri: A standard persian text collection, Knowl. Based Syst. 22 (2009), no. 5, 382 -387.
DOI
|
11 |
S. Hussain, Resources for Urdu language processing, in Proc. Workshop Asian Lang. Resour. IJCNLP, (Hyderabad, India), Jan. 2008, pp. 99-100, https://www.aclweb.org/anthology/I08-7017.pdf
|
12 |
A. Kanapala and S. Pal, Test collection for legal ir from online discussion forums, in Proc. Forum Inf. Retr. Eval. (Bangalore, India), Dec. 2014, pp. 126-129.
|
13 |
I. Ounis et al., Terrier information retrieval platform, in Advances in Information Retrieval, vol. 3408, Springer, Berlin, Germany, 2005, pp. 517-519.
|
14 |
E. M. Voorhees, Overview of trec 2003, in Proc. Text Retr. Conf. (TREC), (Gaithersburg, MD, USA), Nov. 2003, pp. 1-13, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=150467
|
15 |
S. E. Robertson et al., Okapi at trec-4, in Proc. Text REtrieval Conf. (London, UK), Oct. 1996, pp. 73-96, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.3342
|
16 |
C. D. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, vol. 999, MIT Press, Cambridge, MA, USA, 1999, https://nlp.stanf ord.edu/fsnlp/.
|
17 |
I. Rasheed and H. Banka, Query expansion in information retrieval for Urdu language, in Proc. Int. Conf. Inf. Retr. Knowl. Manag. (CAMP), (Kota Kinabalu, Malaysia), Mar. 2018, pp. 171-176.
|
18 |
I. Rasheed, H. Banka, and H. M. Khan, Pseudo-relevance feedback based query expansion using boosting algorithm, Artif. Intell. Rev. (2021), https://doi.org/10.1007/s10462-021-09972-4
DOI
|
19 |
P. Clough and M. Sanderson, Evaluating the performance of information retrieval systems using test collections, Inf. Res, 18 (2013), no. 2.
|
20 |
A. K. McCallum, Mallet: A machine learning for language toolkit, 2002, http://mallet.cs.umass.edu/.
|
21 |
R. Rahimi, A. Shakery, and I. King, Extracting translations from comparable corpora for cross-language information retrieval using the language modeling framework, Inf. Process. Manage, 52 (2016), no. 2, 299 -318.
DOI
|
22 |
M. Humayoun, H. Hammarstrom, and A. Ranta, Urdu morphology, orthography and lexicon extraction, M.S. thesis, Department of Computer Science and Engineering, Chalmers tekniska hogskola, Goteborg, Sweden, 2006.
|
23 |
A. Hardie, Developing a tag-set for automated part-of-speech tagging in Urdu in Proc. Corpus Linguistics (Lancaster, UK), Mar. 2003.
|
24 |
P. Baker et al., Corpus data for south asian language processing, in Proc. Workshop South Asian Lang. Process. (EACL), (Budapest, Hungary), Apr. 2003, pp. 1-8.
|
25 |
K. Riaz, Baseline for Urdu IR evaluation, in Proc. ACM workshop Improving non english web searching (Napa Valley, CA, USA), Oct. 2008, pp. 97-100.
|
26 |
A. Daud, W. Khan, and D. Che, Urdu language processing: A survey, Artif. Intell. Rev. 47 (2017), 279-311.
DOI
|
27 |
M. Sharjeel, R. M. A. Nawab, and P. Rayson, Counter: Corpus of urdu news text reuse, Lang. Res. Eval. 51 (2017), 777-803.
DOI
|
28 |
V. Gupta, N. Joshi, and I. Mathur, Design & development of rule based inflectional and derivational Urdu stemmer, in Proc. Int. Conf, Futuristic Trends Comput. Anal. Knowl. Manag. (ABLAZE), (Greater Noida, India), Feb. 2015, pp. 7-12.
|
29 |
K. Riaz, Concept search in Urdu, in Proc. PhD workshop Inf. Knowl. Manag. (Napa Valley, CA, USA), Oct. 2008, pp. 33-40.
|
30 |
S. A. Ali et al., Salience analysis of news corpus using heuristic approach in Urdu language, Int. J. Comput. Sci. Netw. Secur. (IJCSNS), 16 (2016), no. 4, 28-36.
|
31 |
I. Hanif et al., Cross-language Urduenglish (clue) text alignment corpus, in Proc. Working notes CLEF (Toulouse, France), Sept. 2015.
|
32 |
Z. Ahmad et al., Urdu nastaleeq optical character recognition, World Acad. Sci., Eng. Technol. 26 (2007), pp. 249-252.
|
33 |
G. Salton, A. Wong, and C. S. Yang, A vector space model for automatic indexing, Commun. ACM 18 (1975), no. 11, 613-620.
DOI
|
34 |
G. Amati and C. J. Van Rijsbergen, Probabilistic models of information retrieval based on measuring the divergence from randomness, ACM Trans. Inf. Syst. (TOIS), 20 (2002), no. 4, 357-389.
DOI
|
35 |
K. Batri, S. Lakshmi, and B. Sathiyabhama, Trade-off between the number of index-terms and the information retrieval system's performance, Kuwait J. Sci. 44 (2017), no. 4, 49-56.
|
36 |
N. Craswell et al., Overview of the trec-2003 web track, in Proc. Text Retr. Conf. (TREC), vol. 3, (Gaithersburg, MD, USA), 2002.
|
37 |
J. M. Ponte and W. B. Croft, A language modeling approach to information retrieval, in Proc Int. ACM SIGIR Conf. Res. Dev. Inf Retr. (Melbourne, Australia), Aug. 1998, pp. 275-281.
|
38 |
E. Frank et al., Weka-a machine learning workbench for data mining, in Data Mining and Knowledge Discovery Handbook, Springer, Boston, MA, USA, 2009, pp. 1269-1277.
|
39 |
I. Haneef et al., Design and development of a large cross-lingual plagiarism corpus for urdu-english language pair, Sci. Program. 2019 (2019), 1-11.
|
40 |
L. Cohen, L. Manion, and K. Morrison, The ethics of educational and social research, in Research Methods in Education, 8 th ed., Routledge, London, UK, 2013, https://doi.org/10.4324/9780203720967
DOI
|
41 |
W. B. Croft, D. Metzler, and T. Strohmann, Search Engines: Information Retrieval in Practice, Pearson Education, Boston, MA, USA, 2010.
|
42 |
T. Zia, M. P. Akhter, and Q. Abbas, Comparative study of feature selection approaches for Urdu text categorization, Malaysian J. Comput. Sci, 28 (2015), no. 2, 93-109.
|
43 |
N. Khan, M. P. Bakht, and R. A. Wagan, Corpus construction and structure study of Urdu language using empirical laws, in Proc. Int. Conf. Data Sci. (Karachi, Pakistan), Feb. 2019, pp. 9-14.
|