Precision Improvement in Information Storage and Retrieval System by Document Length Normalization

D. Sharma, H Nagar

Precision Improvement in Information Storage and Retrieval System by Document Length Normalization

D. Sharma¹ , H Nagar²

Section:Review Paper, Product Type: Isroset-Journal
Vol.4 , Issue.1 , pp.1-5, Feb-2016

Online published on Apr 01, 2016

Copyright © D. Sharma, H Nagar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: D. Sharma, H Nagar, “Precision Improvement in Information Storage and Retrieval System by Document Length Normalization,” International Journal of Scientific Research in Computer Science and Engineering, Vol.4, Issue.1, pp.1-5, 2016.

MLA Style Citation: D. Sharma, H Nagar "Precision Improvement in Information Storage and Retrieval System by Document Length Normalization." International Journal of Scientific Research in Computer Science and Engineering 4.1 (2016): 1-5.

APA Style Citation: D. Sharma, H Nagar, (2016). Precision Improvement in Information Storage and Retrieval System by Document Length Normalization. International Journal of Scientific Research in Computer Science and Engineering, 4(1), 1-5.

BibTex Style Citation:
@article{Sharma_2016,
author = {D. Sharma, H Nagar},
title = {Precision Improvement in Information Storage and Retrieval System by Document Length Normalization},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {2 2016},
volume = {4},
Issue = {1},
month = {2},
year = {2016},
issn = {2347-2693},
pages = {1-5},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=248},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=248
TI - Precision Improvement in Information Storage and Retrieval System by Document Length Normalization
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - D. Sharma, H Nagar
PY - 2016
DA - 2016/04/01
PB - IJCSE, Indore, INDIA
SP - 1-5
IS - 1
VL - 4
SN - 2347-2693
ER -

1428 Views

1341 Downloads

1252 Downloads

Bar Line

Abstract :
Huge amount of information are available over the internet in electronic document format but retrieving the correct document according to users information need is very critical task. The relevancy of the document can vary according to the length of document. Automatic information storage and retrieval system have to deal with documents of varying length in text collection. In this paper we are presenting a document term weighting scheme based on length of document. Our method increases the rank of relevant document in the retrieved ordered document set. From the result we have seen that our method increase the document rank from 0.83 precision to 0.16 precision.

Key-Words / Index Term :
Document length, Normalization, Rank, Storage system, Precision, Information Storage and Retrieval System, Normalization

References :
[1] O. King, M. Kobayashi, “Information Retrieval and Ranking on the Web: Benchmarking studies II”, IBM TRL Research Report :RT0298, Japan pp.1-38,1999.
[2] S. Michel, K. Nguyen, A. Rosenstein, L. Zhang, S. Floyd, V. Jacobson, “Adaptive web caching: towards a new global caching architecture”, Computer Networks and ISDN systems, Vol.30, Issue.22, pp.2169-2177, 1998.
[3] G.E. Dupret, M. Kobayashi, “Information Retrieval and Ranking on the Web: Benchmarking studies I,” IBM TRL Research Report, Japan, pp.1-138, 1999.
[4] M. Kobayashi, K. Takeda, “Information Retrieval on the Web”, IBM Research, Japan, pp.1-64, 2000.
[5] G. Salton, A. Wong, C. S. Yang, "A vector space model for automatic indexing", Magazine Communications of the ACM CACM Homepage archive, Vol.18, Issue.11, pp.613-620, 1975.
[6] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, R. Harshman, “Indexing by latent semantic analysis”, Journal of the American Society for Information Science, Vol.41, Issue.6, pp.391-407, 1990.
[7] T. Kitagawa, Y. Kiyoki, “A mathematical model of meaning and its application to multidatabase systems”, In RIDE-IMS `93: Proceedings of the 3rd International Workshop on Research Issues in Data Engineering: Interoperability in Multidatabase Systems, Austria, pp.130-135, 1993.
[8] Y. Kiyoki, T. Kitagawa, T. Hayama, “A metadatabase system for semantic image search by a mathematical model of meaning”, SIGMOD Record, Vol.23, Issue.4, pp.34-41, 1994.
[9] K. Takano, Y. Kiyoki, “A superordinate and subordinate relationship computation method and its application to aerospace engineering information”, In ACST`07: Proceedings of the third conference on IASTED International Conference, Anaheim, CA, pp.510-516, 2007.
[10] G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, K.J. Miller. “Introduction to LexemeNet: An on-line lexical database”, Journal of Lexicography, Vol.3, Issue.4, pp.235-244, 1990.
[11] R. Rada, H. Mili, E. Bicknell, M. Blettner, “Development and application of a metric on semantic nets”, IEEE Transactions on Systems, Man and Cybernetics, Vol.19, Issue.1, pp.17-30, 1989.
[12] Y. Kim, J. Kim, “A model of knowledge based information retrieval with hierarchical concept graph”, Journal of Documentation, Vol.46, Issue.2, pp.113-136, 1990.
[13] Y. Li, K. Bontcheva, “Hierarchical, perceptron-like learning for ontology-based information extraction”, In Proceedings of the 16th international conference on World Wide Web (WWW `07), NY, pp.777-786, 2007.
[14] C. Hwang, “Incompletely and imprecisely speaking: Using dynamic ontologies for representing and retrieving information”, In Proceedings of the 6th international workshop on ontology-based information extraction system, Germany, pp.14-20, 1999.
[15] B. Yildiz, S. Miksch “ontoX - A Method for Ontology-Driven Information Extraction”, Lecture Notes in Computer Science. Vol.4707, pp. 660-673, 2007.
[16] A. Todirascu, L. Romary, D. Bekhouche, “Vulcain — An Ontology- Based Information Extraction System”, Lecture Notes in Computer Science, Vol.2553, pp.64-75, 2002.
[17] M. Vargas-Vera, E. Motta, J. Domingu, S. Shum, M. Lanzoni, “Knowledge extraction by using an ontology-based annotation tool”, In Proceedings of the workshop on knowledge markup and semantic annotation, NY, pp.1-8, 2001.
[18] B. Popov, A. Kiryakov, D. Ognyanoff, D. Monov, A. Kirilov, “KIM – a semantic platform for information extraction and retrieval”, Natural Language Engineering, Vol.10, Issue.3, pp. 375-392, 2004.
[19] B. Adrian, J. Hees, L. Elst, A. Dengel, “iDocument: Using Ontologies for Extracting and Annotating Information from Unstructured Text”, Lecture Notes in Computer Science, VOl.5803, pp.249-256, 2009.
[20] T.G. Kolda, D.P. O`Leary, "A semidiscrete matrix decomposition for latent semantic indexing information retrieval", Journal ACM Transactions on Information Systems (TOIS) TOIS Homepage archive, Vol.16, Issue.4, pp. 322-346, 1998.
[21] G.Salton, C. Buckley, "Lexeme weighting approaches in automatic text retrieval," Journal Information Processing and Management, Vol.24, Issue.5, pp. 513–523, 1988.
[22] D. Harman, "Ranking algorithmsIn Information Retrieval: Data Structures and Algorithms," Prentice Hall, Englewood Cliffs, pp.363–392, 1992.
[23] B. Yildiz, S. Miksch “ontoX - A Method for Ontology-Driven Information Extraction”, Lecture Notes in Computer Science, Vol.4707, pp.660-673, 2007.
[24] A. Todirascu, L. Romary, D. Bekhouche, “Vulcain — An Ontology- Based Information Extraction System,” Lecture Notes in Computer Science, Vol. 2553, pp.64-75, 2002.
[25] M. Vargas-Vera, E. Motta, J. Domingu, S. Shum, M. Lanzoni, “Knowledge extraction by using an ontology-based annotation tool”, In Proceedings of the workshop on knowledge markup and semantic annotation, NY, pp.1-13, 2001.
[26] B. Popov, A. Kiryakov, D. Ognyanoff, D. Monov, A. Kirilov, “KIM – a semantic platform for information extraction and retrieval”, Natural Language Engineering, Vol.10, Issue3, pp. 375-392,2004.
[27] B. Adrian, J. Hees, L. Elst, A. Dengel, “iDocument: Using Ontologies for Extracting and Annotating Information from Unstructured Text”, Lecture Notes in Computer Science, Vol.5803, pp.249-256, 2009.
[28] T.G. Kolda, D.P. O`Leary, "A semidiscrete matrix decomposition for latent semantic indexing information retrieval", Journal ACM Transactions on Information Systems, Vol.16, Issue.4, pp.322-346, 1998.
[29] G. Salton, C. Buckley, "Term weighting approaches in automatic text retrieval", Journal Information Processing and Management, Vol.24, Issue.5, pp.513–523, 1988.
[30] D. Harman, "Ranking algorithms. In Information Retrieval: Data Structures and Algorithms", Prentice Hall, Englewood Cliffs, pp.363–392, 1992.

Full Paper View Go Back

Main Menu

Journals Contents

Information

Download

Publication Certificate

Contact Us

Use full Link