Procedure for building semantic indexes based on domain-specific ontologies
Keywords:
Semantic indexing, Ontology, information retrieval, collaborative marketAbstract
The current on-line search systems are still far from providing users with contextualized and accurate answers because users have to make additional efforts to filter and evaluate information supplied to them. One of the ways to improve the results is to create semantic indexes that incorporate knowledge and intelligent processing of resources. When it comes to the implementation of semantic indexes, however, there is a wide range of research studies with their own procedures and lengthy conceptualization, implementation, and refinement processes. Thus, it becomes of the utmost importance to define an instrument that allows creating these kinds of structures in a more structured and efficient manner. This work proposes a procedure that makes it possible to create semantic indexes based on domain-specific ontologies. The methodology entailed creating a state of the art of the various existing proposals and drawing a general procedure that incorporates the best practice for creating semantic indexes. Then, a semantic index was created of the domain of plants and their components. The results demonstrate that the defined process is a good instrument that guides implementation of these kinds of structures with a high degree of customization. Nevertheless, it also shows that the process depends on other variables in building and processing the index, so the design needs to be re-examined until the desirable results are obtained.
Downloads
References
2. AVELLO, D. G. BlindLight- Una nueva técnica para procesamiento de texto no estructurado mediante vectores de n-gramas de longitud variable con aplicación a diversas tareas de tratamiento de lenguaje natural. Oviedo, 2005, 241. Doctoral Tesis Universidad de Oviedo. Departamento de Informática.
3. BARITE, M. Diccionario de Organización y representación del Conocimiento: Clasificación, Indización, terminología. En: Página Web versión HTML. Montevideo (2000) [citado 13 de Abril de 2011], Disponible en Internet: <http://www.eubca.edu. uy/diccionario/index.htm>
4. BAZIZ, M., BOUGHANEM, M., & AUSSENAC-GILLES, N. Evaluating a Conceptual Indexing Method by Utilizing WordNet. En: (2005); 8.
5. BENAVIDES, K. D. R. Índices de RI. En: Página Web versión HTML. (2011) [citado 14 Abril de 2011], Disponible en Internet: <http://www.kramirez.net/RI_Maestria/Material/ Presentaciones/Indices%20de%20RI.pdf>
6. CARRASCAL, C. Tesauros y Ontologías. En: Página Web versión HTML. (2004) [citado 14 de Abril del 2011], Disponible en Internet: <http://personales.upv.es/ccarrasc/doc/2003-2004/ TesaurosOnto/principal.html>
7. CONSORTIUM, P. O. Plant Ontology. En: Plant Ontology™ Consortium Página Web versión HTML. (2010) [citado 10 de agosto de 2010], Disponible en Internet: <http://www. plantontology.org/]>
8. CORD, V., LOMBARDI, P., MARTELLI, M., & MASCARDI, V. An Ontology-Based Similarity between Sets of Concepts. En: (2005).
9. CHANG, C., & SCHATZ, B. Performance and Implications of Semantic Indexing in a Distributed Environment. En: Proceedings of the eighth international conference on Information and knowledge management Kansas City, Missouri, United States. (1999)
10. CHEN, H., SCHATZ, B., NG, D., MARTINEZ, J., KIRCHHOFF, A., & LIN., C. A Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic Retrieval: The Illinois Digital Library Initiative Project. En: IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8) (1996); 39.
11. CHUNG, Y.-M., HE, Q., POWELL, K., & SCHATZ, B. Semantic Indexing for a Complete Subject Discipline. En: Proceedings of the fourth ACM conference on Digital libraries (1999); 39-48.
12. DEERWESTER, S., DUMAIS, S. T., FURNAS, G. W., LANDAUER, T. K., & HARSHMAN, R. Indexing by latent semantic analysis. En: Society for information systems (1990).
13. DESMONTILS, E., & JACQUIN, C. Indexing a Web Site with a Terminology Oriented Ontology. En: CiteSeerX (2002); 181-198.
14. DESMONTILS, E., JACQUIN, C., & SIMON, L. Ontology enrichment and indexing process. En: Institut de Recherche en Informatique de Nantes 2, rue de la Houssinire Página Web versión HTML. (2003) [citado, Disponible en Internet: <http:// citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.95.7308&r ep=rep1&type=pdf>
15. DOMINICH, S. A unified mathematical definition of classical information retrieval. En: Journal of the American Society for Information Science, 51(7) (2000); 10. 0002-8231.
16. FRAKES, W. Information retrieval: data structures and algorithms: 1992. Series,
17. GAO, M., LIU, C., & CHEN, F. An Ontology Search Engine Based on Semantic Analysis. En: Proceedings of the Third International Conference on Information Technology and Applications (ICITA'05) Volume 2 - Volume 02.(2005)
18. GÄRDERFORS, P. Concetual Spaces as a Framework for Knowledge Representation. En: Mind and Matter, 2(2) (2004); 18.
19. HERNÁNDEZ, J. P. R., & HERNÁNDEZ, G. A. Indización y Búsqueda a través de Lucene. Veracruz, Sinaloa, 2008, Universidad Autónoma de Sinaloa, Instituto Tecnológico de Orizaba.
20. HERRERA, A. G. L. Modelos de Sistemas de Recuperación de Información Documental Basados en Información Linguística Difusa. 2006, 255. Tesis (Doctor en Informática). Universidad de Granada. Departamento de Ciencias de la Computación e Inteligencia Artificial.
21. ISO. Métodos Para el Análisis de Documentos, determinación de su Contenido y Selección de los Términos de Indización NC- ISO 5963: 2000. En: Página Web versión HTML. (2000) [citado, 1, Disponible en Internet: <http://www.energia.inf.cu/PAEC/conten/ normal/CAT%C1LOGO%20DE%20NORMAS%20CUBANAS. pdf>
22. JAVIER. Ven a ver a Javier. Lista de Buscadores Semánticos. En: Página Web versión HTML. (2011) [citado, Disponible en Internet: <http://www.javi.it/semantic.html>
23. JIMÉNEZ, A. G. Instrumentos de representación del conocimiento: tesauros versus ontologías. En: Anales de documentación, Revista de Bibliotecomania y Documentación (2004). 1697-7904.
24. JONES, K. S., WALKER, S., & ROBERTSON, S. A probabilistic model of information retrieval: development and comparative experiments: Part 1. En: Information Processing and Management (2000).
25. KANG, B. Y. A Novel Approach to Semantic Indexing Based on Concept. En: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Sapporo, Japan.(2003)
26. KENT, A., BERRY, M. M., LUEHRS, F. U., & PERRY, J. W. Machine literature searching VIII. Operational criteria for designing information retrieval systems. En: American Documentation, 6(2) (1955); 93-101. 1936-6108.
27. LEAL, E. T. La Desambiguación del Sentido de las Palabras: revisión metodológica. Revista multidisciplinar sobre diseño, personas y tecnología En: Página Web versión HTML. (2009) [citado 10 de marzo de 2010], Disponible en Internet: <http://www. nosolousabilidad.com/articulos/desambiguacion.htm>
28. LIN, D. An Information-Theoretic Definition of Similarity. En: Proc 15th International Conference on Machine Learning (1998); 296- 304.
29. LV, G., ZHENG, C., & ZHANG, L. Text Information Retrieval Based on Concept Semantic Similarity. En: 2009 Fifth International Conference on Semantics, Knowledge and Grid (2009); 356-360.
30. MANNING, C. D., RAGHAVAN, P., & SCHÜTZE, H. Introduction to Information Retrieval: Cambridge: 2008. Series, 0521865719
31. MARCO, S. B., & KATHLEEN, S. V. An Approach to Semantic Indexing and Information Retrieval, Extraido 10 de diciembre de 2009. En: Revista Facultad de Ingeniería Universidad de Antioquia, 48 (2009); 14. 0120-6230.
32. MAZUEL, L., & SABOURET, N. Semantic Relatedness Measure Using Object Properties in an Ontology. En: Proceedings of the 7th International Conference on The Semantic Web, Karlsruhe, Germany.(2008)
33. MIHALCEA RADA, M. D. Semantic Indexing using WordNet Senses En: Department of Computer Science and Engineering (2000); 11.
34. MILLER, G. A., BECKWITH, R., FELLBAUM, C., GROSS, D., & MILLER, K. Introduction to WordNet: An On-line Lexical Database. En: International Journal of lexicography, 3 (1993); 235-244.
35. MOLINA, M. P. Búsqueda y Recuperación de Información. En: E-COMS: Electronic Content Management Skills Página Web versión HTML. (2009) [citado 27 de abril de 2010], Disponible en Internet: <http://www.mariapinto.es/e-coms/recu_infor.htm>
36. N., J. M. D., SALTO, F., & PÉREZ, M. Recuperación de Información. . En: Página Web versión HTML. (2009) [citado 14 de julio de 2010], Disponible en Internet: <http://sites.google.com/site/ glosariobitrum/Home/recuperacion-de-informacion>
37. NGUYEN, T., & PHAN, T. The effect of Semantic Index in Information Retrieval development. En: International Conference on Information Integration and web-based Applications and Services, Austria.(2008)
38. NOAH, S. A., ZAKARIA, L., ALHADI, A. C., MOHD, T., SEMBOK, T., & SAAD, S. Towards Building Semantic Rich Model for Web Documents Using Domain Ontology. En: Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence. (2004)
39. NOVOA, D., & BALLEN, L. La Indexación Semántica Latente en la recuperación de información. En: Preprint (2007).
40. PAGE, L., BRIN, S., MOTWANI, R., & WINOGRAD, T. The PageRank Citation Ranking: Bringing Order to the Web: 1999. Series,
41. PARDO, M. A., & FERRO, J. V. Introducción a la Recuperación de Información. En: Grupo: Lengua Y Sociedad de la Información Página Web versión HTML. Galicia (2010) [citado 09/06/2010], Disponible en Internet: <http://www.grupolys.org/docencia/ln/ biblioteca/ir.pdf>
42. RADA, M., & DAN, M. Semantic Indexing using WordNet Senses En: Department of Computer Science and Engineering, In Proceedings Of Acl Workshop On Ir & Nlp, Hongkong.(2000)
43. RADA, M., & MOLDOVAN, D. I. An Iterative Approach to Word Sense Disambiguation. En: Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference Orlando, FL.(2000) http://www.cse.unt.edu/~rada/ papers/mihalcea.flairs00.pdf
44. RAE. Sitio Web del Diccionario de la Real Academia Española, Segunda Edición. En: RAE Página Web versión HTML. (2011) [citado 11 de Abril de 2011], Disponible en Internet: <http:// www.rae.com/>
45. RESNIK, P. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. En: arXiv preprint cmp-lg/9511007(02 de febrero de 2010) (1995).
46. RESNIK, P. Semantic Similarity in a Taxonomy: An Information- Based Measure and its Application to Problems of Ambiguity in Natural Language. En: Journal of Articial Intelligence Research 11 (1999); 36.
47. ROMA-FERRI, M. T., & PALOMAR, M. Interoperabilidad Semántica de Ontologías Basada en Técnicas de Procesamiento del Lenguaje Natural. En: ISKO. CAPÍTULO ESPAÑOL. CONGRESO 7° (2005); 534-548.
48. S CAZALENS, E. D., C JACQUIN, AND P LAMARRE. A Web Site Indexing Process for an Internet Information Retrieval Agent System En: Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00), 1 (2000); 254- 258.
49. SALTON, G., & MCGILL, M. J. (1986). Introduction to Modern Information Retrieval (pp. paginas 400). Retrieved from http:// lyle.smu.edu/~mhd/8337sp07/salton.pdf
50. SAMANEH CHAGHERI, C. R., SYLVIE CALABRETTO, CYRIL DUMOULIN. Semantic Indexing of Technical Documentation. En: Laboratoire d'InfoRmatique en Image et Systèmes d'information (2009); 12.
51. SÁNCHEZ, D., & MORENO, A. Learning non-taxonomic relationships from web documents for domain ontology construction. En: Data & Knowledge Engineering, 64(3) (2008); 600-623. 0169-023X.
52. SANDERSON, M. Word sense disambiguation and information retrieval. Glasgow, 1996, 136. Tesis (PhD). University of Galsgow. Department of Computing Science.
53. SCHATZ, B. R. Information Retrieval in Digital Libraries: Bringing Search to the Net. En: Science - Bioinformática, 275 (1997); 327 - 334.
54. SHAHRUL AZMAN NOAH, L. Z., ARIFAH CHE ALHADI, TENGKU MOHD TENGKU SEMBOK, SAIDAH SAAD. Towards Building Semantic Rich Model for Web Documents Using Domain Ontology. En: Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence (2004); 769 - 770.
55. SONG JUN-FENG, Z. W., XIAO W., LI G., XU Z. Ontology-Based Information Retrieval Model for the Semantic Web. Proceedings of the 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE'05) on e-Technology, e-Commerce and e-Service En: IEEE Computer Society Página Web versión HTML. Washington, DC, USA (2005) [citado, Disponible en
56. SONG, W., LI, C. H., & PARK, S. C. Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures. En: Expert Systems with Applications, 36 (2009); 9095–9104.
57. SWETS, J. Effectiveness of information retrieval methods. En: American Documentation (1969).
58. THANH NGUYEN, T. P. The effect of Semantic Index in Information Retrieval development. En: International Conference on Information Integration and web-based Applications and Services (2008); 438-441.
59. TUMER, D. S., M.A. BITIRIM, Y. DEPT. OF COMPUT. ENG., EASTERN MEDITERRANEAN UNIV., FAMAGUSTA. An Empirical Evaluation on Semantic Search Performance of Keyword-Based and Semantic Search Engines: Google, Yahoo, Msn and Hakia. En: IEEEXplore (2009); 51-55.
60. UNIVERSITY, P. WordNet 3.0 Princenton University En: Página Web versión HTML. (2009) [citado 02 de febrero de 2010], Disponible en Internet: <http://wordnet.princeton.edu/wordnet>
61. VALLET, D., FERNANDEZ, M., & CASTELLS, P. An Ontology- Based Information Retrieval Model. En: IEEE Mendeley (2005).
62. YATES, R. B., & NETO, B. R. Modern Information Retrieval: New York: 1999. Series, 0-201- 39829- X
63. YU, D. C., L., D. J., CUADRADO, & COBURN, A. The Semantic Indexing Project knowledgesearch En: Página Web versión HTML. (2003) [citado 12/12/2009], Disponible en Internet: <http://www.knowledgesearch.org/>
64. ZOBEL, J. Inverted files for text search engines. En: ACM Computing Surveys (CSUR) (2006).