Using Wikipedia for extracting hierarchy and building geo‐ontology

Quoc‐Hung Ngo (Faculty of Computer Science, University of Information Technology, HoChiMinh City, Vietnam)
Son Doan (University of California, San Diego, San Diego, California, USA)
Werner Winiwarter (Research Group Data Analytics and Computing, University of Vienna, Vienna, Austria)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 16 November 2012



This paper aims to serves two main purposes: First, it seeks to provide an overview of the location hierarchy from the highest divisions (continents) to the lowest divisions (wards, villages) in reality and in the Wikipedia pages. Secondly, it aims to introduce an approach to building a geographical ontology from Wikipedia.


The paper first reviews existing applications which extract information from Wikipedia and use it as a data resource to develop natural language processing tools. The paper also reviews the structure of Wikipedia pages which show the location's information. Based on the analysis, the paper then proposes an approach to extract location hierarchy as well as geographical characteristics for the geo‐ontology. The approach also rebuilds the relations between locations in the ontology.


Existing location name systems are mainly based on probabilistic locations, which are mined from the data and they lack the administrative relations between locations for full levels and all countries and territories. The literature review in geographical hierarchy and using Wikipedia for natural language processing tasks offers an approach to build a geographical ontology from Wikipedia pages. The proposed approach is believed to be the first which provides a full geo‐ontology for all countries.

Practical implications

The paper builds a geo‐ontology with full levels for all countries and territories. The administrative relations between locations are needed for real‐world applications.


The comprehensive overview on existing work on geo‐ontology provides a valuable reference for researchers and system developers in related research communities. The proposed approach to build a geographical ontology by using the Wikipedia offers a promising alternative to build a knowledge system from free online multi‐language encyclopedia.



