(sorry, at that moment, the english has not been checked)
While working with semantic concepts and the role they can play in our digital explorations of pedagogical contents, we are of course interested in DBPedia.
I will present an experimental approach that allowed me to find, by a series of automatic processes, knowledges that are not explicitly available in DBPedia: the association between a type of building and a religion. I would approach in a future post a generalization of this approach.
In connection with the french curiculum for the Art History, we identified that, at various levels, the monuments and other architectural elements have an important role to play in education. Can be addressed in this context questions about the materials, the shapes, the relationship between the arrangement of space and power, the role of the spiritual and religious and many other subjects.
This led us to ‘snoop’ in DBPedia (with the approach ‘Follow your nose’) to see which benefits we could gain. This note describes some aspects of this exploration.
First let’s look the religious architecture. Quickly found in the History of Art, topics such as Abbey, Gothic church, mosque, Romanesque church, synagogue.
We will seek the corresponding resources in DBPedia, the properties they have in common and see if other elements have the same properties.
Searching for ressources in DBPedia
http://fr.dbpedia.org/resource/<category name>
select reduced * where {?s rdfs:label "Abbaye"@fr } LIMIT 100
which results in
This suggest 4 sets containing the Abbaye concept (ans, as a “bonus”, gives me a purely technical property, about readers of Wikipedia)
For other religious building I’m looking for, two properties are commonly used:
On dbpeda-fr, the request
select distinct ?s where { ?s ?p <http://fr.dbpedia.org/resource/Catégorie:Édifice-type> . } LIMIT 100
gives a set with 257 éléments linked to Catégorie:Édifice-type, mainly some types of buildings and an owl:sameAs pointing on an equivalent concept in dbpedia:
http://dbpedia.org/page/Category:Buildings_and_structures_by_type
And in the the types of buildings, we find:
http://fr.dbpedia.org/resource/Synagogue
http://fr.dbpedia.org/resource/Église_(édifice)
http://fr.dbpedia.org/resource/Mosquée
We see that ‘église’ (church in french), we have an approximative matching (Église_(édifice)).
What are the predicates shared by several categories?
To find teh predicates and the objects shared by some entities of type ‘Synagogue’ and ‘Abbaye’, I use the request
select distinct ?p where { ?s1 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Synagogue> . ?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> . ?s1 ?p ?o . ?s2 ?p ?o . } LIMIT 100
and obtains 33 items, from which I have extracted some non geographical predicates
and some geographical
Predicates mostly used on a category
The request
SELECT count(distinct ?s2) WHERE { ?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> . ?s2 ?p ?o } LIMIT 100
show me that, at the time it has been done, there are 631 elements of type dbpedia-fr:Abbaye
The request
SELECT ?p (COUNT(?p) AS ?pTotal) WHERE { ?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> . ?s2 ?p ?o } GROUP BY ?p ORDER BY DESC(?pTotal) LIMIT 100
show me the predicates defining properties of instances of Abbaye, sorted by number of occurence. Here is the start of the results:
so, we get the properties which are shared by a lot of instances of dbpedia-fr:Abbaye.
We find a lot of https://dbpedia.org/ontology/wikiPageWikiLink, which are only links to other pages in Wikipedia. It’s a fact I’ve seen for a lot of DBPedia categories (more precision in a later post in the blog http://onsem.wp.imt.fr. Now, a question is: is there some (?p, ?o) which are more frequent in entities of some categories.
Search for shared links by many instances of a category
The request
SELECT ?o (COUNT(?o) AS ?oTotal) WHERE { ?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> . ?s2 <http://dbpedia.org/ontology/wikiPageWikiLink> ?o } GROUP BY ?o ORDER BY DESC(?oTotal) LIMIT 100
gives us an answer.
http://fr.dbpedia.org/resource/Abbaye appears 629 times for 631 instances, which is natural: it says ‘that entity of type Abbaye has a link with the concept Abbaye’.
http://fr.dbpedia.org/resource/Catholicisme appears 377 times, which seems normal and useful.
Then we have http://fr.dbpedia.org/resource/Ordre_de_Saint-Benoît 220 times; why? Suggestions welcome.
The next value is
http://fr.dbpedia.org/resource/Révolution_française
which is probably understandable, but also probably is a sign of how Wikipedia describe the french revolution.
The next value is http://fr.dbpedia.org/resource/Monument_historique_(France) which can be really useful (say something like ‘this building is a french historical building).
All these values are intesresting.
A similar request on Synagogue, for 78 synagogue:
With similar request on other buildings we see that the mostly used value is often the type of the building. As say above, it’s not surprising.
The second value seems to designate a religion. Perhaps, here is a way to automatically associate a religion with a type of buildings. I have tried it on some samples:
http://fr.dbpedia.org/page/Mosquée is associated with http://fr.dbpedia.org/page/Islam
http://fr.dbpedia.org/resource/Cathédrale is associated with http://fr.dbpedia.org/resource/Église_catholique_romaine and the next value is http://fr.dbpedia.org/resource/Catholicisme
http://fr.dbpedia.org/resource/Temple is associated with http://fr.dbpedia.org/resource/Protestantisme and next http://fr.dbpedia.org/resource/Bouddhisme
In all cases, the second result is acceptable; but, in some cases, like temple, more results are also acceptable values for the predicate.
This suggest that all religions in the result are acceptable values. So, we need to search for a common property which enable the identification of some results as religion.
Qualify the links
A first human look in the page associated with some values, gives us
http://fr.dbpedia.org/page/Judaïsme dcterms:subject dbpedia-fr:Catégorie:Religion_au_Moyen-Orient and a property ‘is prop-fr:religion of’
http://fr.dbpedia.org/page/Catholicisme dcterms:subject dbpedia-fr:Catégorie:Branche_du_christianisme and a proprerty ‘is prop-fr:religion of’
The predicate is prop-fr:religion of (and his reverse prop-fr:religion) seems to be a good candidate.
The request
SELECT ?o (COUNT(?o) AS ?oTotal) WHERE { ?s <http://fr.dbpedia.org/property/religion> ?o } GROUP BY ?o ORDER BY DESC(?oTotal) LIMIT 20
gives us values associated with the predicate religion:
Association of buildings and religion
Now, we can try to propose an association between a type of building and a religion with the help of the following request
SELECT ?s ?o (COUNT(?o) AS ?oTotal) WHERE { ?s ?p <http://fr.dbpedia.org/resource/Catégorie:Édifice-type> . ?s2 <http://dbpedia.org/ontology/type> ?s . ?s2 <http://dbpedia.org/ontology/wikiPageWikiLink> ?o . ?s3 <http://fr.dbpedia.org/property/religion> ?o } GROUP BY ?s ?o ORDER BY DESC(?oTotal) LIMIT 100
The first results are as follow (with some little editing to enter in the page)
s | o | oTotal |
dbpedia-fr:Abbaye | dbpedia-fr:Catholicisme | 599430 |
dbpedia-fr:Église_(édifice) | dbpedia-fr:Catholicisme | 418170 |
dbpedia-fr:Basilique_(christianisme) | dbpedia-fr:Catholicisme | 357750 |
dbpedia-fr:Chapelle | dbpedia-fr:Catholicisme | 206700 |
dbpedia-fr:Mosquée | dbpedia-fr:Islam | 167970 |
dbpedia-fr:Église_(édifice) | dbpedia-fr:Église_catholique_romaine | 132086 |
dbpedia-fr:Collégiale | dbpedia-fr:Catholicisme | 89040 |
dbpedia-fr:Basilique_(christianisme) | dbpedia-fr:Église_catholique_romaine | 78914 |
dbpedia-fr:Abbaye | dbpedia-fr:Église_catholique_romaine | 71740 |
dbpedia-fr:Couvent | dbpedia-fr:Catholicisme | 62010 |
dbpedia-fr:Chapelle | dbpedia-fr:Église_catholique_romaine | 56126 |
dbpedia-fr:Monastère | dbpedia-fr:Catholicisme | 34980 |
dbpedia-fr:Collégiale | dbpedia-fr:Église_catholique_romaine | 21944 |
dbpedia-fr:Synagogue | dbpedia-fr:Judaïsme | 17640 |
dbpedia-fr:Monastère | dbpedia-fr:Église_catholique_romaine | 11394 |
dbpedia-fr:Couvent | dbpedia-fr:Église_catholique_romaine | 8862 |
dbpedia-fr:Abbaye | dbpedia-fr:Protestantisme | 8550 |
dbpedia-fr:Synagogue | dbpedia-fr:Catholicisme | 7950 |
dbpedia-fr:Basilique_(christianisme) | dbpedia-fr:Christianisme | 7436 |
dbpedia-fr:Baptistère | dbpedia-fr:Catholicisme | 6360 |
dbpedia-fr:Mosquée | dbpedia-fr:Sunnisme | 5066 |
dbpedia-fr:Temple | dbpedia-fr:Protestantisme | 4788 |
dbpedia-fr:Église_(édifice) | dbpedia-fr:Protestantisme | 4104 |
dbpedia-fr:Église_(édifice) | dbpedia-fr:Christianisme | 4004 |
dbpedia-fr:Abbaye | dbpedia-fr:Christianisme | 4004 |
dbpedia-fr:Cloître | dbpedia-fr:Catholicisme | 3180 |
These associations seems goods, and some buildings are associated with several religions (like Temple).
The associated count could help to evaluate the pertinence of the association.
The request
SELECT ?s (COUNT(?s) AS ?sTotal) WHERE { ?s ?p <http://fr.dbpedia.org/resource/Catégorie:Édifice-type> . ?s2 <http://dbpedia.org/ontology/type> ?s . ?s2 <http://dbpedia.org/ontology/wikiPageWikiLink> ?o . ?s3 <http://fr.dbpedia.org/property/religion> ?o } GROUP BY ?s ORDER BY DESC(?sTotal) LIMIT 100
give us the count of buildings associated with a religion as follow (I’ve cut the list; the missing values are so small that there are not representative):
We need to divide the count of association (building, religion) by the count of buildings in each typeto see if an association is pertinent
I doesn’t see immediately how to do it with sparql. Perhaps I will do it quickly by combining the results in Javascript -for example, if I get a JSON result- or supervising the request with a Java program. To follow.
In short
I have seen that most of the predicates used with the entities I was looking for are http://dbpedia.org/ontology/wikiPageWikiLink. By looking at links shared by most of the entities of a category, I’ve learned a property which seems important for the category. Then, by looking about the common usage of the property, I can propose a type for the property, I can qualify the property. Finally, as the original category isn’t associated with that (property, value) pair, I can propose an association between a type of building and a religion.
That’s all folks:I’ve illustrated how, with some simple SPARQL request, I can produce (propose?) some new knowledges to be included in DBPedia and the DBPedia ontology (or added to a knowledge graph importing DBpedia).
in a next post, I will try to generalized the method and give it a more abstract approach.