ILOT – Semantic and education

Exploring dbpedia about Monuments, with help of sparql

During our work on semantic and the role it can play in digital explorations, we are of course interested in DBPedia.

I will present an experimental approach that allowed me to find, by a series of automatic processes, knowledge that is not explicitly available in DBPedia: the association between a type of buildings and a religion. I would approach in a future post generalization of this approach.

In connection with the french curiculum about History of Arts, we identified that, at various levels, monuments and other architectural elements have an important role in the transmission. Can be addressed in this context questions about the materials, the shapes, the relationship between the arrangement of space and power, the role of the spiritual and religious and many other subjects.
This led us to ‘snoop’ in DBPedia (with the approach ‘Follow your nose’) to see what party we could draw. This post describes some aspects of this exploration.
First let’s look to religious architectures. quickly found in the History of Art, topics such as Abbey, Gothic church, Mosque, Romanesque church, Synagogue.
We will seek the corresponding resources in DBPedia the properties they have in common and see if other elements have these same properties.

Find a resource DBPedia

We know that many resources of type ‘Category’ are built on the template
http://fr.dbpedia.org/resource/<resource name capitalized>

A possible starting point is the request sparql (on fr.dbpedia.org/sparql)

So I did a search on
http://fr.dbpedia.org/resource/Abbaye
In the page that is displayed, I could see that this resource was associated with the predicate dcterms:subject to the following values:

This suggests to me 4 sets encompassing concept Abbey (and as a bonus, give me a link that has nothing to do: a functional link for readers).

For other religious architectures that I retained, it appears that both categories seem applicable to all:

On dbpeda-fr, the request

select distinct ?s where {
?s ?p <http://fr.dbpedia.org/resource/Catégorie:Édifice-type>  .
} LIMIT 100

provides a set of 257 items related to Catégorie:Édifice-type, essentially building types and an equivalent category (owl: sameAs) in dbpedia:

http://dbpedia.org/page/Category:Buildings_and_structures_by_type

And as building types, we found:

http://fr.dbpedia.org/resource/Synagogue

http://fr.dbpedia.org/resource/Église_(édifice)

http://fr.dbpedia.org/resource/Mosquée

but we see that for ‘church’, we did not strictly have the same category, and we have not ‘Roman church’ and ‘Gothic church ‘.

What predicates shared by multiple categories?

Search predicates and shared objects of the topics of the associated objects with respectively ‘Synagogue’ and ‘Abbey’

select distinct ?p where {
?s1 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Synagogue> .
?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> .
?s1 ?p ?o .
?s2 ?p ?o .
} LIMIT 100

gives 33 items, in which a manual selection, allows to highlight the following, non-geographical features

p
http://dbpedia.org/ontology/architecturalStyle
http://dbpedia.org/ontology/wikiPageWikiLink
http://fr.dbpedia.org/property/année
http://fr.dbpedia.org/property/auteur
http://fr.dbpedia.org/property/débutConstr
http://fr.dbpedia.org/property/finConst
http://fr.dbpedia.org/property/style
http://fr.dbpedia.org/property/titre
http://dbpedia.org/ontology/depictionDescription

and the following geographic features

p
http://dbpedia.org/ontology/city
http://dbpedia.org/ontology/country
http://dbpedia.org/ontology/department
http://dbpedia.org/ontology/region
http://fr.dbpedia.org/property/département
http://fr.dbpedia.org/property/géolocalisation
http://fr.dbpedia.org/property/lieu
http://fr.dbpedia.org/property/région
http://fr.dbpedia.org/property/ville
http://fr.dbpedia.org/property/lienRégion
http://fr.dbpedia.org/property/pays

Predicates the most used by a category

The request

SELECT count(distinct ?s2) 
WHERE
{ 
?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> .
?s2 ?p ?o
}
LIMIT 100

tells me that, at the moment I make this request, there are 631 elements of type dbpedia-fr:Abbey

The request

SELECT ?p (COUNT(?p) AS ?pTotal)
WHERE
{ 
?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> .
?s2 ?p ?o
}
GROUP BY ?p
ORDER BY DESC(?pTotal)
LIMIT 100

gives me the predicates that apply to the Abbey, ranked by their number of appearances. This is the beginning of the results:

p pTotal
http://dbpedia.org/ontology/wikiPageWikiLink 35996
http://www.w3.org/1999/02/22-rdf-syntax-ns#type 3975
http://purl.org/dc/terms/subject 3383
http://www.w3.org/2002/07/owl#sameAs 2963
http://dbpedia.org/ontology/wikiPageExternalLink 968
http://xmlns.com/foaf/0.1/name 870
http://fr.dbpedia.org/property/type 656
http://dbpedia.org/ontology/type 648
http://www.w3.org/2000/01/rdf-schema#label 631
http://xmlns.com/foaf/0.1/isPrimaryTopicOf 631
http://www.w3.org/ns/prov#wasDerivedFrom 631
http://dbpedia.org/ontology/wikiPageRevisionID 631
http://dbpedia.org/ontology/wikiPageLength 631
http://dbpedia.org/ontology/wikiPageID 631
http://dbpedia.org/ontology/wikiPageOutDegree 631
http://www.w3.org/2000/01/rdf-schema#comment 627
http://dbpedia.org/ontology/abstract 627
http://fr.dbpedia.org/property/nommonument 622
http://fr.dbpedia.org/property/ville 618
http://www.georss.org/georss/point 616
http://www.w3.org/2003/01/geo/wgs84_pos#lat 616
http://www.w3.org/2003/01/geo/wgs84_pos#long 616
http://fr.dbpedia.org/property/longitude 615
http://fr.dbpedia.org/property/latitude 615
http://dbpedia.org/ontology/city 609
http://fr.dbpedia.org/property/géolocalisation 592
http://fr.dbpedia.org/property/culte 588
http://dbpedia.org/ontology/religiousOrder 585
http://xmlns.com/foaf/0.1/depiction 552
http://dbpedia.org/ontology/thumbnail 540
http://fr.dbpedia.org/property/photo 530
http://fr.dbpedia.org/property/région 489

This allows us to move forward in the search for ‘properties’ described with most dbpedia-fr:Abbaye

We see that there are many http://dbpedia.org/ontology/wikiPageWikiLink, which correspond to links to other pages. This finding can be generalized to many categories of DBPedia. I would go back in a future post. One may wonder if some links appear more often than others.

Research for links shared by many elements of a category

The request

SELECT ?o (COUNT(?o) AS ?oTotal)
WHERE
{ 
?s2 <http://dbpedia.org/ontology/type> <http://fr.dbpedia.org/resource/Abbaye> .
?s2 <http://dbpedia.org/ontology/wikiPageWikiLink> ?o
}
GROUP BY ?o
ORDER BY DESC(?oTotal)
LIMIT 100

gives us the answer:

http://fr.dbpedia.org/resource/Abbaye appears 629 times, which can be easily explained.

http://fr.dbpedia.org/resource/Catholicisme apparait 377 fois, appears 377 times, which is also logic.

Then we have  http://fr.dbpedia.org/resource/Ordre_de_Saint-Benoît 220 fois; ; it must seek the explanation. Thank you for your suggestions.

The fourth link

http://fr.dbpedia.org/resource/Révolution_française

which has consistency, although this suggests a very historical bias about DBpedia sources.

The fifth is http://fr.dbpedia.org/resource/Monument_historique_(France)  which could be us well.

We can consider that all of the beginning of this response elements can contribute to our exploration/exploitation of DBPedia.

 

o oTotal
http://fr.dbpedia.org/resource/Abbaye 629
http://fr.dbpedia.org/resource/Catholicisme 377
http://fr.dbpedia.org/resource/Ordre_de_Saint-Benoît 220
http://fr.dbpedia.org/resource/Révolution_française 191
http://fr.dbpedia.org/resource/Monument_historique_(France) 182
http://fr.dbpedia.org/resource/Église_catholique_romaine 170
http://fr.dbpedia.org/resource/Catégorie:Abbaye_du_Moyen_Âge 156
http://fr.dbpedia.org/resource/Ordre_cistercien 137
http://fr.dbpedia.org/resource/Bénédictin 137
http://fr.dbpedia.org/resource/Architecture_romane 120
http://fr.dbpedia.org/resource/France 116
http://fr.dbpedia.org/resource/Catégorie:Abbaye_bénédictine_en_France 111
http://fr.dbpedia.org/resource/Règle_de_saint_Benoît 106
http://fr.dbpedia.org/resource/Architecture_gothique 100
http://fr.dbpedia.org/resource/Catégorie:Abbaye_monument_... 98
http://fr.dbpedia.org/resource/Liste_des_abbayes_et_monastères 94
http://fr.dbpedia.org/resource/Cloître 89
http://fr.dbpedia.org/resource/Monastère 86

Compare to what you get on Synagogue, with a similar request and 78 synagogues:

 

o oTotal
http://fr.dbpedia.org/resource/Synagogue 78
http://fr.dbpedia.org/resource/Judaïsme 63
http://fr.dbpedia.org/resource/Arche_sainte_(synagogue) 25
http://fr.dbpedia.org/resource/Juifs 25
http://fr.dbpedia.org/resource/Catégorie:Synagogue_monument_historique_(France) 20
http://fr.dbpedia.org/resource/Paris 18
http://fr.dbpedia.org/resource/Hébreu 18
http://fr.dbpedia.org/resource/Seconde_Guerre_mondiale 18
http://fr.dbpedia.org/resource/Rabbin 17
http://fr.dbpedia.org/resource/Monument_historique_(France) 17
http://fr.dbpedia.org/resource/Bimah 17
http://fr.dbpedia.org/resource/Ashkénaze 16
http://fr.dbpedia.org/resource/Île-de-France 15
http://fr.dbpedia.org/resource/Tables_de_la_Loi 15
http://fr.dbpedia.org/resource/Shoah 14
http://fr.dbpedia.org/resource/Sefer_Torah 13
http://fr.dbpedia.org/resource/Alsace 13
http://fr.dbpedia.org/resource/Architecte 13

It is found that the most common link in almost all cases, but not all, is a link to the resource type that was used.
The second link seems to be a link URI designating a religion. Perhaps we have there a way to automatically associate a religious architecture to a religion. Let’s try a few examples:

http://fr.dbpedia.org/page/Mosquée is thus associated with  http://fr.dbpedia.org/page/Islam

http://fr.dbpedia.org/resource/Cathédrale   is thus associated with   http://fr.dbpedia.org/resource/Église_catholique_romaine and in third position with http://fr.dbpedia.org/resource/Catholicisme

http://fr.dbpedia.org/resource/Temple is thus associated with  http://fr.dbpedia.org/resource/Protestantisme and in third position with http://fr.dbpedia.org/resource/Bouddhisme

In all these cases, the second answer is admissible; the last two examples show situations where, for different reasons, the third answer is also relevant.

In fact it suggests that all religions referenced after the first link would be admissible answers. So we need to see if they have one property in common that identifies them as religion in DBPedia.

 

Qualify links

A first visual exploration gives us

http://fr.dbpedia.org/page/Judaïsme   dcterms:subject   dbpedia-fr:Catégorie:Religion_au_Moyen-Orient and the property is prop-fr:religion of

http://fr.dbpedia.org/page/Catholicisme   dcterms:subject   dbpedia-fr:Catégorie:Branche_du_christianisme and the property is is prop-fr:religion of

It appears that the common feature is to use the predicate is prop-fr: religion of.

The request

 

SELECT ?o (COUNT(?o) AS ?oTotal)
WHERE
{
 ?s <http://fr.dbpedia.org/property/religion> ?o  
}
GROUP BY ?o
ORDER BY DESC(?oTotal)
LIMIT 20

allows us to find elements that are objects  for that predicate::

o oTotal
http://fr.dbpedia.org/resource/Catholicisme 795
http://fr.dbpedia.org/resource/Islam 509
http://fr.dbpedia.org/resource/Christianisme 286
http://fr.dbpedia.org/resource/Église_catholique_romaine 211
http://fr.dbpedia.org/resource/Protestantisme 171
http://fr.dbpedia.org/resource/Sunnisme 149
http://fr.dbpedia.org/resource/Judaïsme 140
http://fr.dbpedia.org/resource/Luthéranisme 93
“Autres”@fr 89
http://fr.dbpedia.org/resource/Bouddhisme 86
http://fr.dbpedia.org/resource/Animisme 78
http://fr.dbpedia.org/resource/Hindouisme 75
“Christianisme orthodoxe”@fr 72
http://fr.dbpedia.org/resource/Christianisme_orthodoxe 69
http://fr.dbpedia.org/resource/Anglicanisme 65
“hindoue”@fr 60
“Aucune”@fr 58
http://fr.dbpedia.org/resource/Religions_traditionnelles_africaines 56
http://fr.dbpedia.org/resource/Presbytérianisme 55
http://fr.dbpedia.org/resource/Église_catholique 51

Associate buildings with religion

Seek to associate a religion to each of the building types with the request:

SELECT ?s ?o (COUNT(?o) AS ?oTotal)
WHERE
{ 
?s ?p <http://fr.dbpedia.org/resource/Catégorie:Édifice-type>  .
?s2 <http://dbpedia.org/ontology/type> ?s .
?s2 <http://dbpedia.org/ontology/wikiPageWikiLink> ?o .
?s3 <http://fr.dbpedia.org/property/religion> ?o  
}
GROUP BY ?s ?o
ORDER BY DESC(?oTotal)
LIMIT 100

These are the first results (slightly edited for width of the page limitation)

 

s o oTotal
dbpedia-fr:Abbaye dbpedia-fr:Catholicisme 599430
dbpedia-fr:Église_(édifice) dbpedia-fr:Catholicisme 418170
dbpedia-fr:Basilique_(christianisme) dbpedia-fr:Catholicisme 357750
dbpedia-fr:Chapelle dbpedia-fr:Catholicisme 206700
dbpedia-fr:Mosquée dbpedia-fr:Islam 167970
dbpedia-fr:Église_(édifice) dbpedia-fr:Église_catholique_romaine 132086
dbpedia-fr:Collégiale dbpedia-fr:Catholicisme 89040
dbpedia-fr:Basilique_(christianisme) dbpedia-fr:Église_catholique_romaine 78914
dbpedia-fr:Abbaye dbpedia-fr:Église_catholique_romaine 71740
dbpedia-fr:Couvent dbpedia-fr:Catholicisme 62010
dbpedia-fr:Chapelle dbpedia-fr:Église_catholique_romaine 56126
dbpedia-fr:Monastère dbpedia-fr:Catholicisme 34980
dbpedia-fr:Collégiale dbpedia-fr:Église_catholique_romaine 21944
dbpedia-fr:Synagogue dbpedia-fr:Judaïsme 17640
dbpedia-fr:Monastère dbpedia-fr:Église_catholique_romaine 11394
dbpedia-fr:Couvent dbpedia-fr:Église_catholique_romaine 8862
dbpedia-fr:Abbaye dbpedia-fr:Protestantisme 8550
dbpedia-fr:Synagogue dbpedia-fr:Catholicisme 7950
dbpedia-fr:Basilique_(christianisme) dbpedia-fr:Christianisme 7436
dbpedia-fr:Baptistère dbpedia-fr:Catholicisme 6360
dbpedia-fr:Mosquée dbpedia-fr:Sunnisme 5066
dbpedia-fr:Temple dbpedia-fr:Protestantisme 4788
dbpedia-fr:Église_(édifice) dbpedia-fr:Protestantisme 4104
dbpedia-fr:Église_(édifice) dbpedia-fr:Christianisme 4004
dbpedia-fr:Abbaye dbpedia-fr:Christianisme 4004
dbpedia-fr:Cloître dbpedia-fr:Catholicisme 3180

We see that relevant associations are obtained, and there are sometimes several associations for the same building type.

The resulting count must be used to assess the reliability of the association.

The request

SELECT ?s  (COUNT(?s) AS ?sTotal)
WHERE
{ 
?s ?p <http://fr.dbpedia.org/resource/Catégorie:Édifice-type>  .
?s2 <http://dbpedia.org/ontology/type> ?s .
?s2 <http://dbpedia.org/ontology/wikiPageWikiLink> ?o .
?s3 <http://fr.dbpedia.org/property/religion> ?o  
}
GROUP BY ?s 
ORDER BY DESC(?sTotal)
LIMIT 100

gives us the types of buildings associated with religion and gives the following list (shortened, discarding answers with low representativity):

s sTotal
http://fr.dbpedia.org/resource/Abbaye 688350
http://fr.dbpedia.org/resource/Église_(édifice) 567408
http://fr.dbpedia.org/resource/Basilique_(christianisme) 450166
http://fr.dbpedia.org/resource/Chapelle 268082
http://fr.dbpedia.org/resource/Mosquée 177246
http://fr.dbpedia.org/resource/Collégiale 113068
http://fr.dbpedia.org/resource/Couvent 72060
http://fr.dbpedia.org/resource/Monastère 59322
http://fr.dbpedia.org/resource/Synagogue 31322
http://fr.dbpedia.org/resource/Temple 9684
http://fr.dbpedia.org/resource/Baptistère 6782
http://fr.dbpedia.org/resource/Cloître 3576
http://fr.dbpedia.org/resource/Bibliothèque 3348
http://fr.dbpedia.org/resource/Calvaire_(édifice) 2012
http://fr.dbpedia.org/resource/Tombeau_(architecture) 1956
http://fr.dbpedia.org/resource/Hôpital 1590
http://fr.dbpedia.org/resource/Campanile 1590
http://fr.dbpedia.org/resource/Temple-montagne 1374
http://fr.dbpedia.org/resource/Église-halle 950
http://fr.dbpedia.org/resource/Prison 424
http://fr.dbpedia.org/resource/Clocher 422
http://fr.dbpedia.org/resource/Citadelle 342
http://fr.dbpedia.org/resource/Ferme_fortifiée 342

It would be probably useful to divide the number of association (religion-building) by the number of representatives of each type of building to have a measure of representativity of the association found.

I do not now (at the date of publication of the french version of this post) how to do that with sparql to obtain the best associations. But is it in SPARQL to do it: in Javascript, for example, I could easily make the first query and retrieve the results in JSON, and then to do the other request, and finally calculate the pertinence from the two results in JSON. To be continued.

In summary

I found that most of the predicates used on the set of elements that I thought was http://dbpedia.org/ontology/wikiPageWikiLink. Searching for links shared by a number of elements of a class, I highlighted a ‘feature’ of this category. Then, looking predicates commonly used for this feature, I could qualify this characteristic. Finally, as my original class did not have this knowledge, I could make an association between a category such as Abbey and one or more religions. This treatment is partly statistical. Should assess their level of relevance, but the first results are interesting.

That’s all for now. I have shown how, with some fairly simple queries, you can generate knowledge that are not explicitly available in DBPedia.

In a future post, I will try to systematize and generalize the approach.

(the french version of this post was first published here https://onsem.wp.imt.fr/2015/05/15/creer-des-connaissances-formalisees-pour-le-web-semantique-a-partir-de-dbpedia/)

Comments are closed.