Calculate length of path between nodes?

This is based on the same technique used to compute the position of an element in an RDF list using SPARQL that is described in: Is it possible to get the position of an element in an RDF Collection in SPARQL?

If you have data like this:

@prefix : <http://example.org> .

:orgA :hasSuborganization :orgB, :orgC, :orgD.
:orgB :hasSuborganization :orgE, :orgF.
:orgE :hasSuborganization :orgG.
:orgG :hasSuborganization :orgH.

which describes a hierarchy like this:

organization hierarchy

then you can use a query like this:

prefix : <http://example.org> 

select ?super ?sub (count(?mid) as ?distance) { 
  ?super :hasSuborganization* ?mid .
  ?mid :hasSuborganization+ ?sub .
}
group by ?super ?sub 
order by ?super ?sub

to get results like these:

$ sparql --query query.rq --data subs.n3
----------------------------
| super | sub   | distance |
============================
| :orgA | :orgB | 1        |
| :orgA | :orgC | 1        |
| :orgA | :orgD | 1        |
| :orgA | :orgE | 2        |
| :orgA | :orgF | 2        |
| :orgA | :orgG | 3        |
| :orgA | :orgH | 4        |
| :orgB | :orgE | 1        |
| :orgB | :orgF | 1        |
| :orgB | :orgG | 2        |
| :orgB | :orgH | 3        |
| :orgE | :orgG | 1        |
| :orgE | :orgH | 2        |
| :orgG | :orgH | 1        |
----------------------------

The trick here is to recognize that any path from X to Y can be viewed as a (possibly empty) path from X to some intermediate node Z (nonempty means that you can choose X as Z) concatenated with a (non empty) path from Z to Y. The number of possible ways of picking Z indicates the length of the path.

Leave a Comment