Temporal nearness – data model

In my master thesis, one important attribute is the temporal nearness of concepts in the domain, such as moving persons and equipment. Special concerns for the case is that the geography is indoors in potentially complex buildings, such as large hospitals. The geography is considered to be known to the actors – assuming that the persons know where they are, where they are headed to and how to get there. This may argue that a floor map depicting where you are, your target and fastest route there is unnecessary – and I agree. However what may be of interest is to get information on how far away you are from your interest points and how far your interest points are from each other. For instance if you are 15 minutes away from your office, you have planned to meet your colleague there at 15 minutes, but your colleague is 20 minutes away. Provided by this information is 5 minutes that you can finish what you are doing, walk to your office, and still meet your colleague in time! (pretty lame example I know – working on my exemplifying skills:). Anyhow, the driving information is thus the temporal nearness. This is highly dependant on the geography in the context and it’s navigational abilities. Building generally has rooms with doors and corridors which in turns comprise and constrains the ability to move in that space (i.e. walking). My attempt at finding the temporal nearness comprise a spatial model of such a geography, a data structure for navigational abilities, implementation of this model in a spatial database (PostGIS) and calculation of the temporal nearness based on this. This blog post tries to explain just this. I plan to post a separate post on how to manage the data for this model - so watch out:)

Spatial model and data structure for indoor temporal nearness

Spatial model and data structure

Figure1: Spatial model and data structure

Figure 1 shows the ER model that stores the spatial data as well as the data structure for nearness calculations. It is very simple, which I believe is a strength. Rooms/corridors are contained in the Area table – which holds just the spatial information of the room as a polygon. This is sufficient for the case in question, however it could (and should) be extended with more information, especially which floor it resides.

To support temporal nearness I needed a graph-like structure, nodes being start and target points, related to each others by edges that assume a cost of traversing them. This is stored in node and node_edge. Nothing really fancy here either. Plain database modelling of graph’s spiced with some spatial data.

More interesting is the potential issues of modelling indoor geographies. First of all – graph theory and fastest route algorithms only deals with start and end points in nodes. In the real world these nodes does not exist. So an arbitrary point (a persons location) would have to “snap” to the best node to be able to “traverse the graph” (i.e. navigate). The best node is probably the nearest node in Euclidean distance (i.e. common nearest in meters) – however depending on the density of nodes and the geography there are some pitfalls. Wiggly and skewed rooms/corridors with low density of nodes could give harsh errors by finding for instance that the nearest node is one that is actually far away or even worse in another floor or another room. Possible solutions to the problem of “snapping” an arbitrary point to the graph are; 1. Constrain the search space of nodes to rooms. This is enabled by the modelling of rooms in the Area table. 2. Increase the node density (i.e. nodes “everywhere”, for instance every 10 cm). This decrease the performance of the algorithms, possibly pretty bad. Making of the nodes and edges between them becomes so complex or time consuming or both that it must be automated. However one could foresee that automation of this process is possible, and probably not that difficult. Although the model in figure 1 supports this I will assume a low density mesh of nodes, but design algorithms that easily scale for high density.

Edges in the model are stored as geographical lines which, from a computer science perspective makes no sense. However it does make sense from a GIS perspective, where edges could act as possible walking routes as well as edges in a graph data structure. Although I have only experimented with conceptual edges that neglects the route of walking but retains the time it takes as the cost of the edge.

As mentioned, floors are not included in the model, and you might think; “Haha, this will never work in a real life situation – floors makes it all to complex!”. Well, you might have some truth in that thought. However, it has not been forgotten and it has been discussed with fellow students and thesis supervisors. The conclusion to modelling floors is to just “wrap out” the building and just include nodes and edges where the floors connect. For instance an elevator is a fast way of moving from one floor to another, compared to stairs, this is modelled as nodes in the start/end of the elevator and the elevator as an edge with a cost (smaller than that of the stair-edge). And whops, solved! (or really?) Well, I fear that much is still to be figured out here, and I suspect that some issues lurk for the floor issue (such as coordinat systems used and similar). I believe however that this idea may prove to have something to it.

So over to the more technical side.

Implementation

Theory has a tendency to make complex problems seem less complex, however, for the theory to be successful it needs to be proven in real life. So I have tested my ideas by a proof of concept implementation illustrating important solutions. I decided to use PostGIS, which is a spatial extension to the PostgreSQL database that is quite powerful, in addition I needed to compute the fastest route based on the nodes and edges. I had two choices here, either implement a known fastest route algorithm, such as Djikstra’s algorithm, myself, or find a third-party that did it for me. After some searching I ended up with the latter, namely, yet another extension to PostgreSQL/PostGIS, namely pgRouting which proved to be exactly what I needed, and even better it integrated directly in the infrastructure I chose – nice! Then all I needed was to design some (simple) algorithms that was suited for the task at hand.

Firstly I needed to enable spatial functionality on the fields of the tables holding the spatial information, this is fairly straight forward and includes commands like;

SELECT AddGeometryColumn('polygon','area',32631,'POLYGON',2);

Which adds a POLYGON geometry to the field area.polygon, with the coordinate system (actually projection) 32631, which is short for a UTM zone that I chose. Actually the projection is not that important in my case as it is indoors, and thus don’t have so much to do with the world as most maps do. However UTM is nice as it uses meters as metric and the zone I chose natively starts at N=0. Additionally I needed to add some functionality for the pgRouting as well, it proved no difficulty, however some issues with the installation of pgRouting occured, but that was mostly because of a tweaked installation of PostgreSQL that was in another location than default (including altering some make files), so I recommend to go for default installation paths:) Well, the topology creation that pgRouting needs goes something like this (not sure if it is absolutely correct, see pgRouting docs for more);

SELECT assign_vertex_id(node_edge, 0.01, line, edge_id);

So, no worries there, following the documentation and examples was straight forwards. Then I was ready to begin the fun stuff:)

First I needed to find the nearest node from an arbitrary point [eg. (0.2, 3)], given the constraints mentioned earlier. This ment that I needed to find the polygon the point was in, this proved very easy in PostGIS using the function ST_Within() which checks whether a point is inside a polygon or not, very handy for a fairly complex task. Secondly I needed to find the nodes within the polygon that the point is in. Some set-theoretic thinking about it and the solution is (in SQL);

SELECT node.id AS node_id, area.id AS area_id,
 distance(GeomFromText('POINT(0.2 3)', 32631), node.point) AS distance
 FROM node, area
 WHERE ST_Within(GeomFromText('POINT(0.2 3)', 32631), area.polygon) = 't'
 ORDER BY distance ASC LIMIT 1

Which should get all nodes and all areas constrained by the areas that have the point (o.2 3) inside them (i.e. 1 polygon = 1 room). Additionally it calculates the distance from the point to all nodes in the set, orders it and limits the result to 1 row (i.e. the nearest node). Distance is straight forward with the PostGIS distance() function which performs an Euclidean distance calculation and the issues associated with it, as mentioned earlier.

As the task of finding the nearest node is needed several times in different contexts it is preferred to encapsulate the logic in some method-like approach. As I didn’t have any experience with PL/PGSQL, I thought that it would be nice to try it out. I knew that it was a language for creating functions in the PostgreSQL DBMS, which was exactly what I needed. The definition for the function that encapsulate the above logic is;

CREATE OR REPLACE FUNCTION alexanno_find_nearest_node("X" double precision, "Y" double precision)
  RETURNS integer AS
$BODY$SELECT node.id AS node_id
FROM node, area
WHERE
  ST_Within(GeomFromText('POINT('||CAST($1 AS text)||' '||CAST($2 AS text)||')', 32631), area.polygon) = 't'
ORDER BY
  distance(GeomFromText('POINT('||CAST($1 AS text)||' '||CAST($2 AS text)||')', 32631), node.point) ASC
LIMIT 1
$BODY$
  LANGUAGE 'sql' VOLATILE
  COST 100;
ALTER FUNCTION alexanno_find_nearest_node(double precision, double precision) OWNER TO postgres;

The function is of course specific to my case, and is to be used as an example only.

Next up was the calculation of temporal nearness, which consisted mainly of computing the shortest path in the graph. Using pgRouting for this proved to be very nice, however the method for obtaining shortest path consists of a “stand-alone” query, so I needed to exhibit some subquerying, which resultated in several queries that does different things, for example finding the shortest path from one node to everyone else:

SELECT *, (SELECT SUM(cost) FROM shortest_path('SELECT edge_id AS id, source::int4,
         target::int4, cost::double precision
        FROM node_edge',(SELECT id FROM node AS n WHERE n.id = 4), node.id, false, false) AS subQ) AS subq2
         FROM node

Other SQL queries that I note down as I go forward are documented at my wiki page on PostGIS which of course are subject to radical change at any given time:)

One additional method that I believe may be useful is to create a view that contains all shortest path available in the graph. However this can pose some performance issues as it will generate n^2 rows, which potentially is several millions. Although leaving performance heavy tasks up to DBMS’s is something I believe is a good thing as DBMS’s traditionally implements performance tweaks way beyond what, at least I can achieve:) The definition of the view is found at my wiki

So that was my current investigations and thoughts on temporal nearness and spatial modelling of indoors environments. Do you have any thoughts on this? Am I on some right track with this? Any experience in performance issues with PostgreSQL/PostGIS/pgRouting and similar problems like this one?

Shortly I will post a follow-up to this post and share my experience with data management of the implementation described here. This will include mostly experience with µDig connected to PostGIS.


Del på:
  • Facebook
  • Twitter
  • LinkedIn
  • Add to favorites

1 Comment »

  1. [...] the promised follow-up to the post on data modelling in PostGIS. In this post I will cover how I populated the database using a graphical tool called [...]

RSS feed for comments on this post. TrackBack URI

Leave a comment

This work is licensed under a Creative Commons Attribution 3.0 Unported License.
(c) 2010 What's Sound? | powered proudly by WordPress