How do I design a database to store properties, selecting attributes by synonyms

I would take a different approach to your attribution scheme. Rather than treating different attributions as synonyms, I would treat them as overlapping, or more specifically, nested descriptions of a property. This would handle your business case while at the same time acknowledging the astute observation made by Mike Sherrill.

Here is a quick ERD sketch:

ERD

By way of a very fast data dictionary:

PROPERTY is a piece of real estate.

CATEGORY is a collection of descriptive attributes. The point of this table is more as an organizer of attributes than anything else. It could include things like “type of property”, “ownership structure”, “number of bathrooms”, and whatever else might be of interest.

ATTRIBUTE is a specific quality of interest. Note the involuted relationship on this entity type. I’ll deal more with that later. The main point is that attributes can be more general or more specific and some attributes can be seen as refinements of other attributes.

DESCRIPTOR is the intersection of a PROPERTY and the ATTRIBUTEs that have been associated with that particular piece of real estate.

So how is this supposed to help?

The key is how attributes work. If you use a nested set model, then you can address more or less specific attribution and searching criteria. Consider the following diagram of one potential CATEGORY with its associated ATTRIBUTEs:

enter image description here

In this example the CATEGORY is “type of property”. You can see from the diagram that there is a hierarchical breakdown of attributes in this category. Each box in the diagram is a record in ATTRIBUTE. Boxes that contain other boxes have child attributes. Boxes that are inside another box have an FK to their containing box and so forth.

In this way, you could say “I want to find a property that is a Penthouse”. You can then find PROPERTY records with a related DESCRIPTOR that points at the “Penthouse” ATTRIBUTE. That is pretty easy. But what if your search comes up empty?

The advantage of this approach is that you can then loosen your criteria by saying, “let’s go up the attribution hierarchy to the next less-specific thing than penthouse”. In my example, that would be “Highrise”. Now you try your search again and you might have better luck.

A system like this gives you the ability to be as specific as you want in each category of attribution while relaxing the others far enough to start getting search hits. This is really what a real estate agent’s work is about is it not? Helping the client to make the necessary compromises to find the best fit to their most important criteria?

Handling Nested Sets

The only tricky part of this approach is how to handle the nested sets. There are lots of ways to do this, many of which have been thoroughly documented elsewhere. I myself like the visitation number technique, especially for relatively static data sets. This makes it very easy to find matches for some given ATTRIBUTE or any of its children without having to do anything exotic in your SQL.

EDIT: So How Does This Work?

OP asked how do you handle things like number of bedrooms and what do the queries look like? Let’s take another example for illustration:

Bedroom Example

The above shows the nested sets for the CATEGORY “Number of Bedrooms”. I’ve also added the visitation numbers to the diagram. Note the way the visitation numbers work, in particular, note that the left (green) and right (red) numbers for any given attribute value contain the left and right visitation numbers for any subordinate attributes. For example, “2+ Bedrooms” has left and right numbers 6 and 15 respectively. Every attribute that falls under “2+ Bedrooms” has left and right numbers that fall within this range.

So how would you query for a properties with a given descriptor? Let’s say we want to find all properties with two or more bedrooms. The SQL for such a query might look something like this:

select P.* 
from PROPERTY P
  inner join DESCRIPTOR D
    on P.id = D.property_id
  inner join ATTRIBUTE A
    on D.attribute_id = A.id
where A.left >= (select X.left from ATTRIBUTE X
                 where X.name="2+ Bedrooms")
  and A.right <= (select Y.right from ATTRIBUTE Y
                  where Y.name="2+ Bedrooms")

Note that the above query is a little different that what you might actually use. For example, you’d probably look up the filtering attribute using its int identity key instead of its string name. However, I thought I’d leave it as shown for clarity around the main point, which is you filter by looking not for a specific related attribute, but for any related attributes that fall within your filter range.

If you wanted to filter on multiple attributes, then just add more sub-clauses to your where clause.

Leave a Comment