Introduction

Lately we had several projects where we had to store in a database very different items that shared a common state.

As an example take the RocketLab website you are reading: Events and BlogPosts are aggregated in the LabLog list as if they were similar items. And indeed they all have a Title, a Date and a Description.

But if you get the detail page of an Event or a BlogPost you can see that they actually don't contain the same information: a BlogPost contains essentially formatted text when an Event contains more structured information such as the place where the event will take place, the type of event it is, if people need to register to attend, etc..

Still we have to access those entities sometimes as similar items (in the LabLog list) or as different items (in the events list and in the blog posts list).

NaĆÆve database model

Our first idea, and it was not that bad, Drupal does just the same, was to have a database table with the common fields, a field containing the type of item (it's either an event or a blog post) and a data field where we serialized the corresponding PHP object. This approach was ok until we had to filter or search LabLog items based on fields that were contained in the serialized data.

Indeed SQL does not know anything about PHP serialized data, thus you cannot use any of it's features on that data.

So how do you get all the LabLog items that are Events, happen in April 2012 and are ā€œtechtalksā€? The only way is to go through all the Events records of April, unserialize the data and check if it's a techtalk event. In SQL you would normally only do a single request to find those items.

A better database model

There is a better way to model this in a database, it's called table inheritance. It exists in two forms: single table inheritance and multiple table inheritance.

Multiple table inheritance

Multiple table inheritance requires to use three tables instead of a single one. The idea is to keep the common data in a ā€œparentā€ table, which will reference items either in the Event table or in the BlogPost table. The type column (called the discriminator) helps to find out if the related item should be searched in the Event table or in the BlogPost table. This is called multiple table inheritance because it tries to model the same problem as object inheritance using multiple database tables.

Multiple table inheritance

When you have a LabLogItem you check the type field to know in which table to find the related item, then you look for that item with the ID equals to related_id.

Single table inheritance

Alternatively the same can be modelled in a single table. All the fields are present for all the types of LabLogItem but the one that do not pertain to this particular type of item are left empty. This is called single table inheritance.

Single table inheritance

Single or multiple table inheritance

The difference is really only in how the data is stored in the database. On the PHP side this will not change anything. One may notice that single table inheritance will promote performance because everything is in a single table and there is no need to use joins to get all the information. On the other hand, multiple table inheritance will allow a cleaner separation of the data and will not introduce ā€œdead data fieldsā€, i.e. fields that will remain NULL most of the time.

Table inheritance with Symfony and Doctrine

Symfony and Doctrine make it extremely easy to use table inheritance. All you need to do is to model your entities as PHP classes and then create the correct database mapping. Doctrine will take care of the hassle of implementing the inheritance in the database server.

Please note that the code I present here is not exactly what we use in RocketLab; we are developers and as such we always have to make things harder. But the idea is thereā€¦

The parent entity

In the case of RocketLab we created a parent (abstract) entity, called LabLogItem, that contains the common properties.

/**
 * This class represents a LabLog item, either a BlogPost or an Event.
 * It is abstract because we never have a LabLog entity, it's either an event or a blog post.
 * @ORMEntity
 * @ORMTable(name="lablog")
 * @ORMInheritanceType("SINGLE_TABLE")
 * @ORMDiscriminatorColumn(name="type", type="string")
 * @ORMDiscriminatorMap( {"event" = "Event", "blogpost" = "BlogPost"} )
 */
abstract class LabLogItem
{
    /**
     * @ORMId
     * @ORMColumn(type="integer")
     * @ORMGeneratedValue(strategy="AUTO")
     */
    protected $id;

    /**
     * @ORMColumn(type="date")
     */
    protected $date;

    /**
     * @ORMColumn(type="string")
     */
    protected $title;

    /**
     * @ORMColumn(type="text")
     */
    protected $description;

    /***** Getters and setters *****/

    public function getId()
    {
        return $this->id;
    }

    public function setDate($date)
    {
        $this->date = $date;
    }

    public function getDate()
    {
        return $this->date;
    }

    // And so on...
}

There are several things to note about the mapping:

  • @ORMInheritanceType: indicates that this entity is used as parent class in the table inheritance. This example uses single table inheritance, but using multiple tables inheritance is as easy as setting the parameter to ā€œJOINEDā€. Doctrine will create an manage the unique or multiple database tables for you !
  • @ORMDiscriminatorColumn: indicates which column will be used as discriminator (i.e. to store the type of item). You don't have to define this column in the entity, it will be automagically created by Doctrine.
  • @ORMDiscriminatorMap: this is used to define the possible values of the discriminator column as well as associating a specific entity class with each type of item. Here the discriminator columns may contain the string ā€œeventā€ or ā€œblogpostā€. When its value is ā€œeventā€ the class Event will be used, when its value is ā€œblogpostā€, the class BlogPost will be used.

Basically that's the only thing you need to use table inheritance, but let's have a look at the children entities.

The children entities

We have two regular entities to model the events and blog posts. Those entities extend LabLogItem.

/**
 * Represent a blog post item.
 * Note that this class extends LabLogItem
 */
class LabLogItemBlog extends LabLogItem
{
    /**
     * @ORMColumn(type="text")
     */
    protected $content;

    /***** Getters and setters *****/

    public function getContent()
    {
        return $this->content;
    }

    public function setContent($content)
    {
        $this->content = $content;
    }
}

/**
 * Represent an event item.
 * Note that this class extends LabLogItem
 */
class LabLogItemEvent extends LabLogItem
{
    /**
     * @ORMColumn(type="string")
     */
    protected $eventType;

    /**
     * @ORMColumn(type="string")
     */
    protected $location;

    /**
     * @ORMColumn(type="boolean")
     */
    protected $requiresRegistration;

    /***** Getters and setters *****/

    public function getEventType()
    {
        return $this->eventType;
    }

    public function setEventType($type)
    {
        $this->eventType = $type;
    }

    // And so on...
}

There is not much special in the children entities. An important thing to note is that the common fields defined in the parent entity LabLogItem SHOULD NOT be repeated here. Also you may notice that there is no annotations in the children such as @ORMEntity to indicate that they are entities. Indeed they will inherit the annotations of LabLogItem and become entities.

From now on, when you create a PHP object of type Event and ask the entity manager to persist it, Doctrine will automatically do the complex work for you. From the developper point of view, Events and BlogPosts are just entities like any other.

It's easy to do operations on items which you don't know exactly the type:

$item = $entityManager->getRepository('RocketLabBundle:LabLogItem')->findOneByDate($someDate);

// Here we don't know exactly whether $item contains a blog post or an event...

if ($item instanceof Event) {
    // Then it's an Event
    echo $item->getEventType();
} else {
    // Otherwise it's a BlogPost
    echo $item->getContent();
}

But, if you know the type of item you are using you still can use them as regular entities:

$item = $entityManager->getRepository('RocketLabBundle:Event')->findOneByDate($someDate);

//  We have searched the Event entity repository so what we get in $item MUST BE an Event
echo $item->getEventType();

Conclusion

As you can see above using table inheritance with Symfony and Doctrine is very easy. It's just a matter of creating the parent class and the correct mapping. Furthermore you can switch from single to multiple table inheritance by modifying one line of code.

This technique should be used whenever you need to store items with a common state but that are very different in their nature.