Incremental XML DataSet for PHPUnit

November 19, 2009

I recently got started with proper database testing with PHPUnit and I was disappointed with my available options for fixture files.

For those not familiar with the documented options, there are:

From reading the documentation I got the impression that Flat XML is kind of like half-assing it, CSV is nice but you lose control over null values and XML is the way you should probably do it.  I was OK with this, I not really crazy about half-assing my code (especially test cases), and I like verbosity, something that you get with XML and contrarily non-existent in CSV.  So I buckled down and started working on test cases and their associated XML fixture files.  I quickly came to realize a big disadvantage to using XML fixture files for my datasets and that’s that I was continuously repeating myself; I was wasting time cutting and pasting from one fixture to another.  This really bothered me, I thought it would be useful to be able to construct an XML dataset out of multiple XML ‘partials’ so I set out to define a dataset class that would allow me to do just that.

I wrote a class called IncrementalXmlDataSet, it extends PHPUnit_Extensions_Database_DataSet_XmlDataSet and implements a fluid interface.  It’s not a lot of code but it gets the job done and I find it really useful.

<?php

class IncrementalXmlDataSet extends PHPUnit_Extensions_Database_DataSet_XmlDataSet
{
  const XML_START = '<?xml version="1.0" encoding="UTF-8" ?><dataset>';
  const XML_END = "</dataset>";

  protected $_xmlFiles;
  protected $_xml;
  protected $_basePath;

  public function __construct(array $xmlFiles = array(), $basePath = null)
  {
    $this->_xmlFiles = $xmlFiles;
    $this->_basePath = !is_null($basePath) ? $this->setBasePath($basePath) : null;
    $this->_xml = null;
  }

  public function initDataSet()
  {
    if (is_null($this->_xml)) {
      $this->generateXML();
    }

    $this->xmlFileContents = @simplexml_load_string($this->_xml);

    if ($this->xmlFileContents === FALSE) {
      throw new InvalidArgumentException("String is not valid xml: {$this->_xml}");
    }

    $tableColumns = array();
    $tableValues = array();

    $this->getTableInfo($tableColumns, $tableValues);
    $this->createTables($tableColumns, $tableValues);

    return $this;
  }

  public function setBasePath($basePath)
  {
    /**
     * append the directory separator to the end of the base path if it's not
     * already there
    */
    if ($basePath[strlen($basePath)-1] !== DIRECTORY_SEPARATOR) {
      $basePath .= DIRECTORY_SEPARATOR;
    }

    $this->_basePath = $basePath;

    return $this;
  }

  public function addFile($xmlFile)
  {
    $xmlFile = $this->_basePath . $xmlFile;
    if (!is_file($xmlFile)) {
      throw new InvalidArgumentException("Could not find xml file: {$xmlFile}");
    }

    $this->_xmlFiles[] = $xmlFile;

    return $this;
  }

  public function generateXML()
  {
    $contents = array();
    foreach ($this->_xmlFiles as $fname) {
      $contents[] = file_get_contents($fname);
    }

    $this->_xml = self::XML_START . implode('',$contents) . self::XML_END;

    return $this;
  }

}

Using the class is really simple, here is an example:

protected function getDataSet()
{
    $ds = new IncrementalXmlDataSet()
      ->setBasePath(TESTS_PATH . '/data/fixtures/partials')
      ->addFile('payments.xml')
      ->addFile('payments_by_credit_cards.xml')
      ->initDataSet();

    return $ds;
}

And that’s it, now I’m able to create partial XML files with table data in them and glue them together as I please.

Yay.


Join Table Inheritance With Doctrine

June 26, 2009

I’m sure I’m not alone when I say that besides my regular development job, I also have a few side projects that I like to work now and then. One such project I’m working on with a friend, it’s in the early stages of design/development so we’re still hashing the database schema. For this project we determined that it would be important to keep track of the emails that were sent by the system.

At first this problem seemed pretty mundane, to keep track of the emails I’d simply just map the pertinent email fields to columns in an email message table. However, after giving it more thought I found one serious weaknesses in this quick and dirty design.

  • email messages generally have more than one recipient in the To:, Cc, and Bcc fields. Having a single column for each of fields is then problematic.
    • I’ll have to extend the size of the column to something irrational (a clob perhaps)
    • I’ll have to determine some method of delimiting the addresses
    • Searching for emails sent to various users becomes fairly complicated

Giving it further thought I decided it would be better to associate email messages to the users themselves. That was a good idea but it sounded like I would have a lot of code duplication keeping track of the email recipients (To, Cc, Bcc) since they’re all very similar. Then it dawned on me, what if the join table associating the email messages to the users used column aggregation inheritance? It sounded like a pretty good idea, almost too good. I wasn’t sure if would work … good news, it does.

Here’s what it looks like.

// schema.yml
User:
  columns:
    id:
      type: integer
      autoincrement: true
      primary: true
    username:
      type: string(255)
    email:
      type: string(255)

EmailMessage:
  columns:
    id:
      type: integer
      autoincrement: true
      primary: true
    from_user_id:
      type: integer
      notnull: true
    subject:
      type: string(255)
    message:
      type: clob
  relations:
    Sender:
      type: one
      class: User
      local: from_user_id
      foreign: id
    Recipient:
      class: User
      local: recipient_id
      foreign: message_id
      refClass: EmailRecipient
    To:
      class: User
      local: recipient_id
      foreign: message_id
      refClass: ToRecipient
    Cc:
      class: User
      local: recipient_id
      foreign: message_id
      refClass: CcRecipient
    Bcc:
      class: User
      local: recipient_id
      foreign: message_id
      refClass: BccRecipient

EmailRecipient:
  columns:
   message_id:
     type: integer
   type:
     type: string(5)

ToRecipient:
  inheritance:
    extends: EmailRecipient
    type: column_aggregation
    keyField: type
    keyValue: to
  columns:
    recipient_id:
    type: integer

CcRecipient:
  inheritance:
    extends: EmailRecipient
    type: column_aggregation
    keyField: type
    keyValue: cc
  columns:
    recipient_id:
      type: integer

BccRecipient:
  inheritance:
    extends: EmailRecipient
    type: column_aggregation
    keyField: type
    keyValue: bcc
  columns:
    recipient_id:
    type: integer

Now that the schema is taken care of, lets load some data.

// fixtures.yml
User:
  user_1:
    username: cdecarlo
    email: cdecarlo@example.com
  user_2:
    username: bbuilder
    email: bbuilder@example.com
  user_3:
    username: wshatner
    email: wshatner@example.com
  user_4:
    username: ssnake
    email: snake@example.com

EmailMessage:
  message_1:
    Sender: user_1
    subject: This is an email
    message: This is the message body of the email.
    To: [user_2]
  message_2:
    Sender: user_1
    subject: This is another email
    message: Wow, two emails already! Better slow down.
    To: [user_3]
    Cc: [user_2]
  message_3:
    Sender: user_4
    subject: Party at my place
    message: Hey everyone, there is a party at my place tonight! Be there or be square!
    To: [user_4]
    Bcc: [user_1, user_2, user_3]

Ok, now that the models are built and data is loaded it’s time to see how this works in practice.  Here are some sample queries to get you started.

// find all the messages with there recipients grouped together (regardless of To, Cc, Bcc)
$ ./doctrine dql "from EmailMessage e, e.Recipients"
dql - executing: "from EmailMessage e, e.Recipients" ()
dql -   id: '1'
dql -   from_user_id: '1'
dql -   subject: 'This is an email'
dql -   message: 'This is the message body of the email.'
dql -   Recipients:
dql -     -
dql -       id: '2'
dql -       username: bbuilder
dql -       email: bbuilder@example.com
dql - -
dql -   id: '2'
dql -   from_user_id: '1'
dql -   subject: 'This is another email'
dql -   message: 'Wow, two emails already! Better slow down.'
dql -   Recipients:
dql -     -
dql -       id: '3'
dql -       username: wshatner
dql -       email: wshatner@example.com
dql -     -
dql -       id: '2'
dql -       username: bbuilder
dql -       email: bbuilder@example.com
dql - -
dql -   id: '3'
dql -   from_user_id: '4'
dql -   subject: 'Party at my place'
dql -   message: 'Hey everyone, there is a party at my place tonight! Be there or be square!'
dql -   Recipients:
dql -     -
dql -       id: '4'
dql -       username: ssnake
dql -       email: snake@example.com
dql -     -
dql -       id: '1'
dql -       username: cdecarlo
dql -       email: cdecarlo@example.com
dql -     -
dql -       id: '2'
dql -       username: bbuilder
dql -       email: bbuilder@example.com
dql -     -
dql -       id: '3'
dql -       username: wshatner
dql -       email: wshatner@example.com

// find email all the email messages with completely hydrated Sender, To, Cc, Bcc
$ ./doctrine dql "from EmailMessage e, e.Sender s, e.To to, e.Cc cc, e.Bcc bcc"
dql - executing: "from EmailMessage e, e.Sender s, e.To to, e.Cc cc, e.Bcc bcc" ()
dql -   id: '1'
dql -   from_user_id: '1'
dql -   subject: Thisisanemail
dql -   message: Thisisthemessagebodyoftheemail.
dql -   Sender:
dql -     id: '1'
dql -     username: cdecarlo
dql -     email: cdecarlo@example.com
dql -   To:
dql -     -
dql -       id: '2'
dql -       username: bbuilder
dql -       email: bbuilder@example.com
dql -   Cc: {  }
dql -   Bcc: {  }
dql - -
dql -   id: '2'
dql -   from_user_id: '1'
dql -   subject: Thisisanotheremail
dql -   message: 'Wow,twoemailsalready!Betterslowdown.'
dql -   Sender:
dql -     id: '1'
dql -     username: cdecarlo
dql -     email: cdecarlo@example.com
dql -   To:
dql -     -
dql -       id: '3'
dql -       username: wshatner
dql -       email: wshatner@example.com
dql -   Cc:
dql -     -
dql -       id: '2'
dql -       username: bbuilder
dql -       email: bbuilder@example.com
dql -   Bcc: {  }
dql - -
dql -   id: '3'
dql -   from_user_id: '4'
dql -   subject: Partyatmyplace
dql -   message: 'Heyeveryone,thereisapartyatmyplacetonight!Bethereorbesquare!'
dql -   Sender:
dql -     id: '4'
dql -     username: ssnake
dql -     email: snake@example.com
dql -   To:
dql -     -
dql -       id: '4'
dql -       username: ssnake
dql -       email: snake@example.com
dql -   Cc: {  }
dql -   Bcc:
dql -     -
dql -       id: '1'
dql -       username: cdecarlo
dql -       email: cdecarlo@example.com
dql -     -
dql -       id: '2'
dql -       username: bbuilder
dql -       email: bbuilder@example.com
dql -     -
dql -       id: '3'
dql -       username: wshatner
dql -       email: wshatner@example.com

// find messages where the sender and the To recipient are the same
$ ./doctrine dql "from EmailMessage e, e.Sender s, e.To t where s.username = t.username"
dql - executing: "from EmailMessage e, e.Sender s, e.To t where s.username = t.username" ()
dql -   id: '3'
dql -   from_user_id: '4'
dql -   subject: 'Party at my place'
dql -   message: 'Hey everyone, there is a party at my place tonight! Be there or be square!'
dql -   Sender:
dql -     id: '4'
dql -     username: ssnake
dql -     email: snake@example.com
dql -   To:
dql -     -
dql -       id: '4'
dql -       username: ssnake
dql -       email: snake@example.com

That’s it.  The best thing is that Doctrine handles all the heavy lifting for you, no custom code.  It’s just a nice way to peice together some core behaviours.  What’s also great about this approach is that it’s easily generalised.  For instance instead of Email Messages you could be modeling:

  • Forums and Users where users can be Administrators, Moderators, and General Users
  • Departments and Employees where employees can be Directors, Managers, and Grunts
  • Boxes and Shapes where shapes can be squares, triangles, and circles

It’s getting drafty in here

November 22, 2008

I just started work on a little project of mine. The overall goal is to figure out how Symfony Forms work, I’m writing it all down and by the end I hope to have a pretty thorough tutorial. I’m still trying to come up with a title, I was thinking maybe “Symfony Forms: What they wanted you to know but didn’t have the time to tell you”. It’s a little wordy, but I think it gets the point across.