/usr/portage

Proof of Concept: Binary packed UUIDs as primary keys with Doctrine2 and MySQL 8

The Problem

For a project I need non-guessable synthetic primary keys. I will use them to construct URIs and these URIs need to be non-guessable. If I would use the traditional way of doing so, going the down the route of integer primary keys with auto increments, or using a sequence table an attacker could easily increment or decrement the integer to find some similar items. Next idea was to use UUIDs or GUIDs. These identifiers are globally unique, so this would work for primary keys too. Reading some documentation on the topic brought up the interesting issue of space usage. Storing the UUIDs in a CHAR column would be a huge waste of space compared to an integer primary key. As primary keys are referenced in related table, this would be a huge issue. Finally I found a trick storing there binary representation in a BINARY column. Doing that in MySQL is fairly easy:

INSERT INTO items SET id = UNHEX(REPLACE(UUID(), '-', '');

Selecting a human readable reasult is easy too:

SELECT HEX(id) FROM items;

Achieving the same thing in PHP is pretty straightforward too. You need the PECL extension UUID (pecl install uuid) and pack()/unpack():

<?php
$uuid = uuid_create(UUID_TYPE_TIME);
$uuid = str_replace("-", "", $uuid);
var_dump(pack('H*', $uuid));
string(16) "?Irp??ߐ
                   )??m"

Converting them back into there hex representation is similar:

<?php
var_dump(array_shift(unpack('H*', $binaryUuid)));
string(32) "d2f268509db211df9010000c29abf06d"

Doctrine2 integration

Next step would be integration with Doctrine2. To do so, we need to create a custom mapping type. I’m not using Doctrine2 for database abstraction, but for it’s object relational mapping capabilities so I ignore portability and concentrate on MySQL.

<?php
namespace Lars\Doctrine2\Types\Mysql;
use Doctrine\DBAL\Types\Type;
use Doctrine\DBAL\Platforms\AbstractPlatform;
 
class BinaryType extends Type
{
    const BINARY = 'binary';
 
    public function getSqlDeclaration(array $fieldDeclaration, AbstractPlatform $platform)
    {
        return sprintf('BINARY(%d)', $fieldDeclaration['length']);
    }
 
    public function getName()
    {       
        return self::BINARY;
    }   
      
    public function convertToPhpValue($value, AbstractPlatform $platform)
    {
        if ($value !== null) {
            $value= unpack('H*', $value);
            return array_shift($value);
        }
    }
 
    public function convertToDatabaseValue($value, AbstractPlatform $platform)
    {
        if ($value !== null) {
            return pack('H*', $value);
        }
    }
}

Now we are introducing the new type to Doctrine2 somewhere in our setup logic:

<?php
use Doctrine\DBAL\Types\Type;
Type::addType('binary', 'Lars\Doctrine2\Types\Mysql\BinaryType');

One issue I stumbled upon was the default Doctrine2 does. With MySQL it maps binary types to intermediate blob types (in the Doctrine2 type system). This default behavior is not configurable, so we need to patch Doctrine\DBAL\Schema\MySqlSchemaManager. I’m sure there is a more elegant way and I would love to receive some remarks here:

            case 'tinyblob':
            case 'mediumblob':
            case 'longblob':
            case 'blob':
            /** 
             * Commented out to make our custom mapping work
             * case 'binary':
             */         
            case 'varbinary':
                $type = 'blob';
                $length = null;
                break;

Last part is our entity:

<?php
namespace Lars\User\Domain;
 
/**
 * @Entity
 * @Table(name="user",indexes={@Index(name="user_email_idx",columns={"user_email"})})
 * @HasLifecycleCallbacks
 */
class User
{
    /**
     * @Id
     * @Column(type="binary",length=16,name="user_id")
     * @GeneratedValue(strategy="NONE")
     */
    protected $_id;
 
    /**
     * @Column(type="string",length=32,name="user_email")
     */
    protected $_email;
 
    public function changeEmail($email)
    {
        $this->_email = $email;
        return $this;
    }
 
    public function getId()
    {
        return $this->_id;
    }
 
    /**
     * @PrePersist
     */
    public function generateUuid()
    {
        $this->_id = str_replace('-', '', uuid_create(UUID_TYPE_TIME));
    }
}

The important part here is the createUuid()-method to generate the UUID once before persisting the domain object. With GeneratedValue(strategy="NONE") we told Doctrine not to generate the ID by itself and with HasLifecycleCallbacks we configure Doctrine to scan for lifecycle callback methods, so that generateUuid() will be called before persisting the entity.

Fetching an object by ID is as easy as ever, but don’t forget to convert the ID:

$user = $em->find(
    'Lars\User\Domain\User',
    pack('H*', '16aec29e9db011df8013000c29abf06d')
);

Further ideas

The whole UUID should be refactored towards an UUID value object to encapsulate UUID creation and binary conversion.

Filed on 01-08-2010, 23:11 under , , , & eight comments & no trackbacks

Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

  1. Gerd Riesselmann responses:
    published on August 2nd 2010, 12:06:18 pm *

    Interesting. I usually use two fields: An autoincrementing integer as primary key and a char field filled with some hashed randomness – where both are required to access a ressource, like /users/[id]/[hash]. Since id ensures uniqueness, hash only must be unpredictable. And id beeing a POIID (Plain Old Integer ID) gives the usual performance when applying data integrity constraints.

    Reply

  2. Lars Strojny says:
    published on August 2nd 2010, 02:18:35 pm *

    Good idea with the random column. I’m a little bit about lookup performance on the non-primary index there. MySQL really loves it’s primary keys, the other keys, not so much.

    Reply

  3. Gerd Riesselmann means:
    published on August 17th 2010, 06:28:04 pm *

    Just stumbled upon a similar question here: http://blog.koehntopp.de/archives/2938-Master-Master-und-Distributed-Transactions.html#extended. See the comments: InnoDB clusters by primary key, which in case of UUIDs may lead to a performance loss.

    Regarding lookup: I usually just lookup by id, retrieve the record, and compare the hash after that. No need for an index on that field.

    Reply

  4. beberlei supposes:
    published on August 2nd 2010, 08:06:38 pm *

    Hey Lars,

    awesome post, i like the usage of @PrePersist to generate the UUID instead of hacking in a custom generator strategy. Really elegant :-)

    what version of Doctrine2 are you using? In the current master you can change the behaviour of MySQL Schema Manager mapping the varbinary to your custom type, see:

    http://www.doctrine-project.org/projects/dbal/2.0/docs/reference/types/en#detection-of-database-types

    This will be included in ORM Beta3 which will be released in the next days.

    Reply

  5. Dennis Gearon opines:
    published on January 17th 2011, 09:09:13 am *

    Postgres supports UUIDs as the string for I/O, and stored already as binary 16 bytes. But then, I’m a postgres lover.

    Different subject, anyone ever gotten UUIDs (or even strings) to work as primary keys in Doctrine 1.4.x?

    Reply

  6. mlaw answers:
    published on March 9th 2011, 07:15:35 pm *

    I noticed that when specifying mappings on a binary type that the length field is incorrectly set to 0 for the generated foreign keys… Just wondering if you had the same issue or found a work around.

    Reply

  7. Lars Strojny supposes:
    published on March 20th 2011, 03:17:15 am *

    Indeed, we used columnDefinition="binary(16)" to enforce a specific schema. Nowadays we use a custom database migration tool (dbdeploy.jar), so this doesn’t matter too much anyway.

    Reply

  8. Wim Mostmans replys:
    published on February 22nd 2013, 02:55:09 pm *

    Thanks for this Lars, I’ve used it as a base to implement a uuid type.
    I thought this could be useful for other people too so I’ve created a gist for it:
    https://gist.github.com/Sitebase/5013494

    Reply

Add a Comment & let me know what you think