Over the years I have written many PHP applications that require SQL databases of some sort. I often reused my code, and developed some handy helpers over time. My current go-to solution reduces my development times massively and I want to share it with the world.

Problem

Executing an SQL request in PHP is usually done using PHP Data Objects (PDO). This abstraction layer allows to write code which is more or less independent of the database used. Nevertheless, the whole concept of using SQL inside PHP seems somewhat archaic. Similar to regular expressions, using a different syntax or even language for a very specialized task can sometimes be justified. But what if the query has to be dynamic? We basically write code to create a string for a different code (the SQL query). At some point, database access starts to look a lot like compiler design.

Most SQL queries I use contain very simple INSERT, UPDATE, SELECT or DELETE statements. The code to create the required queries can easily be wrapped in convenient functions. Ideally, those functions should handle escaping values, do basic error handling and provide some flexibility like allowing to add ORDER BY or LIMIT.

Security

PHP has a long tradition of being considered an insecure language. That sentiment resulted mostly from earlier versions allowing to overwrite global variables by modifying the request query by default. Ease of development attracted many newcomers with little security awareness, which led to many vulnerable PHP applications. Another very common attack vector is SQL code injection, enabled by not properly handling user input.

While it is certainly possible to write secure code in PHP, the language doesn’t always make it easy. When creating quick prototypes, even experienced developers sometimes refrain from escaping all their variables, just to keep the code short and readable. But bad prototype code can still somehow end up in productive systems. So how can we get other developers and ourselves to write secure code from the beginning? By making it easier than the insecure way!

Creating SQL from Dictionaries

Let’s start with with a simple table in SQLite.

CREATE TABLE books (
  id INTEGER PRIMARY KEY,
  title TEXT,
  published INTEGER,
  pages INTEGER
);

Relational databases map very well to arrays of dictionaries. By using the PDO::FETCH_ASSOC fetch style, a row from the table above might look like this:

{
  "id": 3,
  "title": "Don Quixote",
  "published": 1605,
  "pages": 992
}

But how do we insert such a row into the table? The SQL we want is simple. But why do we have to write it at all, when we already have the dictionary with all the information? Many WHERE statements have the form key1=val1 AND key2=val2 AND etc. which can be represented by a dictionary equally well.

INSERT INTO books (id, title, published, pages) 
   VALUES (3, "Don Quixote", 1605, 992);
UPDATE books
   SET name = "Old and Huge", pages = 9001
   WHERE published = 1605 AND pages = 992;
SELECT name FROM books WHERE id = 3;
DELETE FROM books WHERE published = 1605;

My desire to use dictionaries for database access led to my first SQL API, which I named SJMTable.

class SJMTable {
    public function __construct($pdo, $table);
    public function insert($data, $args=array());
    public function update($where, $data, $args=array());
    public function select($keys, $where, $args=array());
    public function delete($where, $args=array());
}

The constructor needs a PDO object connected to a database and the name of a table. $data contains data to insert or update, and $where is used to create a WHERE statement by connecting the entries with AND (key1 = value1 AND key2 = value2 etc.). $args allows to optionally add order keys, limits and offsets, or simply append some SQL to the end of the query. The select function returns an array containing the fetched rows, insert returns the primary key of the new row and the other functions return the number of affected rows.

A New Syntax

The simple API covers a surprisingly large amount of SQL scenarios. The generic $args argument adds some flexibility, but the solution seems a bit “hacky” and still doesn’t cover all possibilities. Especially joins would be very complex to add support for. Instead of adding all this complexity to the API, we can solve all our problems by introducing a single function that works similar to printf and a format syntax to replicate the previous functionality.

// Syntax:
// %@ -> quoted value/list
// %K -> unquoted value/list
// %W -> where (WHERE %W), connected with 'AND'
// %S -> assign (for UPDATE SET %A)
// %I -> insert (INSERT INTO %K %I)
$dbl = new SJMSQLayer($pdo);
$dbl->query('INSERT INTO books %I', $data);
$dbl->query('UPDATE foo SET %S WHERE %W', $data, $where);
$dbl->query('SELECT name FROM books WHERE %W', $where);
$dbl->query('DELETE FROM books WHERE %W', $where);

The resulting code is compact and very easy to read for everyone who knows SQL. %W, %S and %I provide the same conveniences as the previous API and %@ and %K allow to insert additional quoted or unquoted variables. When using an array as argument, things like WHERE id IN (%@) lead to valid and perfectly quoted SQL.

Method Chaining

The query function returns a SJMSQLayerStatement object. That object handles a couple of common things you might want to do with SQL queries or allows you to simply get the generated SQL string.

class SJMSQLayerStatement {
	public $sql = null;
	public function exec();
	public function get($key=false);
	public function getAll($key=false);
	public function getDict($dictKey, $valueKey=false);
	public function getGroup($groupKey, $valueKey=false);
}

By using method chaining, the functionality of SJMTable can still be achieved by single lines.

$dbl->query("DROP TABLE %K", "books")->exec();

get returns a single row from a SELECT query, getAll returns an array with all rows. getDict will also return all rows, but instead of an array, the result will be a dictionary where each row is addressed by its value of $dictKey. getGroup is used if the value by which the rows are addressed is not unique. It will return a dictionary of arrays. Using the optional parameters $key and $valueKey will fetch single values instead of the whole rows.

$stm = $dbl->query("SELECT * FROM books");
// get array of all book id's
$stm->getAll("id");
// get a dictionary of book titles by id
$stm->getDict("id", "title");
// get a dictionary of book id's by year
$stm->getDict("published", "id");

Source Code

The source code is available on GitHub. The SJMSQLayer class is very small (less than 100 lines); partially due to the lack of inline documentation. For now, just consider it an example implementation on how to improve your SQL experience. I might add some proper documentation at a later time, especially if people other than me want to use it. I’m also not sure which license I should use for it. I’d be happy to get some recommendations!

If my code saves you a significant amount of time, consider a small donation to my PayPal account.