restifydb is a framework that is able to automatically map databases (from all major database engine providers) to REST web services. The web services can output XML or JSON. It leverages data discovery. It has advanced capabilities as automatic query expansion (joining based on foreign keys), automatic inward references, automatic data discovery, filtering, sorting and out of the box CRUD operations (create, read, update and delete data). Most of these features are used by the demo applications built using restifydb. It also features an administration panel which makes configuring it quite easy.
Let's assume that you have in your data warehouse a database containing information about the world continents, countries and largest cities. One customer of yours is interested in using this data in order to build a nice HTML5/Android/iOS (see an example ) application which will synthesize this information into graphs, tables, etc., and present it to the users. One solution would be for you to offer the entire database for download (via an SQL dump, for instance). However, this is not the best approach because data usually changes and you would need to publish these data dumps every time you update the information. If you have one thousand customers using you database, each one of them would have to re-download the database every time you publish a patch. restifydb can alleviate this problem by creating a single point of access to your data. This is how it works - in five simple steps:
restifydb operates will the following types of entities, all of which can be references by an unique identifier (except for one special case, please see below):
system: this is the root entity. it represents the system as a whole and
contains a collection of data sources. Its URL correspondent is the root framework URL,
e.g.: /api
(),
given that the application is installed in the api
folder.
data source: it usually corresponds to a classical database or schema. It is parent for a
collection of tables. It is identifiable by its context name (as specified when adding the
data source). Its URL counterpart would be, for instance: /api/countrylicious
()
given that the data source name is countrylicious.
table: it corresponds to a classical table within a database. It is a
collection of rows which have all the same structure. It is identifiable via its name. Its
URL counterpart could be, for instance: /api/countrylicious/countries
()
given that the
table name is countries.
row: it usually corresponds to a classical row within a table and it is a collection of
fields.
record: this corresponds to a row which can be identified via an unique value. Most often
this unique value is represented by the primary key of the table containing the record. Its URL
counterpart could be, for instance: /api/countrylicious/countries/233
()
given that the row ID is 233. If there is no primary key defined, the framework
will not be able to expose records.
field: this corresponds to a classical field (or column) within a table. Each field of a
record/row has an associated value (even if this is sometimes set to nil).
When developing restifydb, I used a paradigm called HATEOS (Hypermedia as the
Engine of Application State). This means that the client connecting to the service does not need
to know anything about the structuring of the data and its interconnections. These connections are
described by means of linkage, meaning that every relation is defined via a link to the entity
at the other end of the relation. This is achieved through hrefs.
E.g.: if the data source countrylicious contains a table called cities,
when accessing countrylicious (through, its associated URL - /api/countrylicious
()),
this is described by a href containing the location of the table object, like:
"href": "/api/countrylicious/cities". So, the client doesn't need to know
about the existence of the cities table or how to access it. The service will
provide all these information as hypermedia contained in the response.
restifydb is a PHP web application. It thus needs a typical software stack.
Apache Web Server with mod_rewrite support.
Optionally, if you would like to enable GZip compression, mod_deflate should
also be enabled.
PHP greater then 5.4 with FileInfo support |this
is usually enabled by
default - the extension is named php_fileinfo).| Database engine | PHP extension |
|---|---|
| MySQL through Mysqli | mysqli |
| MySQL through PDO | pdo_mysql |
| Oracle 10g | oci8 |
| MSSQL Server | sqlsrv |
| SQLite through PDO | pdo_sqlite |
| PostgreSQL | pgsql |
| PostgreSQL through PDO | pdo_pgsql |
| IBM DB2 | ibmdb2 |
For the sake of simplicity, we will consider that the URL of the application is
/api
().
restifydb folder)
and the restifydb demos (located in the demos folder).
api folder inside the document root folder.
The way you organise your web application and manage URLs and virtual hosts is beyond the
scope of this document.
cache and config.
/api/admin. You will have to log in to the
administration panel.
admin.
Now that you have configured restifydb, it is time to start adding data sources. This is easily
doable via the "Edit Data Source" dialog (Configure/Configured Data Sources/Add new
data source). This allows adding a new data source. You will be able to set table
permission - this allows exposing only certain tables from the current data source. Fill in all
the required fields. Make sure the database connection parameters are properly set. There are a
few considerations here:
/home/someuser/myds.db127.0.0.1:1521/db. The user name field will actually
contain the schema name.
In order to check whether the connection to the data source can be established, open the
"Configured Data Sources" dialog (Configure/Configured Data Sources). Now
click the little icon next to the desired data source. Once
the process is done, you will receive an information saying whether the connection could be
established or not. If this failed, please re-check the connection parameteres.
If you open the edit data source dialog again, you will see you are not yet able to set table permissions. This is because the metadata information has not yet been retrieved for the selected data source. Reading metadata is a slow process which might take minutes in case of large databases. That is why restifydb caches this information in order to speed up the process.
In order to trigger the re-caching process, open the "Configured Data Sources" dialog
(Configure/Configured Data Sources). Now click the little
icon next to the desired data source. Once the process is done, you will receive a message
confirming the outcome of the operation. The first thing to check for when this fails is whether
the cache directory is writable by the user running PHP.
Please bare in mind to trigger the re-caching mechanism after making changes to the data source structure. This is not required when making changes to data.
restifydb allows you to specify both per data-source and per-table visibility options. You can change these settings in the "Edit Data Source" dialog for the desired data source. By default, access is permitted to every data source and every table. If you mark a data source as disabled, clients will no longer be able to connect to it making there applications fail.
By default, the framework enables clients to perform all the four data operations (creating,
reading, updating and deleting data). It is desirable in some cases to limit these operations.
For this, go to the "General Options" menu (Configure/General Options)
and specify the operations you wish to disable. Currently, this is a global setting and it is
not configurable per data source or table.
Also, in the aforementioned dialog you can specify the maximum data length when outputting large fields (represented by CLOB or TEXT data types). This setting is bypassed when presenting entities which represent records.
This allows the system's administrator to see what went wrong while clients connected to the system. There are many cases in which restifydb's error handling mechanism will not be able to cope with the error at hand. In this case, a generic error message will be displayed to the user, asking him to contact the administrator. A unique (autogenerated) token will be presented to the user. This is stored in the application's log toghether with a lot of useful information (the URL) Please see the Error handling section for additional details.
The configuration is stored in a Sqlite database inside the /config folder. Due to
how the configuration is kept, it is not currently possible to easily recover the administrator
password. If this is lost, you will have to reinitialise restifydb. You can do this by simply
removing the config/restify_config.db file. Also, I encourage you to always keep
a backup of this file.
It is good practice to protect the administration area by other means as well: IP-based filtering,
HTTP Basic Authentication (via the mod_auth_basic module), and so on.
restifydb exposes data via REST web services. The representational state transfer (REST)
paradigm has been introduced as a lightweight, simpler alternative to SOAP (and its description
language - WSDL). In REST, data access methods are mapped to URLs and HTTP verbs. For instance,
reading the countries data would translate into GET /api/countrylicious/countries
().
One could easily guess that deleting a specific country is as simple as doing a
DELETE /api/countrylicious/countries/233 (where 233 is the unique
identifier of the country to be deleted).
restifydb supports the following HTTP verbs, each one corresponding to a CRUD operation:
restifydb can currently serialise the output into two different highly used formats:
JSON (JavaScript Object Notation) -
- this is the preferred data format for
clients integrated into browser applications because it can be natively de-serialized into
JavaScript objects by any modern browser.
XML (Extensible Markup Language) -
.
When outputting data to the calling parties, restifydb needs to decide in which format it should serialise this output. This is called content negotiation. In the framework it works as follows:
_view exists and its value is
set to either json or xml.
_view URL parameter is missing, the framework will check if the caller
is sending the HTTP_ACCEPT HTTP header. If this matches to either
application/xml or application/json, this value will be used as
output type.
There are several URL parameters which can be specified when calling the restifydb web services. Except for
the _view parameter, all of
them only refer to reading data and are preserved when building URLs in the same context (e.g.: if you
specify a _count parameter to limit the number of rows retrieved when reading a table, this
parameter will be preserved when restifydb constructs the URLs needed to navigate through the given table:
previous page, next page, etc.).
Example: the following URl GET /api/countrylicious/countries?_view=json&_count=20&_start=120
()
will trigger a read operation on the countries table. It instructs the framework to serialize the
output as JSON, and to fetch 20 rows starting from the 120th row.
It is important to URL-encode these parameter values when making calls to the web services. By doing this, you
avoid errors which are usually hard to spot. Example: if you would like to search for a city named
Aalborg in the cities table, you should do GET /api/countrylicious/cities?_filter=name%3D%3DAalborg
().
Please note the URL-encoded filter expression which translates into name==Aalborg.
As previously stated, in order for restifydb to be able to work with records (both for reading and writing), these records need to be uniquely identifiable. This means that the underlying table needs to have a primary key defined. This will be used to generate the record's access URL.
Here's an example: the countries table is constructed in such a way that it has a primary key
consisting of the country's identifier. Thus, single countries can be referred to as records:
GET /api/countrylicious/countries/164
().
There are cases, however, in which the primary key is composite - composed by more than one column. In this
situation restifydb will expose records from such a table by concatenating the composite key's field value.
As an example: GET /api/datasource/table/21_and_34 - the primary key is composed of two columns.
The field values are in this case, 21 and 34. Please note that when accessing such
records, the order of the composing fields needs to be respected. The current separator is _and_.
For the sake of simplicity, we will consider that the URL of the application is
/api.
Please note that if you disable the "read operation" from the administrative panel, clients will not be able to do read requests.
| Operation | Method and URL | Explanation |
|---|---|---|
| Read the list of configured data sources from the system | GET /api
()
|
Retrieves the list of all the configured data sources in the system. Please note that if you mark a data source as disabled, it will not be accessible nor will it appear in this list. |
| Read the list of tables from a data source | GET /api/countrylicious
()
|
Retrieves the list of all the tables (and views) in the data source. Please note that if you mark a table as disabled, it will not be accessible nor will it appear in this list. |
| Read the list of rows from a table
Please also see the next sections of the current chapter for details about paging, filtering, and sorting data. |
GET /api/countrylicious/regions
()
|
Retrieves the list of all the tables (and views) in the data source. Please note that if you mark a table as disabled, it will not be accessible nor will it appear in this list. |
| Read a record
Please also see the Primary key support section. |
GET /api/countrylicious/regions/8
()
|
Retrieves record with the unique identifier equals to idvalue. Please note that
if the parent table is marked as disabled, the record will not be accessible.
|
Query expansion refers to a feature in restifydb which allows to eagerly load matching records from
tables connected with the current table by the means of foreign key constraints. As an example, if you
talk have a table containing cities, one possible field in the cities table might be the id
of the country this city is found in, countryid. If an countries table exists and
the countryid field in the cities table is connected to the countryid
field in the countries table via a foreign key constraint, this means that when fetching rows
or records from cities, the framework could automatically load the corresponding row or
record from the countries table
(
- please see the outReference property of the countryId field).
Thus, the composite record will be retrieved containing all
the needed information in one request (no need to make a separate request to get the artist information
from the artists table).
_expand
URL parameter by specifying one of the following values: yes and no to enable
it, respectively disable it.
Example: GET /api/countrylicious/cities?_expand=no
()
will retrieve the rows from the
cities table without any information about the country (except for the countryid
which is part of the cities table).
The opposite of query expansion is called inward referencing. When a foreign key constraint is defined
between tables cities and countries (meaning that a city points to / belongs to
a country), it is said that the countries
(
- please see the countryId field) table has an inward reference from the
cities table. This is represented in restifydb via the inRreferences
collection property attached to every field which has such pointers. This is represented by a collection
because multiple foreign inward references might point to the same field.
The output of the data-access web services exposed by restifydb is easy to understand. The output will
have a similar structure every time, irrespective on the entity being presented. The root property is
always called restifydb. This will always contain three properties:
self - defining the properties of the current entity, name and access URL.rows - a collection of child entities. Each row has a values property.rowCount - the total number of child entities. When reading tables, this is very
important as it represents the total number of rows retrieved when applying the curren filtering
criteria.
When the type entity of the entity being retrieved is other than system, a parent
property will always be present. This defines the property of the parent entity, name and access URL.
When reading data from tables, restifydb will expose two convenience properties:
ownFields - returns the comma-separated names of all the fields from the current table.
This is useful for avoiding introspections on the output structure (XML or JSON), thus optimizing
your code.
foreignFields - when query expansion is enabled and
connections to other tables exist, restifydb will give the list of comma-separated fields names from
the connected tables. This is useful for avoiding introspections on the output structure (XML or
JSON), thus optimizing your code. This is expressed as a collection, as multiple fields might have
connections.
The XML structure is fairly similar.
The only semantic difference between the JSON and XML versions is the fact that XML does not support
unnamed nodes (the equivalent of JSON arrays). In this case, the serialisation engine performs a small
trick and replaces the unnamed nodes with nodes named as the singular of the parent node. For example,
the <rows> parent node will contain subnodes named <row>.
restifydb will automatically help clients in paging data. It does so by providing hypermedia links to paging resources such as the first and last page, previous and next page. Paging is only enabled when the row count is greater than the current page size.
In the context of paging, the output will contain several additional properties:
rowCount - the total number of rows availablecurrentPage - the number of the page currently being presentedpageCount - the total number of pages. This is computed by dividing the
rowCount to offset.nextPage, previousPage, firstPage and lastPage
represent the cursors needed to navigate through the entire rows collection. The are expressed via
URL resources.
When reading table data without specifying URL control parameters, the framework will by default retrieve only the first twenty rows. If you wish to control this behaviour, you can use the following URL parameters:
_start: this parameter specifies the index of the first row to fetch. The default value
is 0. This parameter useful for paging data.
_count: this parameter controls how many rows are being fetched. The default value is
20. The maximum value for this parameter is 50.
Example: if you want to fetch 10 rows starting with the row with the index 940 from the cities table, you
can do this: GET /api/countrylicious/cities?_start=20&_count=940
().
By default, when making a table read call, restifydb will fetch every single field in the table, plus
every single field in all of the connected tables, if query expansion is turned on. This behaviour can
be controlled using the _fields URL parameter. The syntax of the _fields
parameters is straightforward: field1[,field2,[,field3...]]. All you have to do is specify
which fields the system needs to fetch.
Example: in order to retrieve only the country names and its capitals, you need do
GET /api/countrylicious/countries?_fields=name,capital
().
Please note that the field (referred in the fields expressions) must be a valid field
from the table being read. Currently, restifydb does not allow limiting foreign fields. When using the
_fields parameter, the query expansion is automatically disabled.
When reading table rows, clients can apply filtering, thus limiting the number of rows being fetched.
This is controlled using the _filter URL parameter. The filtering mechanism supports
concatenating filtering conditions using the && and || logical
operators (corresponding to AND and OR operators). A filtering condition is a
boolean expression of the form field[operator]value. There are several operator supported
in restifydb:
Please note that the field (referred in the filter expressions) must be a valid field
from the table being read. Currently, restifydb does not allow filtering on foreign fields, even
when query expansion is enabled.
Please do not forget to URL-encode the _filter parameter's value when making calls to the
web services.
The filtering conditions can be combined in any way. However, as the framework does not currently
support paranthesis in the filter expression, it does not make much sense combining AND
and OR logical operators. However, chaining ANDs or ORs makes
perfect sense. Here's a more complex example for filtering: area>500000&&population>50000000&®ionid==8
()
- this searches for all the European countries which have an area larger then 500,000 sq km
and a population of more then 50,000,000 inhabitants.
Sorting rows in using restifydb is easy. All you need to do is use the _sort URL parameter.
The syntax for this parameter is similar to the one of the ORDER BY SQL command
- the two behave alike. The syntax of the _sort parameter is:
field1[ ASC|DESC][,field2[ ASC|DESC]...]. The sorting direction can be controlled by the
ASC or DESC keywords which stand for "ascending" and
"descending". By default, the sorting order is ascending - when the sort direction is not specified.
The filter conditions can be chained in order to specify multiple sorting criteria.
Example: if you would like to get countries sorted by continent and in descending order by area, you should
call GET /api/countrylicious/countries?_sort=regionid+asc%2Carea+desc
().
Please note that the field (referred in the sorting expressions) must be a valid field
from the table being read. Currently, restifydb does not allow sorting on foreign fields, even
when query expansion is enabled.
Please do not forget to URL-encode the _sort parameter's value when making calls to the
web services.
If you would like to create a new record, you must do a POST on the table you wish to create
this record specifying the field values in the POST body. The parameter containing these values
should be called _data and should represent an valid JSON object with members representing
field names and values representing field values.
Here is an example: in order to insert a new record into the cities table, you should do a
POST /api/countrylicious/cities and set the method's body to
_data={"id": 4999, "countryid": 155, "name": "Some City in Andorra", "population": 100}
(this should be URL-encoded, please see below).
If the operation succeeds, the response will be populated with two important fields:
affectedRows - returns the number of rows successfully inserted. In case of success, it
should be set to 1.lastInsertedValue - if there is a primary key on the table, it will return the value of the
field representing it in the newly inserted record. This is useful when dealing with auto generated IDs,
where the database takes care of populated the primary key fields (auto increment, sequence, etc).
Please make sure to send the Content-Type: application/x-www-form-urlencoded header when
creating records. Please do not forget to URL-encode to value of the _data parameter.
If you would like to modify an existing record, you must do a PUT on the record you wish to
modify specifying the field values in the PUT body. The parameter containing these values
should be called _data and should represent an valid JSON object with members representing
field names and values representing field values.
The modify (update) operation is idempotent, meaning it will produce the same result no matter how many times it has been invoked.
Here is an example: in order to modify a record from the cities table, you should do a
PUT /api/countrylicious/cities/4999 and set the method's body to
_data={"countryid": 155, "name": "Another City in Andorra", "population": 101}
(this should be URL-encoded, please see below). This will modify the city having the unique identifier set
to 4999.
Please note that in order to be able to modify records, the underlying table needs to have a primary key defined. Please see the Primary key support section for additional details.
Please make sure to send the Content-Type: application/x-www-form-urlencoded header when
creating records. Please do not forget to URL-encode to value of the _data parameter.
Deleting a record in restifydb is easy. All you need to know is the URL uniquely identifying the record and
issue a DELETE action. For example, if you wish to delete the city of Klagenfurt, identified by
its id (341), you need to do DELETE /api/countrylicious/cities/341. If the
operation succeeds, the response will contain a field called affectedRows, showing the number
of successfully deleted rows.
Please note that in order to be able to delete records, the underlying table needs to have a primary key defined. Please see the Primary key support section for additional details.
checked - these map to foreseeable situations in which the framework might fail. The
output will be similar to this:
ERROR #109: This table does not exist. [Date and time: 2015-03-23 13:15:52].
unchecked - these are situations when an unforeseen situation occurs. This might be
associated with a bug in the framework, a data-related issues, infrastructure problems, and so on.
When this happens, restifydb will provide the client with a message similar to the following:
ERROR #701: A fatal exception has occurred. Execution of the script was terminated. Please
contact the system administrator and communicate the following information: [Error token:
506F70C3] [Date and time: 2015-03-23 13:19:23]
| Error code | Description | HTTP code |
|---|---|---|
101 |
An unsupported HTTP method was used. | 405 |
102 |
No data source name was specified. | 400 |
103 |
This data source cannot be found. | 404 |
104 |
This table cannot be found | 404 |
105 |
No primary key found. Cannot navigate a table without a primary key. Please use the _filter parameter instead. | 412 |
108 |
Invalid URL. | 404 |
109 |
This table does not exist. | 404 |
110 |
The number of parameters should match the number of columns from the primary key. | 412 |
111 |
The record with the specified id(s) does not exist. | 404 |
112 |
The ID is mandatory for DELETE and UPDATE operations. | 412 |
113 |
Error deleting specified row. | 412 |
114 |
The column values are missing from the POST data. Please construct the request with the _data parameter in the POST body and make sure the _data parameter is a well-formed JSON string. | 400 |
115 |
One of columns specified does not exist. | 412 |
116 |
Error insert specified row. | 412 |
117 |
The column values are missing from the PUT data. Please construct the request with the _data parameter in the PUT body and make sure the _data parameter is a well-formed JSON string. | 400 |
118 |
Error updating specified row. | 412 |
119 |
No data source is currently configured. | 412 |
120 |
Invalid number of parameters in _filter clause. | 412 |
121 |
The _filter clause refers to a non-existing field. | 412 |
122 |
Syntax error in _filter clause. | 412 |
123 |
Invalid operators in _filter clause. | 412 |
124 |
The _fields clause refers to a non-existing field. | 412 |
125 |
The _sort clause refers to a non-existing field. | 412 |
900 |
restifydb is not properly configured. Please contact the system administrator. | 412 |
901 |
restifydb is not properly configured. Please contact the system administrator. | 412 |
902 |
restifydb is not properly configured. Please contact the system administrator. | 412 |
903 |
restifydb is not properly configured. Please contact the system administrator. | 412 |
201 |
No download token was specified. | 400 |
202 |
Invalid token. | 400 |
203 |
Invalid column. | 400 |
204 |
The specified column is not a valid primary key. | 400 |
205 |
The object for download was empty. | 400 |
301 |
The read operation has been disabled by the administrator. | 403 |
302 |
The update operation has been disabled by the administrator. | 403 |
303 |
The create operation has been disabled by the administrator. | 403 |
304 |
The delete operation has been disabled by the administrator. | 403 |
305 |
The access to this table has been disabled by the administrator. | 403 |
I have built restifydb with speed in mind. However, there is a lot of room for improvement. There are currently some implementation details which need to be considered:
HATEOAS (Hypermedia as the Engine of
Application State,) concept. This leverages data discovery through hypermedia.
ETag support in restifydb. This improves web caching
support. The ETag is computed as the checksum of the output. Of course, same
input will produce the same checksum, meaning it can be read from cache.
GZip compression enabled will greatly reduce the network traffic, as
restifydb sends only text data (in JSON or XML format). Please consider enabling the output
compression if you wish to save bandwidth.