Writing an xDDL Specification
Before We Start
In this document, we are using the command line version of xddl, available from JCenter. You might need to change the version numbers.
This document was created based on the 0.9.0 version of the project.
Table of Contents
- Core Concepts
- Conventions and Practices
- Referencing Other Data
- Patching Between Versions
- Using Command Line Tools
Core Concepts
An xDDL Specification is composed of a set of Structures
. Structures
have Properties
, which can be Types
,
Lists
, or other Structures
. Each of these can have “Extensions” which contain extended data specific to a particular runtime,
language, environment, data store, etc.
Let’s start by looking at a minimal example:
{
"structures": [
{ "@type": "Structure", "name": "Person",
"properties": [
{"@type": "Type", "core": "STRING", "name": "firstName"},
{"@type": "Type", "core": "STRING", "name": "lastName"}
]
}
]
}
Here we have created an xDDL specification with a single Structure definition, called “Person”, that has two Properties,
firstName
and lastName
. These are Types
, or basic values with a “core” type of STRING
. The core type
carries with it some implicit settings for the various xDDL plugins, so picking the right core type can be important.
The core
values for a type should be one of:
STRING
: A short string value, however you choose to define it.TEXT
: A long text value, however you choose to define it.DATE
: A calendar date.TIME
: A time of day.DATETIME
: A date and time.INTEGER
: A 32 bit integer.LONG
: A 64 bit integer.BOOLEAN
: A flagFLOAT
: A 32 bit floating point value.DOUBLE
: A 64 bit floating point value.BIG_INTEGER
: An exact and arbitrary integer value.BIG_DECIMAL
: An exact and arbitrary decimal value.BINARY
: A collection of bytes representing a binary value.
Next, lets add another Structure to our specification:
{
"structures": [
{ "@type": "Structure", "name": "Person",
"properties": [
{"@type": "Type", "core": "STRING", "name": "firstName"},
{"@type": "Type", "core": "STRING", "name": "lastName"}
]
},
{ "@type": "Structure", "name": "OrganizationalUnit",
"properties": [
{"@type": "Type", "core": "STRING", "name": "name"},
{"@type": "List", "name": "members",
"contains": { "@type": "Reference", "ref": "Person"}}
]
}
]
}
Now we have an OrganizationalUnit
that has a name, and a List
called members
. The List type has an attribute
called contains
that specifies what it is a list “of”. Here we use "@type": "Reference"
to say we are going to
reference something else in the specification and "ref": "Person"
to refer to the structure we created before.
- Lists should “contain”, Structures, Types, or References to Structures and Types.
- Lists may contain nested Lists, but this is discouraged as it is problematic to support in all places where you might want to use your specification.
- References may refer to Structures or Types at the root level of the specification. (More on this below)
The final core concept is the idea of an “Extension”. Extensions, in the ext
attribute allows you to pass extended
information about any level of the specification to plugins to be used for generating artifacts.
Now we can use the “generate” command to create a JSON Schema equivalent for our specification:
xddl generate --input-file ./step2.xddl.json --format json --output-directory .
Which gives us:
{
"definitions" : {
"OrganizationalUnit" : {
"title" : "OrganizationalUnit",
"type" : "object",
"properties" : {
"members" : {
"title" : "members",
"type" : "array",
"items" : {
"$ref" : "#/definitions/Person"
}
},
"name" : { "title" : "name", "type" : "string"}
}
},
"Person" : {
"title" : "Person",
"type" : "object",
"properties" : {
"firstName" : { "title" : "firstName", "type" : "string" },
"lastName" : {"title" : "lastName", "type" : "string"}
}
}
},
"$schema" : "http://json-schema.org/draft-07/schema#",
"$ref" : "#/definitions/null"
}
This looks remarkably close to our original specification file, but things are renamed a bit. Also we ended up with
"$ref" : "#/definitions/null"
. Let’s iterate on our specification again…
{
"entryRef": "OrganizationalUnit",
"types": [
{"@type": "Type", "name": "human_name", "core": "STRING",
"ext": {
"json": {
"minLength": 1, "maxLength": 255, "pattern": "[A-z-']*"
}
}
}
],
"structures": [
{ "@type": "Structure", "name": "Person",
"properties": [
{"@type": "Reference", "ref": "human_name", "name": "firstName", "required": true},
{"@type": "Reference", "ref": "human_name", "name": "lastName", "required": true}
]
},
{ "@type": "Structure", "name": "OrganizationalUnit",
"properties": [
{"@type": "Type", "core": "STRING", "name": "name","required": true},
{"@type": "List", "name": "members",
"contains": { "@type": "Reference", "ref": "Person"}}
]
}
]
}
Here we have added an entryRef
which is the name of the Structure that represents the top level of a document. We
have marked a few properties as required
, and we have create in the types
list a new type called human_name
,
which we use as a Reference
for firstName and lastName.
xddl generate --input-file ./step3.xddl.json --format json --output-directory .
This gives us:
{
"definitions": {
"OrganizationalUnit": {
"title": "OrganizationalUnit",
"type": "object",
"properties": {
"members": {
"title": "members",
"type": "array",
"items": {
"$ref": "#/definitions/Person"
}
},
"name": {
"title": "name", "type": "string"
}
},
"required": ["name"]
},
"Person": {
"title": "Person", "type": "object",
"properties": {
"firstName": {
"title": "firstName", "type": "string", "minLength": 1, "maxLength": 255, "pattern": "[A-z-']*"
},
"lastName": {
"title": "lastName", "type": "string", "minLength": 1, "maxLength": 255, "pattern": "[A-z-']*"
}
},
"required": ["firstName", "lastName"]
}
},
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/OrganizationalUnit"
}
Now we see the values from our ext: { json:
tree copied into the JSON schema for the firstName and lastName. Since
each of these is referenced as human_name
, we don’t need to duplicate the extended configuration multiple places.
You also now have a JSON Schema document with which you can used to validate OrganizationalUnit documents.
Conventions and Practices
While it is not required that you do so, you are encouraged to follow these naming conventions within your xDDL specification. Doing so will give you the best possible results when generating artifacts from the various plugins.
- Structures should have UpperCamelCase names.
- Property names should be lowerCameCase.
- Specification-level types should be lower_snake_case.
As you can imagine, keeping all your definitions in a single file can become overwhelming if your definition is large.
You can break this up by using the --include-dir
options on the command line. This will scan a directory of files
named *.xddl.json and place them into the types or structures groups where appropriate. You can then break your specification
down to:
{
"title": "My Specification",
"version": "1.0",
"entryRef": "OrganizationalUnit"
}
And separate your files into, for example…
human_name.xddl.json
{"@type": "Type", "name": "human_name", "core": "STRING",
"ext": {
"json": {
"minLength": 1, "maxLength": 255, "pattern": "[A-z-']*"
}
}
}
This makes developing and updating you specifications in an IDE much easier since you might not have to search withing a hundreds (or thousands) of lines specification to locate what you need to edit.
Looking at the help text for the generate
command, we see:
xddl generate --help
Usage: generate [options]
Options:
* --format, -f
The output plugin to generate
--help
Show this help text
--include-dir, -d
Directory(ies) to scan for *.xddl.json files to include.
* --input-file, -i
The specification file.
* --output-directory, -o
The directory to output generated artifacts to.
--stacktrace
Show the stacktrace of an error
Default: false
--vals-file, -v
JSON file of values
So if we run
xddl generate -i step4.xddl.json -d step4includes -f json -o .
Then because we have provided a title and version to our specification document, we now get
My_Speciifcation_1.0.schema.json
. This naming convention is common among plugins. If we wanted to generate a
tabled definition for Apache Hive we could run…
xddl generate -i step4.xddl.json -d step4includes -f hive -o .
…and we will generate a file called My_Specification_1.0.hive
with our Hive table definition.
CREATE EXTERNAL TABLE IF NOT EXISTS My_Specification_1.0 (
name varchar(255),
members ARRAY<STRUCT<firstName:varchar(255), lastName:varchar(255)>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ('serialization.format' = '1', 'ignore.malformed.json' = 'true')
LOCATION ''
TBLPROPERTIES ('has_encrypted_data'='false');
Referencing Other Data
Field values in xDDL can be interpreted as OGNL expressions, which allows you to reference values from within and without of the specification. Lets look at an example.
In our previous specification file, we hard coded the version
to be “1.0”, but this might not be specific enough.
Surely our project is being built on a server somewhere and we might want to be more specific. In this case we can say:
{
"title": "My Specification",
"version": "1.0.${vals.buildNumber}",
"entryRef": "OrganizationalUnit"
}
Now we can create a values.json
file that looks like:
{"buildNumber": "1234" }
And use the “unify” command to generate a version of our spec that will have the external value escaped it it…
xddl unify -i step5.xddl.json -d step4includes -v values.json -o current.xddl.json
… giving us:
{
"title" : "My Specification",
"version" : "1.0.1234",
"entryRef" : "OrganizationalUnit",
"types" : [ {
"@type" : "Type", "name" : "human_name", "core" : "STRING"
.. and so forth.
Let’s look at a VERY common case:
{
"@type": "Type",
"core": "STRING",
"name": "version",
"description": "The version",
"required": true,
"ext": {
"java": {
"initializer": "\"${specification.version}\""
}
}
}
Here we are copying the version from the specification file to the the initializer of the Java variable. You can see now the two major context object you have access to from OGNL:
specification
– the actual specification itselfvals
– the external values object, this can be read from a JSON file as we are doing here, or can be defined in the build.gradle file if you are using the Gradle plugin.
Patching Between Versions
With the unify
command, in addition to our “includes” directory, we have the option to specify a “patches” directory.
This is a directory that contains *.patch.json or *.xddl.json files that we will use to modify an existing specification.
These are simply more individual structure files that contain only the changes between versions. For example, if
we wanted to rename the “firstName” field on our “Person” structure, we could create Person.patch.json
{ "@type": "Structure", "name": "Person",
"properties": [
{"@type": "PATCH_DELETE", "name": "firstName"},
{"@type": "Reference", "ref": "human_name", "name": "givenName", "required": true}
]
}
Now we can specify a patches directory and a new version on the command line:
xddl unify -i step5.xddl.json -d step4includes -o current.xddl.json -p ./step6patches --new-version 2.0
Giving us:
{
"title" : "My Specification",
"version" : "2.0",
"entryRef" : "OrganizationalUnit",
// ...
"structures" : [
// ...
{
"@type" : "Structure",
"name" : "Person",
"properties" : [ {
"@type" : "Reference",
"name" : "lastName",
"required" : true,
"ref" : "human_name"
}, {
"@type" : "Reference",
"name" : "givenName",
"required" : true,
"ref" : "human_name"
} ]
} ]
}
With our new version from the command line populated, and the “givenName” field added to the Person structure. This is part of a larger conversation about data migration between versions. For that, you should consult the section on ElasticSearch migrations.
Using Command Line Tools
While most people use the Gradle plugins, the xDDL Command Line is fully featured and can be used to integrated the
functionality with whatever build chain you might have. Here we have mostly used the generate
command to create
artifacts using the xDDL plugins. The other commands are:
unify
– Takes a collection of includes and/or patches, and generate a single xddl file that contains all the defined structures.
xddl unify --help
Output:
Usage: unify [options]
Options:
--no-evaluate-ognl, -no-eval
Disables OGNL evaluation.
--help
Show this help text
--include-dir, -d
Directory(ies) to scan for *.xddl.json files to include.
* --input-file, -i
The specification file.
--new-version, -nb
The version string of the unified file
* --output-file, -o
The file to output generated artifacts to.
--patches-dir, -p
Directory(ies) to scan for *.patch.json files to include.
--scrub-patch, -s
scrubs patch-delete operations from the original
Default: false
--stacktrace
Show the stacktrace of an error
Default: false
--vals-file, -v
JSON file of values
glide
– Takes a specification and a directory of versioned patches and generates each of the interim xddl files for each version. (You can learn more about this in the models or elasticsearch documentation).
xddl glide --help
Output:
Usage: glide [options]
Options:
--glide-patches, -g
Directory(ies) to scan for 'vXXXX' directories containing *.patch.json
files to include.
--help
Show this help text
--include-dir, -d
Directory(ies) to scan for *.xddl.json files to include.
* --input-file, -i
The specification file.
* --output-directory, -o
The file to output generated artifacts to.
--stacktrace
Show the stacktrace of an error
Default: false
--vals-file, -v
JSON file of values
diff
– Outputs the diff between different xddl specs.
xddl diff --help
Output:
Usage: diff [options]
Options:
--comparision
Show comparision rather than just missing fields
Default: false
--help
Show this help text
* --left-file, -l
The left hand file.
--left-include-dir, -ld
Directory(ies) to scan for *.xddl.json files to include.
* --right-file, -r
The left hand file.
--right-include-dir, -rd
Directory(ies) to scan for *.xddl.json files to include.
--stacktrace
Show the stacktrace of an error
Default: false