How can I validate my step inputs and outputs?
Version 3 of opscotch introduces a small but powerful "preprocessor" called doc for JavaScript processors. The doc functions allow you to add documentation and add programatic declarations of what inputs and outputs your step expects and returns. It's a way to validate and perform type checking, and also declares your expectations to callers.
See the doc documentation here
As of version 3.1.0, opscotch does not support any formal documentation of JavaScript files (you'd need to use comments etc.) or type safety of processor inputs and output.
In a JavaScript processor when you are evaluating the inputs via context.getBody() or context.getData() you will need to do your own check on the input. For example if you expect your context.getBody() input to be a JSON object with a property called foo which should be a string, you need to do something like this:
/*
This processor expects a JSON object that has a string property foo.
This processor does important work and stuff...
*/
const raw = context.getBody();
// context.getBody() is expected to be a JSON string
let input;
try {
input = JSON.parse(raw);
} catch (err) {
throw new Error('Input is not valid JSON');
}
// Validate the parsed input
if (typeof input !== 'object' || input === null || Array.isArray(input)) {
throw new Error('Expected input to be a JSON object');
}
if (typeof input.foo !== 'string') {
throw new Error('Expected input.foo to be a string');
}
// Safe to use
context.log(`foo = ${input.foo}`);
This is pretty verbose, and gets very long and difficult to maintain, especially for complex objects with variable conditions.
Using the new doc functions you could just do:
doc
.description("This processor does important work and stuff...")
.inSchema(
{
type: "object",
properties: {
foo: {
type: "string"
}
},
required: ["foo"]
}
)
.run(() => {
// Safe to use
context.log(`foo = ${JSON.parse(context.getBody()).foo}`);
});
While still somewhat verbose, it's actually the simplest way to perform arbitrarily complex validations, AND now the file can be read by a human or machine and they know how to use the file.
You can use the same technique to prevalidate context.getData() and also apply schema checking to the output of the processor.
How does this work?
The doc.run(() =>{} ); function should completely enclose the JavaScript you are running, such that the doc functions are the only methods in the root of the JavaScript.
At a minimum your entire processor script would be:
doc.run(() => { /* your javascript here */ });
However this example is a bit useless as its not adding anything of value. To make it more useful you need to add the schema functions.
The schema functions take a JSON Schema object, and enforce schema compliance.
.description("...")- provides a clear description of the file
.inSchema({...})- this schema is applied to the data from
context.getBody()BEFORE the processor executes
- this schema is applied to the data from
.dataSchema({...})- this schema is applied to the data from
context.getData()
- this schema is applied to the data from
.outSchema({...})- this schema is applied to the outgoing message body set by
context.setBody(...)AFTER the processor executes
- this schema is applied to the outgoing message body set by
.asUserErrors()- when this function is added, the validation errors will be marked as "user" errors i.e., the user should fix the inputs, otherwise they will be marked as "system" errors and the developer should fix them.
opscotch pre-compiles the schemas, and any problems with the schema will be raised during workflow load.
What is JSON schema
JSON schema is a widely used method for declaring the expected shape of JSON entities. With JSON schema you can do things like:
- set the type of properties i.e. this is a string
- have required properties i.e. these properties are required
- have dependent properties
- "any of these" and "all of these" semantics
- conditional semantics i.e. if that then this
- assert no extra properties
Runtime behaviour
If the incoming body does not match the inSchema or data payload does not match the dataSchema, opscotch will reject the run with a clear error before your code executes.
If your output does not match the outSchema, opscotch will catch that too.
You get immediate feedback, a self-documenting contract for your step, and safer workflows with almost no extra code.
JSON Schema supports "descriptions". Why should I use them?
Because the schemas live next to your code, adding description fields turns them into living documentation for whoever reads (or debugs) the step. Over time, those descriptions can be parsed to give humans and AI more context about what a workflow expects and produces, power UI generation for building forms, and even visualize how files and inputs wire together through a workflow.
An example
Here is a more detailed sketch. The schemas specify what is required coming in and what is produced going out. Body and data validation is done before the JavaScript is executed, so you can safely assume the structure is correct without additional guards in your code:
doc
.description("Produces a diff between files")
.inSchema(
{
type : "object",
required : [ "path", "body" ],
properties : {
path : {
type : "string",
description : "the path to the file"
},
body : {
type : "string"
}
}
}
).dataSchema(
{
type : "object",
required : [ "fileId"],
properties : {
fileId : {
type : "string"
}
}
}
).run(() => {
// safely access getBody() and getData() structures
var request = JSON.parse(context.getBody());
const contents = context.files(context.getData("fileId")).read(request.path);
// ... more stuff is done here
// set the output which is validated in the following outSchema
context.setBody(JSON.stringify([{ path: "example", type: "modified" }]));
}).outSchema(
{
type : "array",
items : {
type : "object",
required : ["path", "type"],
properties : {
path : {
type : "string"
},
type : {
type : "string",
enum : [ "added", "deleted", "modified", "unsupported"]
}
}
}
}
);