Tasks in Depth

source and js fields


The source and js fields are the actual functional parts of your periodic task. Together they combine to form a pipeline that generate drops that are written back into flowthings.io.

The task source field describes the data source. This field describes how and where queries will be executed, and, optionally how the returned data will be parsed. Your query can be executed against an url or against the flow.js drop.find() API. source also provides shortcuts for working with common formats like XML, JSON, or RSS feeds that don't require further manual manipulation. When run, source will generate some combination of drops (if you’ve specified parsing) and raw data (if you’ve specified an external URL).

The second, js part of the task specifies a javascript function that allows you to add manual parsing to further manipulate your raw input data. This function has access to the raw request data from your source and requires only that the end result of the function output valid drops. This is similar to how javascript works in tracks.

Together the source and js fields describe a task that will query an external data source and write data to flowthings.

source

The source field is a map that specifies how your task will retrieve data. You can think of the source as a step that queries some datasource and returns some combination of drops and raw http request data, or if you like, a function

datasource -> (drops, response)

If your source outputs drops you can omit the js field - the resulting drops will be written to flowthings. If you have a js field the resulting drops will simply be made available to that function for further processing.

Currently there are two source types supported:

  • dropFind which allows you to base your task on a query from the flow platform, and
  • http which allows you to base your task on any external http data source.

dropFind

A dropFind source executes using the same syntax and permissions as a Flow.Drop.find() executed in a Track. It simply retrieves the drops from flowthings, in the same form that you would receive them in an api query:

{
  "source": {
    "dropFind": {
      "flowId": "f000000000000000000000006", // some flow you have permissions to read
                                             //or, alternatively
      "flowPath":"/path/to/my/flow",         // some flow you have permissions to read
      “options”: {
        "filter": "a > 1", // some query in Flow filter language"
        "limit":  20,      // limit of how many drops to retrieve. If omitted this defaults to 20
        "start":  10,      // a number that species the query offest
        "sort":   "a",     // specified field to sort on. 
        "order":  "asc"    // 'asc' or 'desc'
      }
    }
  }
}

One of flowId or flowPath is mandatory, while the options and all values within are optional. If your task also includes js member, the output Drops of this query will appear in the first parameter of your supplied function for additional processing. If you do not supply a js member the query results be forwarded to the destination specified in the Task.

http

An http map tells us how to retrieve data via an http request. If you use this type of source, and you provide a javascript function, the raw request data will be available in the second (request) parameter. In this case your source map would look like

{
  "http": {
    "get": {
      "url" : "http://my.url.com"
    }
  }
}

An http source supports get, post,put, and delete methods, along with some optional parameters:

{
  "http":{
    “get”: {                         // can be "get","post","put","delete"
      "url": “https://example.com”,  // mandatory
      "headers":{                    // map of headers, optional
        “Content-Type”: “text/plain”
      },
      "data": “hello world”  //  data to be sent along with the request, optional
    }
  }
}

Another example:

{
  "http": {
    "post": {
      "url": "http://my.url.com",
      "headers": {
        “Content-Type”: “text/plain”
      },
      "data":  "some data value"
    }
  }
}

While you can then use this to manually extract data from the request to construct drops, we've also provided shortcuts for some commonly used parsing tasks.

RSS

Because rss is a (reasonably) standardized format, if you specify a task should be parsed as rss we'll make a best-guess effort to do that. If your rss source contains customized fields you can add that in using an optional javascript function), but for the most part you can simply write

{
  "http": {
    "get": {
      "url": "http://my.url.com"
    },
    "parser":"rss"
  }
}

This is all you need to do to periodically poll an external RSS feed - in this case you don't need to provide a js processing step.

xpath

If your http data source provides well-formed XML you can simply provide a map of xpaths to extract the data. To do this, you provide two parts:

  • a root path that specifies what node in the document is the differentiator - that is, the node that specifies which value specifies new drops. For each xpath value matching the specified root, we'll create a new drop using the nodes map.

  • a nodes map, where the keys are the drop keys that will be written, and the values specifiy the xpath values from each root.

For example, given the xml:

<slideshow>
    <slide>
        <title>First title</title>
        <item>1</item>
    </slide>
    <slide>
       <title>Second title</title> 
    </slide>
</slideshow>

and the source

{
  "http": {
    "get": {
      "url": "http://my.url.com"
    },
    "parser": {
      "xpath": {
        "root": "/slideshow/slide",
        "nodes": {
          "elems": {
            "a": "title",
            "b": "item"
          }
        }
      }
    }
  }
}

The drops output would look like:

[
  {
    "elems": {
      "a": "First title",
      "b": "1"
    }
  }, {
    "elems": {
      "a": "Second title"
    }
  }
]

There is also a longer and more detailed way of specifying nodes. Within the nodes map, you can optionally specify the type of the output, and whether or not the value is mandatory. If you specify a value as mandatory and it's not available or not coercible into the requested type, the drop will not be output. In the above example, if we had specified the xpath nodes as json

"nodes": {
  "a": "title",
  "b": {
    "type": "integer",
    "required": true,
    "field": "item"
  }
}

the drops output would look like:

[
  {
    "elems": {
      "a": "First title",
      "b": 1
    }
  }
]

Note that the value for "b" is returned as an integer and the second drop is omitted since it doesn't have the mandatory field "b"

Just as with RSS parsing, this is sufficient to query an external XML source, parse into drops and write data into flowthings - the js step is not necessary.

jsonpath

This works identically to the xpath parser, except that instead of XPATHs you'll specify [JsonPaths] (http://goessner.net/articles/JsonPath/). For example, the document:

{
  "store":  {
    "book": [
      { "author": "Tom" },
      { "author": "Dick" },
      { "author": "Harry"}
    ]
  }
}

with the source parser specified as:

{
  "source": {
    "http":...,
    "parser": {
      "jsonpath": {
        "root": "$.store.book",
        "nodes": {
          "elems": {
            "a": "author"
            }
        }
      }
    }
  }
}

returns the drops:

[{
  "elems": {
    "a": "Tom"
  }
}, {
  "elems": {
    "a": "Dick"
  }
}, {
  "elems": {
    "a": "Harry"
  }
}]

Again, this is sufficient to query an external JSON data source and write drops to flowthings. The js step is not necessary unless you'd like to do further processing.

js

The js object member defines a function that will accept two inputs (drops and raw http response data)and returns a map of paths and drops to be written under each path. In general it will look like:

 function(drops, rawresponse) {
    ...
    return {
            "path1": [drop1, drop2, ...], 
            "path2": [drop3, drop4, ...],
            ...             
           }
 }

where the function argument drops represents any drops that may have been created by the source and is provided as an array:

[
  {"elems":{ "elem1":.., "elem2":..}},
  {"elems":{ "elem1":.., "elem2":..}},
]

and the response object represents the response received from the specified http request:

{
   statusCode : 200, 
   contentType: "application/json",
   content: "the raw content string of the http response",
   headers: {...} // a map of response headers
}

The same Javascript, flow.js, and lodash functions that are available in Tracks are available in the Task environment. an example function may look like this:

function(drops, response) {

  // "drops" contains the drops returned from a drop.find() query
  // or parsed from a HTTP source with an rss, xpath, or jsonpath specifier
  // If you haven't specified enough information to generate drops (either 
  // parsing or a dropFind query) this parameter will be null.
  // 
  // "response" contains the response object that will be available
  // in the function if you specify an HTTP(S) source. If you haven't specified
  // an http data source this parameter will be null

  // below we stop processing and report to a Flow we
  // define at runtime that might trigger aditional actions
  // like a slack POST or an SMS message
  if (response.statusCode !== 200) {
    return {
      “/my/error/flow”: [{
        “elems”: {
          “body”: response.body,
          “status”: response.statusCode
         }
    }]};
  }
}

The above function would only generate a drop if the http response failed.

For example, this (not terribly interesting) task would return the current date (note that there is no specified source and the drops and response parameters are ignored in the provided js function):

function(drops, response) {
  return {"elems": {
    "the_date": JSON.stringify(new Date())
  }};
}

Would create the following drop in the specified destination each time the Task was executed.

{"elems": {"the_date": "2016-03-09T19:34:34.144Z"}}

A summary of these settings appears below:

SourceParserExampleJs function
RSShttprss{"parser": "rss"}optional
XMLhttpxpath{"parser": {"xpath":{...}}optional
JSONhttpjsonpath{"parser": {"jsonpath":{...}}optional
RAWhttp-no parser - javascript is requiredmandatory
DropFinddropFind-no parseroptional
-none--function that generates values with null input

Some examples of Periodic Tasks

RSS task

The following task is sufficient to query an rss feed once per minute, find any new posts, and post them to the specified flow. No other programming is required:

{
  "source": {
    "http": {
      "parser": "rss",
      "get": {
        "url": "http://my.blog/feed/"
      }
    },
    "periodicity": 60000,
    "destination":"/path/to/my/flow"
}

With the same task optionally processed with a bit of javascript, we set an additional title field:

{
  "source": {
    "http": {
      "parser": "rss",
      "get": {
        "url": "http://my.blog/feed/"
      }
    },
    "js": "function(drops,response){
             var out=[];
             for (var i in drops){
                out.push({ elems:{ mytitle : drops[i].elems.title }});
             }
         return { \"/my/altered-drops\": out, \"/my/original-drops\": drops};
       }"
    },
   "periodicity": ...,
   "destination": ...
}

Notice that, just as with tracks, if your function returns multiple drops it should do it in the form of a map of paths -> list of drops/path. In this case the altered drops (that only contain the title) would be written to /my/altered-drops, while the original drops would be written to /my/original-drops. If you manually specify a path in your javascript function it will override the task destination flow.

The drops as generated by the rss parser are fed into the js function through the first (drops) parameter.

XML Task

This task will retrieve xml from the specified url and return a drop created from the values for each unique occurence of the root xpath "/slideshow/slide". As with the RSS task above, if you include a javascript function, the drops this generates will be fed into the function as the first argument. If you are simply extracting values from an xml data source, this is much simpler than manually parsing XML data:

{
  "source": {
    "http": {
      "get": {
        "url":"http://httpbin.org/xml"
      },
      "parser": {
        "xpath": {
          "root": "/slideshow/slide",
          "nodes": {
            "elems": {
              "a": "title",
              "b": "item"
            }
          }
        }
      }
    }
  },
  "periodicity": ...,
  "destination": ...
}

JSON Task

A Json task is identical to an xml task except that jsonpaths are specified:

{
  "source": {
    "http": {
      "get": {
        "url": "http://httpbin.org/get?a=b"
      },
      "parser": {
        "jsonpath": {
          "root": "$",
          "nodes": {
            "elems": {
              "a": "headers.Accept",
              "b": {
                "type": "string",
                "required": true,
                "field": "headers.Host"
              }
            }
          }
        }
      }
    }
  },
  "periodicity": ...,
  "destination": ...
}

Drop Find Task

This task will return the results of a query on the specified flow. You can specify either the flowId or flowPath, but not both:

{
  "source": {
    "dropFind": {
      "options": {
        "filter": "EXISTS A",
        "limit": 1
      }
      "flowId": "f000000000000000000000006"
    }
  },
  "periodicity": ...,
  "destination": ...
}

Manually parsed task

In this task an http source has been provided, but no parser. The raw response data will be available as the second argument of the js function, which will be responsible for generating valid drops:

{
  "source": {
    "http": {
      "get": {
        "url": "http://httpbin.org/get?a=b"
      }
    }
  },
  "js": "function f(drops,response) { return { 'elems': { 'a': response.status, 'b': response.content } }; }",
  "periodicity": ...,
  "destination": ...
}

A "sourceless" task

Run without a source, this task will simply output the same value once per minute, generating a "heartbeat" value:

{
  // source field is omitted
  "js": “function f(drops,response) { return { 'elems': { 'a': 1 } } }”
  // since there is no source within the js function both the
  // drops and the response field will be null
  "periodicity":...,
  "destination":...
}