Split URL directive

The SPLIT-URL directive splits a URL into protocol, authority, host, port, path, filename, and query.


split-url :column

The column is a column containing the URL.

Usage Notes

The SPLIT-URL directive will parse the URL into its constituents. Upon splitting the URL, the directive creates seven new columns by appending to the original column name:

  • column_protocol

  • column_authority

  • column_host

  • column_port

  • column_path

  • column_filename

  • column_query

If the URL cannot be parsed correctly, an exception is throw. If the URL column does not exist, columns with a null value are added to the record.


Using this record as an example:

{ "url": "http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING" }

Applying this directive:

split-url :url

results in this record:

When the URL field in the record is null:

the directive will generate:


Created in 2020 by Google Inc.