Split URL directive
The SPLIT-URL directive splits a URL into protocol, authority, host, port, path, filename, and query.
Syntax
split-url :column
The column is a column containing the URL.
Usage Notes
The SPLIT-URL directive will parse the URL into its constituents. Upon splitting the URL, the directive creates seven new columns by appending to the original column name:
column_protocolcolumn_authoritycolumn_hostcolumn_portcolumn_pathcolumn_filenamecolumn_query
If the URL cannot be parsed correctly, an exception is throw. If the URL column does not exist, columns with a null value are added to the record.
Examples
Using this record as an example:
{
"url": "http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING"
}Applying this directive:
split-url :urlresults in this record:
{
"url": "http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING",
"url_protocol": "http",
"url_authority": "example.com:80",
"url_host": "example.com",
"url_port": 80,
"url_path": "/docs/books/tutorial/index.html",
"url_filename": "/docs/books/tutorial/index.html?name=networking",
"url_query": "name=networking"
}When the URL field in the record is null:
{
"url": null
}the directive will generate:
{
"url": null,
"url_protocol": null,
"url_authority": null,
"url_host": null,
"url_port": null,
"url_path": null,
"url_filename": null,
"url_query": null
}
Created in 2020 by Google Inc.