Webserver Log Parser Transformation
Plugin version: 2.11.0
Parses logs from any input source for relevant information such as URI, IP, browser, device, HTTP status code, and timestamp.
This transform is used when you need to parse log entries. For example, you may want to read in log files from S3 using Amazon S3 source, parse the logs using Webserver Log Parser transformation, and then store the IP and URI information in a Cube dataset.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Log Format | No | Required. Log format to parse. Currently supports Default is CLF. |
Input Name | No | Optional. Name of the field in the input schema which encodes the log information. The given field must be of type |
Output Schema | No | Required. The output schema for the data. |
Conditions
If error dataset is configured, then all the erroneous rows, if present in the input, will be committed to the specified error dataset. If no error dataset is configured, then pipeline will get completed but with warnings in the logs.
Example
This example searches for an input Schema field named ‘body’, and then attempts to parse the Combined Log Format entries found in the field for the URI, IP, browser, device, HTTP status code, and timestamp:
Property | Value |
---|---|
Log Format |
|
Input Name |
|
The Webserver Log Parser transformation will emit records with this schema:
field name | type |
---|---|
uri | string |
ip | string |
browser | string |
device | string |
httpStatus | int |
ts | long |
Created in 2020 by Google Inc.