Requirement | Use Case | Approach |
---|
Add new column | BigQuery schema: a, b, c New schema: a, b, c, d Write data for column d. | ALLOW_FIELD_ADDITION value can be set to SchemaUpdateOptions in JobConfigurationLoad, will always add nullable field. |
Remove existing column | BigQuery schema: a int, b int not null New schema: a int | |
Change column mode to less restrictive | BigQuery schema: a int not null New schema: a int | ALLOW_FIELD_RELAXATION value can be set to SchemaUpdateOptions in JobConfigurationLoad. |
Change column mode to more restrictive | BigQuery schema: a int New schema: a int not null | BigQuery does not allow column mode change to more restrictive, only full table reload can be used in this case. |
Change column type | BigQuery schema: a int New schema: a double | BigQuery does not allow changing column type, only full table reload can be used in this case. |
Full table reload | Remove existing data, load new data, possibly change schema | WRITE_TRUNCATE value can be set to WriteDisposition in JobConfigurationLoad. This option will always overwrite the schema along with the existing data. |
Use custom default for missing columns | BigQuery schema: a int, b int New schema: a int Add b column with custom default. | BigQuery Schema does not support default notion for columns, null value is written if value for column is missing. This can be handled using transformation plugin. |
Add custom default to new column for already existing data | BigQuery has data in columns a and b. New column c is added. Need to insert data into column c for already existing data. | BigQuery does not support default notion for columns. If new column is added, its value in existing rows will be null. In order to change it, only full table reload can be used. |