CloudSQL sink can write dupes

Description

Have seen situations where a task gets retried because of some transient failure, which results in duplicates written to the postgres table. It seems like the sink writes as it gets data instead of committing everything at the end. That may be desirable in some situations where you don't want to hold a lock for a long time, but it the behavior should at least be configurable.

Another way to potentially solve this is to allow upserts (which would be a useful feature anyway) instead of only inserts.

Release Notes

None

Activity

Show:
Albert Shau
March 25, 2021, 4:25 PM

I’m not sure if this is just for cloudsql postgresql or if this also applies to the other db sinks.

Assignee

Bhooshan Mogal

Reporter

Albert Shau

Labels

Product Requirement Doc

None

Reviewer

None

Dev Complete Date

None

Publish Date

None

Docs Impact

None

UX Impact

None

Fix versions

Priority

Major