Description

Bigtable source plugin is broken in 6.3 possibly due to the version bump of protobuf in this change - https://github.com/data-integrations/google-cloud/commit/9265fd6607bcb932d62b0a0f7164246d640e1f97#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8


Same pipeline works in 6.2.3 works fine.

 

Stacktrace below -


java.lang.Exception: org.apache.hadoop.hbase.util.ByteStringer.wrap([B)Lorg/apache/hadoop/hbase/shaded/com/google/protobuf/ByteString;
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$6(AbstractContext.java:629) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:584) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:621) ~[na:na]
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.initialize(SparkRuntimeService.java:433) ~[io.cdap.cdap.cdap-spark-core2_2.11-6.3.0.jar:na]
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.startUp(SparkRuntimeService.java:208) ~[io.cdap.cdap.cdap-spark-core2_2.11-6.3.0.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) ~[com.google.guava.guava-13.0.1.jar:na]
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService$5$1.run(SparkRuntimeService.java:404) [io.cdap.cdap.cdap-spark-core2_2.11-6.3.0.jar:na]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_275]
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.util.ByteStringer.wrap([B)Lorg/apache/hadoop/hbase/shaded/com/google/protobuf/ByteString;
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:1060) ~[na:na]
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:596) ~[na:na]
at io.cdap.plugin.gcp.bigtable.source.BigtableSource.getConfiguration(BigtableSource.java:178) ~[na:na]
at io.cdap.plugin.gcp.bigtable.source.BigtableSource.prepareRun(BigtableSource.java:104) ~[na:na]
at io.cdap.plugin.gcp.bigtable.source.BigtableSource.prepareRun(BigtableSource.java:63) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$prepareRun$0(WrappedBatchSource.java:51) ~[na:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:50) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:36) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambda$prepareRun$2(SubmitterPlugin.java:71) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$execute$3(AbstractContext.java:539) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:224) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.execute(Transactions.java:211) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:536) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:524) ~[na:na]
at io.cdap.cdap.app.runtime.spark.BasicSparkClientContext.execute(BasicSparkClientContext.java:333) ~[io.cdap.cdap.cdap-spark-core2_2.11-6.3.0.jar:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:69) ~[na:na]
at io.cdap.cdap.etl.common.submit.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:149) ~[na:na]
at io.cdap.cdap.etl.spark.AbstractSparkPreparer.prepare(AbstractSparkPreparer.java:87) ~[na:na]
at io.cdap.cdap.etl.spark.batch.SparkPreparer.prepare(SparkPreparer.java:88) ~[na:na]
at io.cdap.cdap.etl.spark.batch.ETLSpark.initialize(ETLSpark.java:120) ~[na:na]
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:131) ~[na:na]
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:33) ~[na:na]
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService$2.initialize(SparkRuntimeService.java:167) ~[io.cdap.cdap.cdap-spark-core2_2.11-6.3.0.jar:na]
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService$2.initialize(SparkRuntimeService.java:162) ~[io.cdap.cdap.cdap-spark-core2_2.11-6.3.0.jar:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$6(AbstractContext.java:624) ~[na:na]
... 7 common frames omitted

Release Notes

Bigtable source plugin that was broken in 6.3 is fixed.

Activity

Show:

Greeshma Swaminathan March 15, 2021 at 11:36 PM

google-cloud plugin version 0.16.1 is available in Hub as a workaround for 6.3.

Greeshma Swaminathan March 11, 2021 at 6:11 PM
Edited

For posterity, the Bigtable src plugin was broken regardless of the sink when working with Spark2 and MR. ITN did not catch this because the spark compatibility was set to spark 1. (see PLUGIN-613)

Greeshma Swaminathan February 24, 2021 at 12:33 AM

cherry pick for 6.3.1

Greeshma Swaminathan February 24, 2021 at 12:23 AM

Greeshma Swaminathan February 23, 2021 at 9:37 PM

org.apache.hadoop.hbase.util.ByteStringer was getting used from hbase-protocol-1.4.12.jar introduced by hbase-common 1.4.12 dependency. This class should come from hbase-shaded-client-1.4.13 to avoid dependency on google protobuf.

Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Greeshma Swaminathan

Components

Fix versions

Priority

Blocker

More fields

Created February 22, 2021 at 3:06 AM
Updated March 15, 2021 at 11:36 PM
Loading...