-
Notifications
You must be signed in to change notification settings - Fork 253
Description
ExternalAccountCredentials has protected transient HttpTransportFactory transportFactory, which becomes null if this object gets serialized & restored. The design intent of this appears to be described in #67, but the implementation in ExternalAccountCredentials lacks the crucial part, quoted below:
When serializing an option object we only transmit the class name for the transport factory and try to instantiate the factory from its classname upon deserialization.
The same problem has been seen and fixed in #132. I believe we need to bring the same fix to ExternalAccountCredentials
More details
NPE happens at the following call site:
google-auth-library-java/oauth2_http/java/com/google/auth/oauth2/AwsCredentials.java
Line 151 in 98fc7e1
| HttpRequestFactory requestFactory = transportFactory.create().createRequestFactory(); |
Full stack trace below:
com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: UNAUTHENTICATED: Failed computing credential metadata
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:116)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.ExceptionResponseObserver.onErrorImpl(ExceptionResponseObserver.java:82)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.StateCheckingResponseObserver.onError(StateCheckingResponseObserver.java:84)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcDirectStreamController$ResponseObserverAdapter.onClose(GrpcDirectStreamController.java:148)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:546)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:489)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:453)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:486)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:567)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:71)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:735)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:716)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Suppressed: java.lang.RuntimeException: Asynchronous task failed
at com.google.cloud.bigquery.connector.common.StreamCombiningIterator.hasNext(StreamCombiningIterator.java:152)
at com.google.cloud.bigquery.connector.common.ReadRowsResponseInputStreamEnumeration.loadNextResponse(ReadRowsResponseInputStreamEnumeration.java:57)
at com.google.cloud.bigquery.connector.common.ReadRowsResponseInputStreamEnumeration.<init>(ReadRowsResponseInputStreamEnumeration.java:37)
at com.google.cloud.spark.bigquery.v2.context.ArrowColumnBatchPartitionReaderContext.makeSingleInputStream(ArrowColumnBatchPartitionReaderContext.java:234)
at com.google.cloud.spark.bigquery.v2.context.ArrowColumnBatchPartitionReaderContext.<init>(ArrowColumnBatchPartitionReaderContext.java:224)
at com.google.cloud.spark.bigquery.v2.context.ArrowInputPartitionContext.createPartitionReaderContext(ArrowInputPartitionContext.java:89)
at com.google.cloud.spark.bigquery.v2.Spark32BigQueryPartitionReaderFactory.createColumnarReader(Spark32BigQueryPartitionReaderFactory.java:21)
at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:79)
at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hasNext(Unknown Source)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:968)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:205)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:138)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1516)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
... 3 more
Caused by: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: UNAUTHENTICATED: Failed computing credential metadata
at com.google.cloud.spark.bigquery.repackaged.io.grpc.Status.asRuntimeException(Status.java:537)
... 17 more
Caused by: java.lang.NullPointerException: Cannot invoke "com.google.cloud.spark.bigquery.repackaged.com.google.auth.http.HttpTransportFactory.create()" because "this.transportFactory" is null
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.AwsCredentials.retrieveResource(AwsCredentials.java:213)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.AwsCredentials.retrieveResource(AwsCredentials.java:202)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.AwsCredentials.getAwsRegion(AwsCredentials.java:338)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.AwsCredentials.retrieveSubjectToken(AwsCredentials.java:173)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.AwsCredentials.refreshAccessToken(AwsCredentials.java:152)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.OAuth2Credentials$1.call(OAuth2Credentials.java:269)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.OAuth2Credentials$1.call(OAuth2Credentials.java:266)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at com.google.cloud.spark.bigquery.repackaged.com.google.auth.oauth2.OAuth2Credentials$RefreshTask.run(OAuth2Credentials.java:633)
... 3 more