-
Notifications
You must be signed in to change notification settings - Fork 322
Description
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Please run down the following list and make sure you've tried the usual "quick fixes":
- Search the issues already opened: https://github.com/googleapis/python-bigquery/issues
- Search StackOverflow: https://stackoverflow.com/questions/tagged/google-cloud-platform+python
If you are still having issues, please be sure to include as much information as possible:
Environment details
(bigquery_reproduce) jupyter@gke-db-executor:~/bigquery_reproduce$ python --version
Python 3.9.0
(bigquery_reproduce) jupyter@gke-db-executor:~/bigquery_reproduce$ pip --version
pip 23.1.2 from /opt/conda/envs/bigquery_reproduce/lib/python3.9/site-packages/pip (python 3.9)
(bigquery_reproduce) jupyter@gke-db-executor:~/bigquery_reproduce$ pip show google-cloud-bigquery
Name: google-cloud-bigquery
Version: 3.11.0
Summary: Google BigQuery API client library
Home-page: https://github.com/googleapis/python-bigquery
Author-email: googleapis-packages@google.com
Location: /opt/conda/envs/bigquery_reproduce/lib/python3.9/site-packages
Requires: google-api-core, google-cloud-core, google-resumable-media, grpcio, packaging, proto-plus, protobuf, python-dateutil, requests
(bigquery_reproduce) jupyter@gke-db-executor:~/bigquery_reproduce$ pip show google-cloud-bigquery-storage
Name: google-cloud-bigquery-storage
Version: 2.20.0
Summary: Google Cloud Bigquery Storage API client library
Home-page: https://github.com/googleapis/python-bigquery-storage
Author-email: googleapis-packages@google.com
License: Apache 2.0
Location: /opt/conda/envs/bigquery_reproduce/lib/python3.9/site-packages
Requires: google-api-core, proto-plus, protobuf
(bigquery_reproduce) jupyter@gke-db-executor:~/bigquery_reproduce$ pip list
Package Version
----------------------------- --------
cachetools 5.3.1
certifi 2023.5.7
charset-normalizer 3.1.0
db-dtypes 1.1.1
google-api-core 2.11.0
google-auth 2.19.1
google-cloud-bigquery 3.11.0
google-cloud-bigquery-storage 2.20.0
google-cloud-core 2.3.2
google-crc32c 1.5.0
google-resumable-media 2.5.0
googleapis-common-protos 1.59.0
grpcio 1.55.0
grpcio-status 1.55.0
idna 3.4
numpy 1.24.3
packaging 23.1
pandas 2.0.2
pip 23.1.2
proto-plus 1.22.2
protobuf 4.23.2
pyarrow 12.0.0
pyasn1 0.5.0
pyasn1-modules 0.3.0
python-dateutil 2.8.2
pytz 2023.3
requests 2.31.0
rsa 4.9
setuptools 67.7.2
six 1.16.0
tzdata 2023.3
urllib3 1.26.16
wheel 0.40.0
Steps to reproduce
When bigquery return empty datase. bigquery python sdk will need to transfer [bq schema to arrow schema
https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/table.py#L1844-L1853),
so it will execute bq_to_arrow_data_type
https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/_pandas_helpers.py#L225-L246
Looks like currently this no JSON type mapping]:
https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/_pandas_helpers.py#L147-L162)
Code example
Error:
/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/_pandas_helpers.py:267:
UserWarning: Unable to determine type for field 'json'.
warnings.warn("Unable to determine type for field
'{}'.".format(bq_field.name))
Traceback (most recent call last):
File "/home/jupyter/smapi/test_bq.py", line 12, in <module>
df = bqexecutor.execute_query(sql)
File "/home/jupyter/smapi/smapi/core/data_providers/sql/bq_executor.py",
line 119, in execute_query
querydf = rows.to_dataframe()
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 2151, in to_dataframe
progress_bar_type=progress_bar_type,
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 1840, in to_arrow
return pyarrow.Table.from_batches(record_batches)
File "pyarrow/table.pxi", line 3747, in pyarrow.lib.Table.from_batches
ValueError: Must pass schema, or at least one RecordBatch
Python version: Python 3.9.0
package version:
google-cloud-bigquery 3.10.0
google-cloud-bigquery-storage 2.19.1
pyarrow 8.0.0
pandas 1.5.3
How to reproduce?
test_bq.py
python
from google.cloud import bigquery
client = bigquery.Client()
table_id = "gdw-dev-gdml.abehsu.reproduce_issue"
# Define schema
schema = [
bigquery.SchemaField("field1", "STRING", mode="REQUIRED"),
bigquery.SchemaField("field2", "JSON", mode="REQUIRED"),
]
table = bigquery.Table(table_id, schema=schema)
table = client.create_table(table) # Make an API request.
print(
"Created table {}.{}.{}".format(table.project, table.dataset_id,
table.table_id)
)
# Perform a query.
QUERY = (
f"""
select * from {table_id}
"""
)
query_job = client.query(QUERY) # API request
df = query_job.to_dataframe()
print(df)
python test_bq.py
Created table gdw-dev-gdml.abehsu.reproduce_issue
Stack trace
/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/_pandas_helpers.py:267:
UserWarning: Unable to determine type for field 'field2'.
warnings.warn("Unable to determine type for field
'{}'.".format(bq_field.name))
Traceback (most recent call last):
File "/home/jupyter/smapi/test_bq.py", line 41, in <module>
df = query_job.to_dataframe()
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/job/[query.py](https://query.py/)",
line 1800, in to_dataframe
return query_result.to_dataframe(
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 2150, in to_dataframe
record_batch = self.to_arrow(
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 1848, in to_arrow
return pyarrow.Table.from_batches(record_batches, schema=arrow_schema)
File "pyarrow/table.pxi", line 3747, in pyarrow.lib.Table.from_batches
ValueError: Must pass schema, or at least one RecordBatch
Making sure to follow these steps will guarantee the quickest resolution possible.
Thanks!