-
Notifications
You must be signed in to change notification settings - Fork 322
Closed
Labels
api: bigqueryIssues related to the googleapis/python-bigquery API.Issues related to the googleapis/python-bigquery API.externalThis issue is blocked on a bug with the actual product.This issue is blocked on a bug with the actual product.status: blockedResolving the issue is dependent on other work.Resolving the issue is dependent on other work.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.‘Nice-to-have’ improvement, new feature or different behavior or design.
Description
Currently, load jobs only take in a single string value in the null_marker param. It would be great if this could take in an array of strings, as there could be multiple string representations that should loaded in as null e.g.
LoadJobConfig(
schema=schema,
skip_leading_rows=1,
source_format=SourceFormat.CSV,
write_disposition=WriteDisposition.WRITE_TRUNCATE,
null_marker=['NA', '', '#NA', 'na']
)
An alternative that many have proposed is loading all your data into one table, and having some process that "cleans" your data and properly loads the correct values into a 2nd, final table. Obviously this will work but takes some time, and having multiple null markers seems like a simple solution that can solve problems with loading CSVs into BigQuery with specific schemas (e.g. having a numeric data type field).
Metadata
Metadata
Assignees
Labels
api: bigqueryIssues related to the googleapis/python-bigquery API.Issues related to the googleapis/python-bigquery API.externalThis issue is blocked on a bug with the actual product.This issue is blocked on a bug with the actual product.status: blockedResolving the issue is dependent on other work.Resolving the issue is dependent on other work.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.‘Nice-to-have’ improvement, new feature or different behavior or design.