KEMBAR78
[Bigquery] LoadJobConfiguration does not correctly handle preserveAsciiControlCharacters · Issue #3873 · googleapis/java-bigquery · GitHub
Skip to content

[Bigquery] LoadJobConfiguration does not correctly handle preserveAsciiControlCharacters #3873

@edgao

Description

@edgao

It looks like LoadJobConfiguration doesn't pass through CsvOptions.preserveAsciiControlCharacters to the Bigquery API correctly. Even after setting this flag, Bigquery still rejects CSV files containing null bytes.

(preserveAsciiControlCharacters is documented here - https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfigurationload)

Environment details

  1. Bigquery API
  2. OS type and version: MacOS Sequoia 15.5
  3. Java version: 21
  4. Version(s): com.google.cloud:libraries-bom 26.57.0 and 26.62.0 (26.62.0 is the latest version as of 2025/06/30)

Steps to reproduce

  1. Create a one-column, one-row CSV file containing a null character (e.g. fo<null>o); upload it to GCS
    2. example: test.csv
  2. Create a Bigquery table: create table foo.bar (the_column STRING)
  3. Try to run a load job with setPreserveAsciiControlCharacters(true) from the GCS file into the table (see code example below)
  4. Bigquery will fail the job with Bad character (ASCII 0) encountered.

Compare this behavior to using the bigquery CLI, where it runs successfully (can be verified with select the_column, to_code_points(the_column) from foo.bar):

bq load --source_format=CSV \
  --preserve_ascii_control_characters \
  foo.bar \
  gs://<snip>/test.csv \
  'the_column:STRING'

Code example

BigQuery bigQueryClient;
TableId tableId = TableId.of("foo", "bar");
String gcsUri = "gs://path/to/the/file.csv";

CsvOptions csvOptions =
    CsvOptions.newBuilder()
        .setPreserveAsciiControlCharacters(true)
        .build();

LoadJobConfiguration configuration =
    LoadJobConfiguration.builder(tableId, gcsUri)
        .setFormatOptions(csvOptions)
        .setSchema(Schema.of(Field.of("the_column", StandardSQLTypeName.STRING)))
        .setWriteDisposition(JobInfo.WriteDisposition.WRITE_APPEND)
        .setJobTimeoutMs(600000L)
        .build();

Job job = bigQueryClient.create(JobInfo.of(configuration));
job.waitFor();

Stack trace

For more details see Big Query Error collection: BigQueryError{reason=invalid, location=null, message=Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 8; errors: 1; max bad: 0; error percent: 0},
 BigQueryError{reason=invalid, location=gs://<snip>.csv, message=Error while reading data, error message: Bad character (ASCII 0) encountered.; line_number: 2 byte_offset_to_start_of_line: 243 column_index: 6 column_name: "string" column_type: STRING value: "f\000oo" File: gs://<snip>.csv}:

External references such as API reference guides

Any additional information below

Following these steps guarantees the quickest resolution possible.

Thanks!

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/java-bigquery API.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions