KEMBAR78
[Bug]: Managed IO for Iceberg cannot query by partition · Issue #33497 · apache/beam · GitHub
Skip to content

[Bug]: Managed IO for Iceberg cannot query by partition #33497

@saathwik-tk

Description

@saathwik-tk

What happened?

Lets say, I have field1 (string type) as my partition field, and there are many other fields, field2, field3, field4....
Querying the data with " select * from iceberg_table where field1 = 'field1_example_value'; " returns no data
but when querying with " select * from iceberg_table where field2 = 'field2_example_value'; " returns the data as expected.
I have ingested the data through Managed IO of Iceberg,
pipeline.apply(Managed.write(Managed.ICEBERG).withConfig(config_map);
Please have a look into this issue.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions