1
Module 7 In this module we will:
• Query from External Data Sources
• Avoid Data Ingesting Pitfalls
Ingesting New Datasets
• Ingest New Data into Permanent Tables
into Google BigQuery • Discuss Streaming Inserts
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
2
Ingest data permanently into BigQuery from a variety of formats
Cloud Google
Storage Drive
Cloud BigQuery
Cloud Query
Dataflow Dataprep Engine
BigQuery Managed Storage
CSV,
Data ingested into BigQuery is
JSON,
Cloud stored in permanent tables and
Avro
Bigtable this storage is scalable and
fully-managed
… and more formats
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
3
BigQuery can query external data sources in GCS and Drive directly
Cloud Google
Storage Drive
Cloud BigQuery
Cloud Query
Dataflow Dataprep Engine
BigQuery Managed Storage
CSV,
JSON,
Cloud AVRO You can query external data
Bigtable sources directly from BigQuery
which bypasses managed
storage
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
4
Pitfalls: Querying from External Data Sources Directly
Limitations
● Strong Performance
Disadvantages
● Data Consistency not
Guaranteed
● Can’t Use Table Wildcards
(cool feature we will
introduce shortly)
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
5
Streaming records into BigQuery through the API
BigQuery
Query
Engine
Streaming Record Inserts BigQuery Managed Storage
Streaming data allows you to
query data without waiting for a
full batch load
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
6
Summary: GCPnew
Summary: Ingest offers youinto
datasets theBigQuery
abilitymanaged
to: storage
Data stored Load new data Setup streaming
permanently in from a variety of ingestion into
BigQuery is formats BigQuery through
fully-managed APIs
(performance,
backups,
redundancy).
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
7
Lab 6
Ingesting and Querying
New Datasets
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
Ingesting and Querying New Datasets
In this lab, you will ingest new data
sources into Google BigQuery and learn
how to query external data sources
directly.
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.