Skip to main content
This document describes the necessary steps to follow to set up your first working project in for the following configuration options: authenticates to Google BigQuery using a Google Cloud service account credential. Because Google BigQuery authentication differs from other warehouses, read How Google BigQuery authentication differs from other warehouses. For Hybrid SaaS deployments, read the Hybrid SaaS BigQuery setup guide.

How Google BigQuery authentication differs from other warehouses

For most warehouses, authentication is configured directly on the environment itself. For example, Snowflake environments typically use username/password or key-pair authentication configured as part of the warehouse connection. Google BigQuery doesn’t follow this model. Instead, Google BigQuery uses Google Cloud credentials provided as JSON Google Cloud service account key files. The Google Cloud service account acts as a principal when accessing Google Cloud resources. For more information, read the following Google Cloud documentation: Because of this, Google BigQuery environments don’t contain warehouse authentication settings directly. To fully configure a Google BigQuery environment, it must have access to a Google Cloud service account credential. For Full SaaS deployments, the credential is supplied as a configured cloud credential on the environment.

Example Google Cloud service account key

The following is an example of a Google Cloud service account key structure:
{
  "type": "service_account",
  "project_id": "example-project",
  "private_key_id": "1234567890abcdef1234567890abcdef12345678",
  "private_key": "-----BEGIN PRIVATE KEY-----\nEXAMPLEKEY\n-----END PRIVATE KEY-----\n",
  "client_email": "matillion-sa@example-project.iam.gserviceaccount.com",
  "client_id": "123456789012345678901",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/matillion-sa%40example-project.iam.gserviceaccount.com"
}

Prerequisites

Google BigQuery requirements

Connectivity requirements


Permissions

The Google Cloud service account must have IAM roles or permissions sufficient for the operations performs against your data. Typical operations include:
  • Create, update, and delete tables and views.
  • Query tables and views.
  • Retrieve metadata for datasets, tables, and views.
  • List projects, datasets, tables, and views.
  • Insert or load data into tables.
  • Run Google BigQuery jobs.
Depending on your use case, Google recommends assigning a combination of the following roles. At a minimum, grant either roles/bigquery.jobUser or roles/bigquery.user, as both include the bigquery.jobs.create permission required for the service account to interact with BigQuery. For -specific BigQuery IAM guidance, read GCP IAM permissions for runner deployment.
RolePurpose
roles/bigquery.jobUserSubmit and run BigQuery jobs—must be project-level; can’t be scoped to a dataset.
roles/bigquery.userRun BigQuery jobs, and query data.
roles/bigquery.dataEditorRead and write data—only required if the pipeline writes back to BigQuery.
roles/bigquery.dataViewerRead data from BigQuery datasets and tables.
roles/bigquery.adminFull administrative access to Google BigQuery resources.
Use the principle of least privilege wherever possible.
For the full list of Google BigQuery IAM roles and permissions, read Access control.

Google Cloud Storage permissions

Many Google BigQuery workflows use Google Cloud Storage (GCS) as a staging location before loading data into Google BigQuery. If your pipelines interact with GCS buckets, the Google Cloud service account also requires appropriate Storage IAM permissions. For more information, read Basic roles. Commonly used roles include:
RolePurpose
roles/storage.objectViewerRead staged files.
roles/storage.objectCreatorUpload staged files.
roles/storage.objectAdminFull access to bucket objects.
For more information about IAM permissions, read Google Cloud IAM permissions for runner deployment.

Setup steps

  1. Register for a account.
  2. Create accounts for users and admins who will be active in .
  3. Create a project, making the following choices:
    • Select managed.
  4. Create an environment, and configure a cloud credential using your Google Cloud service account key.
  5. Select BigQuery defaults for your environment, such as the default GCP project and dataset.
  6. Create a Git branch in which to begin pipeline work.
  7. Create your first pipeline.