Guide for Iceberg Integration (Snowflake on GCP)

This guide outlines the steps required to configure your Snowflake environment on Google Cloud Platform (GCP) to share data with Optimove using Iceberg tables. This modern data sharing architecture allows Optimove to securely and efficiently access required data directly from your cloud storage, giving you full control and visibility.

Prerequisites

PrerequisiteDescription
GCS BucketA Google Cloud Storage bucket to store Iceberg metadata and data files.
Snowflake External VolumeA Snowflake object that stores the GCP service account identity used for reading from and writing to the GCS bucket.
Optional: Open Catalog (Polaris)A REST catalog that Snowflake can use to register and discover Iceberg tables.
Snowflake Catalog IntegrationAn object in your Snowflake account that instructs Snowflake on how to communicate with the remote catalog.

Client-Side Setup Steps

1.1 Create or Choose a GCS Bucket

First, please provision a Google Cloud Storage (GCS) bucket. This bucket will securely store the data and metadata files for the Iceberg tables you intend to share.

1.2 Create a Writable External Volume

In Snowflake, create an external volume that points to your GCS bucket. This will generate a service account that needs permission to access the bucket.

CREATE EXTERNAL VOLUME client_ext_vol
     STORAGE_LOCATIONS = ((
       NAME = 'primary'
       STORAGE_PROVIDER = 'GCS'
       STORAGE_BASE_URL = 'gcs://<bucket>/data/'
     ))
     ALLOW_WRITES = TRUE;

-- Verify the volume is writable
CALL SYSTEM$VERIFY_EXTERNAL_VOLUME('client_ext_vol');
GRANT USAGE ON EXTERNAL VOLUME client_ext_vol TO ROLE <writer_role>;

After creating the volume, run DESC EXTERNAL VOLUME client_ext_vol; and copy the STORAGE_GCP_SERVICE_ACCOUNT value. This is the service account identity you will grant IAM permissions to in the next step.

1.3 Grant IAM Permissions to the GCS Bucket

In the Google Cloud Console, grant the service account from the previous step access to your GCS bucket. You can create a custom IAM role with the following permissions:

  • storage.objects.get
  • storage.objects.list
  • storage.objects.create
  • storage.objects.delete
  • storage.objects.update
  • storage.buckets.get

1.4 (Optional) Create an Open Catalog

If you use a REST-based open catalog like Polaris, you can integrate it with Snowflake. This allows for easier discovery and management of your Iceberg tables.

1.5 Create a Catalog Integration

Create a catalog integration to connect Snowflake to your Iceberg catalog.

CREATE OR REPLACE CATALOG INTEGRATION client_catalog_int
  CATALOG_SOURCE = POLARIS
  TABLE_FORMAT   = ICEBERG
  REST_CONFIG = (
    CATALOG_URI  = 'https://<polaris-endpoint>'
    CATALOG_NAME = '<catalog-name>'
  )
  REST_AUTHENTICATION = (
    TYPE                = OAUTH
    OAUTH_CLIENT_ID     = '<client-id>'
    OAUTH_CLIENT_SECRET = '<client-secret>'
    OAUTH_ALLOWED_SCOPES = ('PRINCIPAL_ROLE:ALL')
  )
  ENABLED = TRUE;

GRANT USAGE ON INTEGRATION client_catalog_int TO ROLE <writer_role>;

1.6 (Optional) Create a Catalog-Linked Database

For accounts on Snowflake Enterprise Edition or higher, you can link the catalog integration to a database. This enables automatic discovery of new tables.

CREATE DATABASE client_lake
  CATALOG = 'client_catalog_int'
  EXTERNAL_VOLUME = 'client_ext_vol';

1.7 Create Iceberg Tables

You can now create your Iceberg tables. If you did not link the database in the previous step, you must specify the EXTERNAL_VOLUME and CATALOG parameters in your CREATE TABLE statement.

Partitioning Recommendation: It is strongly recommended to define a PARTITION BY clause when creating Iceberg tables for efficient data processing. Please coordinate with the Optimove onboarding team to determine the best partitioning strategy for your data.

CREATE ICEBERG TABLE SAMPLE_TABLE (
  ID INT,
  NAME STRING,
  CREATED_AT DATE
)
PARTITION BY (CREATED_AT)
EXTERNAL_VOLUME = 'client_ext_vol'
CATALOG        = 'client_catalog_int';

1.8 Share the Following with Optimove

To finalize the integration, please provide the Optimove team with the following details:

ItemValue
GCS pathgcs://<bucket>/data/
External volume nameclient_ext_vol
Catalog integration nameclient_catalog_int
Table liste.g. client_db.sample_schema.sample_table
Snowflake catalog nameclient_catalog
Snowflake GCP service account email<long‑id>@gcp-sa-snowflake-prod.iam.gserviceaccount.com

Reference Documentation

For more detailed technical information, you can refer to the official Snowflake documentation: