Guide for Iceberg Integration (Snowflake on GCP)
This guide outlines the steps required to configure your Snowflake environment on Google Cloud Platform (GCP) to share data with Optimove using Iceberg tables. This modern data sharing architecture allows Optimove to securely and efficiently access required data directly from your cloud storage, giving you full control and visibility.
Prerequisites
| Prerequisite | Description |
|---|---|
| GCS Bucket | A Google Cloud Storage bucket to store Iceberg metadata and data files. |
| Snowflake External Volume | A Snowflake object that stores the GCP service account identity used for reading from and writing to the GCS bucket. |
| Optional: Open Catalog (Polaris) | A REST catalog that Snowflake can use to register and discover Iceberg tables. |
| Snowflake Catalog Integration | An object in your Snowflake account that instructs Snowflake on how to communicate with the remote catalog. |
Client-Side Setup Steps
1.1 Create or Choose a GCS Bucket
First, please provision a Google Cloud Storage (GCS) bucket. This bucket will securely store the data and metadata files for the Iceberg tables you intend to share.
1.2 Create a Writable External Volume
In Snowflake, create an external volume that points to your GCS bucket. This will generate a service account that needs permission to access the bucket.
CREATE EXTERNAL VOLUME client_ext_vol
STORAGE_LOCATIONS = ((
NAME = 'primary'
STORAGE_PROVIDER = 'GCS'
STORAGE_BASE_URL = 'gcs://<bucket>/data/'
))
ALLOW_WRITES = TRUE;
-- Verify the volume is writable
CALL SYSTEM$VERIFY_EXTERNAL_VOLUME('client_ext_vol');
GRANT USAGE ON EXTERNAL VOLUME client_ext_vol TO ROLE <writer_role>;After creating the volume, run DESC EXTERNAL VOLUME client_ext_vol; and copy the STORAGE_GCP_SERVICE_ACCOUNT value. This is the service account identity you will grant IAM permissions to in the next step.
1.3 Grant IAM Permissions to the GCS Bucket
In the Google Cloud Console, grant the service account from the previous step access to your GCS bucket. You can create a custom IAM role with the following permissions:
storage.objects.getstorage.objects.liststorage.objects.createstorage.objects.deletestorage.objects.updatestorage.buckets.get
1.4 (Optional) Create an Open Catalog
If you use a REST-based open catalog like Polaris, you can integrate it with Snowflake. This allows for easier discovery and management of your Iceberg tables.
1.5 Create a Catalog Integration
Create a catalog integration to connect Snowflake to your Iceberg catalog.
CREATE OR REPLACE CATALOG INTEGRATION client_catalog_int
CATALOG_SOURCE = POLARIS
TABLE_FORMAT = ICEBERG
REST_CONFIG = (
CATALOG_URI = 'https://<polaris-endpoint>'
CATALOG_NAME = '<catalog-name>'
)
REST_AUTHENTICATION = (
TYPE = OAUTH
OAUTH_CLIENT_ID = '<client-id>'
OAUTH_CLIENT_SECRET = '<client-secret>'
OAUTH_ALLOWED_SCOPES = ('PRINCIPAL_ROLE:ALL')
)
ENABLED = TRUE;
GRANT USAGE ON INTEGRATION client_catalog_int TO ROLE <writer_role>;1.6 (Optional) Create a Catalog-Linked Database
For accounts on Snowflake Enterprise Edition or higher, you can link the catalog integration to a database. This enables automatic discovery of new tables.
CREATE DATABASE client_lake
CATALOG = 'client_catalog_int'
EXTERNAL_VOLUME = 'client_ext_vol';1.7 Create Iceberg Tables
You can now create your Iceberg tables. If you did not link the database in the previous step, you must specify the EXTERNAL_VOLUME and CATALOG parameters in your CREATE TABLE statement.
Partitioning Recommendation: It is strongly recommended to define a PARTITION BY clause when creating Iceberg tables for efficient data processing. Please coordinate with the Optimove onboarding team to determine the best partitioning strategy for your data.
CREATE ICEBERG TABLE SAMPLE_TABLE (
ID INT,
NAME STRING,
CREATED_AT DATE
)
PARTITION BY (CREATED_AT)
EXTERNAL_VOLUME = 'client_ext_vol'
CATALOG = 'client_catalog_int';1.8 Share the Following with Optimove
To finalize the integration, please provide the Optimove team with the following details:
| Item | Value |
|---|---|
| GCS path | gcs://<bucket>/data/ |
| External volume name | client_ext_vol |
| Catalog integration name | client_catalog_int |
| Table list | e.g. client_db.sample_schema.sample_table |
| Snowflake catalog name | client_catalog |
| Snowflake GCP service account email | <long‑id>@gcp-sa-snowflake-prod.iam.gserviceaccount.com |
Reference Documentation
For more detailed technical information, you can refer to the official Snowflake documentation:
Updated about 8 hours ago
