Guide for Iceberg Integration (Databricks)

This guide outlines the steps required to configure your Databricks environment to share data with Optimove using Iceberg tables. Please follow the instructions for the cloud provider that hosts your Databricks workspace (Azure, GCP, or AWS).

For Databricks on Azure

1. Prepare Storage

  • Choose or create an Azure Data Lake Storage (ADLS) Gen2 account and a container for the data.
  • Ensure that versioning is enabled on the storage account.
  • Note the container URL, which will be in the format azure://myacct.blob.core.windows.net/databricks/.

2. Create or Convert Tables to Iceberg Format

Within your Databricks workspace, create or alter your tables with the necessary properties to enable Iceberg compatibility.

For new Tables:

CREATE TABLE uc_iceberg_table_test(c1 INT) TBLPROPERTIES(
  'delta.columnMapping.mode' = 'name',
  'delta.enableIcebergCompatV2' = 'true',
  'delta.universalFormat.enabledFormats' = 'iceberg'
);

For existing Delta tables:

ALTER TABLE my_catalog.my_schema.old_delta_tbl
SET TBLPROPERTIES('delta.universalFormat.enabledFormats' = 'iceberg');

3. Create an Identity for Snowflake Access

Create a service principal in Azure Active Directory or a Personal Access Token (PAT) in your Databricks workspace. This credential will be used by Optimove to authenticate to the Unity Catalog API.

4. Grant Unity Catalog Permissions

Grant the service principal or user associated with the token SELECT permissions on each table you plan to share.

GRANT SELECT ON TABLE my_catalog.my_schema.uc_iceberg_table_test TO `<sp-app-id>`;;

5. Share the Following with Optimove

To finalize the integration, please provide the Optimove team with the following details:

  • Fully-qualified table names (e.g., my_catalog.my_schema.uc_iceberg_table_test)
  • The PAT or service principal secret
  • The ADLS Gen2 storage container URL
  • Your Azure Tenant ID
  • Your Databricks external location name (if you are not using 'default')

For Databricks on GCP

1. Prepare Storage

  • Choose or create a Google Cloud Storage (GCS) bucket.
  • Ensure that object versioning is enabled on the bucket.

2. Create or Convert Tables to Iceberg Format

Within your Databricks workspace, create or alter your tables with the necessary properties to enable Iceberg compatibility.

For new Tables:

CREATE TABLE uc_iceberg_table_test(c1 INT) TBLPROPERTIES(
  'delta.columnMapping.mode' = 'name',
  'delta.enableIcebergCompatV2' = 'true',
  'delta.universalFormat.enabledFormats' = 'iceberg'
);

For existing Delta tables:

ALTER TABLE my_catalog.my_schema.old_delta_tbl
SET TBLPROPERTIES('delta.universalFormat.enabledFormats' = 'iceberg');

3. Create an Identity for Snowflake Access

Create a service principal in Azure Active Directory or a Personal Access Token (PAT) in your Databricks workspace. This credential will be used by Optimove to authenticate to the Unity Catalog API.

4. Grant Unity Catalog Permissions

Grant the service principal or user associated with the token SELECT permissions on each table you plan to share.

GRANT SELECT ON TABLE my_catalog.my_schema.uc_iceberg_table_test TO `<sp-app-id>`;;

5. Share the Following with Optimove

To finalize the integration, please provide the Optimove team with the following details:

  • Fully-qualified table names (e.g., my_catalog.my_schema.uc_iceberg_table_test)
  • The PAT or service principal secret
  • Your GCS bucket URI

For Databricks on AWS

1. Prepare Storage

  • Sign in to your AWS account.
  • Choose or create an Amazon S3 bucket in the same region as your Databricks workspace.
  • You must enable Bucket Versioning, as Iceberg relies on it for time-travel and rollback capabilities.

2. Create or Convert Tables to Iceberg Format

Within your Databricks workspace, create or alter your tables with the necessary properties to enable Iceberg compatibility.

For new Tables:

CREATE TABLE uc_iceberg_table_test(c1 INT) TBLPROPERTIES(
  'delta.columnMapping.mode' = 'name',
  'delta.enableIcebergCompatV2' = 'true',
  'delta.universalFormat.enabledFormats' = 'iceberg'
);

For existing Delta tables:

ALTER TABLE my_catalog.my_schema.old_delta_tbl
SET TBLPROPERTIES('delta.universalFormat.enabledFormats' = 'iceberg');

3. Create an Identity for Snowflake Access

Create a service principal in Azure Active Directory or a Personal Access Token (PAT) in your Databricks workspace. This credential will be used by Optimove to authenticate to the Unity Catalog API.

4. Grant Unity Catalog Permissions

Grant the service principal or user associated with the token SELECT permissions on each table you plan to share.

GRANT SELECT ON TABLE my_catalog.my_schema.uc_iceberg_table_test TO `<sp-app-id>`;;

5. Create an IAM Role for S3 Data Access

To securely grant Optimove read access to the underlying data in S3, you will create an IAM role.

Note: To complete this step, you will need two values that the Optimove team will provide after you share your initial details:

  • A Snowflake IAM User ARN (e.g., arn:aws:iam::123456789012:user/...)
  • A unique External ID string

Once you receive these values from Optimove, perform the following actions in the AWS IAM console:

  1. Create a new role with a "Custom trust policy". Paste the JSON below, replacing the placeholder values with the ones provided by Optimove.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::123456789012:user/extvol/UC_EXV_S3_ab12cd345e"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "SNOWFLAKE‑ACME‑2025‑07‑31"
            }
          }
        }
      ]
    }
    
  2. Attach a permissions policy to the role that grants read-only access to your S3 bucket.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "s3:GetObject",
            "s3:ListBucket"
          ],
          "Resource": [
            "arn:aws:s3:::customer-bucket",
            "arn:aws:s3:::customer-bucket/*"
          ]
        }
      ]
    }
    
  3. Save the role and copy its ARN.

6. Share the Following with Optimove

To finalize the integration, please provide the Optimove team with the following details:

  • Fully-qualified table names (e.g., my_catalog.my_schema.uc_iceberg_table_test)
  • The PAT or service principal secret for the Unity Catalog API
  • The S3 bucket URI (e.g., s3://customer-bucket/)
  • The ARN of the IAM role you just created