Guidewire Self Managed version H2 Tables ingest into Databricks Delta Tables

Prerequisite

1. In All Guidewire xCenters import sample data

2. Create account in Databricks Community edition, In this edition Databricks is Free

Step 1: Query H2 tables

Step 2: Save Resultsets into CSV files and rename prefix with clm_

Step 3: Upload all these CSV into Databricks Filestore

Click Create Table then Below pop up will open and upload all csv files

Step 4: Check it in Catalog--> FileStores --> All Clm_ prefix files are available !!

Step 5: Ingested CLM_ csv prefix files now it will be created as delta tables in Databricks as well as migrate to new guidewire raw database

PySpark Code has below

import os
from pyspark.sql import SparkSession
from pyspark.sql.types import IntegerType
from pyspark.sql.functions import col

# Initialize Spark session (usually pre-initialized in Databricks)
spark = SparkSession.builder.appName("CreateTablesFromCSV").getOrCreate()

# Define the base path for FileStore
base_path = "dbfs:/FileStore/"

# Define file prefixes to look for
file_prefixes = ["clm_"]

# Iterate over file prefixes and create tables based on file names
for prefix in file_prefixes:
# List all CSV files that start with the given prefix in the FileStore
csv_files = dbutils.fs.ls(base_path)

# Filter for CSV files that match the prefix
prefix_files = [f.path for f in csv_files if f.name.startswith(prefix) and f.name.endswith(".csv")]

# Load each CSV file and create a table
for file_path in prefix_files:
# Extract the table name from the file name (remove ".csv")
table_name = os.path.basename(file_path).replace(".csv", "")

# Load the CSV file into a DataFrame
df = spark.read.format("csv").option("header", "true").load(file_path)

# Change the ID column data type from string to Integer, if it exists
if 'ID' in df.columns:
df = df.withColumn("ID", col("ID").cast(IntegerType()))

# Create a table with the same name as the file (without .csv extension)
df.write.format("delta").mode("overwrite").saveAsTable(table_name)

# Print success message
print(f"Table '{table_name}' created successfully from file '{file_path}'.")

Log info

Table 'clm_cc_claim_h2' created successfully from file 'dbfs:/FileStore/clm_cc_claim_h2.csv'. Table 'clm_cc_contact_h2' created successfully from file 'dbfs:/FileStore/clm_cc_contact_h2.csv'. Table 'clm_cc_credential_h2' created successfully from file 'dbfs:/FileStore/clm_cc_credential_h2.csv'. Table 'clm_cc_group_h2' created successfully from file 'dbfs:/FileStore/clm_cc_group_h2.csv'. Table 'clm_cc_user_h2' created successfully from file 'dbfs:/FileStore/clm_cc_user_h2.csv'. Table 'clm_cctl_claimstate_h2' created successfully from file 'dbfs:/FileStore/clm_cctl_claimstate_h2.csv'. Table 'clm_cctl_grouptype_h2' created successfully from file 'dbfs:/FileStore/clm_cctl_grouptype_h2.csv'.

Table 'clm_cctl_lobcode_h2' created successfully from file 'dbfs:/FileStore/clm_cctl_lobcode_h2.csv'.

Now these tables are available in default database, let's create new database related to guidewire and then move all these clm prefix table into guidewire claim raw table

PySpark Code below for this

from pyspark.sql import SparkSession

# Initialize Spark session (usually pre-initialized in Databricks)
spark = SparkSession.builder.appName("MoveTablesToNewDatabase").getOrCreate()

# Define the new database name
new_database_name = "gw_claims_raw_db"

# Step 1: Create a new database (if it doesn't exist)
spark.sql(f"CREATE DATABASE IF NOT EXISTS {new_database_name}")

# Step 2: List all tables in the default database
default_tables = [row.name for row in spark.catalog.listTables('default')]

print(f"Tables found in default database: {default_tables}")

# Step 3: Copy each table from default database to the new database
for table_name in default_tables:
# Read the table from the default database
df = spark.table(f"default.{table_name}")

# Write the table to the new database
df.write.format("delta").mode("overwrite").saveAsTable(f"{new_database_name}.{table_name}")

# Print success message
print(f"Table '{table_name}' moved successfully from 'default' to '{new_database_name}'.")

Log message has below

Tables found in default database: ['clm_cc_claim_h2', 'clm_cc_contact_h2', 'clm_cc_credential_h2', 'clm_cc_group_h2', 'clm_cc_user_h2', 'clm_cctl_claimstate_h2', 'clm_cctl_grouptype_h2', 'clm_cctl_lobcode_h2'] Table 'clm_cc_claim_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_contact_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_credential_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_group_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_user_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cctl_claimstate_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cctl_grouptype_h2' moved successfully from 'default' to 'gw_claims_raw_db'.

Table 'clm_cctl_lobcode_h2' moved successfully from 'default' to 'gw_claims_raw_db'.

These above Pyspark published in the below link for reusability purpose

GW Tables Created and Moved to Guidewire Raw Database v1.0 Published

it will be look like this below

This is done with Guidewire Self Managed version

from H2 to CSV tables

from CSV to delta tables in databricks notebook

Finally

Now create new notebook and then exploring guidewire claims raw tables via simple SQL Queries

In Next Blog Post,

will see more Databricks SQL queries for Guidewire Claims Raw Tables and Queries

Additional Notes: It is Guidewire Self-managed version and In Guidewire Cloud we have s3 bucket parquet files and other options like OSS to convert Parquet to csv files !

For enterprise firm versions on Guidewire Cloud it has capabilities like Data Engineer team to load these type csv to delta tables it will be automated where as here we are doing manual process to create csv files

Data science team able to work on ML experiments like databricks environment

"How to maintain or retain tabs in same tab after button click events or postback?" using JQuery in ASP.NET C#

In this post I'll share an details about " How to maintain or retain tabs in same tab after button click events or postback? " Step 1: you need to download Jquery and JQueryUI Javascript libraries from this site http://jqueryui.com/ Step 2: As usually you can create ASP.NET website from Visual Studio IDE and add Jquery and JqueryUI plugins in the header section of aspx page. Step 3: Add HiddenField control inside aspx page which is very useful to retain tab in same page Step 4: Use the HiddenField ID in Jquery code to indicate that CurrentTab Index Step 5: In code Behind, using Enumerations concept give the tab index values as user defined variable Step 6: Use the Enum values in every Button click events on different tabs to check that tab could be retained in the same tab Further, Here I'll give the code details and snap shot pictures, 1. Default.aspx: Design Page First Second Third ...

Kumaran1987

Search This Blog

Guidewire Self Managed version H2 Tables ingest into Databricks Delta Tables

Labels

Comments

Post a Comment

Popular posts from this blog

"How to maintain or retain tabs in same tab after button click events or postback?" using JQuery in ASP.NET C#

JSP and Servlet Form Submission without page refresh show results on the same page using Jquery AJAX

Login and Registration forms in C# windows application with Back end Microsoft SQL Server for data access