SQL Queries Code Interviews QA 50 plus

Image
Here are SQL-focused interview questions with only the relevant SQL code: 1. Find the second highest salary from an Employee table. SELECT MAX(Salary) AS SecondHighestSalary FROM Employees WHERE Salary < (SELECT MAX(Salary) FROM Employees); Using ROW_NUMBER(): WITH RankedSalaries AS (   SELECT Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) AS Rank   FROM Employees ) SELECT Salary AS SecondHighestSalary FROM RankedSalaries WHERE Rank = 2; --- 2. Write a query to calculate a running total of sales. SELECT   OrderID,   OrderDate,   Amount,   SUM(Amount) OVER (ORDER BY OrderDate) AS RunningTotal FROM Orders; --- 3. Retrieve customers who placed no orders using a LEFT JOIN. SELECT c.CustomerID, c.CustomerName FROM Customers c LEFT JOIN Orders o ON c.CustomerID = o.CustomerID WHERE o.OrderID IS NULL; --- 4. Write a query to find the top 3 highest salaries. SELECT DISTINCT Salary FROM Employees ORDER BY Salary DESC LIMIT 3; Using DENSE_RANK(): WIT...

Guidewire Self Managed version H2 Tables ingest into Databricks Delta Tables

Prerequisite 

1. In All Guidewire xCenters import sample data

2. Create account in Databricks Community edition, In this edition Databricks is Free



Step 1: Query H2 tables

'' failed to upload. Invalid response: RpcError




Step 2: Save Resultsets into CSV files and rename prefix with clm_



Step 3: Upload all these CSV into Databricks Filestore




Click Create Table then Below pop up will open and upload all csv files






Step 4:  Check it in Catalog--> FileStores --> All Clm_ prefix files are available !!



Step 5: Ingested CLM_ csv prefix files now it will be created as delta tables in Databricks as well as migrate to new guidewire raw database

PySpark Code has below 

import os
from pyspark.sql import SparkSession
from pyspark.sql.types import IntegerType
from pyspark.sql.functions import col

# Initialize Spark session (usually pre-initialized in Databricks)
spark = SparkSession.builder.appName("CreateTablesFromCSV").getOrCreate()

# Define the base path for FileStore
base_path = "dbfs:/FileStore/"

# Define file prefixes to look for
file_prefixes = ["clm_"]

# Iterate over file prefixes and create tables based on file names
for prefix in file_prefixes:
    # List all CSV files that start with the given prefix in the FileStore
    csv_files = dbutils.fs.ls(base_path)
   
    # Filter for CSV files that match the prefix
    prefix_files = [f.path for f in csv_files if f.name.startswith(prefix) and f.name.endswith(".csv")]
   
    # Load each CSV file and create a table
    for file_path in prefix_files:
        # Extract the table name from the file name (remove ".csv")
        table_name = os.path.basename(file_path).replace(".csv", "")
       
        # Load the CSV file into a DataFrame
        df = spark.read.format("csv").option("header", "true").load(file_path)
       
        # Change the ID column data type from string to Integer, if it exists
        if 'ID' in df.columns:
            df = df.withColumn("ID", col("ID").cast(IntegerType()))
       
        # Create a table with the same name as the file (without .csv extension)
        df.write.format("delta").mode("overwrite").saveAsTable(table_name)
       
        # Print success message
        print(f"Table '{table_name}' created successfully from file '{file_path}'.")

 Log info 

Table 'clm_cc_claim_h2' created successfully from file 'dbfs:/FileStore/clm_cc_claim_h2.csv'. Table 'clm_cc_contact_h2' created successfully from file 'dbfs:/FileStore/clm_cc_contact_h2.csv'. Table 'clm_cc_credential_h2' created successfully from file 'dbfs:/FileStore/clm_cc_credential_h2.csv'. Table 'clm_cc_group_h2' created successfully from file 'dbfs:/FileStore/clm_cc_group_h2.csv'. Table 'clm_cc_user_h2' created successfully from file 'dbfs:/FileStore/clm_cc_user_h2.csv'. Table 'clm_cctl_claimstate_h2' created successfully from file 'dbfs:/FileStore/clm_cctl_claimstate_h2.csv'. Table 'clm_cctl_grouptype_h2' created successfully from file 'dbfs:/FileStore/clm_cctl_grouptype_h2.csv'. 

Table 'clm_cctl_lobcode_h2' created successfully from file 'dbfs:/FileStore/clm_cctl_lobcode_h2.csv'.



 

Now these tables are available in default database, let's create new database related to guidewire and then move all these clm prefix table into guidewire claim raw table

PySpark Code below for this

from pyspark.sql import SparkSession

# Initialize Spark session (usually pre-initialized in Databricks)
spark = SparkSession.builder.appName("MoveTablesToNewDatabase").getOrCreate()

# Define the new database name
new_database_name = "gw_claims_raw_db"

# Step 1: Create a new database (if it doesn't exist)
spark.sql(f"CREATE DATABASE IF NOT EXISTS {new_database_name}")

# Step 2: List all tables in the default database
default_tables = [row.name for row in spark.catalog.listTables('default')]

print(f"Tables found in default database: {default_tables}")

# Step 3: Copy each table from default database to the new database
for table_name in default_tables:
    # Read the table from the default database
    df = spark.table(f"default.{table_name}")
   
    # Write the table to the new database
    df.write.format("delta").mode("overwrite").saveAsTable(f"{new_database_name}.{table_name}")
   
    # Print success message
    print(f"Table '{table_name}' moved successfully from 'default' to '{new_database_name}'.")


Log message has below


Tables found in default database: ['clm_cc_claim_h2', 'clm_cc_contact_h2', 'clm_cc_credential_h2', 'clm_cc_group_h2', 'clm_cc_user_h2', 'clm_cctl_claimstate_h2', 'clm_cctl_grouptype_h2', 'clm_cctl_lobcode_h2'] Table 'clm_cc_claim_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_contact_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_credential_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_group_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cc_user_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cctl_claimstate_h2' moved successfully from 'default' to 'gw_claims_raw_db'. Table 'clm_cctl_grouptype_h2' moved successfully from 'default' to 'gw_claims_raw_db'. 

Table 'clm_cctl_lobcode_h2' moved successfully from 'default' to 'gw_claims_raw_db'.



These above Pyspark published in the below link for reusability purpose

GW Tables Created and Moved to Guidewire Raw Database v1.0 Published 

it will be look like this below 


This is done with Guidewire Self Managed version 

from H2 to CSV tables

from CSV to delta tables in databricks notebook 

Finally


Now create new notebook and then exploring guidewire claims raw tables via simple SQL Queries

In Next Blog Post,

will see more Databricks SQL queries for Guidewire Claims Raw Tables and Queries


Additional Notes: It is Guidewire Self-managed version and In Guidewire Cloud we have s3 bucket parquet files and other options like OSS to convert Parquet to csv files !

For enterprise firm versions on Guidewire Cloud it has capabilities like Data Engineer team to load these type csv to delta tables it will be automated where as here we are doing manual process to create csv files

 Data science team able to work on ML experiments like databricks environment




 



Comments

Popular posts from this blog

"How to maintain or retain tabs in same tab after button click events or postback?" using JQuery in ASP.NET C#

Login and Registration forms in C# windows application with Back end Microsoft SQL Server for data access