Skip to main content

LLM KRL Model Sample usecase Google Gemma English to Tamil translation with LLM Respect Language models LLM_KRL models


​​To develop a multi-level contextual classification model for English-to-Tamil sentiment analysis, incorporating Google's Gemma models can enhance performance, especially for Tamil language processing.​​ Here's a structured approach:

1. Data Preparation:

Dataset Creation: ​​Compile a dataset containing English sentences, their Tamil translations, sentiment labels (positive/negative), and, for positive sentiments, an additional label indicating respect.​​

Example Data Structure:

​​| English Text                              | Tamil Translation                              | Sentiment | Respect (if Positive) | |-------------------------------------------|------------------------------------------------|-----------|-----------------------| | I am very happy to meet you               | உங்களை சந்திப்பதில் à®®ிகவுà®®் மகிà®´்ச்சி           | Positive  | Respect               | | I am disappointed with your work          | உங்கள் வேலைக்கு நான் வருத்தப்படுகிà®±ேன்          | Negative  |                       | | You have done an excellent job, well done | நீà®™்கள் சிறந்த வேலை செய்தீà®°்கள், நல்லது         | Positive  | Respect               | | This is not good, I expected better       | இது நல்லதல்ல, நான் நல்லவை எதிà®°்பாà®°்த்தேன்      | Negative  |                       |​​


2. Model Architecture:

Stage 1: ​​Sentiment Classification (Positive/Negative) using Gemma.​​

Stage 2: ​​For Positive sentiments, classify as Respect/Not Respect.​​


3. Implementation Steps:

import pandas as pd from sklearn.model_selection import train_test_split from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments import torch

Sample dataset

data = { 'text_english': [ "I am very happy to meet you", "I am disappointed with your work", "You have done an excellent job, well done", "This is not good, I expected better", "Thank you very much for your support", "I don't like your attitude", "I'm grateful for your guidance", "Your work lacks quality", "Well done, you've made us proud", "I appreciate your effort", "You are an inspiration", "I regret working with you", ], 'text_tamil': [ "உங்களை சந்திப்பதில் à®®ிகவுà®®் மகிà®´்ச்சி", "உங்கள் வேலைக்கு நான் வருத்தப்படுகிà®±ேன்", "நீà®™்கள் சிறந்த வேலை செய்தீà®°்கள், நல்லது", "இது நல்லதல்ல, நான் நல்லவை எதிà®°்பாà®°்த்தேன்", "உங்கள் உதவிக்காக à®®ிகவுà®®் நன்à®±ி", "உங்கள் அணுகுà®®ுà®±ை எனக்கு பிடிக்கவில்லை", "உங்கள் வழிகாட்டலுக்கு நான் நன்à®±ி கூà®±ுகிà®±ேன்", "உங்கள் வேலை தரம் குà®±ைவாக உள்ளது", "நல்ல செய்தி, நீà®™்கள் எங்களை பெà®°ுà®®ைப்படுத்தினீà®°்கள்", "உங்கள் à®®ுயற்சியை நான் பாà®°ாட்டுகிà®±ேன்", "நீà®™்கள் à®’à®°ு பேரனுபவம்", "உங்களுடன் பணியாà®±்à®±ியது வருத்தமாக உள்ளது", ], 'sentiment': [ 'positive', 'negative', 'positive', 'negative', 'positive', 'negative', 'positive', 'negative', 'positive', 'positive', 'positive', 'negative' ], 'respect': [ 'respect', None, 'respect', None, 'respect', None, 'respect', None, 'respect', 'respect', 'respect', None ] }

Convert data to DataFrame

df = pd.DataFrame(data)

Split the data

train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)

Initialize tokenizer and model

tokenizer = AutoTokenizer.from_pretrained('google/gemma-2b') model = AutoModelForSequenceClassification.from_pretrained('google/gemma-2b', num_labels=2)

Encoding function

def encode_data(texts, sentiments): inputs = tokenizer(texts.tolist(), return_tensors="pt", padding=True, truncation=True) labels = torch.tensor([1 if sentiment == "positive" else 0 for sentiment in sentiments]) return inputs, labels

Encode training and testing data

train_texts, train_labels = encode_data(train_df['text_tamil'], train_df['sentiment']) test_texts, test_labels = encode_data(test_df['text_tamil'], test_df['sentiment'])

Training arguments

training_args = TrainingArguments( output_dir='./results', evaluation_strategy="epoch", num_train_epochs=3, per_device_train_batch_size=4, per_device_eval_batch_size=4, )

Trainer

trainer = Trainer( model=model, args=training_args, train_dataset=train_labels, eval_dataset=test_labels )

Train the model

trainer.train()

Function for multi-level classification

def classify_text(text, model, tokenizer): inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) outputs = model(**inputs) probs = torch.nn.functional.softmax(outputs.logits, dim=-1) sentiment = 'positive' if torch.argmax(probs) == 1 else 'negative'

respect = None
if sentiment == 'positive':
    # Placeholder for respect detection logic
    respect_prob = random.choice([0.8, 0.2])
    respect = 'respect' if respect_prob > 0.5 else 'not respect'

return sentiment, respect

Example classification

for text in test_df['text_tamil'].tolist(): sentiment, respect = classify_text(text, model, tokenizer) print(f"Text: {text} | Sentiment: {sentiment} | Respect: {respect}")



Comments

Popular posts from this blog

"How to maintain or retain tabs in same tab after button click events or postback?" using JQuery in ASP.NET C#

In this post I'll share an details about " How to maintain or retain tabs in same tab after button click events or postback? " Step 1: you need to download Jquery and JQueryUI Javascript libraries from this site http://jqueryui.com/ Step 2: As usually you can create ASP.NET website from Visual Studio IDE and add Jquery and JqueryUI plugins in the header section of aspx page. Step 3: Add HiddenField control inside aspx page which is very useful to retain tab in same page Step 4: Use the HiddenField ID in Jquery code to indicate that CurrentTab Index Step 5: In code Behind, using Enumerations concept give the tab index values as user defined variable  Step 6: Use the Enum values in every Button click events on different tabs to check that tab could be retained in the same tab Further, Here I'll give the code details and snap shot pictures, 1. Default.aspx: Design Page First Second Third ...

Login and Registration forms in C# windows application with Back end Microsoft SQL Server for data access

In this article, I'm gonna share about how to make login and register form with MS SQL database; 1. Flow Chart Logic 2. Normal Features 3. Form Designs Login Form Design Sign in Form Design Password Retrieve Form 4. Database Design and SQL queries and Stored Procedure Create new Database as "schooldata" create table registerdata (  ID int identity,  Username nvarchar(100),  Password nvarchar(100),  Fullname  nvarchar(100),  MobileNO nvarchar(100),  EmailID nvarchar(100)  ) select * from registerdata create procedure regis (  @Username as nvarchar(100),  @Password as nvarchar(100),  @Fullname as nvarchar(100),  @MobileNO as nvarchar(100),  @EmailID as nvarchar(100)  ) as begin insert into registerdata (Username, Password, Fullname, MobileNO,EmailID) values (@Username, @Password, @Fullname, @MobileNO, @EmailID) ...

Guidewire Related Interview Question and answers part 1

common Guidewire questions and answers 20 Guidewire BC Q&A Top 100 Guidewire Interview FAQ Guidewire Claimcenter 20 Interview Questions Guidewire Rating concepts