Friday, 31 March 2023

How to encode text to numeric using fit_transform

 Hi all,

Qn) How to encode text to numeric using fit_transform

Ans)

import numpy as np
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer

df = pd.read_csv("hupassg.tsv", sep='\t', encoding='ISO-8859-1');

vectorizer1 = CountVectorizer(max_features = 10000, ngram_range=(13), stop_words='english')
count_vector1 = vectorizer1.fit_transform(df['clean_assg'])
feature_names1 = vectorizer1.get_feature_names_out()
data1 = df[['assg_set','clean_assg','final_score']].copy()
X = count_vectors1.toarray()
y = data1['final_score'].to_numpy()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)

Save and load model in python using pickle

 Dear all,

Qn) How to save and load a model using Pickle in Python

Ans)

import pickle
filename = 'hupscoringsvm.sav'
pickle.dump(model, open(filename, 'wb'))

--------
import pickle
filename = 'hupscoringsvm.sav'
loaded_model = pickle.load(open(filename, 'rb'))
y_pred=loaded_model.predict(X_test)
print(y_pred)


Save Feature using Pickle

 Dear all,

Qn. How can we save features using pickle in python

Ans. 

import pickle
pickle.dump(X, open('X.pkl''wb'))
pickle.dump(y, open('y.pkl''wb'))
X = pickle.load(open('X.pkl''rb'))
y = pickle.load(open('y.pkl''rb'))
//And split to make testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)