Scikit - learn is an open - source machine learning library in Python. It provides simple and efficient tools for data mining and data analysis. Scikit - learn includes various machine learning algorithms such as classification, regression, clustering, and dimensionality reduction.
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It provides a wide range of plotting functions, such as line plots, scatter plots, bar plots, and histograms. Matplotlib allows users to customize every aspect of a plot, including colors, labels, and axes.
Seaborn is a Python data visualization library based on Matplotlib. It provides a high - level interface for creating attractive and informative statistical graphics. Seaborn simplifies the process of creating complex visualizations, such as box plots, violin plots, and heatmaps.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
# Generate a synthetic dataset
X, y = make_classification(n_samples=100, n_features=2, n_informative=2,
n_redundant=0, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a SVM classifier
clf = SVC(kernel='linear')
clf.fit(X_train, y_train)
# Create a meshgrid to plot the decision boundary
h = .02 # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
# Plot the decision boundary
plt.contourf(xx, yy, Z, alpha=0.4)
plt.scatter(X[:, 0], X[:, 1], c=y, alpha=0.8)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Decision Boundary of SVM Classifier')
plt.show()
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Train a random forest classifier
clf = RandomForestClassifier()
clf.fit(X, y)
# Get feature importances
importances = clf.feature_importances_
feature_names = iris.feature_names
# Plot feature importances
plt.bar(feature_names, importances)
plt.xlabel('Feature')
plt.ylabel('Importance')
plt.title('Feature Importance of Random Forest Classifier')
plt.show()
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
# Generate synthetic data
X, _ = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)
# Perform clustering
kmeans = KMeans(n_clusters=4, random_state=0).fit(X)
labels = kmeans.labels_
# Visualize the clustering results using Seaborn
sns.scatterplot(x=X[:, 0], y=X[:, 1], hue=labels, palette='viridis')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Clustering Results of K - Means')
plt.show()
Visualizing Scikit - learn models with Matplotlib and Seaborn is a powerful way to gain insights into how machine learning models work. By visualizing decision boundaries, feature importance, and clustering results, you can better understand the performance of your models and make more informed decisions. However, it is important to be aware of the common pitfalls and follow the best practices to create effective visualizations.