Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Matplotlib

Matplotlib permet crear tot tipus de gràfics en Python: línies, barres, pastís, dispersió, histogrames i més. Primer generes les dades (normalment amb NumPy), després “construeixes” el gràfic afegint elements com etiquetes, títols, llegendes i quadrícules. Finalment, plt.show() el mostra. Cada comanda que dibuixa dades afegeix un nou element visual, i pots personalitzar-los amb colors, tipus de línia, formes i altres estils. També és possible crear diverses figures i subplots per mostrar diferents gràfics dins del mateix espai, cosa que facilita comparar dades i visualitzacions.

Basics: Common Plot Types

Visualize data using line plots, scatter plots, histograms, bar charts, and pie charts.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(3)

Line Plot

  • Use plt.plot() for continuous data.
  • Add labels, title, grid, and legend.
x = np.linspace(0, 2 * np.pi, 200)
y1 = np.sin(x)
y2 = np.cos(x)

plt.plot(x, y1, label='sin(x)')
plt.plot(x, y2, label='cos(x)', linestyle='--')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.title('Sine and Cosine Waves')
plt.legend()
plt.grid(True)
plt.show()

png

Scatter Plot

  • plt.scatter() for showing individual data points.
  • Color and size can represent additional dimensions.
x = np.random.rand(100)
y = np.random.rand(100)
colors = np.random.rand(100)
sizes = 100 * np.random.rand(100)

plt.scatter(x, y, c=colors, s=sizes, alpha=0.6, cmap='viridis')
plt.colorbar(label='Color scale')
plt.title('Bubble Scatter Plot')
plt.xlabel('x')
plt.ylabel('y')
plt.show()

png

Histogram

  • plt.hist() to visualize distributions.
  • Adjust bins and density.
data = np.random.randn(1000)
plt.hist(data, bins=30, density=True, alpha=0.7)
plt.title('Normalized Histogram of Gaussian Data')
plt.xlabel('Value')
plt.ylabel('Density')
plt.show()

png

Bar Chart

  • plt.bar() for categorical comparisons.
  • Horizontal bar: plt.barh()
categories = ['A', 'B', 'C', 'D']
values = [5, 7, 3, 4]

plt.bar(categories, values, alpha=0.8)
plt.title('Vertical Bar Chart')
plt.xlabel('Category')
plt.ylabel('Value')
plt.show()

# Horizontal bar chart
plt.barh(categories, values, alpha=0.8)
plt.title('Horizontal Bar Chart')
plt.xlabel('Value')
plt.ylabel('Category')
plt.show()

png

png

Pie Chart

  • plt.pie() for composition of a whole.
labels = ['W', 'X', 'Y', 'Z']
sizes = [15, 30, 45, 10]
explode = (0, 0.1, 0, 0)  # only 'explode' the 2nd slice

plt.pie(sizes, labels=labels, autopct='%1.1f%%', explode=explode, shadow=True)
plt.title('Pie Chart Example')
plt.axis('equal')  # Equal aspect ensures pie is drawn as a circle.
plt.show()

png

Saving Figures

  • Use plt.savefig() to export plots.
# Example: Save the sine wave plot
x = np.linspace(0, 2 * np.pi, 200)  # Redefine x for the sine wave
y1 = np.sin(x)  # Redefine y1 based on the correct x

plt.figure()
plt.plot(x, y1)
plt.title('Sine Wave')
plt.savefig('data/sine_wave.png', dpi=150)
print('Saved figure as sine_wave.png')
Saved figure as sine_wave.png

png

Subplots & Advanced Plots

Learn to create complex layouts and advanced visualizations such as error bars and images.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(4)

Creating Subplots

  • plt.subplots(nrows, ncols) to create grid of axes.
  • Adjust layouts with figsize, tight_layout.
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
axs[0, 0].plot(np.sin(np.linspace(0, 2*np.pi, 100)))
axs[0, 0].set_title('Sine Wave')

axs[0, 1].bar(['A', 'B', 'C'], [3, 5, 2])
axs[0, 1].set_title('Bar Plot')

axs[1, 0].hist(np.random.randn(500), bins=20)
axs[1, 0].set_title('Histogram')

axs[1, 1].scatter(np.random.rand(50), np.random.rand(50), c=np.random.rand(50), cmap='plasma')
axs[1, 1].set_title('Scatter Plot')

plt.tight_layout()
plt.show()

png

Shared Axes and Figure-Level Settings

  • Share x or y axes across subplots.
  • Add a main title with fig.suptitle().
fig, axs = plt.subplots(2, 1, sharex=True, figsize=(8, 6))
x = np.linspace(0, 10, 100)
axs[0].plot(x, np.sin(x))
axs[0].set_title('Sine')
axs[1].plot(x, np.cos(x))
axs[1].set_title('Cosine')
fig.suptitle('Shared X-Axis Example')
plt.show()

png

Error Bars

  • Use plt.errorbar() to show uncertainties.
x = np.arange(5)
y = np.random.rand(5)
yerr = 0.1 + 0.2 * np.random.rand(5)
plt.errorbar(x, y, yerr=yerr, fmt='o-', capsize=5)
plt.title('Error Bars Example')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()

png

Image Plot

  • Display 2D arrays as images with plt.imshow().
img = np.random.rand(10, 10)
plt.imshow(img, interpolation='nearest')
plt.title('Random Image')
plt.colorbar()
plt.show()

png

Advanced: Contour and Heatmap

  • Use plt.contour() and plt.imshow() for heatmaps.
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
Z = np.exp(-(X**2 + Y**2))

plt.figure(figsize=(6,5))
contours = plt.contour(X, Y, Z, levels=6)
plt.clabel(contours, inline=True)
plt.title('Contour Plot')
plt.show()

# Heatmap
plt.figure(figsize=(6,5))
plt.imshow(Z, origin='lower', extent=[-3,3,-3,3])
plt.title('Heatmap')
plt.colorbar()
plt.show()

png

png

Example project

This section guides you through generating, analyzing, and visualizing a synthetic dataset for classification.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
np.random.seed(5)

Generate Synthetic Data

  • Use make_classification to create a dataset.
X, y = make_classification(
    n_samples=200,
    n_features=4,
    n_informative=2,
    n_redundant=0,
    n_clusters_per_class=1,
    flip_y=0.01,
    class_sep=1.5,
    random_state=5
)
print('Features shape:', X.shape)
print('Labels distribution:', np.bincount(y))
Features shape: (200, 4)
Labels distribution: [ 99 101]

Explore Feature Distributions

  • Plot histograms for each feature.
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
for i in range(4):
    ax = axs.flat[i]
    ax.hist(X[:, i], bins=20, color='C{}'.format(i), alpha=0.7)
    ax.set_title(f'Feature {i} Distribution')
plt.tight_layout()
plt.show()

png

Pairwise Scatter Plots

  • Visualize relationships between pairs of features.
fig, axs = plt.subplots(2, 3, figsize=(12, 8))
pairs = [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
for ax, (i, j) in zip(axs.flat, pairs):
    ax.scatter(X[:, i], X[:, j], c=y, cmap='coolwarm', alpha=0.6)
    ax.set_xlabel(f'Feature {i}')
    ax.set_ylabel(f'Feature {j}')
    ax.set_title(f'{i} vs {j}')
plt.tight_layout()
plt.show()

png

Compute Basic Statistics

  • Calculate means and standard deviations per class.
classes = np.unique(y)
for cls in classes:
    cls_data = X[y == cls]
    print(f'Class {cls}: mean={cls_data.mean(axis=0)}, std={cls_data.std(axis=0)}')
Class 0: mean=[-1.51904518 -0.13961004 -1.51551924 -0.10388006], std=[0.94565383 0.92078405 0.53456453 0.88192832]
Class 1: mean=[-1.62540423  0.03701214  1.51590761  0.1083452 ], std=[0.85187348 0.96153734 0.53164293 0.91483422]

Train/Test Split & Visualization

  • Split data using NumPy and plot class distributions.
indices = np.arange(len(y))
np.random.shuffle(indices)
split = int(0.8 * len(indices))
train_idx, test_idx = indices[:split], indices[split:]
X_train, X_test = X[train_idx], X[test_idx]
y_train, y_test = y[train_idx], y[test_idx]

# Visualize train vs test in feature 0 & 1
plt.figure(figsize=(6,5))
plt.scatter(X_train[:,0], X_train[:,1], c=y_train, cmap='viridis', label='Train', alpha=0.6)
plt.scatter(X_test[:,0], X_test[:,1], c=y_test, cmap='coolwarm', marker='x', label='Test', alpha=0.6)
plt.xlabel('Feature 0')
plt.ylabel('Feature 1')
plt.legend()
plt.title('Train vs Test Data')
plt.show()

png

Save Processed Data

  • Optionally save arrays to disk for later use.
np.save('data/X_train.npy', X_train)
np.save('data/X_test.npy', X_test)
np.save('data/y_train.npy', y_train)
np.save('data/y_test.npy', y_test)
print('Saved .npy files in data directory.')
Saved .npy files in data directory.
Last change: , commit: ec88a2f