Getting Started¶
Installation¶
Install ggplotly using pip:
pip install ggplotly
Dependencies¶
ggplotly requires:
- Python 3.8+
- pandas
- plotly
- numpy
Optional dependencies for specific features:
# For geographic plots (geom_map, geom_sf)
pip install geopandas shapely
# For smoothing (geom_smooth, stat_smooth)
pip install scipy scikit-learn
Basic Usage¶
Import¶
In [1]:
Copied!
from ggplotly import *
import pandas as pd
import numpy as np
from ggplotly import *
import pandas as pd
import numpy as np
Your First Plot¶
In [2]:
Copied!
# Create some data
df = pd.DataFrame({
'x': [1, 2, 3, 4, 5],
'y': [2, 4, 3, 5, 4]
})
# Create a scatter plot
ggplot(df, aes(x='x', y='y')) + geom_point()
# Create some data
df = pd.DataFrame({
'x': [1, 2, 3, 4, 5],
'y': [2, 4, 3, 5, 4]
})
# Create a scatter plot
ggplot(df, aes(x='x', y='y')) + geom_point()
Out[2]:
The Grammar of Graphics¶
Every ggplotly visualization is built from these components:
- Data - A pandas DataFrame
- Aesthetics (
aes()) - Map columns to visual properties - Geoms - Geometric objects that represent data (points, lines, bars)
- Optional layers - Scales, themes, facets, labels
In [3]:
Copied!
# More complete sample data
np.random.seed(42)
df = pd.DataFrame({
'x': np.random.randn(50),
'y': np.random.randn(50),
'category': np.random.choice(['A', 'B', 'C'], 50),
'values': np.random.rand(50) * 10
})
(
ggplot(df, aes(x='x', y='y', color='category')) # Data + aesthetics
+ geom_point(size=5) # Geom
+ scale_color_brewer(palette='Set1') # Scale
+ theme_minimal() # Theme
+ labs(title='My Plot', x='X Axis', y='Y Axis') # Labels
)
# More complete sample data
np.random.seed(42)
df = pd.DataFrame({
'x': np.random.randn(50),
'y': np.random.randn(50),
'category': np.random.choice(['A', 'B', 'C'], 50),
'values': np.random.rand(50) * 10
})
(
ggplot(df, aes(x='x', y='y', color='category')) # Data + aesthetics
+ geom_point(size=5) # Geom
+ scale_color_brewer(palette='Set1') # Scale
+ theme_minimal() # Theme
+ labs(title='My Plot', x='X Axis', y='Y Axis') # Labels
)
Out[3]:
In [4]:
Copied!
ggplot(df, aes(x='x', y='y', color='category')) + geom_point()
ggplot(df, aes(x='x', y='y', color='category')) + geom_point()
Out[4]:
Multiple Geoms¶
In [5]:
Copied!
ggplot(df, aes(x='x', y='y')) + geom_point() + geom_smooth()
ggplot(df, aes(x='x', y='y')) + geom_point() + geom_smooth()
Out[5]:
Faceting¶
In [6]:
Copied!
ggplot(df, aes(x='x', y='y')) + geom_point() + facet_wrap('category')
ggplot(df, aes(x='x', y='y')) + geom_point() + facet_wrap('category')
Out[6]:
Using Index as X-Axis¶
In [7]:
Copied!
# Create time series data with index
ts_df = pd.DataFrame({
'values': np.cumsum(np.random.randn(50)) + 50
}, index=pd.date_range('2024-01-01', periods=50, freq='D'))
# Automatically use DataFrame index
ggplot(ts_df, aes(y='values')) + geom_line()
# Create time series data with index
ts_df = pd.DataFrame({
'values': np.cumsum(np.random.randn(50)) + 50
}, index=pd.date_range('2024-01-01', periods=50, freq='D'))
# Automatically use DataFrame index
ggplot(ts_df, aes(y='values')) + geom_line()
Out[7]:
Built-in Datasets¶
ggplotly includes classic datasets for learning:
In [8]:
Copied!
from ggplotly import data
# See all available datasets
data()
from ggplotly import data
# See all available datasets
data()
Out[8]:
['commodity_prices', 'diamonds', 'economics', 'economics_long', 'faithfuld', 'iris', 'luv_colours', 'midwest', 'mpg', 'msleep', 'mtcars', 'presidential', 'seals', 'txhousing', 'us_flights', 'us_flights_edges', 'us_flights_nodes']
In [9]:
Copied!
# Load the iris dataset
iris = data('iris')
ggplot(iris, aes(x='sepal_length', y='sepal_width', color='species')) + geom_point()
# Load the iris dataset
iris = data('iris')
ggplot(iris, aes(x='sepal_length', y='sepal_width', color='species')) + geom_point()
Out[9]:
In [10]:
Copied!
# Add borders to points
ggplot(iris, aes(x='sepal_length', y='sepal_width', color='species')) + geom_point(size=10, stroke=2)
# Add borders to points
ggplot(iris, aes(x='sepal_length', y='sepal_width', color='species')) + geom_point(size=10, stroke=2)
Out[10]:
Labels with Background (geom_label)¶
In [11]:
Copied!
# Text with background boxes for better readability
label_df = pd.DataFrame({
'x': [1, 2, 3], 'y': [2, 4, 3],
'label': ['Alpha', 'Beta', 'Gamma']
})
(ggplot(label_df, aes(x='x', y='y'))
+ geom_point(size=12)
+ geom_label(aes(label='label'), fill='white', alpha=0.8))
# Text with background boxes for better readability
label_df = pd.DataFrame({
'x': [1, 2, 3], 'y': [2, 4, 3],
'label': ['Alpha', 'Beta', 'Gamma']
})
(ggplot(label_df, aes(x='x', y='y'))
+ geom_point(size=12)
+ geom_label(aes(label='label'), fill='white', alpha=0.8))
Out[11]:
Fixed Aspect Ratio (coord_fixed)¶
In [12]:
Copied!
# Ensure equal scaling - circles appear as circles
theta = np.linspace(0, 2*np.pi, 50)
circle = pd.DataFrame({'x': np.cos(theta), 'y': np.sin(theta)})
ggplot(circle, aes(x='x', y='y')) + geom_path(size=2) + coord_fixed(ratio=1)
# Ensure equal scaling - circles appear as circles
theta = np.linspace(0, 2*np.pi, 50)
circle = pd.DataFrame({'x': np.cos(theta), 'y': np.sin(theta)})
ggplot(circle, aes(x='x', y='y')) + geom_path(size=2) + coord_fixed(ratio=1)
Out[12]:
Saving Plots¶
# Create plot
p = ggplot(df, aes(x='x', y='y')) + geom_point()
# Save as HTML (interactive)
p.write_html('plot.html')
# Save as image (requires kaleido)
p.write_image('plot.png')
# Or use ggsave
ggsave(p, 'plot.png', width=800, height=600)
Next Steps¶
- Aesthetics Guide - Learn about mapping data to visual properties
- Geoms Reference - Explore all available geometric objects
- Themes - Customize the look of your plots