Pallas Athena

Understanding the Basis of the Separating Axis Theorem (SAT)

Nick Nagel's SVG Scene Graph Series

  1. Understanding and using the SVG Scene Graph
  2. Bases and Affine Transformations
  3. The Separating Axis Theorem

Rationale

This article is part of a series I've been working on centered around the scene-graph conceptualization of Scalable Vector Graphics (SVG). The motivation for the series stems from work I've been undertaking to add virtual camera functionality to my SVG Creator's framework. Specifically, I needed to simulate camera features (like zoom, panning and tilt) to facilitate the creation of interactive SVG artworks.

In the previous article in the series I shared some insights that occurred to me relating the concept of mathematical bases and basis vectors to a generalized process for understanding and using SVG transformations within the scene-graph. The present piece dovetails off the previous by looking at how projection and basis reasoning lead to robust geometric constraints definitions (e.g., collisions, containment, and panning limits).

The Problem

The problem at hand concerned the development of an SVG virtual camera system. One of the things I enjoy about working with SVG is the ability to create artworks with varying levels of detail and set up systems that allow viewers to navigate the work and appreciate features from different perspectives.

For example, below you see some SVG artwork which illustrates the concept of the "World Tree" from Norse mythology. In it, I've set up a camera system to enable viewers to zoom in an pan around to explore some of the details hidden away in the more distant perspective. The problem for this piece (and it's a general problem for any number of applications like games, sims, or advertisements) is that virtual camera panning has to be constrained or the viewer will get lost. Think about it for a minute. How would you express constraints preventing the user from panning "out of bounds"?

Live Interactive Illustration

A touch! 'tis but a touch!
Figure 1: The world Tree by 𝒩. An SVG artwork depicting Yggdrasil -- the world tree from Norse mythology. The figure shows the artwork along with debug scaffolding (the red, green and blue box outlines) used to develop virtual camera functionality.

The artwork is intended to let the viewer zoom in and pan around to look at details. But if the user pans too far, they'll fall off the edge of the world, right? We can't have that! So we need to constrain the pannable area whenever the user is in a zoomed in state.

For this article, I've left visible the test scaffolding I used to develop the camera functionality. I've drawn the SVG world coordinate system axes (shown in blue) and also the local coordinate system axes for the camera (shown in red). The point of this article is I want you to think of these coordinate systems as bases in the geometric sense: each basis defines how vectors, distances, and shapes are interpreted in its own frame of reference.

I've also drawn rectangles within each basis.

  • The blue rectangle corresponds to my target viewBox dimensions in world coordinates.

  • The red rectangle corresponds to the transformed viewBox under the camera's affine transform.

Initially, when the artwork loads, the global basis and the camera basis coincide. But as soon as the user zooms in, the camera basis is scaled up, and its rectangle becomes larger than the SVG viewport in world coordinates.

To express the panning constraints in this larger context, I added a green rectangle. This perimeter defines the set of legal panning positions -- boundaries within which the camera is allowed to move without exposing empty space. At the initial zoom level, the green box sits neatly inside the blue world box. But once the user zooms in, the camera's viewBox becomes larger in world units. The green region becomes the area to which the camera view must be constrained in order to keep the world fully visible.

Whew! Now that we've established these two bases -- the world basis and the scaled camera basis we're in the perfect position to enforce panning constraints using the Separating of Axis Theorem.

The Separating Axis Theorem

The Separating Axis Theorem (SAT) sets up a computer graphics algorithm that is often used in gaming and simulation to test for the intersection of convex polygons for, e.g., collision detection. Here, I'll extend it to test for containment.

In its classic form, SAT provides a means to determine whether two convex shapes overlap. I first applied this algorithm developing a flight simulation for Nellis Airforce Base. Back then I used it to identify when pilots flew out of bounds in a virtual training course. Here, I'll extend the idea to add functionality to the camera system in the SVG scene graph.

Go ahead and play with the world tree for a bit.

  • Double click (or spread on mobile) to "zoom" the camera in order to see the camera basis scale up.

  • Once zoomed in drag or swipe to pan. Notice how the green camera constraint box moves relative to the blue world-box.

At first, the green constraint box is fully inside the world. But once you've zoomed in, the camera basis grows and the constraint box now overflows the world box. The system is only "happy" when the entire green box is contained inside the blue box.

So how do we test for that?

The naive conditional approach

The most obvious way (and probably the first thing any artist-engineer thinks of) is to check vertices manually:

IF
   CAM.upper_left   inside WORLD &&
   CAM.lower_left   inside WORLD &&
   CAM.upper_right  inside WORLD &&
   CAM.lower_right  inside WORLD
THEN
   all good
ELSE
   not so much ...

Sure that'll work. Here. For this specific case. Because it involves only axis-aligned rectangles. But it's also brittle, verbose and limited.

It does not generalize to:

  • rotated rectangles,
  • arbitrary convex shapes,
  • concave shapes (without preprocessing),
  • or any situation where the axes don't line up.

The better general purpose solution

What we really want is one clean, elegant test that works for any convex polygons -- of any number of vertices, at any orientation, under any transform. Standing on the shoulders of Hermann Minkowski, we arrive at the clean, mathematical version using the Separating Axis Theorem.

The Separating Axis Theorem:

Two convex polygons $A$ and $B$ are disjoint (i.e., they do not overlap or touch) if, and only if, there exists at least one line (let that be called a separating axis) such that the projections of the two polygons onto that line are also disjoint.

Simple, right? No, really it's a mouthful. On paper, that statement looks compact, but it's carrying a lot of meaning. It's one of those deceptively simple phrases that packs an entire geometric worldview into a single sentence.

If you're a visual thinker (and if you're reading this blog, you probably are), the best way to understand SAT is to see it. To that end I've created an interactive diagram hoping to to make the idea tangible, and to make the algorithm feel less like abstract algebra and more like geometry you can touch.

Bases in Action: SAT Illustrated

In this section, we'll unpack the theorem step-by-step using an explicit example: two triangles. We'll use math to create sweeping axes -- 1-dimensional basis directions onto which we can project the "shadows" of each polygon and look for separation. Think of it as a taste of linear algebra for artists.

Figure 2: Two triangles A and B. Are they overlapping? Easy enough to see, right? But can we prove it to the machine ... ?

Let's start with the two triangles I've shown in Figure 2.

$$ \begin{align*} \text{Triangle } A: & \quad A_1=(0, 0), \quad A_2=(2, 0), \quad A_3=(1, 2) \\ \text{Triangle } B: & \quad B_1=(3, 1), \quad B_2=(5, 1), \quad B_3=(4, 3) \end{align*} $$

On visual inspection it's pretty easy to see they're separated, right? So how do we prove that with separating axes?

Step 1. Pick an axis -- any axis ...

Well, maybe not just any axis (the choice is, after all, infinite). SAT tells us that for our purposes it's enough to test axes that come from the normals of the polygon edges. That's a finite and tractable set.

Let's start with the edge $B_1 \rightarrow B_2$. Try tapping $Triangle B$ to see this edge as a vector:

$$ \vec{E}_{B_{1}B_{2}} = B_2 - B_1 = (5-3, 1-1) = (2, 0) $$

2. Use the normal as a basis for projection

Now this brings us to the heart of the discussion. Once we have an edge, we can obtain its normal (a line indicating a direction perpendicular to the edge) and use it as a basis for the projection. What this gives us is a way of mapping from our 2D svg world coordinates to a one-dimensional basis on which we can project our triangles -- and look for separation.

Insight!

For me, the key insight to understanding the SAT is to realize that the separating axes are geometric bases for projection.

Here's the normal for edge $\vec{E_{B_1B_2}}$ .

$$ \vec{N}_1 = (-E_y, E_x) = (0, 2) $$

Tap $Triangle B$ again to see the formula in action. And tap once more to pull the normal off to the side.

3. Project the triangles on the normal basis

With our edge normal obtained, we now have a basis on which to project the triangles. Conceptually, projection is easy to grock. We project the polygon by mapping its vertices, $V_i$, onto the normal basis using the dot-product:

$$ Proj( V_i ) = V_i \cdot \vec n $$

Projecting a single vertex gives us a scalar value. Projecting all the vertices in a polygon gives us an interval along the basis. Think of it as a shadow cast by the polygon.

The shadows of the two polygons are useful measures. We can use them to look for a gap indicating separation! And to satisfy the Separating Axis Theorem all we need to find is one (gap that is). So for $Triangle B$ we project:

$$ \begin{align*} \text{Proj}(B_1) &= (3, 1) \cdot (0, 2) = 2 \\ \text{Proj}(B_2) &= (5, 1) \cdot (0, 2) = 2 \\ \text{Proj}(B_3) &= (4, 3) \cdot (0, 2) = 6 \end{align*} $$

And the projections give us interval:

$$ I_B = [2, 6] $$

Now for Triangle $A$ :

\begin{align*} \text{Proj}(A_1) &= (0, 0) \cdot (0, 2) = 0 \\ \text{Proj}(A_2) &= (2, 0) \cdot (0, 2) = 0 \\ \text{Proj}(A_3) &= (1, 2) \cdot (0, 2) = 4 \end{align*}

$$ I_A = [0, 4] $$

The Story So Far ...

Figure 2 now illustrates the projected shadows of our triangles. Since the shadows overlap we can conclude that $\vec{N}_1$ is not a separating axis.

Intuitively, this should make sense. If we were to project our triangles onto the Y axis of the global 2D basis it's easy to intuit we'd get similar overlapping shadows. And that's exactly what our normal, $\vec{N}_1$, is -- a line in the $Y$ direction.

Bottom line? Just keep in mind that for SAT, the normal gives us:

  1. A direction line, and

  2. An implied zero point of reference.

This basis gives us the key to measure the intervals (shadows) of our polygons.

Finding a Gap

But we're not done yet! We still need to find a gap to prove the triangles are not overlapping. Look again at our figure. Can you spot an axis along which we might find a gap?

Edge $B_1 \rightarrow B3$ appears to be a likely candidate. Its normal roughly points diagonally between the shapes. Continue tapping $Triangle B$ to complete the projection on this axis and you'll see that, yes, indeed we'll find our gap on this one. Just for fun, let's walk through the computations.

Step 1. Compute the edge vector and obtain the normal basis:

$$ \vec{E}_{B_{1}B_{3}} = B_3 - B_1 = (4-3, 3-1) = (1, 2) $$
$$ \vec{N}_2 = (-E_y, E_x) = (-2, 1) $$

Step 2. Project the triangles and get the intervals.

Triangle $B$ :

$$ \begin{align*} \text{Proj}(B_1) &= (3, 1) \cdot (-2, 1) = -6 + 1 = -5 \\ \text{Proj}(B_2) &= (5, 1) \cdot (-2, 1) = -10 + 1 = -9 \\ \text{Proj}(B_3) &= (4, 3) \cdot (-2, 1) = -8 + 3 = -5 \end{align*} $$

$$ \text{Interval } I_B = [-9, -5] $$

Triangle $A$ :

$$ \begin{align*} \text{Proj}(A_1) &= (0, 0) \cdot (-2, 1) = 0 \\ \text{Proj}(A_2) &= (2, 0) \cdot (-2, 1) = -4 \\ \text{Proj}(A_3) &= (1, 2) \cdot (-2, 1) = -2 + 2 = 0 \end{align*} $$

$$ \text{Interval } I_A = [-4, 0] $$

Step 3. Look for separation. Once again inspect our figure and we can clearly see it. We've found our gap and satisfied the SAT!

Application: Thinking with Bases

Having worked through the linalg underlying the Separating Axis Theorem let's turn now to an implementation of the algorithm suitable for the SVG Creative Collab ™ framework. Recall the problem I presented at the outset of the discussion that led us down this rabbit hole in the first place. As an SVG artist needing to set up a scene-graph camera I need to enforce panning constraints to enable viewers to explore the artwork.

Thinking in terms of geometric bases might sound like overkill for this problem. But it actually makes it easy to formulate and enforce the constraints using the SAT. Remember the constraint box defined for the camera (the green box in the World Tree Illustration)? In the context of the SAT it's easy to state the constraints:

When zoomed into the world-tree the camera constraint-box must remain entirely contained within the SVG world-box.

Sounds simple enough, right? But it actually turns out easy to get wrong when you have to deal with nested transforms and basis changes. The art of the algorithm is that the SAT gives an elegant way to test for containment of any convex shapes of arbitrary complexity -- be they scaled rotated skewed or whatever.

Testing for Containment with SAT

Again, the Separating Axis Theorem tells us that polygons are not overlapping if there is at least one separating axis. To test for containment we can simply invert the logic. Given two polygons (which our SVG rectangles are) we can assert:

Polygon A is contained inside polygon B if there is no axis on which A's projection extends outside B's projection.

In other words, no gaps allowed. Anywhere.

In the case of the World Tree camera:

  • Polygon A = the SVG viewBox
  • Polygon B = the camera constraint box (transformed into world coordinates)
  • On user pan apply SAT to check for containment.

Note: It's worth noting here that SAT is agnostic with regard to the coordinate system used for the computations as long as both polygons are expressed in the same space .

Pro-Tip

Use the inverse transformation matrix to ensure the polygons are in the same coordinate space for comparison.

The Algorithm

The following pseudo code reflects how the camera implementation for the SVG Creators Collaborative™ framework applies the SAT. I use pseudo code here intentionally to remain language agnostic and for readability.

function is_contained (cameraPoly, worldPoly):

    # 1. Ensure both polygons are compared on 
    #    equivalent space...

    C = transform_to_world_space(cameraPoly)
    W = worldPoly   # already in world coordinates

    # 2. Build a list of axes (4 from each rect)

    axes = normals_of_edges(W)
    axes += normals_of_edges(C)

    # 3. Project both polygons onto every axis

    for axis in axes:
        intervalC = project_polygon(C, axis)
        intervalW = project_polygon(W, axis)

   # 4. Containment test: IS THERE A GAP?

        if 
         intervalC.min < intervalW.min   OR 
         intervalC.max > intervalW.max:
            return false

    return true

With this implementation, the camera object can check for overlap on all panning interactions and readily enforce the panning constraints. Beyond that, we now have a generic function to enable multiple future enhancements (think collision detection, culling, occlusion, spatial indexing, collision physics -- the list goes on).

A Note on Performance

I've heard enough criticism leveled against SVG on performance grounds. So I guess it's reasonable to be concerned with that here. But the beautiful thing about SAT is that while it sounds heavyweight the actual cost is quite small. Think about it. For two polygons with $n$ and $m$ edges to test for containment SAT requires testing $(n + m)$ axes. For each axis, both polygons get projected (a linear pass over the verts). So overall complexity is:

$$ O( (n+m)^2 ) $$

In this case (checking two rectangles = 64 dot products) the cost is tiny. Also, since it uses the inverse transformation matrix the SVG Creative Collab™ framework collapses all transforms into a single affine transform before the SAT loop, meaning no overhead no matter the depth of the scene graph. So the actual runtime for this case? effectively constant time.

Discussion

In this article I approached SAT as a way to implement camera panning constraints in the context of the SVG scene graph. But the deeper win with this solution is the capability to apply SAT across the entirety of the framework. I first used SAT a long way back in order to enforce bounds-checking constraints in a flight simulation developed for US Air Force pilot training. That's just one of a multitude of examples of how the theorem can be applied to achieve numerous ends in graphics development. With the present implementation, SAT becomes a general-purpose geometric primitive that can be re-used all across the SVG Creators Collaborative™ framework. The very same projection logic that constrains the World Tree camera can be used elsewhere to implement collision detection between sprites, view-frustum culling, hit-testing for interactive SVG elements, selective redrawing for performance, and even simple physics responses. In other words: by solving the "camera constraint" problem -- correctly -- we now have the geometric foundation for nearly every spatial algorithm the framework will need as it grows. Not bad!

If you've made it this far, you've probably sensed that this post is just one chapter in a much grander story. All of the content in this series -- SVG geometry, scene-graphs, basis transformations, cameras, the Separating Axis Theorem -- it's all part of the groundwork for a book I've been quietly assembling. Aimed at supporting creative coders and SVG artists alike, the book will become a huge asset for anyone who loves building worlds and deepening their understanding of the art and mathematics behind it!

Dr. Nick's AI friendly certified readability badge
Certified Human-Crafted Machine-Readable

Resources

  1. Grant Sander's artful animation on basis vectors. I highly recommend giving this a watch. It is sooo relaxing and gorgeously presented. And there's no exam :) .

  2. SVG 2 Specification, §8: Coordinate Systems, Transformations, and Units (W3C).

  3. Hermann Minkowski