Transforming Mouse Coordinates to Canvas Coordinates

I wanted to be able to add intuitive zooming and panning in an image editing web app I was working on, but this turned out to be a bit harder than I expected. All examples and info online followed bad practices or use outdated features to work. And all the examples required you to track your own version of the transformations applied to the canvas, but now with getTransform we can make use of the fact that the canvas already stores all of it’s transforms. So, I set out to create a nice modern version that worked and that I could understand, and that made use of the canvas’s built in transformation tracking. This article goes into why and how this works, but you don’t really need to know that to accomplish this. If you just want a fast answer to “how do I transform a point to the transformed canvas space”, click here to go right to the secret sauce. You can also see a fully working JSFiddle here.

Check out the example below. Clicking and dragging in the canvas will move the image around, and scrolling your mouse wheel while over the canvas will zoom in to and out of the image at the exact position of your cursor.  The upper left corner of the image will always be the origin 0,0.


Original:    x: 0, y: 0
Transformed: x: 0, y: 0

Translating the Canvas

We’ll start with the simple part, translating. Calling translate on a canvas moves the canvas and it’s origin to a new point on a grid (MDN).

The origin of the canvas starts out in the upper left corner, making it 0,0 in canvas space by default, as shown in the graph above. The lowercase x and y represent a translate distance in the x and y direction. The red lines and red zero mark the new origin of the canvas, meaning the red lines are the new X and Y zero, with the red zero being 0,0. So if we translate the canvas 20 pixels in the x direction and 20 pixels in the y direction, the new origin starts 20 pixels down and 20 pixels over from the original origin.

That means that as far as the canvas itself is concerned, the coordinate system now looks like this:

Here the red zero is the new origin and if you wanted to draw back to the upper left corner of the canvas, you’d need to draw at -20, -20.

So we have a new origin for the canvas, but while the canvas knows this, the browser does not. This means that if you add a mousemove event listener to the canvas and get back the coordinates, the upper left corner will still be 0,0. All the event cares about is where the element is on the page, and where your mouse is within the element, it does not care that the element happens to be a canvas and that the canvas happens to be translated by 20px in both directions.

At this point, the solution for transforming the mouse cursor into canvas space is simple, subtract the translation amount from the mouse cursor position. As long as we know the amount that was translated in both directions, we can subtract that from the mouse cursor position and we’ll get the correct coordinates. In this example after translating the canvas 2opx down and to the right, we need to translate the mouse cursor 20px up and to the left to get the proper coordinates. That way mouse position 0,0 becomes -20,-20.

While we could track the translate amount ourselves, the canvas already knows how much it’s translated. We can get that by calling getTransform(), which returns an object with the following properties:

a   Horizontal scaling

b   Vertical skewing

c   Horizontal skewing

d   Vertical scaling

e   Horizontal translation

f   Vertical translation

While that is only part of the full matrix that is returned, this is everything we need. We can write a simple function to get the transformed mouse coordinates.

const canvas = document.getElementById('canvas');
const context = canvas.getContext('2d');

function getTransformedPoint(x, y) {
  const transform = context.getTransform(); 
  const transformedX = x - transform.e; 
  const transformedY = y - transform.f;
  return { x: transformedX, y: transformedY };
}

// We can use our function with a canvas event
canvas.addEventListener('mousemove', event => {
  const transformedCursorPosition = getTransformedPoint(event.offsetX, event.offsetY);
  console.log(transformedCursorPosition);
}); 

We get the canvas translation from the DOMMatrix that we get back from getTransform(), per the docs above property e is horizontal translation, f is vertical translation. So we simply take those and subtract them from the mouse position, and our mouse cursor position has been transformed to the same coordinate system as the canvas. If all you’re doing is translating and not scaling or skewing, this is all you need.

Scaling the Canvas

Things get more complicated when we scale the canvas, but the concept is the same. By default, one unit in a canvas is one pixel. Scaling changes that unit. Scaling to 2.0 means one unit is 2 pixels, so when an image is drawn, it’s twice as large as normal. Again, as far as mouse cursor position is concerned, the unit is always one pixel, it doesn’t know about your canvas translation. Also as before, we need to apply the opposite of the transformation we applied to the canvas. With translating we translate the mouse cursor in the opposite direction, and with scaling, we need to transform it by the amount it would take to undo the scale.

If you draw an image to the canvas at a scale of 1.0 every pixel you move your mouse will move to a new pixel as drawn in the canvas. But if you draw an image to the canvas at a scale of 10.0 every pixel you move your mouse actually only covers 1/10 of a pixel as drawn to the canvas, so moving one pixel down and to the right would have an in-canvas coordinate of 0.1,0.1. Similarly if you draw to the canvas at a scale of 0.1 every pixel you move your mouse actually covers 10 pixels as drawn in the canvas.

So for whatever scale we apply to the canvas, we need to transform our mouse cursor by the “opposite” scale, which is whatever amount it would take to put the scale back to 1.0 when multiplied by the scale. So we have the formula scale * x = 1.0, or solved, x = 1 / scale.  So simply take whatever scale amount we have and divide one by it, and we know how to undo our scale.

With that knowledge, we can actually find the position of our mouse cursor in scaled canvas coordinates by doing:

function getTransformedPoint(x, y) {
  const transform = context.getTransform();
  const invertedScaleX = 1 / transform.a;
  const invertedScaleY = 1 / transform.d;

  const transformedX = x * invertedScaleX;
  const transformedY = y * invertedScaleY;

  return { x: transformedX , y: transformedY };
}

Combining Transforming and Scaling Simultaneously

Each of those functions will work separately with a translated and scaled canvas, but if we want to scale and translate, we have to go one extra step. When the canvas is scaled, the amount it’s been translated also scales. So we actually need to apply our inverted scale to the translation as well.

function getTransformedPoint(x, y) {
  const transform = context.getTransform();
  const invertedScaleX = 1 / transform.a;
  const invertedScaleY = 1 / transform.d;

  const transformedX = invertedScaleX * x - invertedScaleX * transform.e;
  const transformedY = invertedScaleY * y - invertedScaleY * transform.f;

  return { x: transformedX, y: transformedY };
}

All I’ve done here is combined the two previous functions. I get the inverted scale mouse position first, then subtract the amount of transform. The difference here is that I also apply the inverse zoom to the x and y translations to account for the fact that the translations themselves are also scaled.

Skewing and the Built-In Matrix Inversion Method

Once we get to skewing, trying to invert everything ourselves and then apply it to a point starts to get out of hand. Luckily the DOMMatrix that is returned to us actually already has a built in function that inverts itself. So, we grab a copy of the current transform and call invertSelf() on it, and that will then contain an inversion of all the same data points. That means that not only are are the horizontal and vertical translations negative versions of their previous selves, they’ve also already been adjusted by the inverse zoom, which we also get automatically. So now we don’t have to find the inverted scale ourselves, and we don’t subtract the translation, we add them since they’re already inverted. We also add in the inverse skew multiplied by the inverse scale, and we end up with a fully translated point.

function getTransformedPoint(x, y) {
  const inverseTransform = context.getTransform().invertSelf();
  const transformedX = inverseTransform.a * x + inverseTransform.c * y + inverseTransform.e;
  const transformedY = inverseTransform.b * x + inverseTransform.d * y + inverseTransform.f;
 
  return { x: transformedX, y: transformedY };
}

This is, not coincidentally, the formula for 2D affine transformation:

x2 = a*x1 + c*y1 + e
y2 = b*x1 + d*y1 + f

Built-In Point Transformation

The reality is, DOMMatrix can do this for you. While it’s certainly good to know how this works, all you really need is to turn your position into a DOMPoint and have DOMMatrix transform it for you:

function getTransformedPoint(x, y) {
  const originalPoint = new DOMPoint(x, y);
  return context.getTransform().invertSelf().transformPoint(originalPoint);
}

Wrapping Up and Zooming to Cursor

With this in our tool belt, we can freely interact with items in the canvas regardless of the scale, translation, or skew. To wrap up what I set out to do, we need to zoom into the image with the mouse wheel at the point that our mouse cursor is currently over. Scaling scales out from the current origin, so we need to move the origin to the in-canvas position of the mouse cursor and zoom in there. However since we don’t want to actually translate everything over, we need to move the origin back to wherever it was before we did the zoom.

function onWheel(event) {
  const currentTransformedCursor = getTransformedPoint(event.offsetX, event.offsetY);

  const zoom = event.deltaY < 0 ? 1.1 : 0.9;
  
  context.translate(currentTransformedCursor.x, currentTransformedCursor.y);
  context.scale(zoom, zoom);
  context.translate(-currentTransformedCursor.x, -currentTransformedCursor.y);
  
  // Redraws the image after the scaling    
  drawImageToCanvas();

  // Stops the whole page from scrolling
  event.preventDefault();
}

I’m doing this in a mouse wheel event handler and normalizing the mouse wheel value to either 1.1 or 0.9 to zoom in or out.

And that’s it. You can see the full code in JSFiddle here, or checkout the code from Github here.

If you have questions, comments, or suggestions on how to make this better, please leave a comment below!