I recently came across something that proved to be one of the more complicated programming challenges I've had in a real world scenario. There are plenty of tough programming puzzlers that are hard to solve, but this was an actual problem that I bet some of you have encountered before; how to properly transform a video or camera stream to a TextureView of a different size.

While there are code snippets found in various samples and StackOverflow-posts, I felt it was about time that I fully grasped how this worked so that I wouldn't have to copy code I didn't fully understand.

In this post I'll try to explain how the transformation on TextureView works in a way that doesn't require full knowledge about the computer graphics theory that is often needed.


Textures 101

In modern computer graphics, a texture is just a bitmap that is mapped onto a plane defined by some points. The plane can be a triangle, rectangle, or any other valid 2D shape. The points can be in 2D space or 3D space, but the actual mapping of the texture to this plane should be considered happening on a flat plane. To make things simpler, we'll stick to rectangular planes for the rest of this story.

When we map a texture to this plane, we say that the top left corner of the texture has the mapping coordinate of 0, 0 and the bottom right is 1, 1. The size of our rectangular plane when displayed on a screen is in pixels, like 1440 x 810, and the resolution of the texture might be something different, like 1600 x 1200.

The texture has an aspect ratio of 4:3 and the view 16:9. Scaling is need, but how?

This means that if we want to fit the texture inside the rectangle exactly, without cropping, the 0, 0 mapping coordinate corresponds to the top left corner of the texture, and the 1, 1 mapping coordinate corresponds to the bottom right corner. However, since the texture has a different aspect ratio (in this case, 4:3) from the rectangle on the screen (which is 16:9), the texture will look skewed.

Illustrating how the texture will be skewed. Notice the squares are now rectangles.

If we instead would map the bottom right corner of the texture to the mapping coordinate 0.5, 0.5, the texture would be displayed in the top left quarter of the rectangle and still look skewed. We can also use a mapping coordinate outside 0 and 1. If the bottom right would be mapped to 2, 2, we would only display the top left quarter of the texture inside the rectangle entire. Alternatively, by mapping the top left corner to -1, -1 and the bottom right to 1, 1, we would display the bottom right quarter of the texture. We can also flip the texture along the vertical axis by mapping the top left to 1, 0 and the bottom right to 0, 1.

Basically, the mapping coordinates tell the GPU which relative pixel of the texture to map to the relative rendered pixel of the rectangle. This is the basics of textures, and we need this to understand how to properly display a camera feed or a video on a TextureView. Getting the scaling of the texture correctly is the key to displaying a perfect view finder for a camera application.

Correctly scaled texture where we choose to crop it to match the entire output view.

Transforming these mapping coordinates is done using a 3x3 transformation matrix. What this means is that every mapping coordinate is transformed by multiplying it with this matrix, giving it the final position.

TextureView

Let's assume our view has an aspect ratio of 16:9 and fills the entire width of the screen in portrait mode. In a ConstraintLayout the XML for our TextureView would be something like this.

<TextureView
    android:id="@+id/viewFinder"
    android:layout_width="0dp"
    android:layout_height="0dp"
    app:layout_constraintDimensionRatio="16:9"
    app:layout_constraintEnd_toEndOf="parent"
    app:layout_constraintStart_toStartOf="parent"
    app:layout_constraintTop_toTopOf="parent" />

On my Pixel 3 XL, the physical size of this view will be 1440 x 810 pixels. The actual size doesn't really matter, but we usually use it to calculate the aspect ratio of the view in runtime.

The TextureView has a method called setTransform() which takes a 3x3 Matrix which is used to transform the mapping coordinates. Initially, the transform matrix for a TextureView is an identity matrix, meaning that no scaling, translation, or rotation happens.  

Modifying the mapping coordinates of a TextureView is done by transforming them using using this matrix. For instance, if the matrix is simply a scaling by 0.5 on both x and y axis, all the mapping coordinates will be multiplied by 0.5 and the result will be a texture covering the top left quadrant of the TextureView. Naturally, the identity matrix, which doesn’t alter the original values, will simply result in a texture covering the entire TextureView without being cropped and leaving no area in the TextureView empty.

The problem you will encounter is that the even if the aspect ratio and size of the preview frames from the camera (or the video you're trying to play) and the TextureView is the same, you'll notice that it still won't display correctly. The reason for this is how textures works as I described above. This is why your camera feed tends to look skewed when your transform matrix for the TextureView is incorrectly calculated (or just left to the default identity matrix).

Scale and crop

In this case, the aspect ratio of the preview frames from the camera is 4:3 and the resolution 1600 x 1200 pixels. Since the width of the view finder (our TextureView) is larger than the height, we will make the preview frames fit horizontally (x axis), meaning that scaling factor for this axis should be exactly 1.0. Vertically, we need to scale it so it won't be skewed, which basically means we will crop the preview frames along the y axis.

In code, it will look like this.

val scaleX = 1f
val scaleY = ??? // TODO Calculate this!

val matrix = Matrix()
matrix.setScale(scaleX, scaleY)

textureView.setTransform(matrix)
Find Y!

The challenge here is to calculate the exact value of scaleY. If we were to use 1.0 for the height as well, our texture would get squashed vertically, with some rather odd results. If the values is off by just a few decimals, the user will notice the effect as shown in the video below.

Example of skewed camera view finder on Android

Let's start with a function where we will calculate the correct transform.

fun calculateTransform(viewFinderSize: Size, previewSize: Size) {
    val matrix = Matrix()
    
    // TODO: Calculate the correct transformation matrix!
    
    return matrix
}
Function for calculating the correct transform matrix

One way to think about this is that the texture should maintain the same aspect ratio, regardless of the aspect ratio of the TextureView. Let's calculate the aspect ratios for both.

val previewAspectRatio = previewSize.width / previewSize.height.toFloat()
val viewFinderRatio = viewFinderSize.width / viewFinderSize.height.toFloat()
Aspect ratios of the texture and the view

My immediate idea was that since the aspect ratio should remain the same, why not just set the scaling along the y axis to the value of the aspect ratio for the preview?

val scaleX = 1f
val scaleY = previewAspectRatio

matrix.setScale(scaleX, scaleY)
This is wrong.

If you try this, you'll see that it doesn't work and the previews will still be squashed, although a bit less. This is because we also depend on the aspect ratio of the TextureView, not just the preview frames. The correct way to get the scale factor in this scenario would be this:

val scaleX = 1f
val scaleY = previewAspectRatio * viewFinderRatio

matrix.setScale(scaleX, scaleY)
This works, but only for view finders and preview frames that are wider than their height!

This will work fine for when the view is wider than it is tall, but what if we have the opposite? Imagine if the aspect ratio of the view finder finder were 2:3 instead. The size of the TextureView in portrait mode on a Pixel 3 XL would be 1440 x 2160 pixels. In those cases we need to flip the axis for which we scale, but also calculate the viewFinderRatio differently.

val scaleFactors = if (viewFinderSize.height <= viewFinderSize.width) {
    val previewRatio = previewSize.width / previewSize.height.toFloat()
    val viewFinderRatio = viewFinderSize.width / viewFinderSize.height.toFloat()
    val scaling = viewFinderRatio * previewRatio
    PointF(1f, scaling)
} else {
    val previewRatio = previewSize.height / previewSize.width.toFloat()
    val viewFinderRatio = viewFinderSize.height / viewFinderSize.width.toFloat()
    val scaling = viewFinderRatio * previewRatio
    PointF(scaling, 1f)
}

matrix.preScale(scaleFactors.x, scaleFactors.y, centerX, centerY)
Correct scaling transform for all aspect ratios of the TextureView

There are two differences between the cases. First, we flip the x and y scaling in each case, since we will scale by 1.0 along the opposite axis. Secondly, the aspect ratio of the view finder (but not the texture!) is different.

The result of a correct transformation of camera preview frames.

The code above works great as long as the aspect ratio of the preview frames are wider than they are tall, and the device is in portrait mode. If either of those parameters change, you will need to introduce another case to cover that as well.

Rotation

One variable that I haven't covered is rotation. There are two kinds of rotation that can happen in Android with regards to camera applications: rotation of preview frames and rotation of the device. Fortunately, it turns out you only have to deal with rotation of the device when working with the camera on Android. First, you need a function that gives you the rotation of the device display in degrees.

fun getDisplaySurfaceRotation(display: Display): Int {
    return when (display.rotation) {
        Surface.ROTATION_0 -> 0
        Surface.ROTATION_90 -> 90
        Surface.ROTATION_180 -> 180
        Surface.ROTATION_270 -> 270
        else -> throw IllegalArgumentException("Invalid rotation!")
    }
}
Code for getting rotation in degrees of the device display

Now all you need to do is to compensate for it by appending a rotation to the transformation matrix that is equal to the display rotation with the opposite sign.

matrix.postRotate(-viewFinderRotation.toFloat(), centerX, centerY)
Appending a compensating rotation to the view finder based on display rotation

If your Activity displaying the camera view finder will always be in portrait mode, this means rotating by 0 degrees, so in many cases you might be able to skip this step. However, with all the variations of devices and screen combinations (Hello, foldable screens!), I recommend still keeping this around.

The math

You might be wondering why we need to multiply the two aspect ratios to get the correct scaling. I was wondering the same thing at first and I wasn't able to find a good and simple explanation for this. The best explanation I got was from my friend Ryan Harter who described it as "It's just mapping the two aspect ratios". While that might be enough for those who have worked with computer graphics before, I think it still might be a bit abstract for novices to this area.

The first step to understanding this is to recognise that a scaling of 1.0 along each axis doesn't mean that the texture isn't scaled at all when rendered. The nature of how textures gets displayed on a TextureView is that there is an implicit scaling happening.

Let’s consider the simplest case. We got a TextureView and a texture, both with the same aspect ratio of 1:1. The actual size in pixels doesn't matter and we don't need to do any further scaling since the implicit scaling matches perfectly. But what if the texture had a different aspect ratio, say 4:3? The  implicit scaling would now be wrong for the y axis and the result would look stretched. How much do we need to scale along the y axis to compensate?

4 divided by 3 is 1.33333, so what if we use that for the scaling along the y axis? It turns out that this works perfectly and we no longer have a stretched result. But how about when the aspect ratio of the TextureView isn't 1:1? What happens if it is 16:9?

First, let's assume that our texture has an aspect ratio of 1:1 again. If we use the same approach as earlier and just scale by the aspect ratio of the texture, the result will be a squashed texture in our 16:9 TextureView. If we instead use the aspect ratio of the TextureView (16 divided by 9 is 1.777777), you’ll notice that the scaling is now correct.

By now you have probably figured out why that is. We need to consider both aspect ratios when compensating for the implicit scaling that happens on a TextureView. If the texture has an aspect ratio of 4:3 and the TextureView 16:9, the result is 2.3703703704. That's the scaling factor we need in our case, so the generic formula would be:

scalingFactor = (textureSize.x / textureSize.y) * (viewSize.x / viewSize.y)

This still assumes that your scaling factor along the other axis is 1.0.

Conclusions

This gave me a lot of headache before I managed to wrap my head around it. Once you do, you'll see that making a completely generic solution is far from simple and requires covering several different cases.

My suggestion when doing this is to limit your application to as few cases as possible. If you're implementing a view finder for the camera, it helps a lot if you can lock the orientation of the Activity so that you don't have to consider device rotation as well.

The transformation matrix is useful besides scaling the texture correctly. If you're doing image analysis like text recognition or QR/barcode scanning, chances are that you might want to display an overlay on the detected text or barcode. To do that properly, you need this transformation matrix when drawing using the Canvas. Drawing overlays on a TextureView is outside the scope of this article, but it may be a topic I'll return to later.

The great thing about getting the transformation of your TextureView correct, is that it allows you to use some unorthodox view finder sizes that is better suited to the use case of your application. For instance, if the user is supposed to scan a single line of text, it might be useful to provide a view finder with a size that makes it super clear how to focus on that.

A correctly transformed view finder with a "weird" aspect ratio

An informed reader might note that there are other ways to calculate the transformations required. If you need to fit the entire preview inside the TextureView, you need a slightly different solution. The reason I choose this solution before others is that it is easier to explain and reason about.

I hope this post will be helpful for those of you implementing your own view finder for the camera using a TextureView (or rendering a video), where you have problems with the video getting squashed or stretched.

Many thanks to Sebastiano who helped me with proof-reading this post.