I recently came across something that proved to be one of the more complicated programming challenges I've had in a real world scenario. There are plenty of tough programming puzzlers that are hard to solve, but this was an actual problem that I bet some of you have encountered before; how to properly transform a video or camera stream to a TextureView of a different size.
While there are code snippets found in various samples and StackOverflow-posts, I felt it was about time that I fully grasped how this worked so that I wouldn't have to copy code I didn't fully understand.
In this post I'll try to explain how the transformation on
TextureView works in a way that doesn't require full knowledge about the computer graphics theory that is often needed.
In modern computer graphics, a texture is just a bitmap that is mapped onto a plane defined by some points. The plane can be a triangle, rectangle, or any other valid 2D shape. The points can be in 2D space or 3D space, but the actual mapping of the texture to this plane should be considered happening on a flat plane. To make things simpler, we'll stick to rectangular planes for the rest of this story.
When we map a texture to this plane, we say that the top left corner of the texture has the mapping coordinate of
0, 0 and the bottom right is
1, 1. The size of our rectangular plane when displayed on a screen is in pixels, like
1440 x 810, and the resolution of the texture might be something different, like
1600 x 1200.
This means that if we want to fit the texture inside the rectangle exactly, without cropping, the
0, 0 mapping coordinate corresponds to the top left corner of the texture, and the
1, 1 mapping coordinate corresponds to the bottom right corner. However, since the texture has a different aspect ratio (in this case,
4:3) from the rectangle on the screen (which is
16:9), the texture will look skewed.
If we instead would map the bottom right corner of the texture to the mapping coordinate
0.5, 0.5, the texture would be displayed in the top left quarter of the rectangle and still look skewed. We can also use a mapping coordinate outside
1. If the bottom right would be mapped to
2, 2, we would only display the top left quarter of the texture inside the rectangle entire. Alternatively, by mapping the top left corner to
-1, -1 and the bottom right to
1, 1, we would display the bottom right quarter of the texture. We can also flip the texture along the vertical axis by mapping the top left to
1, 0 and the bottom right to
Basically, the mapping coordinates tell the GPU which relative pixel of the texture to map to the relative rendered pixel of the rectangle. This is the basics of textures, and we need this to understand how to properly display a camera feed or a video on a
TextureView. Getting the scaling of the texture correctly is the key to displaying a perfect view finder for a camera application.
Transforming these mapping coordinates is done using a
3x3 transformation matrix. What this means is that every mapping coordinate is transformed by multiplying it with this matrix, giving it the final position.
Let's assume our view has an aspect ratio of
16:9 and fills the entire width of the screen in portrait mode. In a
ConstraintLayout the XML for our
TextureView would be something like this.
<TextureView android:id="@+id/viewFinder" android:layout_width="0dp" android:layout_height="0dp" app:layout_constraintDimensionRatio="16:9" app:layout_constraintEnd_toEndOf="parent" app:layout_constraintStart_toStartOf="parent" app:layout_constraintTop_toTopOf="parent" />
On my Pixel 3 XL, the physical size of this view will be
1440 x 810 pixels. The actual size doesn't really matter, but we usually use it to calculate the aspect ratio of the view in runtime.
TextureView has a method called
setTransform() which takes a
Matrix which is used to transform the mapping coordinates. Initially, the transform matrix for a
TextureView is an identity matrix, meaning that no scaling, translation, or rotation happens.
Modifying the mapping coordinates of a
TextureView is done by transforming them using using this matrix. For instance, if the matrix is simply a scaling by
0.5 on both x and y axis, all the mapping coordinates will be multiplied by
0.5 and the result will be a texture covering the top left quadrant of the
TextureView. Naturally, the identity matrix, which doesn’t alter the original values, will simply result in a texture covering the entire
TextureView without being cropped and leaving no area in the TextureView empty.
The problem you will encounter is that the even if the aspect ratio and size of the preview frames from the camera (or the video you're trying to play) and the
TextureView is the same, you'll notice that it still won't display correctly. The reason for this is how textures works as I described above. This is why your camera feed tends to look skewed when your transform matrix for the
TextureView is incorrectly calculated (or just left to the default identity matrix).
Scale and crop
In this case, the aspect ratio of the preview frames from the camera is
4:3 and the resolution
1600 x 1200 pixels. Since the width of the view finder (our
TextureView) is larger than the height, we will make the preview frames fit horizontally (x axis), meaning that scaling factor for this axis should be exactly
1.0. Vertically, we need to scale it so it won't be skewed, which basically means we will crop the preview frames along the y axis.
In code, it will look like this.
The challenge here is to calculate the exact value of
scaleY. If we were to use
1.0 for the height as well, our texture would get squashed vertically, with some rather odd results. If the values is off by just a few decimals, the user will notice the effect as shown in the video below.
Let's start with a function where we will calculate the correct transform.
One way to think about this is that the texture should maintain the same aspect ratio, regardless of the aspect ratio of the
TextureView. Let's calculate the aspect ratios for both.
My immediate idea was that since the aspect ratio should remain the same, why not just set the scaling along the
y axis to the value of the aspect ratio for the preview?
If you try this, you'll see that it doesn't work and the previews will still be squashed, although a bit less. This is because we also depend on the aspect ratio of the
TextureView, not just the preview frames. The correct way to get the scale factor in this scenario would be this:
This will work fine for when the view is wider than it is tall, but what if we have the opposite? Imagine if the aspect ratio of the view finder finder were
2:3 instead. The size of the
TextureView in portrait mode on a Pixel 3 XL would be
1440 x 2160 pixels. In those cases we need to flip the axis for which we scale, but also calculate the
There are two differences between the cases. First, we flip the
y scaling in each case, since we will scale by
1.0 along the opposite axis. Secondly, the aspect ratio of the view finder (but not the texture!) is different.
The code above works great as long as the aspect ratio of the preview frames are wider than they are tall, and the device is in portrait mode. If either of those parameters change, you will need to introduce another case to cover that as well.
One variable that I haven't covered is rotation. There are two kinds of rotation that can happen in Android with regards to camera applications: rotation of preview frames and rotation of the device. Fortunately, it turns out you only have to deal with rotation of the device when working with the camera on Android. First, you need a function that gives you the rotation of the device display in degrees.
Now all you need to do is to compensate for it by appending a rotation to the transformation matrix that is equal to the display rotation with the opposite sign.
Activity displaying the camera view finder will always be in portrait mode, this means rotating by
0 degrees, so in many cases you might be able to skip this step. However, with all the variations of devices and screen combinations (Hello, foldable screens!), I recommend still keeping this around.
You might be wondering why we need to multiply the two aspect ratios to get the correct scaling. I was wondering the same thing at first and I wasn't able to find a good and simple explanation for this. The best explanation I got was from my friend Ryan Harter who described it as "It's just mapping the two aspect ratios". While that might be enough for those who have worked with computer graphics before, I think it still might be a bit abstract for novices to this area.
The first step to understanding this is to recognise that a scaling of
1.0 along each axis doesn't mean that the texture isn't scaled at all when rendered. The nature of how textures gets displayed on a
TextureView is that there is an implicit scaling happening.
Let’s consider the simplest case. We got a
TextureView and a texture, both with the same aspect ratio of
1:1. The actual size in pixels doesn't matter and we don't need to do any further scaling since the implicit scaling matches perfectly. But what if the texture had a different aspect ratio, say
4:3? The implicit scaling would now be wrong for the y axis and the result would look stretched. How much do we need to scale along the y axis to compensate?
4 divided by
1.33333, so what if we use that for the scaling along the y axis? It turns out that this works perfectly and we no longer have a stretched result. But how about when the aspect ratio of the
1:1? What happens if it is
First, let's assume that our texture has an aspect ratio of
1:1 again. If we use the same approach as earlier and just scale by the aspect ratio of the texture, the result will be a squashed texture in our
TextureView. If we instead use the aspect ratio of the
16 divided by
1.777777), you’ll notice that the scaling is now correct.
By now you have probably figured out why that is. We need to consider both aspect ratios when compensating for the implicit scaling that happens on a
TextureView. If the texture has an aspect ratio of
4:3 and the
16:9, the result is
2.3703703704. That's the scaling factor we need in our case, so the generic formula would be:
scalingFactor = (textureSize.x / textureSize.y) * (viewSize.x / viewSize.y)
This still assumes that your scaling factor along the other axis is
This gave me a lot of headache before I managed to wrap my head around it. Once you do, you'll see that making a completely generic solution is far from simple and requires covering several different cases.
My suggestion when doing this is to limit your application to as few cases as possible. If you're implementing a view finder for the camera, it helps a lot if you can lock the orientation of the
Activity so that you don't have to consider device rotation as well.
The transformation matrix is useful besides scaling the texture correctly. If you're doing image analysis like text recognition or QR/barcode scanning, chances are that you might want to display an overlay on the detected text or barcode. To do that properly, you need this transformation matrix when drawing using the
Canvas. Drawing overlays on a
TextureView is outside the scope of this article, but it may be a topic I'll return to later.
The great thing about getting the transformation of your
TextureView correct, is that it allows you to use some unorthodox view finder sizes that is better suited to the use case of your application. For instance, if the user is supposed to scan a single line of text, it might be useful to provide a view finder with a size that makes it super clear how to focus on that.
An informed reader might note that there are other ways to calculate the transformations required. If you need to fit the entire preview inside the
TextureView, you need a slightly different solution. The reason I choose this solution before others is that it is easier to explain and reason about.
I hope this post will be helpful for those of you implementing your own view finder for the camera using a
TextureView (or rendering a video), where you have problems with the video getting squashed or stretched.
Many thanks to Sebastiano who helped me with proof-reading this post.