Wanted: advice on fast bitmap scaling

BBartonW · August 25, 2021, 10:09pm

Hello Haxe community,

I have to bid a project in a couple days and I am trying to figure out an approach and how much effort I will have to spend on image processing performance. I’m not sure how to target the GPU or even if I should.

I have tried to figure out where Haxe and OpenFL or additional libraries use hardware acceleration, but it is taking me too long. I’ve found a few image libraries, but again haven’t figured out if they use the GPU.

Thanks in advance for advice and pointing me towards articles/documentation.

The shortish version:
What is a good approach using Haxe and probably OpenFl to have very fast bitmap scaling for viewing and zooming in and out on individual very large (2GB) HDR images? This is for a kiosk application and I can specify the GPU, CPU, SSD, and amount of RAM as well as the monitor resolution—probably 4K maybe 8K.

Should I target the GPU or a CPU with many cores/threads (initiate multiple transform threads at once knowing that I will likely need one or more of them to zoom smoothly)? Would a game engine be overkill—that’s okay if it is fastest development path.

I’m quite new to Haxe and I haven’t yet figured out when the GPU is or isn’t targeted with Haxe and OpenFl. So far I have never directly used OpenGL or DirectX, so I am guessing that bindings such as OpenFl’s OpenGLRenderer might involve a greater learning curve than I have time for.

The touchscreen interface will need to be as minimal as possible, so I don’t need a lot of controls/components—at least not visible components.

The longer version—in case someone is curious and has plenty of excess time:
I need to figure out an approach for a potential touch screen kiosk application. I have done several kiosk projects using AS3/AIR to generate 64-bit Windows captive runtime applications. As a potential replacement for AS3/AIR/FlashDevelop, I started dabbling with Haxe, HaxeDevelop, and OpenFl back in March and then dropped it as other things took priority. Before dropping it I converted a small piece of an AS3/AIR kiosk application to Haxe/OpenFL and after much frustration was able to create a working EXE that utilized most of the common elements I tend use in kiosk applications.

The key aspect of this kiosk application will be to view large gorgeous HDR images on a 4K monitor (potentially 8K monitor down the road). A sample image I have is close to 2GB in size and has a pixel resolution of 12,752 x 19,136 (monitor will be in portrait orientation).

The kiosk needs to allow visitors to zoom from a scaled version of the entire image, down to having one image pixel to one monitor pixel, and any level in between. Visitors will also need to able to pan at any resolution. This needs to be done as elegantly as possible.

I have done something similar with aerial images of a city using AIR. I started with a few hundred large GIS aerial images shot at 0.5 ft per pixel and created around 20,000 images (2,000 x 2,000 pixels each) at 9 different levels of resolution (ft per pixel that is, ending with 128 feet per pixel), for zooming in and panning around the city. However, that zoom isn’t particularly elegant. From whatever resolution (feet per pixel) the viewer is on, I merely scale the current view of the bitmap(s), not the bitmap data, according to the amount of zoom, and then cut to bitmaps most closely matching the final resolution (ft per pixel) of the zoom. Zooming from the entire city view, down to street level view gives you a very ugly scaled up image while you are zooming in (spreading two fingers) until you let go and the new bitmaps appear. Zooming out leaves you with a lot of blank space as the area you are zooming out of gets smaller and smaller. Pulling images in for continual panning was plenty fast.

Note that in that project I did compress the aerial images, finding a tradeoff between speed of execution and image quality. This time I want to maximize image quality.

Also, this time I’m trying to avoid the pre-production of generating images at varying resolutions.

I can specify the computer hardware for the kiosk. It will be a Windows machine in part because I don’t have time to learn something else.

Repeating from above: I have never worked directly with DirectX or OpenGL and at this point I am hoping not to have to. I have tried to figure out where Haxe and OpenFL or additional libraries use hardware acceleration, but it is taking me too long. I’ve found a few image libraries, but again haven’t figured out if they use GPU. Also, as scaling will likely be my only transformation, I’m not sure if the libraries provide any benefit.

A few things I’ve glanced at along the way:

Core Architecture
bitmapData.draw() (this would be the multiple thread approach)
Haxe “image” library with GraphicsMagick installed
Haxe “bitmap” library
Haxe “magic” library with ImageMagick installed
Haxe “heaps.io” game engine
Feathers UI
starling
Haxeui.org

Closing:
Again, thanks in advance for advice and pointing me towards articles and documentation.

basro · August 26, 2021, 11:57am

I can’t answer about which haxe library to use, but I can tell you that you do need GPU acceleration for this task, so you must pick a library that does that. Even with the fastest CPU you’d never be able to get smooth framerate at those resolutions if you tried to render it in software mode.

I think the dimensions of your image (even if it is HDR) will allow you to have it completely uploaded into video memory if you get a GPU with enough ram. If that’s the case your job shouldn’t be too hard.
If it doesn’t fully fit in VRAM you’ll have to create multiple versions of different quality and only load the high quality versions of the image when you zoom in. That can get quite complicated.

Since your image is HDR, do you intend to use an HDR display? If not then you can convert your image to truecolor instead to save in VRAM. If you do intend to use an HDR display it you will need to make sure your engine supports HDR rendering.

BBartonW · August 26, 2021, 7:46pm

@basro Thank you for the reply.

I intend to have an HDR display but have not yet started looking at what I need to do in software to support HDR images. I appreciate your note on this.

I did find some information on GPU usage under OpenFl–not sure how I missed it previously.

Forum info on OpenFl and GPU

It appears as if OpenFl uses the GPU most of the time for my target, but I still have some digging to do, because the bitmapData.draw() method apparently does not use the GPU and I need to figure out how to best utilize the GPU.

OpenFl seems to be a good direction to pursue, which is nice because I am quite familiar with AS3 and have tried out OpenFl. Also, there are libraries that sit on top of OpenFl that might be of use.

I hadn’t thought about maybe having to have multiple resolution images even with the use of the GPU. I will make sure we have plenty of VRAM. I already have a system for creating and utilizing such images, but for long term maintenance simpler is better.

Regards

tokiop · August 26, 2021, 9:36pm

Hi,

Loading 2Gb of image (in an already uncompressed format ?) seems overkill, the tile-pyramid approach that you seem to have already used, and that is commonly used for such applications (web cartography, ultra-high-res images), is probably much simpler, scalable and performant for a minimal overhead. It could be generated and cached by an existing script or via Haxe.

See for example OpenSeeDragon, an old library which allows smooth scrolling of high res images on common hardware. It supports pinch-zooming, so if using a browser in kiosk mode is compatible with your other needs, it might be a simple solution.

If a native binary or more control is needed, a prototype in OpenFL and/or in Kha (lower level graphic library) to test smooth zooming would be informative and maybe not to long to produce. I guess both make use of GPU as much as possible nowadays. You could take care of preloading and displaying the lower tile while zooming to prevent scale-up, and perfect the display, once the bid is accepted.

BBartonW · August 27, 2021, 8:47pm

@tokiop Thank you for the good input. Sorry my slow reply, I didn’t have email notification set.

I like OpenSeeDragon and have seen sites that use it–without knowing what was underneath. I didn’t even realize that there is such a thing as a zooming image format–much less many of them. The photographer is starting a website and I will pass this on to him.

Viewing these I realized that a pre-determined zoom is more aesthetically pleasing than zooming to match finger gestures. Hmmm, I tested a button interface for zooming on my GIS project, but people are very accustomed to finger pinching and I had to go that route.

I had noticed Kha, but didn’t look very closely at it. I will take another peek.

We are targeting a museum quality experience, or at least being able to get there in two versions. So we are entertaining overkill for now–but we have not been assigned a budget yet.

I started looking for a single image solution after zooming in and out on a huge .ARQ file under Affinity Photo on a computer with an RTX3080 graphics card (not at my desk unfortunately). It is pretty smooth. It does show a little temporary blocking when you zoom fast, but it is not bad. I figured there has to be a way for me to do this.

For that GIS project, to create the thousands of images with multiple levels of resolution I wrote code that spit out data that was then used as input for Javascript routines I wrote for Photoshop batch processing. That production would be much less complex for this project, but the customer wouldn’t be able to just drop in additional images on their own in the future if we use multiple resolutions (unless of course the application has a mode where it creates them when the customer drops them in).

What I didn’t do in that project, was switch imagery (resolution) while zooming. That would have improved the aesthetic appeal.

I might try a test with OpenFl tiles, and see where that gets me.

Thanks again.

d0oo0p · August 29, 2021, 8:11am

Hi. I think this could be solved quite easily using Kha. It is quite simple to use and is more low level with less overhead than OpenFL and gives you excellent access to handling drawing bitmaps, scaling etc on the GPU - at native speeds.

BBartonW · August 31, 2021, 4:45pm

@d0oo0p Thank you for the insight on Kha. I had looked at Kha briefly, but haven’t done any of that type of low level graphics programming. The Kha API documentation (top level) has very little description of how to use the library. Many classes merely have their methods listed, with no explanation.

Is there a good documentation resource somewhere for a noob in low level graphics programming getting started with Kha? Something with an overview?

d0oo0p · September 1, 2021, 9:10pm

Here’s some nice starting points:

And Lewis Leptons Kha tutorial series is perfect for starters:

BBartonW · September 9, 2021, 11:03pm

@d0oo0p Those are good starting points. I haven’t installed Kha yet, but I’ve watched a handful of Lewis Lepton’s tutorial videos. I like the pace and level of detail. The samples link appears to have some samples that will be great for me (e.g., HDR).

Thanks!

Confidant · September 13, 2021, 2:52pm

This reminded me of a product called Zoomify. I used that a bit along time ago but I don’t know what it would take to integrate with your project.

BBartonW · September 13, 2021, 5:37pm

@Confidant I checked out Zoomify. Zoomify uses multiple images (tens or tens of thousands of images). It has the same issues I’m trying to avoid as the GIS implementation I created has, but Zoomify does do it more elegantly. I could at least implement a bit of the nicer technique if I end up implementing a multiple image method. (e.g., when zooming out a long way, they use a static blurred version of the image as the background rather than a “blank” background as I had done).

I see Zoomify has “unconverted image viewing” in a couple of paid versions and drag and drop conversion too.

If we land the project, I’m looking forward to trying to use image processing on the fly with OpenFl and/or Kha, but Zoomify might be a good backup plan.

Thanks!

Confidant · September 13, 2021, 6:32pm

I might show some ignorance here so bear with me as I ramble since I find this interesting. Doing something “on the fly” will require maximum performance and I would think you would need a lot of RAM and would benefit greatly from a GPU. I am wondering if it’s better to use something that supports cacheing, i.e. a server solution (but you’d run it locally) which would save you some electricity in the long run. If you compiled Haxe to Python you could perhaps integrate a library like Dirpy. There’s also the old standards ImageMagick and GD with whatever server that has a cache.

Be sure to report back when you arrive at a solution!

BBartonW · September 13, 2021, 7:16pm

@Confidant I appreciate the additional feedback–good stuff to consider.

The beauty of the situation is that this is a stand-alone kiosk project and I get to specify the computer hardware. I am thinking roughly the equivalent of: RTX-3080, i9 processor, as much RAM as a I need, and M.2 SSD.

It’s first deployment would be international event in New York, the second at a conference at a museum. We are hoping the museum will be interested in a permanent version, so we are putting budget towards elegant performance.

As it is a single kiosk, the hardware expense isn’t as large a factor as it would be for a multiple kiosk project. Of course we might get the project with less budget than asked for, and then I go back to the drawing board (and Zoomify is open source).

BBartonW · May 30, 2022, 1:57am

@Confidant Reporting back as requested (I tried to keep it short–honest):

The kiosk, with version 1 of the application with almost full-sized images, was up and running at its first event last week. Thank you for all the help in the Haxe and OpenFl forums and some one-on-one help.

Zooming in and out of the huge images is fast and smooth, and panning works well, but boy do I wish Haxe applications targeting Windows could play video (would Away3D VideoTexture work or would the same licensing issues apply)! Anyone else want to pitch in for a bounty?

I had to use Electron. There is a big performance hit in one area when using Electron. But the kiosk needs to play video and that was the solution I was able to get working on time.

The goal was to be able to smoothly zoom in and out of huge images. And, this is for a kiosk, so other than OS stuff, my application is the only application running and I don’t care how long it takes the application to load up when launched.

Implementation:

My direct customer shot the photos in HDR, but did not maintain HDR through the image processing, so for now I don’t need to support that.

The touch display is a 50” 4K monitor in portrait orientation. The PC is an Intel NUC 11 Enthusiast with 64GB of memory and a decent M.2 drive. It has the mobile version of the 6GB RTX 2060 GPU and the CPU is an i7-1165G7. It is running Windows 11.

I had to go with 2 sizes of each of the 45 images. For the main menu, where you need to be able to quickly run through images, I used smaller 1080 x 1920 images. I preload those as OpenFl Bitmap’s.

The bigger images are PNGs with maximum compression. To work in Electron, the huge zoomable images had to be scaled down so neither dimension was greater than 16,384 pixels. They are all pre-loaded as ByteArray’s. When the viewer opts to look at a large image, it takes up to 5 seconds to covert it from a ByteArray to BitmapData so it can be displayed. That’s the big performance hit under Electron. Also, that conversion is a processing hog, so tweens (Actuate) of other things pretty much grind to a halt during the conversion.

Once the big image is converted and displayed, the viewer can zoom with two finger pinch/spread and pan by dragging with a single finger. The viewer can also zoom in/out and pan by holding down zoom or pan buttons. They all work fully as fast as desired. And fortunately, I did not need to use a multi-image format for the zooms!

I am super happy with the speed of the zoom using 2-finger pinch and spreads. You can go from a scale of around 0.1 up to 4 in a couple spreads with no lag. For finger zooming and panning I used openfl.ui.Multitouch (no gesture libraries).

Tradeoffs/Laments:

So, why do I wish I could target Windows directly? When targeting Windows rather than Electron, I can preload all the large images as Bitmaps (which of course are mostly BitmapData). I have an additional Bitmap for actually displaying the large images. When I want to change the image being displayed, I merely point the BitmapData property of that additional Bitmap to the BitmapData of the image I want to display. With an exe, I estimate it takes well under a quarter second to switch images. That is compared to around 5 seconds for converting a preloaded ByteArray image to BitmapData under Electron.

Under Windows, I’m guessing that when you load a Bitmap image, Haxe/OpenFl immediately expands it to the internal format needed for display. This is unlike Adobe AIR (and presumably Harmon AIR), which waits to expand the image until you actually display it—and then frees up the extra memory a little while after you quit displaying it.

I say this because the large images take up about 5GB on the hard drive, and my test application expands from about 98MB to about 28GB once all of the images are loaded. It further expands up to about 31GB when switching the images being displayed. I’m further guessing that with an exe, when you actually display the image, that is when it is transferred to GPU memory—and it is removed when you quit displaying it or perhaps on dispose() or disposeImage(). Just a guess because the GPU memory is only 6GB.

Under Electron, I’m wondering if Bitmap’s are by loaded into the GPU by default. Because under Electron, I can only preload about half the large Bitmap’s before the application chokes. If that is the case, and if there is a way to store a Bitmap (or at least BitmapData) outside of the GPU, then doing that might eliminate my 5 second delay when switching images.

The hit happens again, indirectly, during the attract loop where the large images are tweened on and off the screen. I need to have two images converted so one image tweens on while the other is tweening off. But I can’t convert the next image from ByteArray to BitmapData during those tweens, because the conversion processing would grind the tweens to halt. So once in position, an image has to stay on the screen for about 5 seconds while the next image is converted. My direct customer wants to speed that up. And I can do that by using smaller preloaded Bitmap images for the attract loop rather than the large images, but then my customer also gives up the possibility of being able to have the application zoom into the image detail during the attract loop. We’re not doing that yet, but we have talked about it.

So, with an exe, not only would I have vastly faster performance when switching images, but I would also have been able to go without creating a second smaller version of each image. The second version is undesirable because down the road the end customer wants to be able to add images on their own—as easily as possible. Extra image production is an extra step for something to go wrong, and I don’t know if they are savvy enough to be able to scale images to the maximum size that will fit the entire image within 1080 x 1920 pixels for the main menu (i.e., will they scale for the correct dimension—width vs height).

I might be able to speed up the Electron version by loading the images as Bitmap’s when they are needed (and then dumping them when done with them) rather than preloading them. On average the large images take about 2.4 seconds to load as Bitmaps on my development computer with a slower SSD (and under a Windows executable). It would certainly be worth trying.

I would also like to log kiosk usage by storing data in a CSV text file. I’ve done that many times with AIR kiosks and it should be easy under a Haxe/OpenFl exe. I think there is a way to do that under Electron, but I probably won’t go through the learning curve.

Caveat

As the end customer will want to add more images down the road, if I could use a Windows exe I would eventually need to switch computers to something that can have more than 64GB of memory. But I really like that Intel NUC we got, so if that happens soon, I could find another use for that, minimizing the cost of upgrading the computer.

Cheers.

Confidant · May 31, 2022, 5:15pm

Great info, thanks Bruce!

BBartonW · June 2, 2022, 6:27pm

Oops:

The Intel NUC 11 enthusiast actually has a max of 64GB and that is what we installed.