Photobash plugin development

Grum999 · October 25, 2021, 11:41am

QThread are not so difficult to use for parallelization once you’ve understood how to use it

I’ve used it for BuliCommander:

Doing analysis of ~10000 files (~80GiB) took a couple of seconds (analysis read all files size, image dimension, hash calculation, …)
Generating thumbnails (512, 256, 128 and 64px size thumbnail for each file) is a little bit longer (between ~50s and ~300s according to computer activity to generate 40000 thumbnails from 80GiB images – tested from a SSD, running with 24thread)
Use of threads allow to keep ui responsive while computer is doing intensive computation
Also load thumbnails asynchronously in treeview to keep ui responsive

But the thing for performance is not only the use of multithreading, you have to generate a cache and use it.

Once cache is generated, it took around ~1.2s to load 10000files thumbnails in a treeview

Example:

If you"re interested, you can take a look on WorkerPool class I’ve wrote to simplify my use of parallelization jobs

Grum999

slightlyangrydodo · October 25, 2021, 11:44am

Very interesting! I’ll look into it later. Right now, in terms of priority for development, I’m going back to the Krita Redesign, as I always prefer to drive something to “near-completion”, and then change context. In truth, it’s me who has problems parallelizing tasks

However, I’m saving this information for later!

Grum999 · October 25, 2021, 11:51am

I understand, I also have this problem

Anyway, the day you start to work on this part for your plugin, don’t hesitate to ask for help with the provided class if you need

Grum999

slightlyangrydodo · October 25, 2021, 11:53am

Will do! This would improve the usability of the plugin quite a bit in scenarios with large folders, but I want to produce a larger release, and I already have two other killer features that I’m going to try and implement, namely quick auto-masking to generate a selection, and fetching images from CC0 sources. I don’t know how I’m going to do either, but they’re great additions for a v3.

EyeOdin · October 26, 2021, 5:15pm

I did a quick test yesterday and I don’t think caching so many images even with qthreads to be very good it will eat all the whole ram regardless no? Krita still needs memory to operate while you use photobash.

On my computer caching 700 images is stretching it and stalls the other operations of krita to have it in memory. So giving support to folders of 60k seems impossible unless you filter search that and cache it after and still it would lag. I don’t even have folders that size to test it, the most populated folders I have are 1k in size and photobash can go through them a bit slower but it goes. The ideal for me was 200 images or less for no effort.

Like I can make it faster but it will loose features or look a more pixelated.

Grum999 · October 26, 2021, 5:57pm

I don’t know which kind of test you made, and how the cache has been implemented…

On my side, results are clear (tests made on 10000 images ~80GB)

Without cache: load and create thumbnails take between 50s and 300s
With cache: load take 1.2s

I can’t talk about your tests.
In my case, during process, once one image is processed data is “stay” in memory until the GC made cleanup; you can force GC if you want but that’s useless, just let Python doing things when needed.

So for memory, I don’t really care.
You never have the 80GB image loaded in memory, the only thing that can take place in memory is the QTreeView filled with 10000 items: size in memory depend of size of thumbnail (for what I see: 64x64px size took ~300MB memory / 512x512px size took ~6GB RAM)

After, take in consideration that displaying 10000 image in a single view might be useless, nobody is able to look result (or it might took hour to scroll to search one picture inside a such view), you have to filter this.

It will depend of your computer (mine is 24CPU / 60GB RAM; but on my laptop 4CPU/16GB it’s slower :)) and how you’ve implemented things; also Python is not the best language for this (a compiled language will provide better results in memory usage and speed)

On my side, event with all CPU working to resize image and save results in cache:

Krita UI was still responsive (but Ok, not possible “to draw”)
Other software were running properly (browsing or screen recording without any problem)

Also, I currently let my plugin using all available thread, but it’s possible to tune this to tell him to only use 75% of 50% or less of thread (means: you let resources available for other task on computer)

After for a plugin like photobash, you do as you prefer, it’s not mine.
I provided a possible solution, use it or not; on my side I’m comfortable with results I have for my own case

Grum999

EyeOdin · October 26, 2021, 6:36pm

I don’t have 80 or 60Gb of RAM to spare though
Photobash is also not mine I am just trying to help.

TheTwo · October 26, 2021, 7:26pm

I think speed is not a problem at this time. Because it is often not very practical.
When you have so many pictures, you often need a professional software to manage it (eagle, digikam…). You need to label them to make it easy to find them. And often there will be a separate display to place them.
This should be the opposite of the plug-in idea (docker should not take up too much territory). The plug-in is also unlikely to create a database for the images in the folder to store thumbnails

KnowZero · October 26, 2021, 8:41pm

Caching is not limited to ram. You can cache as files too. That is how most file managers do it. Anyone who used older versions of windows probably remembers randomly seeing hidden thumbs.db files in each folder. Those were the thumbnail cache to speed up loading. (since they have been moved to a centralized folder)

By making cache files, you reduce the most demanding operation which is resizing all the images. While also reducing the 2nd slowest operation, which is reading the files from disk. You may even see a reduction in ram usage too.

EyeOdin · October 26, 2021, 10:48pm

I need to explore that too, making a cache file sounds like a interesting idea to explore.
I will look more into the QThreads also that my previous implementation ceased to work after 4.4.7.

Grum999 · October 27, 2021, 6:26pm

I’m hesitating to provides additional answers because the original topic about photobash plugin is slightly derivating…
@raghukamath maybe all posts related to how to improve plugin with cache, QThread & parallelization can be splitted in a dedicated topic in Plugins Development category?

I’ve made a test on a Windows 10 running in VM with 4CPU/8GB RAM
I think it’s a “normal” configuration to be able to run Windows 10 properly and available for most of users

No surprise, generating cache with 4CPU instead of 24 is slower
Also, note that running on VM is naturally slower (not normal to be slow at his point on my computer, but normal)

But, it works without any problem (test made on 14000 files, 91GB; average JPEG files are 7~9MB with some up to 170MB)

At the maximum, Krita took I think 1.8GB RAM during cache generation. That’s a peak I think related to the converted image (some of my JPEG files in test directory are 40MB, 80MB, 170MB so loading them take naturally more space in memory)

Once finished, cache on disk for thumbnails takes:

110MB for 64x64 pixel size
435MB for 128x128
1.7GB for 256x256
6.84GB for 512x512
(I generate multiple sizes, this allow to optimize loading according to current thumbnail size in QTreeView)
So it has generated ~9GB date for 56000 files from initial 91GB of 14000files

Once generated, no need to generate it anymore (except for files that has been modified, but regenerate cache on the fly for one file is not visible – or except if cache file has been entirely deleted)

You can see here a screen record during the process, with CPU/Memory/Disk usage:

The video has been cut (because original total process was 5minutes length and not possible to generate a webm file small enough to be posted here)
You can see that during the cache generation, I’m able to create a new document and draw on it; don’t expect to be able to paint, that’s a bit laggy for sure, but it’s more to show that computer is still responsive during the process, the user interface and more globally the computer is not freezed (thanks to QThread for that)
You can also see at the end the memory occupied by Krita to display 14000 files in treeview => I change the small thumbnail size to larger one
– it took a couple a second to reload thumbnail for 14000files, and memory usage grows (because thumbnails are bigger)
– closing the windows, few seconds after you can see the memory usage decreasing; it took some seconds because of how Python memory is managed (Python’s Garbage Collector made cleanup of unused item in memory when it determinate it’s the right moment to do it)

Sorry for video encoding quality, to encode 3’49 of video in 2.5MB file size, there’s no other choice

Grum999

EyeOdin · October 27, 2021, 11:03pm

@Grum999
Considering what you said I did some changes and I did some more tests today after I managed to get QThread working properly for this case. it was not quite what I was expecting it to become. I think this is still upgradable.

All values are adjustable but:

thumbnails at 256px so it is not too pixelated
when thread is activated the loading bar appears. Activation is done by Ctrl+LMB click currently.
thread load building 500 pages left of current page and current right, while cleaning what is too far
photobash ui lock while thread is working so it does not call things that are not built yet.

I found a folder on my drive that has 12k images for concept art and is the one I am using in the example and the frame rate of the gif really does not do it justice after the load but only a short distance around. I could make it load the whole folder but I feel krita would get too heavy considering too big folders and crash most users on that case. so 1k images limit? I dont know it was a round number I choose and did not behave badly. with smaller thumbnails you should be able to make this range alot higher. the increase of memory like this was not as extreme as I was getting before.

thread_pre_load

slightlyangrydodo · October 28, 2021, 6:28am

I actually limited the maximum number of images to 10k, because I didn’t think that the average user would have more than that!

hulmanen · October 28, 2021, 6:45am

Off on a tangent: for organizing, I like a canvas approach, like PureRef uses. Spatial memory is a beast, so if you know your nature photos are somewhere top right, and maple trees are at the center of the tree cluster, you can find things quite quickly even with a lot of material. I would just load all my reference material into PureRef if it could handle that amount of files… Unfortunately it doesn’t seem to do any LODs or caching as far as I can tell, it just loads everything into memory at full res.

So… Inifinite canvas for the photobash plugin? Any chance?

EyeOdin · October 28, 2021, 7:38am

@hulmanen
I am not sure if that would be possible or how to make that.

slightlyangrydodo · October 28, 2021, 8:52pm

That is firmly in the “nope” category, sorry! The plugin is only meant to assist in the photobashing process, it can never be as good as PureRef, since that program does A LOT more than just storing references, and is quite excellent at it. I strongly believe that the simplicity of this plugin is one of it’s greatest features

EyeOdin · October 29, 2021, 8:00pm

When doing a change to the image the page is not reset unlike when filter is activated made when adding a keyword so you can see the change on it and not have to scroll back to it again.

no_page_reset_on_edit

EyeOdin · October 31, 2021, 6:29pm

Expand, Random and Current Folder Display:
expand_random

Also negation of context menu on buttons without images.

Also I have been toying with an very odd idea. Considering this works with searching keywords in the filename how about opening a dialog window where you can add key words to a given file easily. Like right possible keywords for a folder and they activate a set combination and when you click the image on photobash those key words are replaced by what you have. Kinda like:

Keywords:
o tree
x rock
x clouds
o water
o mountain

Names:
“name.ext” > into > “rock-clouds_name.ext”

I think it is possible but I am not sure how worth making something like this to be worth the trouble.

hulmanen · October 31, 2021, 6:58pm

I think tagging would be a great idea, but I wonder if there’s a better way than altering the filename. Exif tags come to mind?

EyeOdin · October 31, 2021, 7:59pm

Seems Exif was deprecated on Qt5 and you have metaData as a replacement, but regardless that sort of image “text” to input keys and comments are only valid for images of given formats like BMP, JPG and PNG only I think.

Also making that would force a annoying rewrite on the filter that would make it slower, that I am not too keen because I like the idea to search by format as much as properties like “rock” or “mountain”. To me this makes sense on the filename though as it catalogs images regardless of software.

I will keep an open eye for this but I am unsure it is doable.
I need to write over to even see if it is worth.

I have expanded photobash and it is opening more formats even though it is still incomplete because of SVG that I am not able to load up correctly into Krita still because it uses vector layers.