Why do we shoot too many images (good and bad reasons) ?
Fear to miss out the right moment, burst mode, two similar images won’t cost you anything in a digital world. There are many reasons to shoot a lot of images per year. Some of them are good reasons, some not but it does not matter. At the end of the shoot, it is time to cull all these images.
Why it is a problem ?
I have already discussed this in a previous blog post.
I have even calculated an estimated carbon footprint for not doing it and why it is something to consider for the planet !
As a summary, eradicating similar images is just part of the need for streamlining the pre-process of a photo session.
How to fix it?
First, we can do it manually. For years I have been culling 90 to 95% of my images this way. Nowadays, thanks to AI, it can be done faster and it can be done even better. AI culling software have limits but also tangible capabilities and have been available since a few years.
Let’s have a closer look at how eradicate the similar images specifically.
Why it can be so difficult?
NB: we don’t talk about the duplication done by the user on the machine due to different thumbnails, copy of the JPEG and so on. We talk about creating with the camera very similar to quasi-identical images during the photoshoot.
First there are two main kind of different “duplicates” or “similar images” created during the photoshoot:
- Similar images created in a burst mode,
- Similar images of the same topic.
The similar images from burst mode will be discussed in another blog post, it is about choosing from 10, 20 if not dozens of images shot in less than a few seconds the best one(s).
Duplicate by kind: similar images of the same topic means “same topic shot but in different time, angles”. This is also synonym of “I don’t want to miss, so I try many things before thinking about what I need”. Again, it does not matter whether there is duplication for good or bad reasons.
How AI can help in the eradication of similar images
To cull that efficiently, images must be grouped by similarities, whatever the time of creation of the image or any other metadata. It is a question of similarity (in the burst mode, the temporality is paramount, not the similarity).
There have been a lot done with regards to estimators capable to detect 2 similar images since now more than a few years.
And it is easy to define a threshold for the similarity’s level, but I don’t think it is a relevant approach. At the end of the day, either 2 images are close enough to be considered the same topic or not. This looks more like an absolute value. Of course, some thresholding might be necessary for some cases, I know, but the idea is to define in that case one threshold for the usage and to stick to it, not one threshold by photoshoot.
That being written, the other difficulty comes from our ability to choose between several images. We are limited by different parameters: thumbnails should be big enough and at the same time we should see all thumbnails at the same time. I have discussed this already and I have defined what I am calling “the rule of 8”.
This means when you have more than 8 similar, you will have “false negative” that should be culled too, it is inevitable. But you can cull 7/8 of the images (85% typically) in a very efficient way.
To cull more that 85%, you need to cull the result of the first analysis done with this rule of 8. Hence the needs for an analysis of the analysis… In Futura Photo, this is a new rule to be released with the next version 2.17. This means you can cull up to 63/64 (98%) of the images very quickly. As the assets (thumbnails, estimators, …) have been calculated for the 1st analysis, the 2nd one is much faster by the way.
If you don’t know how this can work, I suggest to have a look at this short video:
When the new version 2.17 that will embed this 2nd analysis feature will be available, a video will be released to explain more in details how the new rule works and what value it can bring to the photographer.
It is now very much possible to automate some steps after the shoot, and a software like Futura Photo can achieve a culling rate of duplicates up to 80-85% with the existing version, and up to 98% with the next to be released.
So, the time spent to add this (mostly) automated step can be wisely invested as it will allow the photographer to only post-process the images needed and will avoid the archiving of thousands of useless JPEG or RAW files.