For a long time, I didn’t know how I should sort my image collection. My pictures directory was always this messy dimension where all my wallpapers, anime pictures, charts and memes were sent. Sure, everything was there but how could I find a particular image one week, month or year later? When I asked people how they sorted their own files, almost all of them said they were using directories. The others simply didn’t bother and were throwing all their stuff in their Pictures or Downloads directories.
Obviously, it didn’t solve my issue. I have thousands of pictures to sort and manually sorting them in directories or even subdirectories was a daunting task. An alternative approach to this hierarchical organization was to use tags. Popular on anime boards called “boorus”, it would be perfect way to sort all my files. Fortunately, I found a CLI program to tag files: TMSU. You just had to run a command to add a tag to a file, which would be stored in a SQLite database, or retrieve a picture associated with the queried tags. It seemed to be the perfect tool but I’d still have to manually tag each image and establish a naming convention…
I partially solved this issue by using an AI-based program to tag anime pictures: DeepDanbooru. Basically, it takes an image as an input and output tags based on a Danbooru dataset. So, I followed the installation instructions and quickly wrote a script.
#!/bin/bash
for file in $(tmsu untagged "$1")
do
tags="$(python -m deepdanbooru evaluate "$file" --threshold 0.95 --project-path \
deepdanbooru-v3-20211112-sgd-e28 | awk -F ') ' '{print $2}' - \
| sed '/^[[:space:]]*$/d' | tr '\n' ' ' | head -c -1)"
echo $file
echo $tags
for tag in $tags
do
tmsu tag "$file" $tag
done
done
It looped through every image in a directory, stored the ouput tags and ran a TMSU command to add each tag to the corresponding file.
I was really surprised by the results accuracy. It could even “recognize”
anime characters. On a quick note, I ended up with over 3500 tags and 1000
tagged files according to tmsu tags | wc -l
and
tmsu files | wc -l
. I could decrease the number of tags by using
TMSU tag implication or merge, remove useless or badly formatted tags.
In fact, tags like character_name_(series_name!) have to be put between quotes
and some of their characters need to be escaped like this:
“character_name_(series_name!)”. The naming convention wasn’t perfect but
the tags provided by this dataset were fine.
The last piece of the puzzle was the viewing part: how could I quickly
view the images matching one or multiple tags? It was time for
sxiv to shine!
Sxiv is a small image viewer and has some pretty handy flags like
-i
and -o
. They respectively read file names from
the standard input and write marked files to the standard output. It can be
used with other flags and pipes to craft the following command:
tmsu files cup bottle | sxiv -abftio | xargs tmsu tags
It searches files with the “cup” and “bottle” tag through the TMSU database.
Then, it opens the matching files in a fullscreen sxiv window in thumbnail mode,
hides the infobar and plays animated images thanks to the -abft
flags. In this gallery-like mode, images are shown side by side like in a file
manager. It allows us to mark images by pressing m. When we quit sxiv with q
the last part of the command is executed. xargs takes the file names of selected
images and passes them as arguments to the tmsu tags
command to
show which tags are associated to these images. Sxiv can also be used to
perform operations on images like explained on the ArchWiki
This sorting issue was an opportunity for me to learn more about shell pipes and standard input/output. It also demonstrates how easy and beautiful CLI programs combination can be.
See you again, have a nice day!