Memory shards — The magic of combining command-line programs

For a long time, I didn’t know how I should sort my image collection. My pictures directory was always this messy dimension where all my wallpapers, anime pictures, charts and memes were sent. Sure, everything was there but how could I find a particular image one week, month or year later? When I asked people how they sorted their own files, almost all of them said they were using directories. The others simply didn’t bother and were throwing all their stuff in their Pictures or Downloads directories.

Obviously, it didn’t solve my issue. I have thousands of pictures to sort and manually sorting them in directories or even subdirectories was a daunting task. An alternative approach to this hierarchical organization was to use tags. Popular on anime boards called “boorus”, it would be perfect way to sort all my files. Fortunately, I found a CLI program to tag files: TMSU. You just had to run a command to add a tag to a file, which would be stored in a SQLite database, or retrieve a picture associated with the queried tags. It seemed to be the perfect tool but I’d still have to manually tag each image and establish a naming convention…

I partially solved this issue by using an AI-based program to tag anime pictures: DeepDanbooru. Basically, it takes an image as an input and output tags based on a Danbooru dataset. So, I followed the installation instructions and quickly wrote a script.

#!/bin/bash

for file in $(tmsu untagged "$1")
do 
    tags="$(python -m deepdanbooru evaluate "$file" --threshold 0.95 --project-path \
        deepdanbooru-v3-20211112-sgd-e28 | awk -F ') ' '{print $2}' - \
        | sed '/^[[:space:]]*$/d' | tr '\n' ' ' | head -c -1)"
    echo $file
    echo $tags
    for tag in $tags
    do
        tmsu tag "$file" $tag
    done
done

It looped through every image in a directory, stored the ouput tags and ran a TMSU command to add each tag to the corresponding file.

I was really surprised by the results accuracy. It could even “recognize” anime characters. On a quick note, I ended up with over 3500 tags and 1000 tagged files according to tmsu tags | wc -l and tmsu files | wc -l. I could decrease the number of tags by using TMSU tag implication or merge, remove useless or badly formatted tags. In fact, tags like character_name_(series_name!) have to be put between quotes and some of their characters need to be escaped like this: “character_name_(series_name!)”. The naming convention wasn’t perfect but the tags provided by this dataset were fine.

The last piece of the puzzle was the viewing part: how could I quickly view the images matching one or multiple tags? It was time for sxiv to shine! Sxiv is a small image viewer and has some pretty handy flags like -i and -o. They respectively read file names from the standard input and write marked files to the standard output. It can be used with other flags and pipes to craft the following command: tmsu files cup bottle | sxiv -abftio | xargs tmsu tags

It searches files with the “cup” and “bottle” tag through the TMSU database. Then, it opens the matching files in a fullscreen sxiv window in thumbnail mode, hides the infobar and plays animated images thanks to the -abft flags. In this gallery-like mode, images are shown side by side like in a file manager. It allows us to mark images by pressing m. When we quit sxiv with q the last part of the command is executed. xargs takes the file names of selected images and passes them as arguments to the tmsu tags command to show which tags are associated to these images. Sxiv can also be used to perform operations on images like explained on the ArchWiki

This sorting issue was an opportunity for me to learn more about shell pipes and standard input/output. It also demonstrates how easy and beautiful CLI programs combination can be.

See you again, have a nice day!