Organizing digital photos

From Electron Cloud

Revision as of 22:24, 9 September 2009 by Ecloud (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Problems to be solved

  • Current-model Canon cameras have gotten more stupid than old ones regarding the naming of photos on the memory card. My Powershot A50 would remember (via in-camera memory, apparently) how many pictures it has taken during its entire life; so, say, if aut_3902 was the last picture I took, and then I insert a different memory card, the next picture will still be aut_3903. This is as it should be. The SD300, however, always starts over from zero. So you end up with a lot of img_0001.jpg, over and over and over again, every time you delete anything from the SD card. It's otherwise a decent camera but this is rather stupid of them.
  • The typical filesystem (on Linux or otherwise) doesn't go to any great effort to preserve the original date and time of a file. E.g. in the past I'd copy my pictures like this:
cd ~/pics
mkdir 200607-timbuktoo-trip-1
mount /mnt/sd0
cp -a /mnt/sd0/dcim/100canon/* 200607-timbuktoo-trip-1/
cp -a /mnt/sd0/dcim/101canon/* 200607-timbuktoo-trip-1/
...

so, initially, the files all have the correct dates and times, because -a preserves date, time, permission, and ownership. Now suppose I touch up a picture in Gimp and re-save it. The time and date are changed to when I did the touch-up. Linux filesystems don't preserve the original date and time at all. It continues to exist in the exif tags (as long as those are preserved during all the editing operations) but that is beside the point; if I do ls -lrt, they are listed in the wrong order.

  • ReiserFS v4 is still immature, and nobody has figured out how to finally absorb exif tags (and other kinds of metadata) into the filesystem itself, rather than keeping them inside the files. And basic tools like "ls" do not access the metadata inside the files. More advanced file managers (like Konqueror, or Explorer) do use them to an extent.
  • Exif tags include an "orientation"; some cameras (including my Canon) can indicate whether the picture was shot in portrait mode. Unfortunately, many programs pay no attention to this tag, so sometimes when viewing you will see such an image rotated, but often you will not (the portrait-mode image is shown in landscape mode instead).

Note that all of these are open, unsolved problems. The first one is Canon's fault. The remaining is the result of a philosophical mistake - the lack of extended metadata in the filesystem, and tools to manipulate it. Apple showed us the way, more than 20 years ago, with the resource fork. Yet we still do not have an equivalent system in any modern OS. EACH date and time should be accessible via ls. The ORIGINAL date and time should remain the same even when the file is changed. And extra metadata (exposure, lens, camera model and so on) should be extended metadata stored on the filesystem, not in the file. But then there would have to be a cross-platform replacement for the FAT filesystem, which has extended metadata, and everyone would have to use it. It's not hard, just hasn't been done.

  • There is no cross-platform method of doing drag-and-drop of files to a browser, to upload to an online album service, or printing service. Consequently getting images printed is usually a tedious one-at-a-time upload process, or else using a Java applet, or ActiveX control, or stupid one-service-only, one-platform-only special image uploader application. For the old-fashioned method of uploading into a web form though, there is a Firefox extension to make it easier (described below).

Current workarounds

ExifTool

ExifTool is quite useful for renaming files (among other things), so that the file contains the date and time, and a normal ls command, sorting by filename, will show them in the right order.

cd ~/pics/200607-timbuktoo-trip-1
exiftool -r  -d %Y%m%d-%H.%M.%S.%%e "-filename<CreateDate" -o . /mnt/sd0

This I can do repeatedly for each SD card that I used, and as long as the date and time on the camera were correct, I will get files of the form 20060721-11.50.12.jpg. (Using colons as separators in the time field would be nice, rather than dots, e.g. 20060721-11:50:12.jpg, but such files are not portable to Windows because they have the concept of the drive letter, so therefore a colon is not allowed in a filename. How lame.) Now that the Canon numbering is completely useless (due to endless repetition) I don't see any point in preserving it. And if I cannot preserve the date and time on the filesystem, at least I can preserve it by putting it in the name. If I want to rename a picture to something more descriptive, I can still just add descriptive text onto the end of the name. (But usually I don't bother.)

However there is a divide-by-zero bug in exiftool when $x_res and $y_res do not end up being defined (I guess these were exif tags it expected would always exist). I worked around it like this (edit /usr/lib/perl5/site_perl/5.8.6/Image/ExifTool/Exif.pm):

# calculate focal plane size in mm
$w *= $units / ($x_res > 0 && $x_res < 1000000 ? $x_res : 10142.86);
$h *= $units / ($y_res > 0 && $y_res < 1000000 ? $y_res : 10142.86);

The default value being the one my camera usually provides.

Now what if the date and time on the camera are not correct? exiftool can offset all of the dtimes by a fixed amount, but here it gets to be a pain, because it cannot fix all 3 dtimes, rename, and copy, all at the same time, apparently. So I have to copy the files, then fix the dtimes, then rename the files again. In the example, the date is off by 1 day and the minute is off by 1 minute.

exiftool -DateTimeOriginal-='0:0:1 0:1:0' .
exiftool -CreateDate-='0:0:1 0:1:0' .
exiftool -ModifyDate-='0:0:1 0:1:0' .
exiftool -d %Y%m%d-%H:%M:%S.%%e "-filename<CreateDate" .

ExifTool does not rename AVI files, but it can get the DateTimeOriginal timestamp out of the ones from the SD300. So my script "get-photos" now looks like this:

Script: get-photos (cut to the chase)

exiftool -r  -d %Y%m%d-%H.%M.%S.%%e "-filename<CreateDate" -o . /media/CANON_DC
for f in `find /media/CANON_DC -name *.avi`
do
        cp $f `exiftool -d '%Y%m%d-%H.%M.%S' -DateTimeOriginal -S -s $f`.avi
done

When I plug in the SD card, hald auto-mounts it at /media/CANON_DC, and then I can run "get-photos" to copy all the photos and videos from the SD card into the current directory, renaming on-the-fly.

RenRot

Another way, which also solves the rotation problem by losslessly (!) rotating the actual JPEG image, is the Perl script RenRot, which uses exif tags for only two purposes - RENaming and ROTating images. renrot *.jpg does more or less the right thing by default, but the names will be like 20070111140601.jpg without any separators. You can optionally specify a template for the renaming.

Uploading images to a web album

  • DragDropUpload is a nice Firefox extension, which is useful for any multiple-file-upload form: if there are 10 file-upload fields, you can drag up to 10 files to the first one and it will fill them going downwards from there!
  • It's also possible to use FUSE to mount a WebDav service and just copy files into place. Then you need to use a local album HTML generator rather than something like Gallery (which is server-side PHP). For personal use, server-side code is kindof silly... it saves cycles to just generate the HTML once and let it be static after that. Nevertheless I keep using Gallery out of habit. Gallery is supposed to have WebDav support too, but I haven't gotten it to work yet.

Geotagging

Tools:

First I corrected the photo timestamps like this:

exiftool -DateTimeOriginal-=0:0:46 *
exiftool -createdate-=0:0:46 *

because my camera was set 46 seconds into the future, relative to the GPS clock, at the time I took the pictures. (The SD300 has a decent clock; it had not been set for many months, and the error was only 46 seconds.)

Then I used exiftool as above to rename the files and copy them into their final location.

Then I used gpsPhoto.pl like this:

~/bin/gpsPhoto.pl --dir . --gpsfile ~/gps/20080315-south-mtn.gpx --timeoffset 25200 --kml hiking.kml

25200 is the number of seconds to add to the photo time to get GMT time. (Arizona is -7 hours relative to GMT.)

I did not have my GPS running all the time so some photos didn't get tagged. (GPS was running out of track memory and I shut it off to avoid having the first part of the track get erased.)

The kml file can be viewed in Google Earth, and will show the pictures at their correct locations.

  • geophoto I didn't try this yet
  • GPSCorrelate Has a GUI, but I didn't manage to get either the GUI or command line version to work:
[proton][02:48:04 PM] gpscorrelate --timeadd -7 -g ~/gps/20080315-south-mtn.gpx -v 20080315-19.12.52.jpg
EXIF-GPS Photo matching program.
Daniel Foote, 2005.

Reading GPS Data...

Correlate:
20080315-19.12.52.jpg: No match.

Completed correlation process.
Matched:     0 (0 Exact, 0 Interpolated, 0 Rounded).
Failed:      1 (1 Not matched, 0 Write failure, 0 Too Far,
                0 No Date, 0 GPS Already Present.)

Ideas for future tools

Online gallery

WebDAV is great, and beats needing special server-side code. So we're talking about a tool to manage photos, and generate static pages. Something like gphoto or iphoto, but smarter and better.

  • Arrange photos taking their aspect ratios into account (cassowary maybe?) E.g. a panorama could span across the entire "page" while 3 or so regular photos are underneath; 3 rows of landscape-mode photos could fit alongside 2 rows of portrait-mode; etc. The algorithm should be flexible rather than assuming certain aspect ratios. Thumbnail sizes could be tweaked a bit to achieve "justification", within a user-specified min/max range, and assuming that bigger is better as long as it fits.
  • Use the same method Slashdot does to load more stuff when you get to the bottom of the page. Having to hit "next page" really sucks, especially when you have to scroll on the first page, then scroll back to the top just to be able to hit "next". So it should be on one page, yet at the same time people on slow connections should be able to view the first few photos while the rest are loading.
  • Hover to zoom somewhat, click to zoom to full-size
  • Store captions inside photos, or in filesystem metadata. Allow modifying metadata without modifying filesystem "creation time". Then, when generating the static HTML, use those captions.
  • backgrounds that look like paper photo albums, different types from around the world (hold a contest to encourage people to submit scans of blank photo album pages)
  • captions on slips of "paper" under the photos (one possible mode)
  • If the browser is resized, photos could flow so there are fewer rows... but that interferes with using algorithmic optimization and "justification". Would work fine for stupid grid or row-based layouts (which do not do anything special with different aspect ratios); yet, I've never seen an online gallery which worked that way, for some reason.
  • Alternatively, if the browser is resized, could thumbnails be shown larger? It would have to be done in JavaScript by modifying the img size attribute, so therefore thumbnails would have to be bigger to begin with, so there is something worth scaling. Probably 2x larger than "normal", then if you have a really big monitor they would become larger, or if you have a small one they become smaller. But there is more work for the client then (rescaling images).
  • Same tool can run with a GUI, or in command-line mode to just generate a gallery from all the photos in a directory (or recursively from a top-level directory)
  • WebDav support needs to be tolerant/persistent: treat it as a background task to sync the new album into place, rather than a modal process with a progress bar. E.g. if you fail to remove a file or folder, try again. Try removing files in a different order before removing folders, etc. (I have noticed that rm -rf fails on DreamHost, but rm -r is OK. Why?!?) If the connection is intermittent, it should still work, ideally; just might become a background task.
  • At the same time, do not require generation of the entire album on the local disk before uploading: do the scaling (of thumbnails and medium-sized zoomed versions) just-in-time, at the speed the items get uploaded. So plain old rsync is not adequate. (I have noticed that iphoto fails to upload large quantities to DreamHost via WebDAV. So I have to generate the whole thing locally first. That takes too much space.)
  • Importing photos from a flash card needs to be automatic when I insert it, so I can quit using a command-line script for that. Should organize them the way I want them to be organized. (I.e. udev can open this photo manager program when certain cards are inserted)
  • Ideas for preferences:
    • Management style
      • Leave my photos alone! Just view them and allow editing metadata, generating albums etc.
      • Full-blown ownership of photos, for Bubba 6-pack who has no idea how to organize these "file" things into these "folder" things (like iphoto takes ownership of photos)
      • When I create "event" folders, move the photos into them, and put them here: ___________ Otherwise leave them in the same folder.
    • How to handle photo modifications
      • When I make material changes (cropping, retouching etc), back up the originals with suffix _________
      • When I make material changes, leave the original with the same name and prompt for a new name
    • Naming convention for importing photos
      • Put each imported set into a new folder with the date
      • Rename each photo according to its exif data, with the date and time
      • When I rename a photo, prepend the date and time
    • Preservation
      • Keep the file date the same as the creation date from EXIF
      • Original files are read-only, period. Any modification whatsoever creates a copy in subfolder _______
    • Importing photos
      • Open automatically when these cards are inserted: __________
      • Stay open after importing / exit after importing
      • Use this template for names: ___________
      • If photos contain GPS coordinates, group them into events based on proximity in time and space
      • Download GPS coordinates from this separate GPS device: __________
      • If GPS is not connected:
        • Prompt for it
        • Ignore it and just import the photos
      • Time difference between camera clock and GPS is ______
    • HTML
      • Scaling the window causes:
        • Photos to scale to fit
        • Rows of photos to reflow to fit

Simulation of real-world photo handling

not documented here :-) But something was already done on BeOS anyway (so I've heard)...

Personal tools