Technical Ramblings » Blog Archive » Image Description

Image Description – What I want

I have about 1500 images that I’ve taken over the last 1.5 years since I got a digital camera. These photos vary wildly in what they contain, from parties with friends to buildings to family photos to stuff around the house. There is a variety of images depicting people from all walks of life, from my college friends to my family now.

I want to search them.

I’ve become very convinced that the best way to store the information about these photos is in RDF: It’s the only data description framework that lets me extend and expand my descriptions to include everything I need. (There’s also the fact that I’m an RDF nut, so everything is a case for RDF.) Once the data is created, adding it to an RDF store and asking that store questions will not be difficult; the main problem is to get the data created.

There are some web based tools for photo/image annotation: mortenf uses a tool he created which spits out ntriples, and there is also the java app attached to the foaf codepiction project. However, neither one of these lets me describe my images very easily; There’s a lot of effort to go through for it.

My ideal photo annotation tool works like this:

Provide a URL to a photo.
This photo may or may not have EXIF data included. If it does, use the exif data to generate the appropriate RDF; something along the lines of the EXIF metadata created by Mashide’s exif2rdf web app would be good.
Allow me to describe the image with a title and description: dc:title and dc:description.
Allow me to describe the person who took the photo: foaf:maker. This part can work from the existing triple store I have; Let me enter a name, and then use that name, along with an mbox_sha1sum, to describe the person.
Offer me a license list: a list of CC licenses to choose from.
Allow me to describe “things” that are included in the photo: entering a term and checking it against wordnet, then using the right sense of the word.
Allow me to describe “people” in the photo: enter a name or nickname, and then present me a list of options from my triple store, or the option to describe a new person. Once I choose a person, a name + IFP should be included in the output so that I can smush the data together.

I’d like for the entire interface to be text based. I don’t want a web or graphical app: with 1500 photos, that’s just too damn much. I want to be able to describe these things quickly and easily, so very little should get in the way, with as few keystrokes as possible. If I want to skip something, I can just hit enter to skip over it.

I’m lucky in some respects, because some of the hard work of gathering information about people is already done for me: I have a large triple store at hand that can help me with this project. I also have some limited experience in working with the Redland framework, so I may even have the experience I need to make this work. So, here’s how I see it working on the inside:

Create an interface along the lines of what Config::Tiny (in perl) does: Question [default] ? then allow for the user to either skip past (and use the default, which may be none in some cases) or to use their option: if their option, possibly provide a list of choices after that. For example, if I enter “Jessica” as a name, it should tell me “There is no name Jessica in the database: would you like to create a new user?” with a yes or no option; if no, it would go back to the previous step and let me enter a different name. Dump the data, as its created, to an in-memory Redland model. Once the photo description is complete, serialize the model to a file, which can then be run through cwm to clean it up for more permanant storage.

I’m probably going to actually get off my butt and write this pretty soon: I have a lot of interest in describing this data, in part because it’s another case where a nifty project can lead to a great demonstration of why and how RDF can work. I’m most interested in someone who may have suggestions on what kind of tools would be good for creating the interface: the last time I did any kind of user input other than command line was when I was working with cin>> statements in C++. I’m probably going to write it in Python (I feel that I really accomplish things much more quickly writing in that language than in either perl or PHP), so tips on how to create command line user interfaces in Python are appreciated.

This entry was posted on Sunday, December 19th, 2004 at 9:31 am and is filed under Image Description. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.