30 August 2006

Maps added to worldinpictures.org

I've just added a "map view" to worldinpictures.org. There's now a couple of links below the search results to switch between "gallery view" and "map view". Select "map view" to see the location of the photos on a Google Map. Hover over a marker to see the photo title, click the marker to see the image.

There's a couple of things I need to improve but I thought it was useful enough to release as is. It needs to cope better when several photos have the same location - at present only one of the photos will be available on the map view. I might have a look at using tabbed info windows to cope with this.

Python unicode function weirdness

Python has a built-in function called unicode which is intended to convert strings to unicode.

When called with only one argument (the string to convert) it will assume the string is encoded in the default encoding. This is normally ASCII but can be overridden in site.py.

I would normally want to write code that would work regardless of the default encoding. Thankfully unicode can take an additional argument to allow you to specify an encoding rather than using the default. Unfortunately, and for no reason I can think of, supplying this argument causes the function to behave differently when given a unicode string as input:

>>> unicode(u"abc")
u'abc'

>>> unicode(u"abc", "ascii")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: decoding Unicode is not supported

This strikes me as rather bizarre behavior. Surely unicode(s) ought to bahave exactly the same as unicode(s, sys.getdefaultencoding())?

Unexpected ElementTree behavior

I've been using the Python ElementTree library for parsing web service responses for my worldinpictures.org site and generally found it reliable and easy to use.

Character encoding issues have caused me a number of problems recently and I've come across another one with ElementTree:

>>> from elementtree import ElementTree as ET
>>> ET.XML('<?xml version="1.0" encoding="utf-8" ?><title>Good morning Mazatl\xc3\xa1n!</title>').text
u'Good morning Mazatl\xe1n!'

>>> ET.XML('<?xml version="1.0" encoding="utf-8" ?><title>Good morning Mazatln!</title>').text
'Good morning Mazatln!'

It seems that if the element contains any non-ASCII characters then the result will be a unicode string otherwise it will be a plain string.

It would be preferable to have a consistent return type (e.g. always unicode or always in the input encoding).

So, in my case, I pass the result through unicode() to ensure I always get a unicode result.

(There's an issue here with the unicode function and its reliance on the default encoding but that belongs in another post...)

25 August 2006

worldinpictures.org - see more

I've added nearer and further links to worldinpictures.org, so now you can see more than just the 12 nearest images to your chosen place.

24 August 2006

Google maps UTF-8 problem

A while ago I came across a problem with the google geocoder apparently returning Latin1 encoded characters rather than UTF-8. I posted an enquiry to the Google Maps API group but didn't get any responses.

Now I've had time to look at this in more detail and found how to fix it. From my investigations I found that:


  1. wget, curl and requests made with Python urllib2 all returned responses encoded in Latin1. Requests made with Firefox returned responses encoded in UTF-8.

  2. Regardless of the actual encoding returned, the XML always stated encoding="UTF-8".

  3. The Content-Type header in the HTTP response correctly gave the returned encoding (either UTF-8 or ISO-8859-1).


So it looked like this had something to do with the headers sent in the HTTP request. I used curl to play around with these and see if I could get a UTF-8 response. The obvious ones (e.g. Accept-Charset: utf-8) didn't work. But what did work was changing the User-agent header. So, if you want to ensure you get a UTF-8 response, pretend to be Firefox:
curl -H'User-Agent: Mozilla/5.0' 'http://maps.google.com/maps/geo?key=&q=cologne&output=xml'

All this means that you can now search for cologne on worldinpictures.org and it will display Köln rather than K�ln.

23 August 2006

Ubuntu xserver problem

If you find X suddenly stopped working on Ubuntu in the last few days you may have the same problem I had. Yesterday a routine update installed xserver-xorg-core 1:1.0.2-0unbuntu10.3. Turns out this "breaks PCI setup for many users", so the solution is to do another update and get xserver-xorg-core 1:1.0.2-0unbuntu10.4 which reverts the change.

Unfortunately I spent several hours trying to fix the problem before doing the sensible thing and checking if an new update was available.

Note to self: always check for an update first.

22 August 2006

worldinpictures gets UK postcodes

With the help of the geocoder web service available at worldkit, worldinpictures.org now understands UK postcodes.

The geocoder only gives accuracy down to the first part of the postcode but I allow searches on full ("SW1A 0AA") or partial codes ("SW1A").

Unfortunately the geocoder doesn't give results for all postcodes - "N1" for example doesn't get resolved. Not sure if this is due to the worldkit web service or the source data it uses (which comes from jibble.org).

Still this makes searching within the UK that bit easier - at least until the google geocoder supports it...

17 August 2006

Detabifying

If that was a word I'm not sure that would be the way to spell it. I'm referring to the process of replacing tabs by spaces.

The unix command expand does this job - taking a file and replacing tabs with a specified number of spaces.

What's the best way to run this command over a number files? find is likely to be useful here to return the files to process. In my case these are all .php and .css files in and below the current directory:

find . -name '*.php' -o -name '*.css'

will get me a list of these.

I thought perhaps I could use find's -exec option to run expand on the resulting files. Unfortunately expand will only send its output to stdout and as far as I can see there's no way of specifying that you want the output of an -exec'd command redirected.

However, I can iterate over the results with for and this does the trick (copying the results to /tmp/x in this case):
cp -r . /tmp/x
for f in `find . -name '*.php' -o -name '*.css'`; do expand -i -t4 $f > /tmp/x/$f; done

The initial cp is just a crude way of creating the directory structure in the destintion location to stop expand failing when sub-directories don't exist.

15 August 2006

worldinpictures.org to do list

There's plenty of improvements I'd like to make to worldinpictures.org in the (hopefully) near future. These include, in no particular order:


  • Improve geocoding: The google service I currently use doesn't support all countries. Specifically it doesn't support the UK, so if you want to view images from the UK you have to find out the latitude and longitude yourself. Not very convenient. I think worldkit.org and/or http://www.jibble.org/ukpostcodes/ may help here.

  • Add a "more" button to view more than the initial 12 images currently displayed.

  • Add a feed feature so that users can subscribe to a feed of photos for a particular location.

worldinpictures.org launched

My new site worldinpictures.org is now up and running. The concept is fairly simple - you enter an address or location and in return get to see photos taken nearby.
Photos are obtained from flickr using the flickr API. Only "geotagged" photos can be used though. This is a convention whereby photos in flickr are tagged with the latitude and longitude at which they were taken. Such photos should have the following three tags:


  • geotagged

  • geo:lon=XXXX

  • geo:lat=YYYY


where XXXX and YYYY are the decimal longitude and latitude values (see geotagging on wikipedia).
With this information I can build up a database of photos indexed by latitude and longitude. This can then be queried for photos within a certain distance of a specified coordinate.
Next step is to allow a user to enter a location, address or placename rather than a latitude or longitude. Google comes to the rescue here with their geocoder (which is part of the Google Maps API). I supply it with an address and get a latitude/longitude in return.