Archive for the 'Software' Category

Learning new things

Posted in Software on December 3rd, 2014 at 00:06:07

One kind of sad thing about learning new things about software while working for Google: It is unlikely anything that you learn about specific tools outside of Google will be relevant to the tools you use for work inside, and vice versa.

In general, Google has a massive set of awesome tools for everything from deployment to monitoring; and a broad codebase with libraries that can do everything under the sun. However, that means that when you go outside of the Google environment, you’re suddenly stuck a bit out in the cold — the tools that you use inside Google can’t be used outside, so you have to have a completely separate infrastructure (both literal infrastructure, and code infrastructure).

This means that, contrary to prior experiences, where tools I learned on my own time could be useful to me at work, and vice versa, that isn’t true anymore.

In most cases, this is overall a positive thing. At the moment, I’m learning about ansible and vagrant, tools for spinning up VMs locally and provisioning them. However, my work at Google can’t really use this — Google’s toolchain and build/deployment process are totally different. In the other direction, I’ve been setting up monitoring of my services inside Google, but the massive monitoring infrastructure investment that that work is based in is completely unavailable outside of Google.

It’s not a big deal, but it does make the motivation to learn new things outside of Google a bit lower. It also likely contributes to the well-known artifacts of most people who disappear into Google no longer generating as much code outside of Google — once you live in a completely different environment which has some nice things going for it, the idea of learning new tools to replace what you’ve already got seems kinda silly, doubly so since you can’t re-use those tools for work.

On the “scale of things to be upset about at your job”, of course, this is a pretty minor complaint. 😉 It’s just something I realized while doing some experimentation with new tools: sharing my “new shiny” at work tomorrow isn’t really going to have the same effect it might have in the past.

FC40 Camera – Reverse Engineering

Posted in Photography, Software on April 27th, 2014 at 21:42:16

So, the Phantom FC40 comes with a wireless camera, controllable from an iPhone or Android phone. As adumbrated at hotrate.com, it’s a small device — about two inches by two inches, similar to the size of a GoPro if you cut off the ‘lens’ part. I think the camera is a stripped down version of a another wifi camera; at least, the specs, shape, etc. seem to match.

It records 720p video to a built-in SD Card, and is controllable via an App on the iPhone or Android. However, there’s no way to control it from a laptop, which sort of annoys me. I’d also like to experiment with linking the camera and doing live-streaming straight from the camera to the web — I think that would rock.

After doing some research, I found that it looks like the software on this camera is very similar to the wifi support on the GoPro: It uses an “Ambarella streaming” web-app hosted by Apache Coyote on the device. The device provides an RTSP stream, and in theory allows you to browse the files on the device and get access to them.

However, it appears that DJI stripped most of this functionality out — in a pretty hacky way, from what I can tell, leaving half-complete stuff and stubs. It has instead implemented its own hacky interface that you can use to control the camera, though it still does RTSP streaming to the phone application.

In trying to work with the camera, I learned the following things:

– The camera sets up a wifi access point, using 192.168.1.1 as its IP, and serves DHCP addresses starting with 192.168.1.100 to clients. It serves an HTTP server on port 80. (This is different than the GoPro which defaults to a different IP setup and runs on port 8080.)
– Most interaction to the camera is through posting of XML to /CGI/ calls.
– Most HTTP calls/functions in the camera require a “Cookie” called “Session” with a value in it. If they don’t have it, they simply return an Error code.
– To get a session, you post to “/CGI/CameraLogin?Device=Mobile&Stream=RTP_H264_WQVGA” with a ‘password=’ (blank) form value. This returns a set of data in XML (‘Login OKFC40_S7NO4621102U40_NO_CODECA06‘) and also has a Set-Cookie header: ‘ Set-Cookie: Session=750997680’. However, the Set-Cookie header is preceded by a space, and therefore is not recognized by regular browsers; it seems the FC40 app knows to look for this, but this prevents trivial use of forms in browsers to replicate the functionality. (No idea if this is intentional or not.)
– There is a CGI for RemoteControl
– There is a CGI for status.

I’ve put together some documentation for these and put them into a fc40 camera github repo. I’ll probably end up expanding it a bit more.

The most interesting thing, of course, is the live streaming of video. After poking it a bit, I did find out that it does RTSP streaming, and I was able to discover that VLC does, in theory, play RTSP streams. However, although the setup mostly works, when it tries to PLAY a stream via VLC, the camera immediately closes the connection. Having inspected packets, I believe the only difference is that the User-Agent is VLC instead of being blank/not included, which is apparently a not-unheard of trick for security-by-obscurity for RTSP streams. I have not yet gotten to the point where I can test this theory, but I’m working towards it.

I wish that DJI was a bit more open about these things, but maybe the reason they’re not is because they took the hardware from some OEM who didn’t want people to get this instead of a more expensive model… or instead of upgrading to a GoPro, which does have a pretty open setup out of the box where all these things ‘just work’. In any case, I figured I’d share what I learned for others to play with.

VSI Curl Support

Posted in GDAL/OGR, Locality and Space, Software on October 4th, 2010 at 06:14:47

In a conversation at FOSS4G, Schuyler and I sat down with Frank Warmardam to chat about the possibility of extending GDAL to be able to work more cleanly when talking to files over HTTP. After some brief consideration, he agreed to do some initial work on getting a libcurl-based VSIL wrapper built.

VSIL is an API inside of GDAL that essentially allows you to treat files which are accessed through different streaming protocols available as if they were normal files; it is used for support for accessing content inside zipped containers, and other similar data access patterns.

GDAL’s blocking strategy — that is, the knowledge of how to read sub-blocks of files in order to obtain the information it needs, rather than needing to read a larger part of the file — is designed to limit the amount of disk I/O that’s needed for rendering large rasters. A properly set up raster can limit the amount of data that needs to be read significantly, helping improve tile rendering time significantly. This type of access would also allow you to fetch metadata about remote images without the need to access an entire (possibly large) image.

As a result, we thought it might be possible to use HTTP-based access to images using this mechanism; for metadata access and other similar information over the web. Frank thought it was a reasonable idea, though he was concerned about performance. Upon returning from FOSS4G, Frank mentioned in #gdal that he was planning on writing such a thing, and Even popped up mentioning ‘Oh, right, I already wrote that, I just had it sitting around.’

When Schuyler dropped by yesterday, he mentioned that he hadn’t heard anything from Frank on the topic, but I knew that I’d seen something go by in SVN, and said so. We looked it up and found that the support had been checked into trunk, and we both sat down and built a version of GDAL locally with curl support — and were happy to find out that the /vsicurl/ driver works great!

Using the Range: header to do partial downloads, and parsing some directory listing style pages for READDIR support to find out what files are available, the libcurl VSIL support means that I can easily get the metadata about a 1.2GB TIF file with only 64kb of data transferred; with a properly overlaid file, I can pull a 200 by 200 overview of the same file while using only 800kb of data transfer.

People sometimes talk about “RESTful” services on the web, and I’ll admit that there’s a lot to that that I don’t really understand. I’ll admit that the tiff format is not designed to have HTTP ‘links’ to each pixel — but I think the fact that by fetching a small set of header information, GDAL is then able to find out where the metadata is, and request only that data, saving (in this case) more than a gigabyte of network bandwidth… that’s pretty frickin’ cool.

Many thanks to EvenR for his initial work on this, and to Frank for helping get it checked into GDAL.

I’ll leave with the following demonstration — showing GDAL’s ability to grab an overview of a 22000px, 1.2GB tiff file in only 12 seconds over the internet:

$ time ./apps/gdal_translate -outsize 200 200  /vsicurl/http://haiticrisismap.org/data/processed/google/21/ov/22000px.tif 200.tif
Input file size is 22586, 10000
0...10...20...30...40...50...60...70...80...90...100 - done.

real	0m11.992s
user	0m0.052s
sys	0m0.128s

(Oh, and what does `time` say if you run it on localhost? From the HaitiCrisisMap server:

real	0m0.671s
user	0m0.260s
sys	0m0.048s

)

Of course, none of this compares as a real performance test, but to give an example of the comparison in performance for a single simple operation:

$ time ./apps/gdal_translate -outsize 2000 2000 
     /vsicurl/http://haiticrisismap.org/data/processed/google/21/ov/22000px.tif 2000.tif
Input file size is 22586, 10000
0...10...20...30...40...50...60...70...80...90...100 - done.

real	0m1.851s
user	0m0.556s
sys	0m0.272s

$ time ./apps/gdal_translate -outsize 2000 2000 
    /geo/haiti/data/processed/google/21/ov/22000px.tif 2000.tif
Input file size is 22586, 10000
0...10...20...30...40...50...60...70...80...90...100 - done.

real	0m1.452s
user	0m0.508s
sys	0m0.124s

That’s right, in this particular case, the difference between doing it via HTTP and doing it via the local filesystem is only .4s — less than 30% overhead, which is (in my personal opinion) pretty nice.

Sometimes, I love technology.

New Mailing List: tiling; Feedback On WMTS

Posted in FOSS4G 2010, OSGeo, TileCache on September 9th, 2010 at 03:07:15

In the past, for tiling, we discussed tiling on an EOGEO list. In the meantime, OSGeo has grown up, EOGEO has moved on, and it seems that there isn’t a very good home for future tiling discussions.

As a result, I have added a tiling list to the OSGeo mailing list server.

Tiling List @ OSGeo

Projects that I hope to see people joining from: TileCache, Tirex, Mapproxy, GWC, others, etc.

This list will be discussing general tiling ideas — how to cache tiles, how to manage caches, how to work with limited caches, where to put your tiles, things like S3, etc. etc. If you are at all interested in tiling — not at the level of a specific application, but in general — please join the list.

Additionally, if you are interested in discussing providing feedback to the OGC regarding the WMTS spec — especially if you are an implementer, but also if you are a user — I would encourage you to join the standards list at OSGeo:

http://lists.osgeo.org/mailman/listinfo/standards

Several people have expressed interest in coordinating a response to the OGC regarding the spec, and we would like to work together on this list to coordinate.

Are you generative than consumptive in your field?

Posted in Locality and Space, OpenLayers, Social, Software on May 26th, 2009 at 10:57:47

Anselm just posted what appears to be a random thought on twitter:

Are you more generative than consumptive in your particular field? … Create more than you consume?

In open source, I often rephrase this question as “Are you a source, or a sink?”

There are many people in the community who contribute more than they consume. Organizations, individuals, etc. There are also many sinks in the community — since entropy is every increasing, this seems a forgone conclusion — and one of the key things that causes an open source project to succeed or fail is the number of sources or sinks.

I personally try very hard to be a source in all that I do, rather than a sink. One way that I do this is that I try very hard to always followup any question I ask — for example, on a mailing list, on an IRC channel, or what have you — with at least two answers of my own. This means that, for example, when I hopped into #django to ask about best practices for packaging apps, I stuck around, and helped out two more people — one who was asking a question about PIL installation, and one about setting up foreign keys to different models.

Now, in the end, my answers were simple — no one with even a basic knowledge of Django would have had problems answering them. But by sticking around and answering them, I was able to make up to some extent for the time/energy that I consumed from someone more familiar with the project, by saving them from needing to answer as well.

It is often the case that users trying to get help will claim that once they get help, they will ‘contribute back’ to the community by, for example, writing documentation. This never happens. Though there are exceptions to every rule, it is almost always the case that users who ask a question, prefacing it with “I will document this for other users”, never follow through on the latter half. The exceptions to this — or rather, the alternate cases — are cases where a user has already investedлегла significant research, and likely already started writing documentation. Unless the process is started before the problem is solved, it is almost universally true — in my experience — that the user will act as a sink, taking the information from the source and disappearing with it.

I work very hard on supporting a number of open source projects that I work on. Though my involvement lately has been more hands off — by doing things like writing documentation instead of answering questions, acting as a release manager instead of fixing bugs, and so on — I work very hard to keep the karmic balance of my work on the positive side. I believe that this pays off in the long run — I have somewhat of a reputation of being helpful, which is beneficial to me since it means I’m more likely to receive help when I need it. I also work to keep karmic balance high on the part of the organization I work for, since many of the other people in the organization are less able to keep karmic balance high.

These rules don’t apply solely to open source — I have the same karmic balance issues going on in my work inside of MetaCarta — but I maintain the same attitude there. Coming in with the idea that it is okay to be a sink can lead to a nasty precedent. In the end, I think that everyone loses. Sinks — both in open source and other karmic ventures — will eventually use up the karma they start with, and be left out to dry. It is the case for more than one person that they have extended their information seeking without contributing back beyond the point where I am willing to continue to support their information entropy.

I joke sometimes about giving out “crschmidt karma points”. Though I don’t have an actual system in this regard, I do quite clearly delineate between constant sinks, and regular sources, and grey areas in-between. I try to stay on the source side, and I encourage anyone else to do the same — even if it’s only by answering easy questions on the mailing list, or doing a bit more research on your own. Expecting other people to fix your problems, in open source or otherwise, is simply a false economy of help, since in the end, it simply doesn’t work.

Toronto Code Sprint: Day 2

Posted in Locality and Space, Mapserver, OSGeo, PostGIS, Toronto Code Sprint on March 8th, 2009 at 22:44:32

Day 2 of the code sprint seemed to be much more productive. With much of the planning done yesterday, today groups were able to sit down and get to work.

Today, I accomplished two significant tasks:

  • Setting up the new OSGeo Gallery, which is set to act as a repository for demos of OSGeo software users in the same way that the OpenLayers Gallery already does for OpenLayers. We’ve even added the first example.
  • TMS Minidriver support for the GDAL WMS Driver: Sitting down and hacking out a way to access OSM tiles as a GDAL datasource, Schuyler and I built something which is reasonably simple/small — an 18k patch including examples and docs — but allows for a significant change in the ability to read tiles from existing tileset datasources on the web.

Other things happening at the sprint today were more WKT Raster discussions, liblas hacking, and single-pass MapServer discussions, as well as some profiling of MapServer performance with help from Paul and Shark. Thanks to the participation of the LizardTech folks, I think there will also be some performance testing done with MrSID rendering within MapServer, and there was — as always — more discussion of the “proj strings are expensive to look up!” discussion.

Other than that, it was a quiet day; lots of work getting done, but not much excitement in the ranks.

We then had a great dinner at Baton Rouge, and made it home.

This evening, I’ve been doing a bit more hacking, opening a GDAL Trac ticket for an issue Schuyler bumped into with the sqlite driver, and pondering the plan for OpenLayers tomorrow.

As before, a special thanks to the conference sponsors for today: Coordinate Solutions via David Lowther, and the lovely folks at SJ Geophysics Ltd.. Thanks for helping make this thing happen! I can guarantee that neither of those GDAL tickets would have happened without this time.

Toronto Code Sprint: Day 1

Posted in Mapserver, OSGeo, PostGIS, Toronto Code Sprint on March 8th, 2009 at 07:55:43

I’m here at the OSGeo Code Sprint in Toronto, where more than 20 OSGeo hackers have gathered to work on all things OSGeo — or at least MapServer, GDAL/OGR, and PostGIS.

For those who might not know, a code sprint is an event designed to gather a number of people working on the same software together with the intention of working together to get a large amount of development work done quickly. In this case, the sprint is a meeting of the “C tribe”: Developers working on the C-based stack in OSGeo.

After some discussion yesterday, there ended up being approximately 3 groups at the sprint:

  • People targeting MapServer development
  • PostGIS developers
  • liblas developers

(As usual, I’m a floater, but primarily concentrating on OpenLayers; Schuyler will be joining me in this pursuit, and I’ve got another hacker coming Monday and Tuesday to sprint with us.)

The MapServer group was the most lively discussion group (and is also the largest). It sounded like there were three significant development discussions that were taking place: XML Mapfiles, integration of pluggable rendering backends, and performance enhancements, as well as work on documentation.

After a long discussion on the benefits/merits of XML mapfiles, it came down to there being one main target use case for the XML mapfile is encouraging the creation and use of more editing clients. With a format that can be easily round-tripped between client and server, you might see more editors able to really speak the same language. In order to test this hypothesis, a standard XSLT transform will be created and documented, with a tool to do the conversion; this will allow MapServer to test out the development before integrating XML mapfile support into the library itself.

I didn’t listen as closely to the pluggable renderers discussion, but I am aware that there’s a desire to improve support and reduce code duplication of various sorts, and the primary author of the AGG rendering support is here and participating in the sprint. Recently, there has been a proposal to the list to add OpenGL based rendering support to MapServer, so this is a step in that direction.

The PostGIS group was excited to have so many people in the same place at the same time, and I think came close to skipping lunch in order to get more time working together. In the end, they did go, but it seemed to be a highly productive meeting. Among some of their discussions was a small amount of discusssion on the WKTRaster project which is currently ongoing, I believe.

After our first day of coding, we headed to a Toronto Marlies hockey game. This was, for many of us, the first professional hockey we’d ever seen. (The Marlies are the equivilant of AAA baseball; one step below the major leagues.) The Canadians in the audience, especially Jeff McKenna, who played professional hockey for a time, helped keep the rest of us informed. The Marlies lost 6-1, sadly, but as a non-Canadian, I had to root a bit for the Hershey team. (Two fights did break out; pictures forthcoming.)

We finished up with a great dinner at East Side Mario’s.

A special thanks to our two sponsors for the day, Rich Greenwood of Greenwood Map and Steve Lehr from QPUBLIC! Our sprint was in a great place, very productive, and had great events, thanks to the support of these great people.

Looking forward to another great day.

Selenium IDE: getCurrentWindow() problems

Posted in Software on December 25th, 2008 at 21:11:08

After the past 4 hours of fighting with this, I figured it was worth me posting.

I’ve recently started using Selenium IDE in some testing. I had found that I was unable to access what seemed like perfectly normal Javascript variables, despite every other tutorial on the web that i could read indicating that it should be possible.

After a lot of messing around, I finally searched the OpenQA forum (which is apparently not adequately indexed by Google, since I did search on Google a number of times). Specifically, I found a forum thread, linking the issue to Issue 558. Specifically, a change in the way that Selenium treats windows means that only ‘safe’ properties could be touched — things like .body, .title, etc. — which explains why I could test window.location, but not window.map.

It seems that there is a fix in their repository for this, but no further release has been made yet, nor do I see any immediate plans for one. Specifically, the change they made adds a getUserWindow() function, which must be used in order to get the window with the properties on it that were added in a ‘non-safe’ way — such as by Javascript.

In any case, while investigating this, I built a new version of Selenium IDE, which adds the getUserWindow function, and repackaged it. This is a change directly from the 1.0b2 XPI I installed, and implements the fix described in the forum thread linked above.

What this means is: if you are using Selenium IDE 1.0b2 and having problems with getCurrentWinow() not letting you access the properties of the window that are added by your Javascript, this XPI should help provide the getUserWindow() function you need: selenium-ide-1.02b2-mc.xpi.

This applies especially to functions like assertEval / verifyEval / getEval and its partners.

In order to take advantage of this, you must change any instances of ‘window’ or ‘this.browserbot.getCurrentWindow()’ to ‘this.browserbot.getUserWindow()’ where they need to access user-set properties. This simply acts as a transition tool for people needing 1.0b2 support and unable to wait for another release for this function to become available.

Jython + TileCache/FeatureServer: It Just Works

Posted in default, ESRI, FeatreServer, FeatureServer, spatialreference.org, TileCache on December 14th, 2008 at 10:37:04

Earlier today, I tried Jython for the first time, because I’m doing some work that may involve interactions with Java libraries in the near future. Jython, which I’ve always avoided in the past due to an irrational fear of Java, is “an implementation of the high-level, dynamic, object-oriented language Python written in 100% Pure Java, and seamlessly integrated with the Java platform.” (I love projects that have great one-liners that I can copy paste.)

My goal for Jython was to do some work with the GeoTools EPSG registry code related to SpatialReference.org. Sadly, I didn’t get that working, but in the process, I learned that Jython now has a beta version which is up to Python 2.5 — much newer than the 2.2 that had previously been available.

With that in hand, I decided to see if I could get some of my other Python projects running under Jython. I’m the maintainer for both TileCache and FeatureServer — two pure Python projects. Theoretically, these projects should both work trivially under Jython, but I’ve always had my doubts/fears about this being the case. However, it turns out that my fears here are entirely unfounded.

I downloaded the FeatureServer ‘full’ release from featureserver.org: this includes the supporting libraries needed to get a basic FeatureServer up and running. I then tried to run the FeatureSever local HTTP server… and it worked out of the box. I was able to Load the layer, save data to it, query it, etc. with no problems whatsoever. Java has support for the DBM driver that FeatureServer uses by default, so out of the box, I was able to use FeatureServer with Jython without problems.

Next came TileCache. TileCache was originally built to support Python all the way back to 2.2, so I wasn’t expecting many problems. Getting it running turned out to be almost as easy: the only code modification that was needed was a minor change to the disk cache, because Jython doesn’t seem to support the ‘umask’ method. Once I changed that (now checked into SVN), Jython worked just as well with TileCache as it did with FeatureServer.

Clearly, there are some things which are less trivial. The reason that these libraries were so easy to use is because they were designed to be low-dependancy: TileCache and FeatureServer default paths are both entirely free of compiled code. Using something like, for example, GDAL Layers in TileCache, would be much more difficult (if it’s even possible).

However, this presents some interesting capabilities I had not previously thought of.

For FeatureServer, this means that it may be possible to write a DataSource which accesses SDE using the ArcSDE Java API, ESRI’s supported method for accessing their SDE databases. One of the purported “holy grails” of the GIS world is RESTful access to SDE databases via lightweight servers — Jython may provide a path to that, if someone is interested in it. (It may be that this has become a moot point with the release of the ESRI 9.3 REST API — I’m not really sure.) This may be a waste of time, but the fact that it *can* be done is interesting to me. Edit: Howard points out that ArcSDE read/write support exists in OGR 1.6, so this is a moot point; you can simply use OGR to do this without involving Jython/Java.

I think this might also speak to a possibility of having better answers available for people who want to use things like FeatureServer from Java platforms (though I don’t know enough about jython to be sure): the typical answer of “use GeoServer” is great, but to be able to provide something a bit more friendly would be interesting. Thankfully, the Java world is largely catching up to the advances made in TileCache/FeatureServer, so this is also less urgent than it has been in the past.

In the end, this was likely simply an interesting experiment. However, it’s nice to know that the capabilities to do things like this within Jython are improving, and that Jython continues to advance their Python. The 2.2 release being the ‘current’ one still is disappointing, but seeing a 2.5 beta available is an exciting development.

As I said, the current version of FeatureServer works out of the box with Jython, and I’ll be doing a TileCache release shortly that will work with Jython out of the box as well. It’s neat to see more possibilities for using these libraries I’ve spent so much time on.

DjangoGraphviz: Visualizing Django Models

Posted in Django, Software on June 25th, 2008 at 16:47:38

Earlier today, a coworker was trying to work out a diagram for the models in the Django app that I’ve been working on internally, to visualize the relationships between them. I did a quick Google, and found a reference to DjangoGraphviz, a super-handy little chunk of code.

DjangoGraphviz did exactly what I needed it to, with no problems at all. (My only complaint is that it requires the DJANGO_SETTINGS_MODULE to be defined in order to get the –help output, somewhat unintuitive.) The software quickly generated a .dot file which I was able to turn into a lovely PDF, and print. I’ve now got a copy on the desk of each of the developers using the codebase, and I think it’ll prove a lovely piece of reference.

So, if you want a quick visualization of your Django models, and you can install graphviz, I highly recommend DjangoGraphviz to do it.

(Note that the wiki page itself recommends a couple other things more ‘built in’ to Django, which are new to me: I didn’t try these, I just stuck with DjangoGraphviz, which did what I wanted.)