Thursday, December 10, 2015

exactEARTH on Satellite AIS

As I listen to this video, I'm amused by the satellite AIS has been around for 5 years.  Really?  I first got data from SpaceQuest in 2009 and I know that there were receivers up before 2009.  The US spooks had one up pre-2007.  I am still bothered by the ABSEA Class B-ish thing as they don't publish a spec that tells me anything useful about what it really is.  And yes, I've talk to the folks doing this, and I still don't know what it really is at a technical level.

The other sorts of data discussion is interesting.  Finally... IMO 289/290 came out back in 2010. exactEARTH was not at the RTCM SC121 meetings in 2007-2009 despite the meetings being open to anyone in the community.  So great that they are talking about icebergs and fish catch, but it's not new.  Hey there is also the Voluntary Observing Ship program (VOS) that demonstrates how this works for sending back super important data about our world.

The preso does have some nice info about Satellite AIS.  Just remember, that ERMA had Orbcomm satellite AIS data integrated for the Deepwater Horizon oil spill response back in 2010, so this stuff isn't that new.  But it is getting easier and better.

Wednesday, December 9, 2015

The joys of being on the internet

The endless brute force attacks that make up the intertubes...  so kind of them to rate limit the attempts.

Jan 25 08:08:55 tide3 sshd[8192]: Failed password for invalid user alex from port 50959 ssh2
Jan 25 08:10:52 tide3 sshd[8195]: Invalid user arbab from
Jan 25 08:10:52 tide3 sshd[8195]: pam_unix(sshd:auth): check pass; user unknown
Jan 25 08:10:52 tide3 sshd[8195]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= 
Jan 25 08:10:54 tide3 sshd[8195]: Failed password for invalid user arbab from port 42321 ssh2
Jan 25 08:12:52 tide3 sshd[8198]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=  user=backup
Jan 25 08:12:54 tide3 sshd[8198]: Failed password for backup from port 52016 ssh2
Jan 25 08:14:51 tide3 sshd[8201]: Invalid user bob from
Jan 25 08:14:51 tide3 sshd[8201]: pam_unix(sshd:auth): check pass; user unknown
Jan 25 08:14:51 tide3 sshd[8201]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= 
Jan 25 08:14:53 tide3 sshd[8201]: Failed password for invalid user bob from port 38871 ssh2
Jan 25 08:16:52 tide3 sshd[8204]: Invalid user christian from
Jan 25 08:16:52 tide3 sshd[8204]: pam_unix(sshd:auth): check pass; user unknown
Jan 25 08:16:52 tide3 sshd[8204]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= 
Jan 25 08:16:54 tide3 sshd[8204]: Failed password for invalid user christian from port 36075 ssh2
Jan 25 08:17:01 tide3 CRON[8207]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 25 08:17:01 tide3 CRON[8207]: pam_unix(cron:session): session closed for user root
Jan 25 08:18:51 tide3 sshd[8210]: Invalid user cisco from
Jan 25 08:18:51 tide3 sshd[8210]: pam_unix(sshd:auth): check pass; user unknown
Jan 25 08:18:51 tide3 sshd[8210]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= 
Jan 25 08:18:53 tide3 sshd[8210]: Failed password for invalid user cisco from port 58544 ssh2
Jan 25 08:20:51 tide3 sshd[8213]: Invalid user cusadmin from
Jan 25 08:20:51 tide3 sshd[8213]: pam_unix(sshd:auth): check pass; user unknown
Jan 25 08:20:51 tide3 sshd[8213]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= 
Jan 25 08:20:53 tide3 sshd[8213]: Failed password for invalid user cusadmin from port 53667 ssh2
Jan 25 08:22:51 tide3 sshd[8216]: Invalid user david from
Jan 25 08:22:51 tide3 sshd[8216]: pam_unix(sshd:auth): check pass; user unknown
Jan 25 08:22:51 tide3 sshd[8216]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=

Monday, November 30, 2015 partially back online

Using Google's App Engine (GAE) static content serving from the Python SDK, I've partially got back online.  Lots more work to do, but at least most of the articles are back.

Wednesday, October 7, 2015

Python ctypes and statically linked libraries

I had no idea this would work.


Marine science / tec note taking

RVTEC has a discussion going on at note taking techniques

My techniques....  hurray for no real quota on storing video and pictures on the web for free.  Down side is the whole high speed internet connection requirement :(

  • camera phone videos + youtube. I've got > 800 videos online, most by # not public, but most by time are public (yes, I have cat videos in there).  Plus a web cam with video recording where I can grab clips from the last couple days.  core sampling and dock context
  • lab notebook and pen (but that doesn't back itself up) + camera phone + flickr & google photos... maybe someday machine learning will let me ask for all photos of box corers in my collection.  Today, I'll have to stick to asking which photos include my cat
  • my blog (I'll have it back online soon) and my backup blog ... +1 for the prior let the search engines sort it out comment
  • inkscape & gimp,  emacs org-mode, google drawings (no more investing my time into Adobe craziness)
  • git + (gdrive & dropbox) for private tracking of code bits and org-mode notes.  github for public stuff
Not really yet in toolbox enough:

  • IPython/Jupyter notebooks + github + nbviewer.  e.g. Camille Cobb made this when she was working for me: Argo floats and BigQuery  Matplotlib basemap is great for things that need location
  • networkx, dia and graphviz for work flows
  • dockr containers for capturing processing tools in a run anywhere mode

Grad students have it so easy these days... they can see what they are getting into ahead of time:

Monday, October 5, 2015

UN Location codes - Making CSV difficult

UN Locode CSV done without even a little thought... with a README and maybe some column labels and maybe some normal machine readable decimal degrees. You shouldn't have to read anything to be able to load a *single* CSV file into a GIS (QGIS, ArcGIS etc).

 So San Francisco, US doesn't even get Long/Lat?

 grep US *.csv | grep 'San Francisco' 2015-1 UNLOCODE CodeListPart3.csv:,"US","EMB","Embarcadero/San Francisco","Embarcadero/San Francisco","CA","---4----","AI","0401",,,
2015-1 UNLOCODE CodeListPart3.csv:,"US","SFO","San Francisco","San Francisco","CA","1--45---","AI","9601",,,
2015-1 UNLOCODE CodeListPart3.csv:,"US","SYF","South San Francisco","South San Francisco","CA","--3-----","RQ","9307",,,

Thursday, September 17, 2015 isn't quite a cool as kodos, but it is definitely handy for getting a regular expression correct in python.

That made adding FSR support to libais pretty easy.

Monday, September 7, 2015

Spelling, grammar, and other mistakes in latex or org-mode writing

Sigh.  I just found yet another typo in a published journal paper this evening.  "the there" -> "there" in my 2006 Santa Barbara paper.  I hard started to create scripts to check for these sorts of issues, but it doesn't look like that was good enough.  This one of the few ways that Microsoft Word, Apple Pages and Google docs are better than text based tools.  But nowhere near enough to get me to switch.  I'm not giving up emacs and org-mode for the majority of my writing.

Sunday, August 30, 2015

Done with Generic Sensor Format

I think I am at the point where it is time to set aside the work on the sonar Generic Sensor Format (GSF) that I've been doing.  My personal goal with this was to demonstrate what direction(s) I think GSF should go in.  I think I've done that by showing:

  • Adding unit tests to the old C code and continuous integration testing
  • Auditing the C code with tools like ASAN, MSAN, Coverity, etc.
  • Creating the beginnings a modern C++ library that is designed with testing from the start
  • Starting a python utility library to facilitate creating tests for the C and C++ code
  • Identifying files that would make the beginnings of a good test suite
  • Show that history comments belongs in the revision history and changelog file, not the actual source code
  • Start a list of issues with the code and show solving some of them
  • Demonstrate payback to Leidos (formerly SAIC) for open sourcing GSF
At this point, I have put in quite a bit of time, squashed a lot of bugs, and set the stage for what I think the direction should be.  However, looking at GSF in depth, it is clear that this is not a technology that the community should rely on.  While the idea of GSF is great, it's fundamentally broken in many of the same ways as AIS that ESR and I identified in our toils paper.  There are so many better technologies that could help build a format that was actually robust and capable for long term support of the community.  For the good and the bad alternatives,   see  I appreciate the people who helped get me.  Evan Robertson went through the NGDC catalog and find files from older versions of GSF and Shannon Bryne at Leidos for the open sourcing process that took form 2008-2014.

There has not been any feedback from the community and no uptake of any of the code, with my goals met, it's time to hang it up.  My hope is that eventually a group of people will pick up on GSF where I left off and finish off the fixes to the old C code and finish writing gsfxx and gsf-py.  And beyond that, I hope yet more people will work on the same process for MB-System.

An incognito Google search to see if my github repo for GSF appears high on the list and it does:


Today I listened to FLOSS Weekly Episode 350 on the Network Time Protocol (NTP) while Lincoln was pass out on me for his afternoon nap.  I still have this massive frustration with time.  I don't feel like I know enough to be able to write software that reasonable logs time for scientific applications; I don't understand how to really specify time correctly,  I don't know the issues that I should be aware of, and I really don't know how to specify the error that is involved.  Having a properly setup NTP network configuration on a device is a great start (see, but that really isn't a very good.  Most people have 1 to 3 hard coded ntp services, which is a pretty crummy initial setup.   And to top it off, after the recent security issues with ntp, my two primary machines won't let me run "ntpq -p -n" to see how ntp is doing.  It seems like any good text on geophysical data analysis should have precision timing near the beginning of the discussion.  But if I were to write such a text, I know enough to know that I couldn't do a decent job of writing that section.  Very frustrating.  Listening to FLOSS Weekly, there were a bunch of topics that I don't remember ever hearing before.

International Atomic Time (TIA) - I think this is what the USCG RDC meant when they said that UTC was 32 seconds off from GMT back when I visited them in 2007 (32 is from memory.  YMMV).

General Timestamp API Project - I should really look into what this project says before saying anything more about how time should be logged.  Should it be in TIA?

DFC77 The German radio broadcasts of time, which is the same basic concept as WWV broadcasts of time in the US, NPL from England, and TDL from France. apparently has tons of time information.

I know just a little about Precision Time Protocol (PTP V2 / IEEE 1588-2008), but not enough to be useful and have never had a chance to try it.

It would be a great project to do an open data logging computer that integrated the ability to use NTP network time if nothing else worked well, GNSS/GPS time(s), PTP, radio times and/or anything else that was available and was designed to accurately (as possible) record data coming into the device from sensors.

BTW, I took a quick peek at the NTP github repo and sad to see that the NTP bug list is hidden behind a login in a bugzilla database.  Not very accessible.  I don't see a continuous integration testing setup.  And to top it off, changes show up as from "unknown."

Saturday, August 8, 2015

And never mind

I'm pretty much giving up on try to blog this month.

Monday, August 3, 2015

Python Testing Cookbook review

I had high hopes for this book.  It is well written and I very much appreciate the detail and dedication that went into it. I am only 4 chapters in, but I already have to say that this book was great for 3-4 years ago.  The tools have improved so much since then that it needs a major rework.  The biggest change: Down with doctest, up with ipython notebooks. I dislike the use of getopt and avoiding the initial configure of to allow python test is a bummer.  I think a full mini project would be a better focus for a book like this.  An now we have awesome and easy to use continuous integration (CI) tools like Travis-CI.  It no longer matters if your full tests take 15 minutes to run.  They always get run.

I am definitely learning from this book, but I have to do a lot of modifications of methods to apply them to my world.

Sunday, August 2, 2015

Badges/Shields for software projects

These are kind of fun and sometimes useful, but here are some notes on using badges for a python / c++ project.  While badges seem a little silly in the beginning, they do convey key information in a very obvious way and add a splash of color to otherwise very dull README files.  I'm sure than for than a few is too many, but here are some examples that I played with this weekend.  I still need to push a new version of libais for some of these that go through pypi to work.  Right off, I found it weird that there were 3k downloads of libais a month.  That seemed really high.  But I think that may be coming from virtualenvs being built by SkyTruth and myself.

And it's fun to be able to just make whatever random thing I want...

GAO - Maritime Critical Infrastructure Protection

I recently skimmed this GAO report on maritime security.  I have to conclude that it totally misses the mark.  But that didn't surprise me in the least.  I would have been surprised by an insightful and intelligently written document that prioritized the real issues and strategies that will make a big difference.

There is a list of threats in the document that seems totally out of line: "Table 1: Sources of Cyber-based Threats"  Their threats are:
  • Bot-network operators 
  • Business competitors 
  • Criminal groups 
  • Hackers 
  • Insiders
  • Nations
  • Phishers 
  • Spammers 
  • Spyware or malware authors 
  • Terrorists

Why all of those groups are real, their categories are somewhat nonsensical.  I can't figure out what they use as a criteria for the categories.  For example, a nation (e.g. North Korea) may imploy or buy from an author of malicious software (The Hacking Team), but does that make two sources of threats?

And without trying to figure out the ontology issues, there are a couple changes to that list that I would make right off.  First, my number one source for threats is software developers.  I've been working on auditing and fixing the Generic Sensor Format (GSF) that is used for sonar mapping and I'll use that as an example.  This is C code developed by professional programmers at SAIC for the US Navy and has been around since the early 1990's.  I took the code (not that it is open sourced under the LGPL 2.1 license) and threw it in Coverity.  Right off the bad, I got a whole pile of coding issues that include multiple buffer overflows and all sorts of use of unsanitized data from files.  Many of these issues have been in the code for > 25 years.  If this is in open code that has been used by many companies for ages, what is hiding in all the closed source code in the maritime industry?   There wasn't a good testing strategy for the GSF C code.  Does your ECDIS have decent automated testing?  This situation is likely way worse.  I talked to a maritime professor teaching ECDIS about 10 years ago.  His number one lesson to students was to make sure that the ECDIS computer had not stopped updating by watching the seconds of the on screen clock.  And the students were supposed to do this in every sweep of their watch (so multiple times per minute).  In addition to bad code, there is also bad design.  These are things like inventing your own encryption or not validating data or patches that go into a system.  A nice example of this is with digital charts.  The rules say that a US chart (e.g. an S-57 file) is valid only if you got it directly from NOAA or an authorized retailer.  That really doesn't mean anything.  What if someone man-in-the-middled the download or it got corrupted somewhere along the way.  I'd take a cryptographically signed file is worth more than the source.

The next change is with hacking.  I'd call this category cracking.  And I'd split it up into two groups.  The first are the smart ones doing things themselves.  They are doing real work and really discovering things.  The next category are "script kiddies".  These folks really have no idea what they are doing and just blindly apply tools that are available on the internet.  They often have no idea what they are breaking into and what the consequences are.

Another change to that list would be to add a lack of reasonable support to mariners from the world's "competent authorities".  If the Hydrographic Offices (HOs) and Coast Guards (CGs) around the world, can't give reasonable guidance to software developers and mariners using the gear, then all it lost.  This boils down to people making decisions they shouldn't (e.g. they are not trained for - electrical engineers and lawyers defining software) and/or closed specs that don't have a way to get audited by professionals.  This IEC specs for AIS gear.

August challenge to my self - blog at least 1x per day on average for the month

I used to blog at least once per day pretty much every day.  I amassed > 3000 posts using nanoblogger and posting to  I haven't gotten around to getting set back up in the last year, so I might as well just try to use the blogger interface and get back into it.  My son is close to 1 year old and he has dominated everything this last year.  And then I lost my father when he was hit in a crosswalk by a driver who didn't see him.  I'm not so sure I will be able to pull this off, but it would be nice to get back into it.  I've had lots and lots of ideas in the last year that have never made it anywhere concrete (not even my private logs).

I do have to say that I really think my blogger account is really really ugly, but in my typical minimalist strategy, I'm just not going to worry about it.

Friday, July 17, 2015

AIS VHF Data Exchange System (VDES) looks like yet more of the same AIS frustration

If you are going to get a new higher bandwidth set of channels to pass marine data around, choosing AIS encoding (binary as over the VDL [VHF Data Link] or NMEA VDM/VDO) is just a terrible idea.  We have so many better serialization formats (Protobuf, MessagePack,, etc.).

240 kbps or 307.2 kbps at what freq?  Details are very thin.

From the RTCM comes this fairly empty statement:

SC-123 Chairman Norsworthy is very familiar with both programs.
“I believe that AIS opensthe door for efficient communication, navigation
and operation in the maritime services,” he says. “AIS is the essential core
for e-navigation, the harmonized collection, integration, exchange, presentation
and analysis of marine information onboard and ashore by electronic
means to enhance berth to berth navigation and related services
for safety and security at sea and protection of the marine environment.
“The next development is the VDES, which contains integral AIS,
and is ‘AIS on steroids.’ VDES will be to AIS as 4G is to cellphones. VDES
is designed to support all the apps needed for e-navigation and GMDSS
(Global Maritime Distress and Safety System) modernization. VDES provides
all the functionality, bandwidth and linkages for the efficient exchange
of information with ships, shore stations and satellites.”
William D. Kautz, USCG says:

VDES uses Recommendation ITU-RM.1842-1 techniques to solve the limitation of AIS data exchange

I've got little hope for anything other than an endless need for grunt coding contracts to come out of the e-Navigation world.

Oh the acronyms and buzz phrases...

E-Navigation Strategic Implementation Plan (SIP)

Thursday, July 2, 2015

elog feature requests

elog is pretty cool and is used on the US Coast Guard ice breaker Healy.  My feature requests...
  • The ability to track the location of the server and/or user for mobile usage.  e.g.  the U.S. Coast Guard ice breaker Healy uses elog.  The ability to export with gdal to KML/Shapefiles/etc would allow logs to be on maps.  e.g. Be able to use gpsd NMEA GPS location on servers.
  • Ability to submit from Android devices with Open Data Kit (ODK).
  • Simile Timeline view of logs
  • Export to msgpack and/or protobuf 3 for ease of writing add ons
  • Integration with flowdock
  • Integration with github for issue tracking and travis-ci for software build/test status
  • Ability to sync multiple servers that are not always connected

Monday, June 15, 2015

Generic Sensor Format (gsf) on github

I'm finally to a point where this is worth talking about.  I started with GSF version 03.06 downloaded from the Leidos website (Leidos split off from SAIC last year).  I used that to start my gsf github repo.  I've given it a really simple GNU Makefile build system and added some basic read testing using Google's gunit/gmock testing suite.  I've setup travis-ci to run the tests every time I push changes.  I even setup Coverity Scan to do static analysis.  I started doing some initial cleanup (spelling, consistently capitalizing GSF in comments), but I need to flush out the unit tests before I start working on the 46 Coverity issues and working on making GSF compile without warnings.  To help with creating tests, I've started writing a pure Python GSF reader so that I can pick packets out of test files and assemble minimally sized test files from real data.  I also need to use the C GSF library to write out some small test cases that cover all the key corner cases.

My original starting point of gsf from Leidos.  It's a shame that they don't have a public repo that I could have cloned with the whole history.

A list of things I'd like to do for GSF:

First green build with Travis-CI:

Coverity summary:

Friday, April 17, 2015

My new ais stream processing code for libais

For a long time, I've had the not so great in noaadata to manage multi-line NMEA AIS VDM messages.  It worked for the old USCG format, but it was brittle and cruft code.  Egil added aisdecode to libais based on ais_normalize, but that was building on a terrible foundation.  Today, I pushed the final patch for my nmea_queue module that can handle text, bare NMEA, TAG Block (including Orbcomm extensions), and USCG old CSV metadata.  I still need to add a couple non-VDM messages to the system to work out how to parse those cleanly.  With the new gpsd_format library (led by SkyTruth), libais and gpsd are starting to really play well together.  The design goals of the two are extremely different, so think carefully which you want to use.  In my case, I use both.

Overall, I'm much happier with the state of libais and excited to talk to many folks in the AIS community at the AISSummit 2015 conference in Hamburg next month.  Paul Woods of SkyTruth and I will be giving a pair of talks that will go over how to use "Big Data" techniques and technologies to tackle larger AIS databases with ease.  We will talk about many of the specifics of what has gone into Global Fishing Watch.

I've also been putting work into gpsd, BitVector, and ais-areanotice with a little bit on noaadata (noadata is a mess).  More to come.

Thursday, March 19, 2015

Where to put the optional marker in a regex?

I'm looking back at my AIS VDM python regular expressions this morning.  It seems obvious now that I look at this code block, but when a field is missing, I'd rather get a None back, so the "?" goes outside the named regex block.


In [1]: import re

In [2]: a = re.compile(r'(?P<seq_id>[0-9]?)')

In [3]: b = re.compile(r'(?P<seq_id>[0-9])?')

In [4]: a.match('').groupdict()
Out[4]: {'seq_id': ''}

In [5]: b.match('').groupdict()

Out[5]: {'seq_id': None}

Tuesday, March 17, 2015

Dynamic Ocean Management

I'm to lazy to switch to a machine that has Photoshop and fix up the images, but check out this article on "Dynamic Ocean Management: Identifying the Critical Ingredients of Dynamic Approaches to Ocean Resource Management" that talks about WhaleALERT!  doi:10.1093/biosci/biv018

Thursday, February 26, 2015

The Robert Schwehr Memorial Fund

My dad:

The Robert Schwehr Memorial Fund has just been established at Hidden Villa. It is a beautiful nonprofit educational farm that my dad loved and one of his favorite hiking spots. If you are interested in making a donation, please include the phrase "Robert Schwehr Memorial Fund" in the comment section of the online donation page. Here is the link

Friday, January 9, 2015

NOAA CSC and USCG do a lame job with Marine Cadastre AIS

This is a serious waste of our tax payer dollars.  Public data about vessel location is being removed from datasets.  This stuff is broadcast in the clear for anyone to receive.  And what do they mean by "encrypted"?  Did they just run a hash (e.g. md5) function on it?  If that's the case, if I we can figure the hash function, it's trivial to build a rainbow table style lookup.  Once that's done, there are lots of sources of MMSI to vessel name and call signs.  Oh, and by the way, it's 2015 and only 2009, 2010, and 2011 are available.  And then they publish it in the ESRI File Geo Database (FGDB) format that really isn't open.  And I don't really feel like getting a username and password again (or maybe my old one still works).  Sigh.

"Note: Ship name and call sign fields have been removed, and the MMSI (Maritime Mobile Service Identity) field has been encrypted for the 2010 and 2011 data at the request of the U.S. Coast Guard."

Saturday, January 3, 2015

Java and Eclipse on the mac

At one time, Apple seemed into Java and things were only slightly painful.  Then Java went out of favor with Apple and things went from bad to worse.  I downloaded the 64bit Mac build of Eclipse and got this after cd /Applications and tar xf ~/Downloads/eclipse-standard-luna-SR1-macosx-cocoa-x86_64.tar.gz and then double clicking the Eclipse icon.

I tried editing /Applications/eclipse/ to look like this:

Looking at my system, I find this for java:

ls -l /Library/Java/JavaVirtualMachines
drwxr-xr-x  3 root  wheel  102 Apr  4  2014 jdk1.7.0_51.jdk
drwxr-xr-x  3 root  wheel  102 May  7  2014 jdk1.7.0_55.jdk
drwxr-xr-x  3 root  wheel  102 Jul  9 13:46 jdk1.7.0_60.jdk
drwxr-xr-x  3 root  wheel  102 Jul 30 13:27 jdk1.7.0_65.jdk
drwxr-xr-x  3 root  wheel  102 Sep 18 05:50 jdk1.7.0_67.jdk

drwxr-xr-x  3 root  wheel  102 Dec 10 08:45 jdk1.7.0_71.jdk

But no luck.  I finally found that this success:

cd /Applications/eclipse/
./eclipse -vm /Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home/bin/

And now I have eclipse running:

So I now have Eclipse Luna Service Release 1 (4.4.1) Build id 20140925-1800 on Mac OSX 10.9.5 with XCode 6.1.1.  It took me way too long to get this figured.