Wednesday, June 29, 2016

How bad is comcast?

So, not that this is news to anyone, but comcast is a bunch of morons.  They called me.  They can find 3 accounts for me, but they can find my internet account.  I have business class internet from them at the same address as cable tv.  I can't effectively do anything with their website as I have to log in and out of several accounts all of which are f-ed up and randomly log me in or out as I try to check bills and the refund of $150 they say they owe me but haven't paid me yet on one of the accounts.  So I'm stuck with their internet for a year until my contract runs out, but it's time to cancel cable TV with them.  Now I'll be just stuck with 24/7 guaranteed support that only works during mountain time business hours.  Last I called, the lady said she had accidentally kicked me off the internet and couldn't fix it until I called back during business hours.  Should couldn't fix what she had just done... which turned out to be nothing.  Yeah for customer service.

Tuesday, June 28, 2016

libtiff security bug

I just had a chance to work on a security bug behind the scenes that might end up having a CVE ( Update: CVE-2016-5875 ).  All the stuff I did in GDAL was so much of a torrent that it hardly seemed worth noting.  While I was just a reviewer and connecting people behind the scenes, it still feels good to help out.  The log entry by Even Rouault:

cvs log -r1.44 tif_pixarlog.c | egrep -i '^[a-z]'
RCS file: /cvs/maptools/cvsroot/libtiff/libtiff/tif_pixarlog.c,v
Working file: tif_pixarlog.c
head: 1.45
total revisions: 51; selected revisions: 1
revision 1.44
date: 2016-06-28 08:12:19 -0700;  author: erouault;  state: Exp;  lines: +9 -1;  commitid: 2SqWSFG5a8Ewffcz;
PixarLogDecode() on corrupted/unexpected images (reported by Mathias Svensson)
The patch: (Also in GDAL as r34459)

cvs diff -r1.43 -r1.44 -u tif_pixarlog.c
Index: tif_pixarlog.c
RCS file: /cvs/maptools/cvsroot/libtiff/libtiff/tif_pixarlog.c,v
retrieving revision 1.43
retrieving revision 1.44
diff -u -r1.43 -r1.44
--- tif_pixarlog.c 27 Dec 2015 20:14:11 -0000 1.43
+++ tif_pixarlog.c 28 Jun 2016 15:12:19 -0000 1.44
@@ -1,4 +1,4 @@
-/* $Id: tif_pixarlog.c,v 1.43 2015-12-27 20:14:11 erouault Exp $ */
+/* $Id: tif_pixarlog.c,v 1.44 2016-06-28 15:12:19 erouault Exp $ */

  * Copyright (c) 1996-1997 Sam Leffler
@@ -459,6 +459,7 @@
 typedef struct {
  TIFFPredictorState predict;
  z_stream stream;
+ tmsize_t tbuf_size; /* only set/used on reading for now */
  uint16 *tbuf;
  uint16 stride;
  int state;
@@ -694,6 +695,7 @@
  sp->tbuf = (uint16 *) _TIFFmalloc(tbuf_size);
  if (sp->tbuf == NULL)
  return (0);
+ sp->tbuf_size = tbuf_size;
  if (sp->user_datafmt == PIXARLOGDATAFMT_UNKNOWN)
  sp->user_datafmt = PixarLogGuessDataFmt(td);
  if (sp->user_datafmt == PIXARLOGDATAFMT_UNKNOWN) {
@@ -783,6 +785,12 @@
  TIFFErrorExt(tif->tif_clientdata, module, "ZLib cannot deal with buffers this size");
  return (0);
+ /* Check that we will not fill more than what was allocated */
+ if (sp->stream.avail_out > sp->tbuf_size)
+ {
+ TIFFErrorExt(tif->tif_clientdata, module, "sp->stream.avail_out > sp->tbuf_size");
+ return (0);
+ }
  do {
  int state = inflate(&sp->stream, Z_PARTIAL_FLUSH);
  if (state == Z_STREAM_END) {
 There is still tons of room for even beginners to find bugs.  So grab your fuzzers (e.g. AFL), static analyzers, and mark 1 eyeballs.  Then go find an open source package and get to work! Heap-based buffer overflow in LibTIFF when using the PixarLog compression format

Thursday, June 23, 2016

Public review of Handling and Analyzing Marine Traffic Data

This mornings reading: Handling and Analyzing Marine Traffic Data, Masters Thesis by ERIC AHLBERG, JOAKIM DANIELSSON.  I hate to be harsh in public, but this thesis is more of a tease than anything else.  I was hoping for more and I hope that those involved follow on with more depth to the work and next time give better background to increase the value of the research.  This thesis shows that there is a start to interesting work.
With the emergence of the Automatic Identification System (AIS), the ability to track and analyze vessel behaviour within the marine domain was introduced. Nowadays, the ubiquitous availability of huge amounts of data presents challenges for systems aimed at using AIS data for analysis purposes regarding computability and how to extract valuable information from the data. This thesis covers the process of developing a system capable of performing AIS data analytics using state of the art Big data technologies, supporting key features from a system called Marine Traffic Analyzer 3. The results show that the developed system has improved performance, supports larger files and is accessible by more users at the same time. Another problem with AIS is that since the technology was initially constructed for collision avoidance-purposes, there is no solid mechanism for data validation. This introduces several issues, among them is what is called identity fraud, that is when a vessel impersonates another vessel for various malicious purposes. This thesis explores the possibility of detecting identity fraud by using clustering techniques for extracting voyages of vessels using movement patterns and presents a prototype algorithm for doing so. The results concerning the validation show some merits, but also exposes weaknesses such as time consuming tuning of parameters.
I skimmed to the reference section and conclusion and, while they reference some key relevant papers, they are missing a lot of references that you might expect.  No reference to ITU, IEC, IMO, IALA, or other relevant specifications.  No references to papers, presentations, or blog posts by me, ESR, or SkyTruth about AIS troubles or using "Big Data" type methods for AIS.  I'm uncomfortable tooting my own horn here, but come on.

Reading through the thesis, I couldn't find any real meat to the introduction and, when I got to the evaluation section, I was disappointed by this.  No references to even what model they used.  They could have easily reached out to a number of folks with AIS data and stats about data errors.  The thesis hasn't even described how AIS messages really work or any background on what perfectly functioning AIS message traffic might look like and its error characteristics.  Their one reference to spoofing was to the annoying web hack of injecting AIS messages into a companies feed, which no other ships would even see on their bridge.
The problem of AIS validation has been studied before, but to the knowledge of the authors of this thesis, there is no data consisting of documented cases of invalid data openly accessible. In addition, there is no measure of how often the specific problem occurs in real situations, which means that it might be too time consuming to use real data. Therefore, the evaluation focused on constructing dummy data to realistically model interesting scenarios which could be a sign of invalid AIS messages, and thereby get an indication of how well the solution performs.
Hey guys, check out my 2012 blog post: AIS Security and Integrity:

It was nice to see them go through various computing platforms, but the analysis was rather weak.  I have to wonder what they mean that a command line interface is hard to upgrade.  That to me seems easier that updating web apps.

Later we get to 4.2.2 AIS message validation.  When they refer to "Static validation, i.e. checking that the messages conform to the syntax of an AIS message" I really have no idea what they mean.  They haven't even defined a syntax for AIS nor told the reader where it might be defined.

The clustering stuff is okay, but the figures are very difficult to read until you get to 4.10.  Just when things are starting to get interesting, the thesis ends.  There is a section on ethical concerns that appears to be an afterthought and provides no new information (and not even a reference to the IMO announcement of > 15 years ago on the topic), analysis or opinions.  There were a whole pile of thoughts submitted to the US Federal Gov for a request some years ago.  Both sides of the argument submitted opinions.

Wishing for more...

Wednesday, June 22, 2016

How many slots can an AIS message have?

Since up to 9 VDM messages can be chained together, there is some confusion as to how many bits can be in an AIS message.  Over the radio, you are allowed to have up to 5 slots.  See ITU-1371-5:

The first slot is short with just 128 bits available.  The other following 4 slots are 256 bits each.  That gives a total over the VHF radio payload of 1152 bits.  Once that is armored into NMEA VDM characters at 6 bits per character, that gives 192 characters.  Those 192 characters can be spread out across multiple NMEA lines that are grouped together.  Each NMEA sentence can be up to 80 characters long.  Those sentences are defined in the NMEA standard and IEC_61162, both of which are paywalled.  According to NMEA 4.0, there can be up to 62 characters per sentence.

While it is possible to chain 9 sentences (1 to 9) together of VDM armored NMEA with 558 characters of armored data at 6 bits per character for a total of 3348 bits, the VDL (VHF Data Link) will only let you send those 5 slots or 1152 bits.  A full length 1152 bit message could be packed in as little as 4 NMEA lines if fully using the 62 characters per sentence limit.

And to sounds like a broken recorded, paywalled specifications are evil.  In the case of specs for maritime systems, closed specs are detrimental to safety of life at sea and reduce the quality of the tools available to mariners.  I take strong issue with NMEA, IEC, ITU and ISO for paywalling so many specifications documents.

Here is what some multi-sentence TAG BLOCK encoded NMEA looks like:

Or as actual text:



Thursday, June 16, 2016

Git for Ages 4 and up

If you are interested in git (and why wouldn't you be?), and you haven't watched this video, you should watch it!

Wednesday, June 15, 2016

Using libais C++ to dump a human readable form of AIS NMEA messages

I was just asked this via email, so here is the answer for all.
I am trying to use your library to decode AIS in C++. I can’t seem to figure out how to actually decode the body into meaningful string data, or if your library has that functionality.

The libais library is designed to convert the NMEA to bits and then break down the bits into a C++ struct.  From python you get a dict that you can print.   Otherwise, you can extend the stream operators to print more for C++, but it's really application specific what format you might want for the output: xml, json, msgpack, csv, sql for database insertion, etc. plus variations on those themes like gpsd json. And do you want to mix message 5 data into 1,2,3 position reports.  It's up to you how you want to print them.

libais just tries to nail the details of the low level packets in C++.   Everything else is considered out of scope.  Pull requests welcome if you want to add a C++ layer above that does more.  I only added the python layer at this point.

If you just want to get a look at a json for quickly, you can pass the data through gpsd.  It's missing some of the messages and field components (e.g. commstate), but it does make an assumption about output format.