The Gowanus Data Dredger

Doubling-down on hyperlocal data after my recent drawbridge-sensing project, I present the Gowanus Data Dredger, a streaming real-time repository of maritime-skewed neighborhood data about everyone’s favorite Superfund site.

The Gowanus Data Dredger may look on the surface like a simple thermal receipt printer, but it’s a lot more than that! It pulls data from a wide array of neighborhood and city sources, through an array of APIs and data-scraping scripts. I made it to celebrate the biodiversity and vibrancy of a place I dearly love, a place that could just be the punchline of a long-forgotten, long-polluted industrial waterway but instead supports an incredible community of neighbors and artists and seafarers who invest a ton of energy into its restoration and access.

Data includes:

These very disparate data sources are all brought together and transmitted via a Furuno NX-500 NAVTEX receiver built in the early ’90s, inside which is a little embedded Raspberry Pi Zero 2W.

ZCZC YA52
171628 UTC JUL
NAVTEX, YOU ASK? NAVTEX IS A WIRELESS TELETYPE-ESQUE SERVICE FIRST DEVELOPED IN THE '70S TO HELP BROADCAST EMERGENCY BULLETINS TO MARINERS AT SEA. ESSENTIALLY AN ENCODED TEXT MESSAGE IS BROADCAST EVERY 4 HOURS OVER A STANDARD FM FREQUENCY (GENERALLY 518MHZ). THE PROTOCOL PIGGYBACKS ON A FORMAT CALLED SITOR-B, AND BOTH USE A CHARACTER ENCODING CALLED CCIR-476. NAVTEX IS STILL USED INTERNATIONALLY, THOUGH IT'S MOSTLY BEEN OBSOLESCED BY MORE MODERN TECH LIKE SATELLITES. IN THE US, THE USCG MANAGES NAVTEX FROM A HANDFUL OF BROADCAST STATIONS UP AND DOWN THE SEABOARD. PRIMARILY IT IS NOW USED FOR WEATHER BULLETINS.
NNNN
These are the operating NAVTEX transmitters along the Eastern seaboard. Each one transmits a forecast every 4 hours!

The data

Some of the sources listed above have APIs, but many don’t. It brings me no end of grief that it’s still so hard to get data out of official government sources. NotifyNYC has an email feed but signup was broken when I tried, and an RSS feed that only features active alerts, so I borrowed an unofficial JSON feed from their home page that has a longer tail. NOAA / NWS offer weather data including tide & current predictions, but accessing all of this requires several different APIs and/or parsing text files! The NYC DEP has a wonderful public-facing tool offering waterbody alerts when a combined sewer overflow leads to a poonami, but no way to access the underlying data. (I had to figure out how to make HTTP POST requests to their server to call the raw data, then figure out how to properly interpret it.)

Snapshots from a day of heavy rains: a combined sewer advisory scraped from the NYC DEP alongside traffic and transit alerts from NotifyNYC

Other sources like water quality required more footwork. Many organizations, including the EPA-sponsored contractors performing Superfund remediation on the canal, take regular water samples. But it’s tough to get this data. The official Superfund water quality data is often one-to-two months out-of-date. Other local groups only perform monitoring haphazardly, e.g. when schoolchildren pass through. Eventually I reached out to the wonderful folks at the Interstate Environmental Commission who put me in touch with folks at the Gowanus Canal Conservancy who perform volunteer water monitoring during the summer season, and they were kind enough to grant me access to their Google Sheet. The status of the turbines that power the flushing tunnel is another example: there’s no way to officially know whether the tunnel is on or off, but Gowanus Dredgers captain Gary Francis posts regular Instagram reels monitoring its status. (I scrape these with a headless Chromium install, and even then I get them from Threads rather than IG since Meta does less gatekeeping there! My code is pretty manual; attempting to use libraries like Instaloader only led to my test account getting banned.)

Homebrew data: @standupvirgin‘s flushing tunnel updates alongside LLM-transcribed VHF radio transmissions, paired with vessel AIS metadata

In most cases, I feed data to the gpt-4o API before logging it, to either classify it as relevant or irrelevant to this project (necessary with broad data feeds, like the city-wide NotifyNYC alerts), or to recast the data into a format more appropriate for the short NAVTEX bulletins (necessary for long-form data like the Gowanus CAG blog posts, video data like Captain Gary’s flushing tunnel updates, and of course my homebrew VHF radio transmissions). This is such a wonderfully reliable LLM use case (though maybe I should just give up and learn regex rather than destroying the world by supporting AI).

Superfund updates synthesized into a NAVTEX-appropriate voice (via GPT-4o) from the Gowanus CAG blog

The VHF radio

This whole project originated when I experimented with monitoring VHF radio transmissions to flag when the Hamilton Avenue drawbridge was about to go up, so of course I had to bring that delightfully ephemeral data into the fold here.

My original setup uses two Raspberry Pi 3As attached to RTL-SDR dongles, one sited at the Gowanus Dredgers Boathouse and the other at Big Reuse a bit further down-canal. (I should take a moment and deeply thank the Dredgers and the folks at Big Reuse for letting me set up these radio receivers at their facilities!)

I never quite got the radio reception I wanted to with my RTL-SDRs, though. This was particularly a problem at the Dredgers Boathouse, which is way up-canal. It couldn’t even get transmissions from the Hamilton Avenue Bridge, no matter how much jury-rigging I did with the antenna. The radio at Big Reuse has worked much better, but my installation is naturally a bit lower-profile there, which makes it harder to optimize for reception. Still, it’s been so much fun monitoring the real-time transmissions!

Eventually I turned my attention to the Dredgers Bunker, a small gated-off launchpad at the mouth of the Gowanus Bay, on a dead-end street behind the neighborhood’s Home Depot. This location offers fantastic open vistas, with far fewer buildings to interfere with my precious radio signal.

Hamilton Avenue Bridge going up, as seen from the Gowanus Dredgers Bunker

The only problem? Unlike Big Reuse or the Boathouse, I wouldn’t have reliable wifi or electricity here, so running the power-hungry Pi-based RTL-SDR was out.

Enter the kv4p-ht.

The kv4p-ht is a dongle built around an ESP32 microprocessor and a DRA818V 137-174MHz radio transceiver. It’s designed by and for ham radio users, and was specifically made to interface over USB with an Android phone.

Testing power consumption of the kv4p-ht ESP32 board (v2.0d), along with a lipoly battery and 10w solar panel

After a few days of making sense of the firmware, I adapted it to meet my use case. I run a custom script that scans through marine-band VHF frequencies and sends any transmissions over UDP, thanks to some questionably-accessed and very-locked-down nearby wifi. And because the DRA818V is so much less resource-intensive than the RTL-SDR, I can power the whole thing with a 6600mAh LiPoly battery and a 6v / 5w solar panel. (Lots of light at the bunker! And my setup maxes out at only ~160mA.)

Field testing next to the Bunker

Since I can’t forward any ports or control the network the kv4p is attached to, and since the wifi network I’m using is so locked down, it took a long time to figure out a way to send VHF transmissions through the firewall. I tried setting up a WireGuard deployment to get the ESP32 onto the same VPN network as my computer (and the NAVTEX receiver), but the firewall blocked every proxy. Eventually I discovered that I could convert each transmission into tiny chunks of optimized Opus audio and then send them over a websocket, using Ably Realtime. Each little chunk is nestled in a custom JSON packet that also includes some other useful metadata, such as the VHF frequency of the transmission and both the lipoly battery life and ambient temperature of the ESP32 (since it’s gonna be sitting in the hot sun, I added a simple thermistor). Seems the wifi firewall can’t tell the difference between this and standard SSL traffic! I have a Python listener running that ties the chunks back together and transcribes any transmissions that are a) long enough and b) appear to have spoken words in them, using Cobra’s voice activity detector. For fun, I also implemented some functionality that allows me to use the same websocket protocol in reverse to flash the ESP32 with firmware updates.

I’m impressed by the meager power consumption of the ESP32+DRA818V setup, and pleased to be able to send VHF signals over lil websocket chunks. But it’s been a bear to optimize the antenna. I’ve been dealing with lots of interference from the ESP32’s 2.4ghz wifi, which I currently have sitting at the opposite end of the enclosure separated by a big divider of RF-isolating tape. (In the image above, the two antennas are right next to each other. Needless to say the VHF receiver did not like this.) Still working on this!

The NAVTEX receiver

Deciding to pass my data through the old NAVTEX receiver was one thing. Actually getting it working was completely different.

The spec isn’t that complex, but the transmission must be properly encoded in order to be properly decoded. Since it’s maritime tech from the ’70s, there aren’t a ton of resources out there, even among ham radio freaks. Fldigi was a critical resource for testing my encoding protocols as well as the various test .wav files I found online. But the real killer app is Baltic Lab‘s CCIR476 encoder/decoder library. Thanks to this work, I was able to rig up an Arduino Nano and an SI5351A clock generator to transmit a test message at 518MHz.

It took forever to get this to work. (Most of a weeklong family vacation in Florida, to my shame.) I was sure the Furuno NX-500 was busted. But after poring over the operator’s manual, I discovered that my receiver had an expansion interface installed that basically broke the NAVTEX functionality. (The model requires certain audio-in and audio-out pins to be bridged using a jumper, and the expansion interface interferes with this.)

Of course the manual spelled out why nothing was working. Maybe I should’ve started by reading it?

As a bonus, though, I learned that the NX-500 accepts pre-recorded audio signals for diagnostic purposes! So much better than having to set up an actual transmitter and try to shield it enough to not bother the FCC.

So my final NAVTEX setup includes an embedded Raspberry Pi Zero 2W, mounted to the PCB. It borrows 5v from the receiver, and modulated audio is transmitted out via a GPIO pin to the audio input pin on the PCB. The antenna is entirely bypassed.

Pi Zero nestled inside the NAVTEX receiver, sending a stream of encoded audio directly to the receiver in place of antenna input

Final thoughts

After spending a month trying to crack the NAVTEX protocol, I was so excited when it finally worked.

And then I realized… I had just invented a printer. The slowest, most inefficient printer imaginable.

But a few weeks later, having rigged up the rest of the system, I’ve done a complete 180°. It’s thrilling to hear the transmitter activate and get to say things like “huh, let’s see what’s coming in over the wire….” It also makes me feel like I’m reading a ticker-tape telegraph message from 1940 or something.

As an added bonus, I’ve learned so much about radio that I’ve finally decided to take the ham operator’s licensing exam. DID YOU KNOW that ham operators are allowed to reach out to astronauts on the ISS??? So I’ll undoubtedly have more wacky radio projects to share soon enough.

If you’re curious to learn more about this project, you can check out all the code — including ESP32 Arduino code for VHF radio transmissions and the Python data loggers and NAVTEX transmitter — on github.

If you’re curious to learn more about the Gowanus Canal, I recommend joining a free walk-up paddle with the Gowanus Dredgers, volunteering with Big Reuse and/or the Gowanus Canal Conservancy, and reading historian Joseph Alexiou’s work!