A system for generating and saving per-connection metadata in real-time on M-Lab's edge systems.
It generates a JSON file for every connection containing the geolocation and network location metadata for the IP addresses in the connection, and eventually adds in all other annotations concerning the "local environment" as well.
The datatype it generates will be "annotation" and it will generate filenames like:
/ndt/annotation/2009/03/18/${UUID}.json
where ${UUID}
is the actual UUID of the connection under consideration. It will follow both our uniform names best-practices and pusher best-practices.
The columns in the JSON file will initially be a subset of our standard columns:
client.Geo.*
server.Geo.*
client.Network.ASNumber
server.Network.ASNumber
Later versions can (and should!) add columns that include real-time switch counters, local machine load, and other indicators of measurement quality, but v1 will concentrate on location data. Each new column added to the annotator output should be added to our set of standard columns.
The location annotation service will read from a MaxMind file served up via a file stored in a GCS bucket. It will periodically poll (in a memoryless manner) to discover whether the file has changed.
This service will depend on tcp-info's UUID notification service, but no local service should depend on the annotator. As such, we do not need to worry about the annotator slowing down an integrated service, we only need to worry about the annotator keeping up with the creation rate of TCP connections. We do not anticipate that being too difficult.
This service is a core service and needs to be highly available, just like tcp-info, packet-headers, traceroute-caller, and DISCO. It represents our one chance to annotate UUIDs with metadata. As such, the health of the experiment service should depend on the health of the UUID annotation service, just like it should depend on the other core services.
If only the local ipservice socket is needed to provide annotations for specific
IPs, the uuid-annotator may be run in a "stand-alone" mode. This mode does not
require the tcp-info -tcpinfo.eventsocket
, -siteinfo.url
, or -datadir
flags.
docker build -t local-annotator .
docker run -v $PWD/testdata:/testdata -it local-annotator \
-ipservice.sock=/local/uuid-annotator.sock \
-maxmind.url=file:///testdata/GeoLite2-City-real.tar.gz \
-routeview-v4.url=file:///testdata/RouteViewIPv4.pfx2as.gz \
-routeview-v6.url=file:///testdata/RouteViewIPv6.pfx2as.gz
If using uuid-annotator data as part of the autoloader pipeline, you may
generate the data type schemas using the generate-schemas
command:
docker run -v $PWD:/schemas --entrypoint /generate-schemas -it local-annotator \
-ann2 /schemas/ann2.json -hop2 /schemas/hop2.json