Monday, April 18, 2016

Prometheus Metric Endpoint Parser for Java

I have a need for a Java-based parser that can parse metric data from any Prometheus endpoint.

Prometheus has two main data formats - a binary format and a text format. You can read about those formats here. That document says that "Clients must support at least one of these two alternate formats." So I needed a Java-based parser that can parse both.

The Prometheus team has published parsers for several different languages (e.g. C++, Go, Python, Ruby, and Java). Some only support the binary formats, Java being one of those with only binary support. In addition, the Prometheus team may delete the Java parser entirely since it is relatively unused by the community. As of this writing, the latest release of the Prometheus Java parser is version 0.0.2 from July 2013 which also doesn't support histograms (though version 0.0.3-SNAPSHOT in the master branch does support it - so if/when 0.0.3 is released, histogram support will be avaialble).

So I needed to write my own Java-based parser for the text format to ensure I could read any Prometheus metric endpoint (even though the documentation says clients must support one or the other, in practice it seems all endpoints support the text format and only some (mainly Go endpoints) support the binary format). So even if the Java-based binary parser support goes away, having a text parser should still be able to read all Prometheus endpoints (in other words, those endpoints with binary-format support should also have text-format support as well).

Here is my Java-based Prometheus Metrics Scraper code. There is a README for a quick synopsis. It supports both binary and text formats and utilizes content negotiation with the URL endpoint to determine what format to expect. You can also programatically process files as opposed to URL endpoints.

This Prometheus Metrics Scraper comes with a CLI that you can run via a simple Java command:
java -jar prometheus-scraper*-cli.jar [--simple | --xml | --json] {url}
It can output any URL endpoint's metric data in several formats (JSON and XML being the two more interesting ones). If you'd like to try it out, grab the latest release from here and run it. For example, you can download the 0.17.1Final CLI jar here.

Programmatically you use this by simply passing a URL (or File) to PrometheusScraper and calling its scrape() method. This will return a list of MetricFamily objects, which contain all the metric data found in the endpoint URL.

See the code's Javadoc for more complete documentation.

There are a few things still missing that would be nice to enhance for the future.

First is histogram support for binary formatted data (but once the jar artifact "io.prometheus.client:model" version 0.0.3 is released by the Prometheus team, it would just be a matter of uncommenting one block of code for my Java-based parser to begin supporting it). Of course, histograms are fully supported in the text parser.

Secondly, the URL endpoint is assumed to be unsecured. If SSL certificates or authentication is required to access the metric data over the given URL, the scraper will fail to process the data.