BQuery is the native Karmasphere query protocol. All the Karmasphere client libraries use BQuery. Compared to DNS, these queries return a richer response, describing exactly which feeds matched, and why the feedset reached the verdict it did.
A BQuery packet is a hash table (Python: dictionary, Perl: hash, Java: Map) which is Bencoded, and sent over a network. Bencoding is a simple encoding format for data structure serialization, borrowed from BitTorrent.
The “official” description of Bencoding from BitTorrent’s protocol specification is as follows:
Each BQuery packet itself contains a number of key-value pairs. Most keys have both a short form and a long form. When constructing a packet, either the short form or the long form may be used, but not both; the short form is preferred to save network bandwidth. When receiving and decoding a packet, both forms must be checked. If the short form is found, the long form may be ignored.
Here is an example query, before BEncoding:
{
"_" : 12345,
"a" : [ "il4l4brs2ksyrf36", "rwf8l7oj" ],
"i" :
[
["127.0.0.1", "ip4", "smtp.client-ip"],
["antony@karmasphere.com", "email", "smtp.env.mail-from"],
["karmasphere.com", "domain", "smtp.env.helo"]
],
"s" : "karmasphere.email-sender",
"fl": 1
}
Formally, a BQuery query is a Bencoded map containing the following key-value pairs:
Short name Long name Value _ _ (required) A unique cookie for the query. This will be returned in the response. a auth (required) A list of Query Credentials (see below), usually a username and a password i ids (required) A list of Identities (see below) s composites (recommended) Either a single string, which is the name of the composite to query, or a list of strings, which are the names of several composites to query. fl flags (optional) A bit significant number (see below). f feeds (optional) A list of additional feeds by id number to query against. c combiners (optional) Either a single string, which is the name of an additional combiner to query, or a list of strings, which are the names of several additional combiners to query.
Cookie: This is an arbitary value which is returned in the response. This enables the application client to keep track of query / response pairs in an asynchronous situation. It is recommended that a simple string or integer be used.
Query Credentials: A list containing a username and a password. These credentials are (mostly) required. They are assigned by the system; you can find your credentials here. They may be different from the username and password that you use to sign in to the website. (more: FAQ)
Identities: A list of identities. Each identity is itself a list in the following format:
identity string, identity type, [, tag1 [, tag2 ....]]
Identity types are represented by strings or numbers:
Numeric String Type 0 ip4 IP4 Address 1 ip6 IP6 Address 2 domain Domain name 3 Email Address 4 url URL 5 opaque Opaque identity string
Tags communicate the context of an identity:
Tag name Meaning smtp.client-ip IP address of a smtp client smtp.env.helo HELO string sent from a smtp client smtp.env.mail-from MAIL FROM: email address from a smtp client
Composites: A list of the names of feedsets to query the identities provided against. For example, “karmasphere.email-sender.”
Feeds: This is a list of additional feed names to query. Generally, this parameter is not required.
Combiners: This is a list of additional combiners to use. Generally, this parameter is not required. Permissible values for this parameter will depend on the configuration of the slave which you are querying; local administrators may install additional combiners for you.
Flags: A bit significant number, currently only the lowest bit has any meaning:
Bit Semantic 0 Include all facts in the response.
The BQuery response is a BEncoded map with the following keys:
Short key Long key Value _ _ The cookie which was sent with the query. This field will not be present if no cookie was sent. f facts A list of facts (matches against each feed for each identity). c combiners A map from combination name to combination data. t time The time the query took in milliseconds. error error An error flag, set to 1 when an error has occured. Other fields may be absent in this case. message message An informational message, sent when an error has occured.
Facts: This is a list of maps with the following data:
Key Value f The feed name. v The value returned by the feed. (a 32 bit signed integer). i The identity that was matched by this feed. d An optional string data value returned by the feed.
Combiners: This is a map of maps. The outer map is keyed by feedset or combiner name. The inner map is as follows:
Key Value v The value returned by the combiner (For feedsets this is a number in the range -1000 to 1000). d An optional string explaining the value.
Here is a sample response, decoded by Perl and dumped:
{
"_" : 12345,
"facts" :
[
{
"f" : 4000,
"v" : -1,
"d" : "Invalid Source IP Address (cymru)",
"i" : "127.0.0.1"
}
],
"combiners" :
{
"karmasphere.email-sender" :
{
"v" : -1000,
"d" : "<f4000: if-fail(0) => return bad(1.0)>"
}
}
}
Here’s a sample error response, decoded by Perl and dumped:
{
"_" : 1655274485,
"error" : 1,
"message" : "java.lang.IllegalArgumentException: Not an IP4 address: 'wrong': Wrong number of octets: 1"
}
BQuery is capable of both UDP and TCP modes. In general, UDP is much faster than TCP, and should be preferred. We recommend that you use UDP by default and only consider TCP for large queries or responses, or if your firewall cannot handle UDP connection tracking. There is currently no automatic failover from UDP to TCP for large queries or responses.
Responses to UDP queries are sent to the originating IP address and port.
In TCP mode, each packet, query or response, MUST be prefixed with a 4 byte field containing the length of the encoded data in network byte order. Responses to TCP queries are sent back over the TCP connection.
Multiple queries can be sent with each TCP connection. However, if multiple queries are sent without waiting for each response individually, the order of the responses may differ from the order of the queries. Clients are advised to use cookies in order to reassociate each response with its query. The standard C libkarmaclient library contains packet reordering code for an asynchronous TCP mode.
Karmasphere provides a number of developer libraries which implement the bQuery protocol.