Category: Articles


eBPF and XDP support is one of the latest evolutions of the Suricata engine’s performance capabilities. These two technologies have been introduced recently for the Linux kernel and Suricata is one of the first well established and mature projects to make use of them. eBPF was introduced in the Linux kernel to be able to run user provided code safely inside the Linux kernel and XDP is running eBPF code on the network data path, as close as possible to the Network Interface Card.

The initial support for eBPF and XDP was initially available in Suricata’s 4.1, released in November 2018, and it has been greatly enhanced in Suricata 5.0. The development team at Stamus Networks, lead by Éric Leblond, has been the primary developer of eBPF and XDP support within Suricata. The latest additions, list of features and potential use cases enabled by eBPF and XDP are becoming more significant, even vital, making this an opportune time to provide a high level overview for the community. This is the purpose of this article written by Éric Leblond and Peter Manev.

Download the whitepaper: Introduction to eBPF and XDP support in Suricata

Stamus Networks

Stamus Networks has created a Network Traffic Analyzer (NTA), which uses network communications as a foundational data source for detecting security events, and married it with an Intrusion Detection System (IDS), which provides deep packet inspection of network traffic using a rules based engine. The combination of these two approaches, done simultaneously in a single solution, provides a level of correlated data never previously achieved. We then added a Threat Hunting interface to allow security practitioners the ability to quickly and efficiently search through this security data to examine, validate, and resolve the security incidents that they face on a daily basis.


Suricata is a free and open source, mature, fast and robust network threat detection engine.
The Suricata engine is capable of real time intrusion detection (IDS), inline intrusion prevention (IPS), network security monitoring (NSM) and offline pcap processing.
Suricata inspects the network traffic using a powerful and extensive rules and signature language, and has powerful Lua scripting support for detection of complex threats.

Suricata’s fast paced community driven development focuses on security, usability and efficiency. The Suricata project and code is owned and supported by the Open Information Security Foundation (OISF), a non-profit foundation committed to ensuring Suricata’s development and sustained success as an open source project.


Suricata 4.0 is out and this switch from 3.x to 4.x is not marketing driven because the changes are really important. This post is not exhaustive on changes. It is Stamus Networks’ take on some of the important changes that have been introduced in this version.

Rust addition

This is the big step forward on the technology side. Suricata is written in C language. This gives performances and a good control over memory. But it goes with a series of well known problems. I name here buffer overflows, use after free, …

And the worse is that Suricata is parsing traffic content which is a kind of vice supercharged user input. If one should not trust user input, guess how careful we should be with network traffic. At Suricon 2016, Pierre Chifflier did present a proof of concept implementation of protocol parsers in Rust. The idea is to use the property of Rust that has been designed to avoid complete class of attacks on memory handling. But there is more in the approach as the implementation is using Nom which is a Rust parser combinator framework. It allows you to write protocol parser easily and in a reusable way. Thus the addition of Rust is two things at the same time: more security and easier code. Which means a lot of new protocols should be added in the near future.

Suricata 4.0 Rust support comes with NFS, DNS and NTP. NTP support is implemented via an external crate (read library): ntp-parser.

As mentioned before, the code uses Nom and the syntax is very different from traditional code. For instance, here is the code of ntp-parser parsing NTP extension:

named!(pub parse_ntp_extension,
           ty: be_u16
        >> len: be_u16 // len includes the padding
        >> data: take!(len)
        >> (

This define a parsing function that read the stream of data. The code says, take 16 bits, store them as unsigned integer in ty. Then store the next 16 bits as unsigned integer in len. Then store in data a chunck of data of length len. And with that build a NTP extension structure. If the writing is concise and efficient, the best thing with Nom is under the hood. Nom is taking care of detecting the invalidities. For instance we could have a chunck of data of length 50, and len being set to 1000 (remember Heartbleed ?). Nom will see that there is not enough data available in the chunck and return it wants more data.

Better alerts

As you may know, the preferred output of Suricata is the EVE JSON format. It is flexible, easy to extend and easy to read by human and tools. Suricata 4.0 is introducing some major changes here:

  • ‘vars’ extraction mechanism
  • The new target keyword
  • HTTP bodies logging
HTTP body output

Suricata is able to uncompressed HTTP body on the fly and match on the uncompressed content. This means that if you get the payload of the stream triggering the alert in your event, you will just see compression noise and won’t be able to analyze why the alert was triggered. Suricata is now able to include the HTTP bodies in the alert. The analyst can then directly see from the event the content that did trigger the alert.

The following event shows how payload_printable is completely compression noise and the http_response_body_printable is readable:

Target keyword

The new target keyword is a fix on a very old problem. It is not possible to know in an alert event which side of the source or destination is the target of the attack. This is a problem as it is not possible to automate things due to that lack of information. The target keyword allow the rules writer to specify which is side is the target. Doing so automated analysis and better visualization can be made.

Usage is simple, signature has to contain the target keyword with value dest_ip or src_ip. For example, in a simple scan alert we have:

alert tcp $EXTERNAL_NET any -> $HOME_NET 3306 (msg:"ET POLICY Suspicious inbound to mySQL port 3306"; flow:to_server; flags:S; threshold: type limit, count 5, seconds 60, track by_src; reference:url,; classtype:bad-unknown; target: dest_ip; sid:2010937; rev:2;)

If target is present in a signature, the alert is added an alert.source and field:

For example, on a visualization where node are IP address and links are alerts between the two, we can get an idea of the possible compromised path. With the target addition, we can switch from a non oriented graph:

To an oriented graph that show which paths were really possible:

If you know French, you can learn more about this subject with Eric Leblond’s talk at SSTIC 2017.

Vars extraction

This is one of the most expected feature of Suricata 4.0. This has been described by Victor Julien in an extensive blog post. The concept is to be able to define in signature data to extract and store them in a key value form. There is a lot of possible usage ranging from application version extraction to getting exfiltred data. For example, let’s consider there is a domain we are interested in. One interesting information is the list of email addresses where mail are sent to. To do so we can use the following signature:

alert smtp any any -> any any (msg:"Mail to stamus"; content:"rcpt to|3A|"; nocase; content:""; within: 200; fast_pattern; pcre:"/^RCPT TO\x3a\s*<([\w-\.]>/ism pkt:email"; flow:established,to_server; sid:1; rev:1;)

The magix here is the groupe in the regular expression ([\w-\.] that is save in a packet var named email by the pkt:email in the regular expression definition.

Using that signature we get this kind of alerts:

The key point here is the vars sub object:

  "vars": {
    "pktvars": [
        "email": ""

We have an extraction of the data and this can be easily search by tool like Elasticsearch or Splunk.


Suricata 4.0 is really an important milestone for the project. Introduction of Rust is opening a really interesting path. The alerts improvement may change the way signatures are written and it will help to provide really accurate information to the analysts.

Suricata 4.0 is already available in SELKS and it will be available in Stamus Probe by the end of August. To conclude on a personal note, we, Stamus Networks, are really happy to have contributed to this release with features such as via HTTP body logging and target keyword.



Stamus Networks was working on a new Suricata feature named bypass. It has just been merged into Suricata sources and will be part of the upcoming 3.2 release. Stamus team did initially present his work on Suricata bypass code at Netdev 1.1, the technical conference on Linux networking that took place in Sevilla in February 2016.

In most cases an attack is done at start of TCP session and generation of requests prior to attack is not common. Furthermore multiple requests are often not even possible on same TCP session. Suricata reassembles TCP sessions till a configurable size (stream.reassembly.depth in bytes). Once the limit is reached the stream is not analyzed.

Considering that Suricata is not really inspecting anymore the traffic, it could be interesting to stop receiving the packets of a flow which enter in that state. This is the main idea behind bypass.

The second one consist in doing the same with encrypted flows. Once Suricata sees a traffic is encrypted it stops inspecting it so it is possible to bypass the packets for these flows in the same way it is done for packets after stream depth.

In some cases, network traffic is mostly due to session we don’t really care about on the security side. This is for example the case of Netflix or Youtube traffic. This is why we have added the bypass keywords to Suricata rules language. A user can now write a signature using this keyword and all packets for the matching flow will be bypassed. For instance to bypass all traffic to Stamus Networks website, one can use:

alert http any any -> any any (msg="Stamus is good"; content:""; http_host; bypass; sid:1; rev:1;)

This is for sure just an example and as you may have seen our website is served only on HTTPS protocol.

Currently, Netfilter IPS mode is the only capture supporting the bypass. Stamus team represented by Eric Leblond will be at Netdev 1.2, first week of October 2016, to present an implementation of bypass for the Linux AF_PACKET capture method based on extended Berkeley Packet Filter.

And if you can’t make it to Japan, you will have another chance to hear about that during suricon, the Suricata user conference that will take place in Washington DC beginning of November.

Suricata bypass concepts

Suricata bypass technics

Suricata is now implementing two bypass methods:

  • A suricata only bypass called local bypass
  • A capture handled bypass called capture bypass

The idea is simply to stop treating packets of a flow that we don’t want to inspect anymore as fast as possible. Local bypass is doing it internally and capture bypass is using the capture method to do so.

Test with iperf on localhost with a MTU of 1500:

  • standard IPS mode: 669Mbps
  • IPS with local bypass: 899Mbps
  • IPS with NFQ bypass: 39 Gbps
Local bypass

The concept of local bypass is simple: Suricata reads a packet, decodes it, checks it in the flow table. If the corresponding flow is local bypassed then it simply skips all streaming, detection and output and the packet goes directly out in IDS mode and to verdict in IPS mode.

Once a flow has been local bypassed it is applied a specific timeout strategy. Idea is that we can’t handle cleanly the end of the flow as we are not doing the streaming reassembly anymore. So Suricata can just timeout the flow when seeing no packets. As the flow is supposed to be really alive we can set a timeout which is shorter than the established timeout. That’s why the default value is equal to the emergency established timeout value.

Capture bypass

In capture bypass, when Suricata decides to bypass it calls a function provided by the capture method to declare the bypass in the capture. For NFQ this is a simple mark that will be used by the ruleset. For AF_PACKET this will be a call to add an element in an eBPF hash table stored in kernel.

If the call to capture bypass is successful, then we set a short timeout on the flow to let time of already queued packets to get out of suricata without creating a new entry and once timeout is reached we remove the flow from the table and log the entry.

If the call to capture bypass is not successful then we switch to local bypass.

The difference between local and capture bypass

When Suricata is used with capture methods that do not offer the bypass functionality of eBPF/NFQ mark – pcap, netmap, pfring – it will switch to local bypass mode as explained above. Bypass is available for Suricata’s IDS/IPS and NSM modes alike.

Handling capture bypass failure

Due to misconfiguration or to other unknown problems it is possible that a capture bypassed flow is sending us packets. In that case, suricata is switching back the flow to local bypass so we handle it more correctly.


Suricata stats in EVE JSON format

Suricata 3.0 will come with a lot of improvements on the output side. One of them is the ability to output Suricata internal statistics in the EVE JSON format.

Stats event in EVE JSON format

This functionality is already used by scirius to display statistics graphs of the Suricata running in SELKS, Amsterdam or Stamus Networks’ appliances:

Stats in scirius

This statistic sometimes help to visualize the impact of configuration change. For example, in the next screenshot the generic receive offloading on the capture interface has been disable at 23:33:

Impact of iface offloading

Impact is cristal clear as the counter of invalid decoding did stop increasing.

Using Kibana Timelion plugin

Amsterdam came with Kibana 4 and the Timelion plugin is preinstalled. Timelion is a plugin providing a new interface and language to graph timeline.

As Suricata stats data are fed into Elasticsearch, we can use it to graph Suricata performance data.

For example to graph DNS and HTTP memory usage, one can use the following syntax:

.es(metric=’avg:stats.dns.memuse’).label(‘DNS’) .es(metric=’avg:stats.http.memuse’).label(‘HTTP’)

Result is the following graph:
Screenshot from 2016-01-07 11-01-48

If you have a counter and want to graph rate, then you can use:

.es(metric=’avg:stats.capture.kernel_packets’).derivative().label(‘PPS’) .es(metric=’avg:stats.capture.kernel_drops’).derivative().label(‘Drops’)

And you get the following graph:

Screenshot from 2016-01-07 10-59-01

One interesting thing with Timelion is that you can use Lucene query to get a count of something really easily. For example to get a view on the rate of different event type, one can use:

.es(q=’event_type:http’) .es(q=’event_type:tls’) .es(q=’event_type:dns’)

Rate of different event types

Both method can be mixed so, if you have different probes (let’s say probe-1 and probe-2) you can do something like:

.es(q=’host.raw:”probe-1″‘, metric=’avg:stats.dns.memuse’).label(‘Probe-1 DNS’) .es(q=’host.raw:”probe-2″‘, metric=’avg:stats.dns.memuse’).label(‘Probe-2 DNS’)


The new Suricata statistic output is really improving the information we can use when doing performance analysis. Combined with timelion, we get a really easy and powerful solution. If you want to give a try to all these technologies one of the easiest way is to use Amsterdam which comes with latest Suricata and a pre installed timelion.



This is a short tutorial of how you can find and store to disk a self signed TLS certificates with Suricata IDPS and Lua (luajit scripting).

What does self signed TLS certificate mean – the quick version  from Wikipedia here. In other words – “certificate that is signed by the same entity whose identity it certifies” or anyone can create and deploy such a certificate. This kind of events are signed of some poorly setup TLS servers and that’s why it is good to keep an eye on such events in your network for the purpose of contingency monitoring – the very least.

TLS support in Suricata allows you to do match on TLS subject, TLS issuer DN and things like TLS fingerprint. For instance, one can do

alert tls any any -> any any (msg:"forged ssl google user";
            tls.issuerdn:!"CN=Google-Internet-Authority"; sid:8; rev:1;)

to detect that the TLS issuer DN is not the one we were waiting for a given TLS subject. But there is no way to compare the two different fields. So how do we catch all those self signed certificates without knowing any details about them and/or any network/domain/port specifics either. And we want them all caught and stored to disk!

This case is one example where the Lua support in Suricata IDPS  shines – more than shines actually because it empowers you to do much more than chasing packet bytes with rule keywords and using PCREs inside a rule – and all those still deliver limited functionality in this particular scenario.

Lua and Suricata

Since version 2.0, Suricata has support for Lua scripting. The idea is to be able to decide if an alert is matching based on the return of a lua script. This script is taking some fields extracted by Suricata (the magic of it all) as parameters and return 1 in case of match and 0 if not. This lua scripting allows rules to implement complex logic that would be impossible with the standard rule language. For instance, Victor Julien was able to write a performant Heartbleed detection with lua scripting in the afternoon (the very same day) the problem/exploit was announced.

The syntax is the following, you prefilter with standard signature language, and you add a lua keyword with parameter being the script to run in case of partial match:

alert tls any any -> any any ( \
    msg:"TLS HEARTBLEED malformed heartbeat record"; \
    content:"|18 03|"; depth:2; lua:tls-heartbleed.lua; \
    classtype:misc-attack; sid:3000001; rev:1;)

For more information on Suricata Lua scripting, please read how to write luascripts for Suricata.

For self signed certificate detection, you need to write a script – shall we say “self-signed-cert.lua” and save it in your /etc/suricata/rules directory. Then you can use it in a rule like so –

alert tls any any -> any any (msg:"SURICATA TLS Self Signed Certificate"; \
  flow:established; luajit:self-signed-cert.lua; \; classtype:protocol-command-decode; sid:999666111; rev:1;)

Now let us explain that in a bit more detail below.

At the time of this writing we are using this branch in particular – TLS Lua rebased . This is going to be merged to the Suricata git (latest dev) soon and later into beta, stable editions.

You need to make sure Suricata is compiled with Lua enabled:

# suricata --build-info
This is Suricata version 2.1dev (rev b5e1df2)
Prelude support:                         no
PCRE jit:                                yes
LUA support:                             yes
libluajit:                               yes
libgeoip:                                yes

In suricata.yaml make sure the tls section is enabled:

# a line based log of TLS handshake parameters (no alerts)
- tls-store:
  enabled: yes  # Active TLS certificate store.
  certs-log-dir: certs # directory to store the certificates files

We have the following rule file (self-sign.rules) content located in /etc/suricata/rules/ :

alert tls any any -> any any (msg:"SURICATA TLS Self Signed Certificate"; \
  flow:established; luajit:self-signed-cert.lua; \; classtype:protocol-command-decode; sid:999666111; rev:1;)

Make sure you add the rule to your rules files loaded in suricata.yaml and that you copy the associated lua script in the same directory. Here you can find the self-signed-cert.lua script.

Then you can start Suricata IDPS the usual way you always do.

The active part of the lua script is the following:

function match(args)
    version, subject, issuer, fingerprint = TlsGetCertInfo();
    if subject == issuer then
        return 1
        return 0

When suricata will see a TLS handshake (regardless of IP/port), it will run the Lua script. This one uses the fact that an equality between subject and issuer DN constitutes most of the self-signed certificates. When it finds such an equality it will return 1 and an alert will be generated.

This script is showing a really basic but useful code. However you can use all the power of Lua with the given info to do whatever you want/need.

The result


Extracted self signed SSL certificate

Some meta data about the certificate (in /var/log/suricata/certs/ you will find the .meta and pem  files):

TIME:              07/14/2015-14:45:16.757001
PROTO:             6
SRC PORT:          49966
DST PORT:          443
TLS SUBJECT:       C=FR, ST=IDF, L=Paris, O=Stamus, CN=SELKS
TLS ISSUERDN:      C=FR, ST=IDF, L=Paris, O=Stamus, CN=SELKS
TLS FINGERPRINT:   80:e7:af:49:c3:fe:9a:73:78:29:6b:dd:fd:28:9e:d9:c9:15:3e:18


Further more from the alert info in JSON format (/var/log/suricata/eve.json log) we also have extra TLS info for the generated alert :

{"action":"allowed","gid":1,"signature_id":999666111,"rev":1,"signature":"SURICATA TLS Self Signed Certificate","category":"Generic Protocol Command 
Decode","severity":3},"tls":{"subject":"C=FR, ST=IDF, L=Paris, O=Stamus, CN=SELKS","issuerdn":"C=FR, ST=IDF, L=Paris, O=Stamus, 
CN=SELKS","fingerprint":"80:e7:af:49:c3:fe:9a:73:78:29:6b:dd:fd:28:9e:d9:c9:15:3e:18","version":"TLS 1.2"}}

To get the extra TLS info in the JSON alert you need to enable it in the suricata.yaml like so:

  # Extensible Event Format (nicknamed EVE) event log in JSON format
  - eve-log:
      enabled: yes
      filetype: regular #regular|syslog|unix_dgram|unix_stream
      filename: eve.json
        - alert:
            #payload: yes           # enable dumping payload in Base64
            #payload-printable: yes # enable dumping payload in printable (lossy) format
            #packet: yes            # enable dumping of packet (without stream segments)
            http: yes              # enable dumping of http fields
            tls: yes               # enable dumping of tls fields <<---
            ssh: yes               # enable dumping of ssh fields

so you can get all the JSON data in Kibana/Elasticsearch as well.


If you would like to learn more about what can you do with Lua and Suricata IDPS, those links below will put you off to a good start:


Some words about PRscript

PRSCript is a script that run a series of builds and tests on a given branch. It was reserved to some developers so they can check the quality of their work before submission. The test builds are run on Suricata buildbot which is composed of some different dedicated hardware system. buildbot is an open-source framework for automating software build, test, and release processes. In the case of Suricata instance, it is set up to run various builds, unit tests as well as functional tests (such as pevma’s regression script).

The fact that this script was reserved to some users was a limitation as many contributors are not registered as Suricata buildbot users. As well, the fact that the code has to be public was not convenient as you could have to expose code before it is ready (with shameful TODO inside). Another point is that you were not able to customize your build. For instance, if you were introducing a new library as dependency it was not possible to test it before a global modification of the buildbot.

PRscript with docker support

To get over these limitations, Victor Julien and I have discussed on using Docker to allow developers to simply run a Suricata dedicated buildbot. As you may/should already know Docker is an open platform for distributed applications for developers and sysadmins. It allows you to quickly install install, manage and run containers. In our case, the idea was to start a pre-configured buildbot container using your local git as reference code. This way you can simply start test builds on your private code without even needing.

So, I have worked on this Docker based buildbot installation dedicated to Suricata and it has been merged in Suricata mainstream by Victor Julien.

It is now possible to use the prscript locally via Docker. Installation had been made simple so you should just have a few commands to run before being ready.

The buildbot will run various builds (gcc and clang, different builds options) and run suricata against some pcaps to check against possible crash.

Screenshot from 2015-04-07 16:22:19



You need to have docker and python-docker installed on your system. Optionally you can install pynotify on your system to get desktop notification capability. On recent Debian based distribution you can use:

sudo apt-get install docker python-docker python-notify

Create the container

This operation has only to be done once. From the root of
Suricata sources, run:

sudo qa/ -C

It will take some times as the download is several hundred Mo. The result will be a docker container named ‘suri-buildbot’.

Using the buildbot

Start the buildbot

When you need to use the buildbot, you can start it from the command line:

sudo qa/ -s

You can check it is running via:

sudo docker ps

And you can connect to the buildbot web interface via http://localhost:8010

Start a build

Once the buildbot is active, you can start a build:

qa/ -d -l YOUR_BRANCH

This will start a build of the local branch YOUR_BRANCH without requiring any connectivity.

To get warned of the result of the builds via a desktop notification:

qa/ -d -l YOUR_BRANCH -n

Stop the buildbot

When you don’t need the buildbot anymore, you can stop it from the command line

 sudo qa/ -S

For further details, check Suricata docker QA page on OISF redmine.

Advanced usage

Build customisation

Buildbot will make suricata read all the pcap files available in qa/docker/pcaps/. So you can use this directory to add your own test pcaps.

Buildbot configuration is stored inside your suricata sources. It is the file qa/docker/buildbot.cfg. So, you can change the Buildbot configuration by editing this file. Then stop and start the docker container to get the new version used. This can be for example used when you need to add a flag to the configure command to activate a new feature.

What is great about this docker way of doing things is that it solves easily some complex points. For instance, if the buildbot configuration were coming from the Docker image then it will not be possible to easily edit it. Furthermore developer will loose any changes made in case of image upgrade. Also, the configure flags used by the buildbot will always be related to the current state of the code. So there will be no issue with running builds even if you are working on some older code as your buildbot configuration will be synchronized first.

Connect via ssh

The docker instance can be accessed via SSH using the admin account (password being ‘admin’ too). To get the port to use for ssh run the following command to get the port to use:

$ sudo docker port suri-buildbot
22/tcp ->
8010/tcp ->

and then connect:

ssh admin@localhost -p 49156

This can be used to install new dependencies inside the container. For instance if you are introducing a new library in Suricata, you may have to install the library in the docker instance.

Customizing the Docker image

On Docker side, the build recipes is available from GitHub. Feel free to modify it or propose updates and fixes.


Conky is a cool, desktop and lightweight monitoring tool. SELKS comes with a ready to use Conky config (also as part of the selks-scripts-stamus package).

With the installation of Conky – you get the possibility to do system monitoring on your desktop. Out of the box there are no Conky config files that are very useful for SELKS. That is why we created one. The trick is we used some info that Conky is capable of reading in itself for the system in general , however we added some stats that we harvest from the Suricata unix socket on the SELKS distro. That way you get right away  runtime, capture method , running mode and Suricata version right on your desktop – among other system stats like – Memory/CPU/Network usage.

The Conky config itself is already installed in /etc/conky/conky.conf however it is also present at /opt/selks/Scripts/Configs/Conky as part of the  selks-scripts-stamus  package for the purpose of record keeping(back up).

So if you are using the Desktop edition of SELKS, you can use Conky easily by running:

conky -d

in a terminal. Then you can close the terminal if you wish. That’s all 🙂


The SELKS Conky config is best utilized when used with a screen resolution of 1680 x 1050 or more.

You can get further ideas for conky configs – just google “conky templates” there is plenty of stuff out there.




Elasticsearch and Kibana are wonderful tools but as all tools you need to know their limits. This article will try to explain how you must be careful when reading data and explain how to improve this situation by using an existing Elastisearch feature.

The Problem

All did start with the analysis of an SSH bruteforce attack coming from Vietnam. This attack was interesting because of the announced SSH client “PuTTY-Local: Mar 19 2005 07:19:17” which really looks like a correct PuTTY software version when most attack don’t spoof their software version and reveal what they are using.

The Kibana dashboard was showing all information needed to get a good idea of attacks:

Screenshot from 2014-12-03 16:36:47

But when looking at less used and most used passwords, there was something really strange:

Screenshot from 2014-12-02 08:59:41

For example, webmaster is seen in the two panels with different values which is not logical.

By adding a filter on this value, the result was a bit surprising:

Screenshot from 2014-12-02 09:08:59

When looking at the detail of events, it was obvious this last result was correct. This SSH bruteforce has tried 10 different logins and has always used the same dictionary of 23 passwords.

To a solution

So the panels with top passwords and less seen passwords are displaying incorrect data in some circumstances. They have been setup in Kibana using the terms type.

This corresponds in Elasticsearch to a facets query. Here’s is the content of the query with the filter removed for readability:

 "facets": {
    "terms": {
      "terms": {
        "field": "password.raw",
        "size": 10,
        "order": "count",
        "exclude": []

So we have a simple request and it is not returning the correct data. The explanation of this problem can be found in Elasticsearch Issue #1305.

Adrien Grand is explaining that a algorithm returning possibly inaccurate values has been chosen to avoid a too memory intensive and network intensive search. The per-default algorithm is mainly wrong when they are more different values than searched values.

We can confirm that behavior in our case by asking for 30 values (on the 23 different passwords we have):

Screenshot from 2014-12-02 09:30:30

The result is correct this time.

If we continue reading Adrien Grand comment on the issue, we see that a shard_size parameter has been introduced to be able improve the algorithm accuracy.

So we can use this parameters to improve the accuracy of the queries. Patching this in Kibana is trivial:

diff --git a/src/vendor/elasticjs/elastic.js b/src/vendor/elasticjs/elastic.js
index ba9c8ee..8daa72a 100644
--- a/src/vendor/elasticjs/elastic.js
+++ b/src/vendor/elasticjs/elastic.js
@@ -3085,6 +3085,7 @@
         facet[name].terms.size = facetSize;
+        facet[name].terms.shard_size = 10 * facetSize;
         return this;

Here we just choose a far larger shard_size than the number of elements asked in the query. We could also have used the special value 0 (or Integer.MAX_VALUE) for shard_size to get perfect result. But in our test setup, Elasticsearch is failing to honor the request with this parameter. And furthermore, the result was already correct:

Screenshot from 2014-12-02 10:10:10

This patch has been proposed to Elasticsearch as PR 2106.

That was a small patch but this fixed our dashboard as the value in the terms panels are now correct:

Screenshot from 2014-12-03 16:44:57