CSV to Elasticsearch in order to replace your Excel with Kibana

Posted on novembre 25, 2018 in Python

Last night, I was asked if I could setup some frontend to make some stats out of a CSV. In a more interactive and collaborative way than Excel.

I was first asked to do a small Django project since it used to be my goto technology at the time.

But for this need, the use of Elasticsearch was perfect and Kibana helped me to not develop any frontend and solve a lot of time.

The CSV export looked like this but had at least 30 columns and 200k lines:

Column1;Colum2;Column3
1;2;3
a;b;c
blih;blah;bluh

Python way to push the CSV to Elasticsearch

Elasticsearch requires JSON documents, so the first step was to convert the CSV to json.

Instead of writing a CSV to JSON parser, I used the pandas library which makes the whole process a lot easier and faster (the csv file had hundreds of thousands of lines).

And by looking at the official elasticsearch python SDK, I just needed to transform the whole CSV into a dict.

import sys
import pandas as pd
import argparse
from elasticsearch import Elasticsearch, helpers

def main():
    parser = argparse.ArgumentParser(description='Process some integers.')
    parser.add_argument('filename', type=str,
            help='filename to parse')
    parser.add_argument('index', type=str,
            help='index name to use')
    args = parser.parse_args()

    filename = args.filename
    index_name = args.index

    # initiate Elasticsearch connection
    es = Elasticsearch()

    # parse the csv with pandas
    df = pd.read_csv(filename, sep=';', error_bad_lines=False)
    # trim whitespace and stuff
    data_frame_trimmed = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
    # replace `nan` values with empty string
    data_frame_trimmed = df.fillna('')
    # transform the whole data frame into a huge python dict
    records = data_frame_trimmed.to_dict(orient='records')

    # use bulk actions to push the data
    actions = []
    for i, r in enumerate(records):
    actions.append({"_index": index_name,
            "_type": "vuln",
            "_id": i,
            "_source": r})
    ret = helpers.bulk(es, actions=actions)
    print(ret)

if __name__ == '__main__':
    main()

And voila !

$ curl http://localhost:9200/vuln_2018-11-25/_search?pretty | jq -r .
{
  "took": 0,
  "timed_out": false,
  "_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
  },
  "hits": {
"total": 3,
"max_score": 1,
"hits": [
  {
    "_index": "index_name_2018-11-25",
    "_type": "vuln",
    "_id": "0",
    "_score": 1,
    "_source": {
      "Column1": 1,
      "Column2": 2,
      "Column3": 3,
    }
  },
  {
    "_index": "index_name_2018-11-25",
    "_type": "vuln",
    "_id": "1",
    "_score": 1,
    "_source": {
      "Column1": "a",
      "Column2": "b",
      "Column3": "c",
    }
  },
  {
    "_index": "index_name_2018-11-25",
    "_type": "vuln",
    "_id": "2",
    "_score": 1,
    "_source": {
      "Column1": "blih",
      "Column2": "blah",
      "Column3": "bluh",
    }
  }
]
  }
}

After that you just need to install Kibana and enjoy your graphs, tables and so on in a more collaborative way, dynamic filters etc.


Continue reading

Salt

Posted on avril 24, 2014 in System

Since a few months, I've been inclined to test and use Salt Stack. I manage a lot a heterogeneous plateforms, but each one are composed of similar machines who does the same stuff.

For example, once three months, I'm being asked to install a new packages, configure a new printer on desktop machines of our datacenter's collaborators. What a great use case :)

Introduction

Salt is like Puppet and Chef, which are also deployment and automation tools. I find it more lightweight and

Installation

It seems that Salt Stack is not yet in the official Ubuntu repositories

Things to do on your master host:

apt-get install python-software-properties
add-apt-repository ppa:saltstack/salt

apt-get update
apt-get install salt-master

Things to do on your client host:

apt-get install python-software-properties
add-apt-repository ppa:saltstack/salt

apt-get update
apt-get install salt-minion

By default a Salt Minion will try to connect to the DNS name "salt"; if the Minion is able to resolve that name correctly, no configuration is needed. If the DNS name "salt" does not resolve, you need to edit /etc/salt/minion

master: 192.168.0.2

Restart everything

Master

/etc/init.d/salt-master restart

Minion

/etc/init.d/salt-minion restart

Communication

Communications bettwen the Master and your Minions is done via AES encryption. But to communicate, your Minion's key must be accepted by the Master

List all keys:

$ salt-key -L
Accepted Keys:
Unaccepted Keys:
NOC1-VTY2
NOC2-VTY2
NOC3-VTY2
NOC4-VTY2
Rejected Keys:

Accept all keys

$ salt-key -A

Accept one key

$ salt-key -a NOC1-VTY2

If you list your keys again you should get an output like this:

$ salt-key -L
Accepted Keys:
NOC1-VTY2
NOC2-VTY2
NOC3-VTY2
NOC4-VTY2
Unaccepted Keys:
Rejected Keys:

You can now test the communication between your Master and one of all of your Minions

$ salt 'NOC1-VTY2' test.ping
NOC1-VTY2:
    True
$ salt '*' test.ping
NOC3-VTY2:
    True
NOC4-VTY2:
    True
NOC1-VTY2:
    True
NOC2-VTY2:
    True

Deployment

Now, I want to be able to add another computer to our NOC team without having to push manually all the configurations (NIS/NFS/packages etc)

There is two major things, the directive file_roots and the file top.sls According to the documentation, SLS (or SaLt State file) is a representation of the state in which a system should be in.

file_roots

In your /etc/salt/master file, you need to uncomment the file_roots directive. It defines the location of the Salt file server and the SLS definitions. Mine look like this

file_roots:
  base:
    - /srv/salt/

After this modification, restart your server

top.sls

Doing specific stuff to specific machines in the main purpose of Salt. This is defined within the top.sls file.

This can be done by:

Ways Example
Globbing "webserverprod"
Regular Expressions "^(memcache|web).(qa|prod).loc$"
Lists "dev1,dev2,dev3"
Grains "os:CentOS"
Pillar
Node Groups
Compound Matching

This is my top.sls file:

base:
   '*':
     - nagios.client
   'os:Ubuntu':
     - repos.online
   '^NOC(\d)+-VTY2$':
     - match: pcre
     - yp.install
     - yp.nsswitch
     - nfs.mount_noc

base:

base:
   '*':
     - nagios.client

This block declare the global environment the minion must apply. In this case, every machine will be assigned the nagios.client directive It's going to execute /srv/salt/nagios/client.sls

os:Ubuntu

This section matches machine using the Salt "grain" system, basically from system attributes. It will execute /srv/salt/repos/online.sls

'^NOC(\d)+-VTY2$'

This section matches using Perl regular expression feature If the hostname of the machine matches this regex, it will be assigned the few directives It will execute, /srv/salt/nagios/yp/install.sls, /srv/salt/nagios/yp/nsswitch.sls, /srv/salt/nagios/nfs/mount_noc.sls

Links


Continue reading