Automating with IFTTT Maker

2017-02-24

IFTTT, IFTTT Maker, automation, heroku, python, timewatch

A few months ago I decided to work part-time and semi-remote instead of working full-time, and I quickly discovered that manually tracking working hours and manually inputting them in a different system was definitely NOT the way to go. I, much as any other lazy developer, don’t like uninteresting time-consuming tasks, so I decided to go ahead and see how much I can automate all of this with IFTTT.

I’ve used IFTTT for a while, so I feel quite comfortable with its simplicity. After some looking around I found Maker, which basically allows you to connect to anything which can make/receive HTTP requests.

Automating The Report

At first I had to code a way to report my hours, and since the service used by this company does not have any API, I just had to wing it with replaying some HTTP requests and figuring out the right parameters.

I ended up with something like this:

def login(self):
    res = self.session.post(urljoin(SITE_URL, 'punch2.php'),
                            data={
                                'comp': self.company,
                                'name': self.employee,
                                'pw': self.password
                            }, allow_redirects=False)
    self.ix_employee = \
        parse_html(res.text).find('input', {'id': 'ixemplee'})['value']
    return True
def punch(self, option, remark=''):
    res = self.session.post(urljoin(SITE_URL, 'punch3.php'),
                            data={
                                'comp': self.company,
                                'name': self.employee,
                                'remark': remark,
                                'B1': option,
                                'ix': self.ix_employee,
                                'ts': '',
                                'allowremarks': 1,
                                'msgfound': 0,
                                'thetask': 0,
                                'teamleader': 0,
                                'tflag': ''
                            })
    if res.status_code != 200:
        raise TimeWatchError(res.text)
    return True
def punch_in(self, remark=''):
    return self.punch(PUNCH_IN, remark)
def punch_out(self, remark=''):
    return self.punch(PUNCH_OUT, remark)

Not the most beautiful piece of code, but it seems to get the job done quite nicely.

Creating an endpoint for IFTTT’s webhook

For this one I wanted to go with something free with minimal overhead, so seems like heroku it was.

Setting up an endpoint wasn’t much trouble at all, a few lines and a requirements.txt is all it really takes:

@app.route('/log', methods=['POST'])
def log():
    params = request.get_json(force=True)
    main.main(params['company'], params['employee'], params['pw'], params['action'])
    return 'OK'
@app.errorhandler(404)
def page_not_found(error):
    """Custom 404 page."""
    return '<html><head><title>Go away</title></head></html>'

Really can’t imagine it getting simpler than this. :)

Configuring the triggers

Well, the first scenario of working from remote was easy, and no setup was really needed. All I had to do was just setup a login script for my VPN connection, which asks me if I want to clock-in or clock-out when I close the session, and just curl my heroku endpoint to clock it. (This way I never have to remember this myself).

The next step was to make sure I don’t have to manually log my hours at the office either. Since I’ve already setup an API endpoint, all that was left was to get IFTTT to call it when I enter or leave the office.

It’s just as simple as setting the API call together with any trigger you want:

IFTTT Maker configuration

The Best Part

This was a fun and cool opportunity to try out some new things like heroku or Maker.
The entire coding process probably took as long as the time it takes me to track and manage my hours during an average month, therefore, it definitely saves me time.

The Code

If you’re interested in the rest of the code or trying for yourself, have a look a the project on GitHub.

Elasticsearch Configurations I Learned The Hard Way

2016-12-31

Code

cluster, configuration, elastic, elasticsearch, elk, head, production

Sadly some things just aren’t in the basic manual, and you just need to figure them out as you go along…

Naming

Most common missed setting is node.name. Yeah, I know it’s fun to see all those Marvel character names come up every time you restart a node, but it’s much harder to make sense of stats if you don’t know what server they are coming from.

Split brain

Always set discovery.zen.minimum_master_nodes to (at least) number_of_master_eligible_nodes//2 + 1, you otherwise risk a situation where some nodes who cannot communicate with the cluster (for any reason) might create another cluster with the same name, completely separate from the original cluster, with its own master and shards.

Turn off the self-destruct button

Don’t wait until someone on your team accidentally sends a DELETE * to the wrong server. Just don’t let it happen in the first place.

Memory

You probably already know that swapping kills performance. To disable it, set the bootstrap.mlockall: true config option, to lock the process address space to RAM.

If you do this you should also define the ES_HEAP_SIZE variable. Note that elastic recommends giving elasticsearch no more than half of the machine’s memory (to leave some room for Lucene), and no more than 32 GB (To keep 32-bit pointers).

Monitoring

In order to detect cluster issues it is important to have a visual tool to see cluster state - shards allocation, master nodes, and data nodes. Either include this in your monitoring visualizations or use an existing tool, personally, I like head.

Recovery time

For large clusters with many shards, shard recovery can take plenty of time, this can be tweaked with the cluster.routing.allocation.node_concurrent_recoveries and indices.recovery.max_bytes_per_sec settings. Keep in mind that this is an IO-intensive operation, so the speeds should match your hardware limitations.

Dedicated Master Nodes

For large clusters with many concurrent queries, have your master nodes separate from the data nodes. This is done set by setting node.master: false or node.data: false in the proper node configurations, and has multiple benefits:

Lower load on data nodes, so they’ll be more available to answer queries - each node that is an eligible master must maintain the cluster state (if the current master goes down) - and this takes resources.
Makes sure the master node is free to keep the cluster healthy.
No split brain problems when expanding your cluster - you won’t need to change the minimum master node and restart all nodes every time you add a node to the cluster.

Additional Reading Material

CaptionBot AI vs. ReCaptcha2

2016-06-26

Research

AI, python, selenium

Recently Skype has been pushing it’s bots, one of them “CaptionBot” is an AI that can describe what it sees in an image. For the past few months I have been working at a company that does web crawling, and recaptcha is never fun. Sure, there are some services that will tell you where to click (for a fee, of course), but they still leave something to be desired. I decided to have a look at CaptionBot and see what results I get.

First of all, we need a way to get the captions from captionbot. Just some web debugging will give us all the code we need for this to work.

Recreating the steps:

> curl 'https://www.captionbot.ai/api/init'
"BVPVg1MWjEM"
> curl 'https://www.captionbot.ai/api/upload' -H 'Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryNAK95DhgT8WjR8' --data-binary $'------WebKitFormBoundaryNAK95DhgT8WjR8\r\nContent-Disposition: form-data; name="file"; filename="Dell-inspiron-3542-ubuntu-os-1.jpg"\r\nContent-Type: image/jpeg\r\n\r\n\r\n------WebKitFormBoundaryNAK95DhgT8WjR8--\r\n'
"https://captionbot.blob.core.windows.net/images-container/trejmded.jpg"
And lastly:
> curl 'https://www.captionbot.ai/api/message' -H 'Content-Type: application/json; charset=UTF-8' --data-binary '{"conversationId":"BVPVg1MWjEM","waterMark":"","userMessage":"https://captionbot.blob.core.windows.net/images-container/trejmded.jpg"}'
"{\"ConversationId\":null,\"WaterMark\":\"131113912716453514\",\"UserMessage\":\"I think it's a flat screen tv. \",\"Status\":null}"

So I wrote a small python utility to help me with this: you can find it here. (Also available on PyPI)

Next, we need the images and challenges from recaptcha. Easy. All we need to do is to find a test page and open a browser to it, click the checkbox, and then save all the images that come up:

import os
from os.path import join as pathjoin
from io import BytesIO
from hashlib import md5
import requests
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from PIL import Image
here = os.path.dirname(os.path.abspath(__file__))
dataset_dir = pathjoin(here, 'dataset')
def crop(image, vertical, horizontal):
    imgwidth, imgheight = image.size
    step_vertical = imgheight/vertical
    step_horizontal = imgwidth/horizontal
    for j in range(vertical):
        for i in range(horizontal):
            box = (i*step_horizontal, j*step_vertical,
                   (i+1)*step_horizontal, (j+1)*step_vertical)
            yield image.crop(box)
driver = webdriver.Firefox()
driver.implicitly_wait(10)
driver.get('https://www.google.com/recaptcha/api2/demo')
while True:
    WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it(
        (By.XPATH, './/iframe[@title="recaptcha widget"]')
    ))
    # Click the checkbox
    driver.find_element_by_id("recaptcha-anchor").click()
    driver.switch_to.parent_frame()
    WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it(
        (By.XPATH, './/iframe[@title="recaptcha challenge"]')
    ))
    challenge = driver.find_element_by_xpath(
        './/div[@class="rc-imageselect-desc-no-canonical"]/strong').text
    image_grid = driver.find_elements_by_xpath(
        './/div[@class="rc-image-tile-target"]//img')
    # Get the dimensions of the grid
    grid_size = \
        tuple(map(int, list(driver.find_element_by_xpath(
            './/table[starts-with(@class, "rc-imageselect-table-")]').
                get_attribute('class').split('-')[-1])))
    current_dir = pathjoin(dataset_dir, '%d-%s' % (len(image_grid), challenge))
    os.makedirs(current_dir, exist_ok=True)
    # Get the captcha image so we can crop it
    main_image = Image.open(BytesIO(
        requests.get(image_grid[0].get_attribute('src')).content))
    for i, im in enumerate(crop(main_image, *grid_size)):
        out_stream = BytesIO()
        im.save(out_stream, format='PNG')
        out_stream.seek(0)
        hashval = md5()
        hashval.update(out_stream.getbuffer())
        im.save(pathjoin(current_dir, '%s.png' % str(hashval.hexdigest())))
    driver.refresh()

I left this running for about a day, though it seems that to get a better dataset one should run this once every few days. I only got 4 different challenges - 4 4 street sign marking, 3 3 river, 2 4 store front, 3 3 street numbers. Yet now I also get captchas for trees and mountains. Since the street numbers and street signs didn’t have enough data I decided to test only store fronts and rivers.

Naturally, Microsoft’s AI describes the image and doesn’t tag it, so I had to analyze some of the data to find the relevant keywords for each category:

mapping = {
    'rivers': ['river', 'water', 'lake', 'harbor', 'ocean', 'boat', 'moat', 'pond', ],
    'store front': ['store', 'in front', 'restaurant', 'side of', 'sign', 'city', 'shop', 'door', 'screen', 'street', ],
}

The results I got for this are:

store front
Pd/Recall: 0.6564885496183206
Precision: 0.593103448275862
rivers
Pd/Recall: 0.9041095890410958
Precision: 0.9295774647887324

Amazing! (For rivers, this means that if there is a river image, there is a 90% chance you will detect it, and a 7.1% chance that you will mistakenly select an image which isn’t a river). I have also started testing this on the “trees” and “mountains”, hopefully the results will be similar to “rivers”.

AI is progressing fast, but it’s not there yet. It still has some trouble describing images where there is too much going on, but does very well with landscapes.

Warning: If you try this yourself be willing to accept that google will treat you as a bot for a while (requiring captchas for searches, etc.)

Enjoy!

Kibana the way YOU want it

2016-06-10

Code

JavaScript, Kibana, Monitoring

The ELK stack has been adopted rapidly in the last few years - and for good reason. It can be configured and deployed fast and without many dependencies, and it can take care of all your monitoring needs.

However, Kibana only has a rather simple interface, the default visualizations do not always support everything we need (or just want).

I was recently working on a project where we needed to monitor the balance left in an account. Initially we used the metric visualization, yet that wasn’t enough. People who weren’t familiar enough with the dashboard couldn’t make sense of all the numbers floating around - so I decided to write my own visualization for that. (Based upon the official metric visualization here.)

As it turns out, all you need is some basic AngularJS to get started.

Defining your plugin

At first, we need to define our plugin, so kibana knows what we are exporting:

package.json:

{
    "name": "health_vis_metric",
    "version": "0.3.0"
}

index.js:

module.exports = function (kibana) {
    return new kibana.Plugin({
        uiExports: {
            visTypes: [
                'plugins/health_metric_vis/health_metric_vis'
            ]
        }
    });
};

In this case we are exporting a “visTypes” defined by the “health_metric_vis” plugin.

The Visualization

Next, we define the visualization itself, this includes the AngularJS view and controller for our plugin:

health_metric_vis.html:

<div ng-controller="KbnHealthMetricVisController" class="health-metric-vis">
    <div class="health-metric-container" ng-repeat="metric in metrics">
        <div class="health-metric-value" ng-style="{'font-size': vis.params.fontSize+'pt', 'color': metric.color }">{{metric.formattedValue}}</div>
        <div>{{metric.label}}</div>
    </div>
</div>

health_metric_vis_controller.js:

define(function (require) {
  let _ = require('lodash');
  const module = require('ui/modules').get('health_metric_vis');
  module.controller('KbnHealthMetricVisController', function ($scope, Private) {
    const tabifyAggResponse = Private(require('ui/agg_response/tabify/tabify'));
    const metrics = $scope.metrics = [];
    function isInvalid(val) {
      return _.isUndefined(val) || _.isNull(val) || _.isNaN(val);
    }
    function getColor(val, visParams) {
      if (!visParams.invertScale) {
        if (val <= visParams.redThreshold) {
          return visParams.redColor;
        }
        else if (val < visParams.greenThreshold) {
          return visParams.yellowColor;
        }
        else {
          return visParams.greenColor;
        }
      }
      else {
          if (val <= visParams.greenThreshold) {
              return visParams.greenColor;
          }
          else if (val < visParams.redThreshold) {
              return visParams.yellowColor;
          }
          else {
              return visParams.redColor;
          }
      }
    }
    $scope.processTableGroups = function (tableGroups) {
      tableGroups.tables.forEach(function (table) {
        table.columns.forEach(function (column, i) {
          const fieldFormatter = table.aggConfig(column).fieldFormatter();
          let value = table.rows[0][i];
          let formattedValue = isInvalid(value) ? '?' : fieldFormatter(value);
          let color = getColor(value, $scope.vis.params);
          metrics.push({
            label: column.title,
            formattedValue: formattedValue,
            color: color
          });
        });
      });
    };
    $scope.$watch('esResponse', function (resp) {
      if (resp) {
        metrics.length = 0;
        $scope.processTableGroups(tabifyAggResponse($scope.vis, resp));
      }
    });
  });
});

health_metric_vis.less:

@import (reference) "~ui/styles/mixins.less";
.health-metric-vis {
  width: 100%;
  display: flex;
  flex-direction: row;
  flex-wrap: wrap;
  justify-content: space-around;
  align-items: center;
  align-content: space-around;
  .health-metric-value {
    font-weight: bold;
    .ellipsis();
  }
  .health-metric-container {
    text-align: center;
    padding: 1em;
  }
}

Configuration

We don’t want our visualization to be limited by hardcoded limits and colors - that’s what configuration is for! All you need for this is setting up a group of input fields:

health_metric_vis_params.html:

<div class="form-group">
  <label>Font Size - {{ vis.params.fontSize }}pt</label>
  <input type="range" ng-model="vis.params.fontSize" class="form-control" min="12" max="120" />
</div>
<div class="form-group">
  <label>Red threshold <span ng-bind-template="({{!vis.params.invertScale ? 'below':'above'}} this value will be red)"></span></label>
  <input type="number" ng-model="vis.params.redThreshold" class="form-control"/>
</div>
<div class="form-group">
  <label>Green threshold <span ng-bind-template="({{!vis.params.invertScale ? 'above':'below'}} this value will be green)"></span></label>
  <input type="number" ng-model="vis.params.greenThreshold" class="form-control"/>
</div>
<div class="form-group">
  <label>
    <input type="checkbox" ng-model="vis.params.invertScale">
    Invert scale
  </label>
</div>
<div class="form-group">
    <label>Green color:</label>
    <input type="color" ng-model="vis.params.greenColor" class="form-control"/>
</div>
<div class="form-group">
    <label>Yellow color:</label>
    <input type="color" ng-model="vis.params.yellowColor" class="form-control"/>
</div>
<div class="form-group">
    <label>Red color:</label>
    <input type="color" ng-model="vis.params.redColor" class="form-control"/>
</div>

The end result of this would be:

options

Piecing it together

Now that we have all the components, all that’s left is to tell Kibana how all these different pieces interact with each other.

health_metric_vis.js:

define(function (require) {
  // Load the required css files
  require('plugins/health_metric_vis/health_metric_vis.less');
  // Load the controller
  require('plugins/health_metric_vis/health_metric_vis_controller');
  // Register our provider with kibana (So it shows up in the menu)
  require('ui/registry/vis_types').register(HealthMetricVisProvider);
  function HealthMetricVisProvider(Private) {
    // This means we are creating a visualization that uses a template.
    const TemplateVisType = Private(require('ui/template_vis_type/TemplateVisType'));
    const Schemas = Private(require('ui/Vis/Schemas'));
    // Here we set up our visualization
    return new TemplateVisType({
      name: 'health-metric',
      title: 'Health Metric',
      description: 'A numeric health metric, can show a number and color it accordingly.',
      icon: 'fa-calculator',
      // Here we load the template file we created
      template: require('plugins/health_metric_vis/health_metric_vis.html'),
      params: {
        // Setting up defaults
        defaults: {
          handleNoResults: true,
          fontSize: 60,
          invertScale: false,
          redThreshold: 0,
          greenThreshold: 0,
          redColor: "#fd482f",
          yellowColor: "#ffa500",
          greenColor: "#6dc066"
        },
        // This is the configuration page
        editor: require('plugins/health_metric_vis/health_metric_vis_params.html')
      },
      // Here you can configure what kind of query is build for your vis
      schemas: new Schemas([
        {
          group: 'metrics',
          name: 'metric',
          title: 'Metric',
          min: 1,
          max: 1,
          defaults: [
            { type: 'count', schema: 'metric' }
          ]
        }
      ])
    });
  }
  // export the provider so that the visType can be required with Private()
  return HealthMetricVisProvider;
});

Installation

Finally, we can install our plugin using the kibana plugin command, either from a local directory or from a url.

In our case:
kibana plugin -i health_metric_vis -u https://github.com/DeanF/health_metric_vis/archive/master.zip

Results

After installing the plugin, we can now see it in the visualization screen:
visoptions

After setting up your metric, you’re done.

example

The full code can be found on GitHub. I hope this article helps you get more out of ELK.

The Grand Opening

2016-06-08