Real time stats for Heartbleed on Twitter : With logstash, elasticsearch and Kibana

What is heartbleed ?

http://heartbleed.com/

Heartbleed stats from Twitter ( 7 days and 571K Tweets ),

  • 7 days and more than half a million tweets about #heartbleed

hb1

  •  How many times was openssl mentioned ? 49K

openssl

  • Different cloud Vendors

cloud

  • Google vs Apple vs samsung

google

  • iOS vs Android

ios

  • Netflix vs rest

netflix

  •  wordpress.com vs blogspot.com 

wp

 

A quick summary, on how to ?

  • 1 medium Ec2 instance. Even though i thought a small was more than enough.

How much it costed me ?

$29.00

Step 1 : Launch a new Ec2 instance

Step 2 : Download logstash and untar

input {
 twitter {
 consumer_key => ""
 consumer_secret => ""
 keywords => ["#heartbleed","heartbleed","heartbleed.com"]
 oauth_token => ""
 oauth_token_secret => ""
 tags => ["#heartbleed"]
 type => "heartbleed"
}

}


output {
 stdout {codec => rubydebug }
elasticsearch {
 embedded => true
 }
}Step 3 : Start logstash as

Step 3 : start Logstash as,

  • nohup bin/logstash -f twitter.conf -v –debug &

Step 4 : Access your kibana as,

  • http://{your ec2 instance IP}:9200/_plugin/kibana/src/index.html

 

Cricket world cup 2014(T20) – Twitter Analysis

Here is what happened during the T20 World cup 2014

Most Popular Country : Sri Lanka

 Image

Most Popular Captain : Dhoni

Image

Sachin vs Others

Image

Caught vs Wicket

Image

Over all Trend

Image

Technical Details

Data : Stored in elasticsearch 1.0.1

Source : Twitter

Ingest : Logstash1.4

Env : AWS EC2

Cost : $ 3.84

$0.120 per M1 Standard Medium (m1.medium) Linux/UNIX instance-hour (or partial hour) 32 Hrs $3.84

 

 

AWS S3 : The bucket you are attempting to access must be addressed using the specified endpoint

You would get the following error ‘The bucket you are attempting to access must be addressed using the specified endpoint‘ when trying to delete a bucket in a different region.

Solution :

Try to connect to the region where the bucket is in.

Here is an example of how you can connect to the S3 bucket in a different region. When making the connection provide the host parameter.

#!/usr/bin/python

“””
- Author : Nag m
- Hack : Search for bucket in a different AWS Region
- Info : Search for a bucket named
* eu-west-1.101-s3-aws in AWS region eu-west-1
** Did you know if you do not specify the host it still works ?????? Yes it works.
“””

import boto

def searchabucket():
bucket = conn.lookup(“eu-west-1.101-s3-aws”)
print bucket.name

if __name__ == “__main__”:
conn = boto.connect_s3(host=’s3-eu-west-1.amazonaws.com’)
searchabucket()

AWS S3 : Convert S3 tags to a python dictionary or key value pairs

AWS S3 Tags

When you try to get the AWS S3 tags the output is an XML. Here is a quick way to convert it to a dictionary or key value pairs

Here is the output of get_tags() from boto

<Tagging><TagSet><Tag><Key>name</Key><Value>nag</Value></Tag><Tag><Key>owner</Key><Value>test</Value></Tag></TagSet></Tagging>

To convert this to a key value pairs

#!/usr/bin/python

"""
- Author : Nag m
- Hack   : Convert tags to key value pairs
- Info   : Convert tags to key value pairs
            * 101-s3-aws
"""

import boto
import xml.etree.ElementTree as ET

def tag(name):
   tag_k = []
   tag_v = []
   bucket = conn.get_bucket(name)
   tagset = bucket.get_tags()
   root = ET.fromstring(tagset.to_xml())
   print tagset.to_xml()
   for val in root.iter('Tag'):
     for child in val:
         if child.tag in "Key":
            tag_k.append(child.text)
         if child.tag in "Value":
            tag_v.append(child.text)
   s3tags = dict(zip(tag_k,tag_v ))
   print s3tags

if __name__ == "__main__":
   conn = boto.connect_s3()
   bucketname = "101-s3-aws"
   tag(bucketname)