Wednesday 25 March 2015

WHITE-BOX APPLICATION MONITORING WITH DOCKER AND BOSUN - AT HOME !

In my previous post I have shown you how to perform black box monitoring of your web application. 
Today I am going to show you how to perform white box monitoring using BosunBosun is an advanced, open-source monitoring and alerting created by Stack Exchange. We are going to use it to perform some white box monitoring on our car service application I used in my the previous post.First thing to do is to start that docker image again and starting Jboss if not already running :


/opt/wildfly/bin/standalone.sh -c standalone-full.xml -b 0.0.0.0

Now we can pull the docker image from docker hub with the following command:

docker pull stackexchange/bosun

Once this image has been downloaded we can run it with:

docker run -d -p 4242:4242 -p 8070:8070 stackexchange/bosun

Bosun should now be accessible in your browser at 192.168.59.103:8070 where 192.168.59.103 is the Boot2Docker host Ip address.


Let’s go one second back to our web application Docker container. Here we have to download the Bosun collector and start it:


./scollector-linux-amd64 -h 192.168.59.103:8070 (where 192.168.59.103 is the Ip Address of the Bosun machine)

You may have to first make the collector executable with:

chmod +x scollector-linux-amd64 

The collector was giving some problem for me so I had to install postfix first with:

apt-get install postfix

Now your www docker container is sending data to the Bosun docker container.

Let’s check the Bosun graphical interface in the browser.

Go at http://192.168.59.103:8070/ again and chose your www docker container from the “hosts” tab:  



This data is being collected and aggregate by Bosun.




You can see CPU, memory, network and disk space usage.
There are many other functionality. Let’s explore the expression tab. 

In this tab you can write an expression for keeping under control a specific metric of your interest.

I wrote an expression for checking that the average CPU load in the last 5 minutes did not go above a threshold of 80%.
In the tab result you can see that this expression has not been satisfied which means that in the last 5 minutes, the average CPU rate, never went over 80%.

A specific metric can also be selected in the Available metric tab:



Here I selected the metric “linux.loadavg_1_min"
  



Also for this metric I have set a threshold of 80% which has not been reached in the last minute so under the tab normals you can read 1 as shown in the following image:


If I set the threshold  to be 0, we can see that one Critical is triggered. When this situation happen an email will be sent. The template of the email can be totally personalised as can be seen in the image above.  

Finally I executed some load test against the web application with:




It is clear from the image below that the average-load started very low because there was no processing on the web application in the www docker image. The very first peak is a first execution of the load test. Then I waited 30 second so the average load went down again, to finally go up again when I executed the test again. 








Saturday 14 March 2015

BLACK BOX APPLICATION MONITORING WITH NAGIOS AND DOCKER - AT HOME !

This guide assumes you have a working installation of Docker!

This blog post will show how to monitor a web service running in a Docker container using black box monitoring techniques with Nagios.


Monitoring is:

Examining your software in the act of running in production;
Looking for error conditions and recording data for trending purposes;
Being able to correlate events for troubleshooting;
Alerting and possibly recovering when things go wrong.

The very final goal of monitoring application service is to lower your MTTDetect, MTTDiagnose, MTTTRepair where MTT == Mean Time To.

There are two kind of monitoring techniques: black box and white box.

Black box techniques take in consideration the global state os the service as a whole. 

Black-box is:

Simulated queries.
Verifies end-to-end operation.
You really don’t have anything except the input and output. 
Important to monitor all sets of dependencies. e.g. DNS. "Single points of failure." 

White-box is:
Monitoring where you know about the inside of the app. 
You may have access to the source or other instrumentation 
You can evaluate metrics derived from instrumentation.

In this post we'll recreate a production environment using three docker virtual machines.
The first virtual machine, we are going to call www, will contain a car service web application. Technically speaking this is a docker container running Jboss with a Car Service web application deployed in it.
The first step is to pull this virtual machine from docker hub:

docker pull jwasilewski/lab4-www

and run it with:

docker run -i -t --name www -p 8080:8080 jwasilewski/lab4-www /bin/bash

And starting Jboss:

/opt/wildfly/bin/standalone.sh -c standalone-full.xml -b 0.0.0.0

We can get the ip address of boot2docker (is you are running Docker on a Mac like me) with the following command:

boot2docker ip

Which returns on my machine:

192.168.59.103

If we open the browser at 192.168.59.103:8080 we should now see the web application up and running: 



The second step will be to pull a Nagios docker container from docker hub with:

docker pull cpuguy83/nags

Nagios is an open source computer system monitoring, network monitoring and infrastructure monitoring software application. Nagios offers monitoring and alerting services for servers, switches, applications, and services. It alerts the users when things go wrong and alerts them a second time when the problem has been resolved.
In starting this docker container, we want to link it with our web-application because they will need to continuously communicate with each other:

sudo docker run -i -t  --name nagios -p 80:80 --link www:www cpuguy83/nagios /bin/bash

The username and password for the Nagios Docker images are:

username = nagiosadmin
password = nagios

Let’s make sure that the Nagios container can communicate with the web application docker container (www), trying to ping the latter from the Nagios container shell: 




Let’s also make sure that the Nagios docker container can directly contact the Jboss web application with a simple wget;
The index.html page only represent the WildFly landing page:



It is now time to create a couple of checks in the Nagios container.
In the folder objects, let’s create a new file called www.cfg. This file will contain our nagios check:


cd /opt/nagios/etc/objects

vi www.cfg


So now we have to tell Nagios that there is a new configuration file to read; We have to edit the file /opt/nagios/etc/nagios.cfg: 



We can finally start Nagios:

/usr/local/bin/start_nagios




We should now be able to see Nagios user interface at http://192.168.59.103:80, where 192.168.59.103 is the Boot2Docker host Ip address: 




The image below is a snapshot of the “Hosts” Nagios page, that shows that localhost and www (which is our web Docker Container) are both up at moment. This is a “ping” check ! 

The next images is a snapshot of the “Services” page that shows that we are monitoring the Jboss service inside the “www” docker container.
 This is a Http Check on the port 8080 which is the port where Jboss is currently running: 




Then we go back to our “www” virtual machine and we shutdown Jboss. Nagios, in the next check iteration, will show Jboss down as shown in the image below:




Many other checks can be performed using Nagios.
Some of them are listed here. 

Monday 9 March 2015

@QConLondon - To the moon - Russ Olsen

First talk at QConLondon was also the best.
Russ Olsen author of "Eloquent Ruby" and vice-president at Cognitect, literally brought us to the moon with is talk.

"We all have moments that change the way we think, the way we look at the world, the things we want to do with our lives. On July 20, 1969 millions of people had one of those transforming experiences: Two men landed on the Moon and nothing was ever the same again. Why did we go to the Moon? How did we get there? What was it like to witness it all? And what does any of this have to do with writing software 40 years later?"

Check out the talk on you tube at: To the moon.

It was so good that I wanted to remember this moment:


Thanks Russ !



Friday 6 March 2015

Tuesday 3 March 2015

@QCONLONDON - DAY 1

Very excited to take part to QCon London tomorrow !

This is my personal schedule:


Read is good, yellow is great !
It is going to be a long day but well worth !

Stay QCon
Luca