Agile / Waterfall / Monitoring

 

agileandwaterfall

Agile and Waterfall for monitoring solutions, these methods have been around for many years and are two different approaches when it comes to delivering IT projects.

I have worked on many systems and application monitoring projects and found for software that is out of the box or shrink-wrapped, it’s better to use the waterfall methodology that the agile, this is because the functional, non-functional and features required can be defined during the design phase and then delivered upon as expected.

This methodology  seemed to work much better than using the agile methodology as the requirement and sprints just seemed to get out of control due to unreasonable demands for feature rich requirement and capabilities upfront, this just seem to hamper the build and delivery process, and thus the requirements could not be met in the time, perhaps given longer time they could have been, and I believe this is to do with the fact that monitoring solutions for the enterprise require many years to become mature, you start of small and build in the key features and functions, then build up the solution and tailor it as to how you want.  Agile is suited to devops and software development lifecycles, for out of the box software I would choose waterfall as this tends to be better in my opinion.

opcmsg test script

Placeholder Image

Although OM is ending its life and the new platform is OMi is taking its place, this script is still useful during OM to OMi migraiton work and can be used when you want to send test events from OML to OMi

It will generate opcmsg events (for operations manager / Linux only), if message storm is operational on the OML server, then it may well stop the events after a number of events as its thinks its a message storm. So it can act as a test for messages and storms.

you will need to create a opcmsg policy and add the application, object and message group or deploy one with no conditions for test purposes.

oml_gui

================================================================

#!/bin/bash
#Script to generate test opcmsg

date
cnt=1
for (( i=1; i <= 10; i++ ))
do
/opt/OV/bin/opcmsg severity=normal application=tstmsg object=tstmsg msg_grp=UAT msg_text=”Test message $cnt” &
/opt/OV/bin/opcmsg severity=warning application=tstmsg object=tstmsg msg_grp=UAT msg_text=”Test message $cnt” &
/opt/OV/bin/opcmsg severity=minor application=tstmsg object=tstmsg msg_grp=UAT msg_text=”Test message $cnt” &
/opt/OV/bin/opcmsg severity=major application=tstmsg object=tstmsg msg_grp=UAT msg_text=”Test message $cnt” &
/opt/OV/bin/opcmsg severity=critical application=tstmsg object=tstmsg msg_grp=UAT msg_text=”Test message $cnt” &
let cnt=cnt+1

sleep 3
done
date

================================================================

Monitoring Deployment Tips

Placeholder ImageOM/OMi Monitoring Tips

The tips below will help you with monitoring outcomes, I have used these steps to ensure the deployment and ongoing process of monitoring is successful, I hope these will help you.

These were based on OM (Operations manager deployments)

  1. Define the monitoring strategy and its lifecycle
  2. Establish the operating model, ensure roles and responsibilities has been assigned, ensure two (minimum) administrators have been trained.
  3. Provide OM formal training – the tools are comprehensive and require good education
  4. Employ service management process (incident and problem) to help refine the solution.
  5. Apply strict governance to the solution, adding/ removing nodes, policy life cycle management, threshold tuning, ongoing documentation, patch management and agent updates
  6. Enhance and customise the solution – OM / OMi is not a tool that can be deployed and then left alone, it requires for the first couple of years ongoing tuning and maintenance.
  7. Enforce change controls, explore business as usual vs major changes.
  8. Have a development environment to test patch, agent and SPI updates.
  9. Slowly adopt smart-plug-ins (SPI) – test and refine in development before deploying to production.
  10. Discuss the SPI’s / Management Packs with subject matter experts and monitor what’s important to the business. Tune the thresholds, this is an ongoing process and can take months to years sometimes.
  11. Build in as many standards as possible, example severity levels etc. This will help with threshold baselines.
  12. Policy lifecycle – develop, test, deploy and refine the SPI policies and then when they expire remove.
  13. Keep track of all manage nodes, (use excel/RTSM). If nodes are de-commissioned then remove from OM/OMi.

 

After following these key steps, you will have a mature monitoring solution.

Windows PC / Laptop Reporter Script

Placeholder Image

This script runs on your laptop running Windows OS and provides useful information, such as OS, Resources, events, and configured settings, it uses powershell scripting, HTML and CSS.

I wrote this to help me detail my laptop and PC’s configuration quickly and I didn’t want to download any software from the net. I used the Microsoft Powershell forums and script repository to help build the script.

Reference: https://technet.microsoft.com/en-us/scriptcenter/bb410849.aspx

There are two files, the powershell script, and a css file, download from the github link below and run the powershell script as administrator,  you should get a HTML page with all the details. (tested only on Windows 8.x running powershell only)

 

pc_reporter

https://github.com/iopsmon/windows_reporter

 

 

OMi Quick Scan Logs – Log4j Script

This is my script to quickly obtain all OMi 10.x logs (log4j) and then scan through them to see ERRORS, WARNINGS and DEBUG events.

The script is run relative to to the time you run the script and uses the current date, this is due to limiting the amount of data as there are many logs and enteries, so its designed for the OMi  administrator to quickly see if there are any ERROR’s across the many logs files and help investigate any problems.

Script Functions:

  • Copys all the logs into a temp folder
  • Shows the size of the logs (can be very large)
  • Scans through the logs and finds ERROR, WARNING, DEBUG logs, copys them into a new file which contains the log name and log entry.

You can then look at the files created, there are three files:

  • omi_error_report.txt (Main one used, good for finding ERRORS)
  • omi_warning_report.txt
  • omi_debug_report.txt

(If you want debug info, then you will need to enable this within OMi)

The enviroment was OMi10.6 / RHEL 6.5

The script is located on github

https://github.com/iopsmon/omi_logs_scan