Friday, 2 December 2016

Devops and traditional HPC

Last April, I have co-presented in Saudi HPC 2016 a talk titled "What HPC can learn from DevOps." It was meant to bring awareness to DevOps culture and mindset to HPC practitioners, was complemented later by an Introductory tutorial to containers. This talk was my second attempt promoting DevOps. The first attempt was In Saudi HPC 2013 with the DevOps afternoon; in which we had speakers from Puppet and Ansible with good examples back then of how automation "Infrastructure as code" frameworks encourage communications, visibility and feedback loops within the organization.

Talk Abstract: 

Cloud, Web, Big Data operations and DevOps mindsets are changing the Internet, IT and Enterprise operations and applications scene rapidly. What can HPC community learn from these technologies, processes, and culture? From the IT unicorns "Google, Facebook, Twitter, Linkedin, and Etsy" that are in the lead? What could be applied to tackle HPC operations challenges? The problem of efficiency, better use of resources? A use case of automation and version control system in HPC enterprise data center, as well a proposal for utilizing containers and new schedulers to drive better utilization and diversify the data center workloads, not just HPC but big data, interactive, batch, short and long-lived scientific jobs.

Here are my personal notes at that time. Apparently, they did not fit the 15 minutes window I was given.



Talk reflections and thought points:


Definitions: Presenting the different possible HPC workloads: HTC, HPC, HSC, and the recent trend in Data Center convergence by considering BigData “Analytics” and more recently MLDM “Machine learning, Data mining.” Highlighting the diversity and variability of HPC workload, then moving to what DevOps means to HPC, Why it did not pick up. What HPC can learn from web, cloud, and Big Data operations?

The disconnect: Traditional HPC software changes are infrequent; HPC does not need to be agile handling frequent continuous deployments. Each HPC deployment is a cluster flake unique in its way, making it hard for group users to port their work to other clusters, a process that takes days, weeks, often months.  The concept of application instrumentation and performance monitoring is not the norm, nor the plumbing and CI/CD pipelines.

The motivation: However, HPC infrastructures inevitably have to grow, innovations in HPC hardware requires a new look into HPC software deployments and development, HPC data centers will need them few highly skilled operational engineers to scale operations with fewer resources efficiently. The defragmented use of system resources needs to be optimized. The scientific and business applications might be rearranged, refactored, reworked to consider better workflows. Analysing application and data processing stages and dependencies looking at them as a whole and connected parts while avoiding compartmentalization and infrastructure silos.

The scalability Challenge: What could be the primary HPC driver to introduce DevOps culture and tooling?  Stressing on scalability (the imminent growth due to initiative like national grids, and International Exascale computing)

DevOps tools: Emphasize richness of the tool set and culture that have driven it. Pointing out it is not about the tools, more than the concepts that tools enable. Not just automation, building, shipping and delivery workflows, but the ever engaging feedback loops, the collaboration, ease of integration, highlight hat communication is not just the human face-to-face but also the meaningful metrics on dashboards,  the importance of code reviews,  the rich API, the natural UX.  Such comprehensive set of tools and unlike the current HPC defragmented alternatives.

Use case of differences: The case of Provisioning; and how the terminology differs between the HPC and web/cloud communities. Taking further this example to pivot to the correct assumptions of HPC is no longer just bare-metal provisioning.

Validation: Validation of the hypothesis of serious HPC workload in the cloud, and recent use cases for containers deployment in HPC from surveys and production ready vendor solution trending last couple of years.

2nd Generation Data Center provisioning tools: Alternatives, present open source alternatives to traditional HPC provisioning tools and highlight their diversity in handling bare-metal as well as virtual image instances, and containers, as well possibilities are still there for diskless.

The current state of the HPC Data Center:  Highlight the problem of static partitioning, and the various workload needed to either support or complement the bigger business/scientific application and discuss valid reasons of partitioning.

Resource Abstraction:  What if we abstract the data center resources, and break down the silos? How should that be done?  What core components need to be addressed? Why? Present a proposal of such tooling with the reasoning behind it.

Unit of change:  Containers technology is a useful enabling technology for such problems. Does not have the performance overhead problems that HPC shied away from in virtualization related solutions, and will enable portability for the various HPC workload deployments. Not to mentions the richness of its ecosystem to enhance the current status quo of scheduling, resources, and workload management to greater levels of efficiency and better utilization of Data Center resources.

The software-defined data Center:  Everything so far can be either code or managed and monitored by code. How flexible is that? and what opportunities it brings?  How everything can be broken down into components? how parts integrate and fit together?  enabling a “Lego style” Compose-able infrastructure driven and managed by code, policies, and desired state models. How code has opened new possibilities to all stakeholders?


Some Docker evaluation use cases:

Challenges ahead: The road ahead expectations? The unique differences and requirements?  Which underlying container technologies need to be in place and for what?  The right amount of namespace isolation vs. cgroups control, how about LXC, LXD, Singularity, Docker? What would we would see coming next?

Conclusion:  The importance of having the right mindset to evaluate, experiment new paradigms and technologies, eventually deploy and utilize them in production; introduce new workflows, enable better communication between the different teams (developers, users, security, operations, business stakeholders). The concept of indirection and abstraction to solve computer problems, in this case, the 2-level indirection scheduling for granular resource management. The container unit abstraction for the workload is not just for applications, it could be data units.

References:

https://blog.ajdecon.org/the-hpc-cluster-software-stack/
http://sebgoa.blogspot.com/2012/11/i-was-asked-other-day-what-was.html
http://qnib.org/data/isc2016/2_docker_drivers.pdf


Saturday, 22 March 2014

What if Ansible Run Hangs?

Running  Ansible against 1000s of nodes, not fully aware of some of the node status before the run, some were heavily loaded, and busy, some were down. such highly loaded of  OOM nodes or even some of the play-book tasks are prone to wait, and blockage, all of these conditions will cause Ansible to hang. below are some of the steps that I followed or were collected from Ansible mailing list* to help debug such a hang:

Is it the initial connection?

use -vvvv to trouble shoot the connection

What you call hung could be normal unless not intended:

from Ansible playbooks async :

By default tasks in play-books block, meaning the connections stay open until the task is done on each node. This may not always be desirable, or you may be running operations that take longer than the SSH timeout.

Is it the remote executed task ?

  •  Run ansible-playbook with ANSIBLE_KEEP_REMOTE_FILES=1
  • create a python tracefile

$ python -m trace --trace 
/home/jtanner/.ansible/tmp/ansible-1387469069.32-4132751518012/command 
2>&1 | head 
  --- modulename: command, funcname: <module> 
command(21): import sys 
command(22): import datetime 
command(23): import traceback 
command(24): import re 
command(25): import shlex 
  --- modulename: shlex, funcname: <module> 
shlex.py(2): """A lexical analyzer class for simple shell-like syntaxes.""" 
shlex.py(10): import os.path 
shlex.py(11): import sys 



Possible causes of hangs :

  • stale shared file system in the remote targeted node
  • if it is a yum related task, and another yum process is running already in targeted node
  • Module dependency such as requirement to add the host in advance to known_hosts or forwarding SSH credentials.
  • some issues with sudo, where the ssh user and the sudo user are the same but sudo_user is not specified.
  • some command module tasks are expecting input from stdin
  • setup module could hang due to hardware or os related issue, updated firmware, drivers could help
  • network, or firewall related, or change of network/firewall/load balancing caused by Ansible run
  • could it be a lookup issue (e.g DNS,  or user look up)


* Thanks to Michael Dehaan and Ansible developers for a an awesome code,  and thanks to James Tanner for his help and pointers in the Ansible users mailing list, and IRC.

* This was written at the time of Ansible 1.4.2 in RHEL/CENTOS based environment,  ssh connections could even be further improved by enabling ControlPersist nor pipelining mode 

Thursday, 30 January 2014

DFIR Dec. 2013 Memory Forensics Challenge notes :

This is my first memory forensics outside of SANS 508 SIFT workstation investigating Timothy Dungan
workstation "Stark Research Labs Intrusion case by Hydra" . So even though I believe that I have answered the questions that were asked in the SANS DFIR blog , there are lots still to learn and more skills to sharpen.  Using lots of  curiosity, volatility, redline, and SIFT workstation it is easy to run a memory investigation especially if one is quipped by SANS508 course material and volatility IRC channel.  Below are my scattered notes from three separate sessions, the overall time it took is over 7-8 hours, it could have been done in one session with more focus and less distraction form the kids.

[ note to oneself : collect reports and screenshots more next time, and write report as you go along ]

Using Mandiant Redline:

Used Redline white listing to filter out a large amount of data that is not likely to be interesting: data that corresponds to unaltered, known-good software components, however, I was not successful at finding red flags "rouge processes" straight away, There were  three suspicious processes i was targeting ,  however could not find the obvious anomaly malware introduced to systems,  so started looking for other low hanging fruits/signals that could give me a good pivot point to start using also the low frequency of occurrence technique and focusing on the DFIR challenge questions asked to keep me focused.

·        Suspicious untrusted  handle pork_bun associated with the explorer.exe process (pid:1672)




 Possible Gobal root kit cloaking activity via  System  Service Descriptor hook:
 The hooking module name looks suspicious irykmmww.sys hooked to ntoskrnl.exe with NtEnumerateKey  , and NtEnumerateValueKey , as well as NtQueryDirectoryFile which are used to hide things:
o   NtEnumerateValueKey : : Allows an application to identify and interact with registry values.   Malware use this insert itself between any registry value request and filter out what value it wants to hide.
o   NtEnumerateKey  : Allows an application to identify and interact with registry Keys.   Malware use this insert itself between any registry key request and filter out any registry keys it may want to hide its value.
o   NtQueryDirectoryFile : Gives the application the ability to perform a directory listing. By hooking this function a malware can hide directories or files from normal file managers as well as anti-malware tools
o   NtDeviceIoControlFile, the API Windows uses to do network related stuff and has been widely mentioned in malware behavior analysis papers. Malware can use it to replay network traffic, how cool is that?!


Not to mention my company campus ISP blocks me from doing some more research ;-)




Not that it cannot be overridden with any vpn connection.

Tried to acquire the driver for further analysis, however Redline couldn’t dump it, you will see later i was able to dump it with volatility which proves why you need to know more than one tool, as most likely one tool will not be fit for all situations and always tools will fail you most when you need them. 

Using Volatility to cross check and dig deeper:

Treating it as a real case, preserving the initial image as read only image and its hash value:

                 $ sudo chattr +i dfir-challenge/APT.img

To start processing we need to know more about the image file profile, so we run imaginfo

sansforensics@SIFT-Workstation:/cases/dfir-challenge$ vol32.py -f ./APT.img imageinfo
Volatile Systems Volatility Framework 2.1_alpha
Determining profile based on KDBG search...

Suggested Profile(s) : WinXPSP3x86, WinXPSP2x86
AS Layer1 : JKIA32PagedMemoryPae (Kernel AS)
AS Layer2 : FileAddressSpace (/cases/dfir-challenge/APT.img)
PAE type : PAE
DTB : 0x319000
KDBG : 0x80545b60L
KPCR : 0xffdff000L
KUSER_SHARED_DATA : 0xffdf0000L
Image date and time : 2009-05-05 19:28:57
Image local date and time : 2009-05-05 19:28:57
Number of Processors : 1
Image Type : Service Pack 3

PROFILE : WinXPSP3x86

The normal process scan for the processes that are not supposedly hidden by unlinking the double linked list process structure.


Cross examining the processes seen normally via the doubly linked list vs. the ones scrapped from memory structures:


Scanning for network artifacts, since this is assumed to be an APT "advanced persistent threat" case, one good lead would if the box was infected at some time malware will have to connect with Covert Command-and-control (C2channels, or if this was not the one with the originally  infected malware, data exfiltration activity should leave some bread crumbs for us to trace.



  interestingly enough from the connection scan above we see port 443 which is usually firewall friendly port appears to be either inactive or stealth.  However it is from the same process to the same IP, the process is explorer.exe (1672). trying to find where is that ip using whois for the ip 222.128.1.2, as seen below we find out that the ip belongs to our friends in China state owned ISP in Beijing

Usually malware will set a mutant so that it does not cause issues again to the system or itself by trying to install or over configure itself,  that is done by checking if a certain mutant exists. one interesting mutant I have seen In both redline and volatility was: The pork_bun mutant


Now that I am quite confident that expolere.exe pid:1672 is the rouge process. Finding which process file have the malware  in case it was injected or hollowed is quite tedious task, however double cheking least frequent strange named unsigned handles starting with the executable DLLs , as well as SDT hooks,




both dll search, and  ssdt hooks via volatility arrived at the same conclusion as Redline, and this time I was able to dump the driver irykmmww.sys and confirmed its rouge using virustotal




Most of the virustotal findings point to a generic trojan/backdoor root kit installed using an exploit not spread like a virus, via social engineering, probably phishing as is the norm with APT, however i am not able to tell with the existing research so far.




virustotall also confirmed that an alternative of the notorious Poison Ivy Trojan was used, which famously was used to attack RSA's SecurID infrastructure in 2011, going strong after eight years and is being used in targeted attacks.

Other findings that the malware logs its findings or activity to :

C:\DOCUME~1\demo\LOCALS~1\Temp\irykmmww.log

So doing filescan and saving it to file for further analysis I can see a suspicious other files explorer file or two, for example 


'\\WINDOWS\\system32\\exploder.exe' does not make sense to be running under system32?!



and with that i have the 5 DFIR questions answered almost, the process was 1672 explorer.exe, thirykmmww.sys is what is hiding the malware artifacts from the system, and persistence  most likely achieved with dll injection  via the irykmmww.dll.

there is more for me to follow up, and research, and more notes that I should have collected real time and post. hopefully next investigation would prove more conclusive and complete, and I would be then more familiar with windows internals.

final note: SANS recommends highly that "Intrusion/Incident  reports" not  to state personal opinions and present facts only, however for my learning process I have put some of my opinions, and hopefully will validate them soon if SANS DFIR publish their  solution.  

Saturday, 7 December 2013

Dynamic Test/Evaluation Environment 

 Vagrant, Ganeti , Openstack are great tools for a dynamic data-driven test environment. couple them with a configuration management CFEngine3, Chef, puppet, , Ansible, or Saltstack and you will start having more time on your hand, and appreciating life around you. The possibilities are endless if you are looking for a backend highly available infrastructure Ganeti is your solution, used already by "Open Source Labs", Google, Mozilla, Greek Research and Technology Network, among others to manage cluster of virtual environments with resilience in mind. if you are looking for flexibility and providing your users with a private cloud solution Openstack will do. for testing new administration tools, policies, cookbooks, manifests, play books and blue prints than Vagrant is the way to go add the combination of these three together and you have dynamic solutions that scale in your own laptop or workstation from few virtual nodes to Amazon EC2, or your own company private cluster environment. 

Devops afternoon in Khobar- Saudi Arabia


Devops, and web operations did not pick up in the middle-east as it did in US, Europe, China, and India. We had a chance to present at the HPC Saudi 2013 user group conference that was coordinated by our technology planning engineer Khalid Chatilla, and Intel/IDC. we decided to check with CFEngine, PuppetLabs, Ansibleworks, and Opscode if they can participate, and they showed interest even though it is already end of year, and budgets already consumed, not to mention the short notice , logistics and planning that needs to take in action to secure their coming to Saudi Arabia. at the end Ansibleworks, and Puppetlabs managed to come and delivered an awesome afternoon, my colleague and friend Ahmed bu Khamessin with his limited graphical resources was able to capture some of these moments by his video camera and even though the sound quality is not great, he made it public to the world.  you can see my intro slides, and Ahmed videos below

Prezi Introduction to Saudi Devops Days  with use cases from CFEngine, and Chef.

Ansible presentation :



 Puppetlabs presentation in youtube


Wednesday, 19 June 2013

Software packages and repositories

Software packages and repositories is my first stop in automating the OS life cycle, the OS image including all software stacks, os, middleware, management, and application should represent a fixed state. that would difficult to track if installs were done ad-hoc outside of a packaging system. so for us we use mainly RHEL based distros. so you think the answer would be use yum, and rpms!!! well there are Java applications as jars. there are Ruby gems, there are python eggs, and there are git clones and tarballs. one answer is use fpm to convert from any format to rpm.
  • so one challenge is the diversity of packaging types and how to standardise on one.  
  • Second, comes the Internet isolation and state, at work we are not allowed downloads directly from the net. 
So for this second problem i need to have a way to mirror publicly accessed or Enterprise provided repos to internal repos. the easiest choice is to mirror every thing and copy it/rysnc it over to work periodically.

for Ruby Gems here is the simplest way to do it :

http://stackoverflow.com/questions/8411045/how-to-build-a-rubygems-mirror-server

$ gem install rubygems-mirror

Edit the YAML configuration file ~/.gem/.mirrorrc:

---
- from: http://rubygems.org
  to: ~/.gem/mirror
the to: filed above can be better pointing to a usb storage, where ever it points at 
$ mkdir ~/.gem/mirror
Start mirroring:
$ gem mirror
Once mirroring finishes edit ~/.gem/mirror/config.ru:
require "rubygems"
require "geminabox"

Geminabox.data = "./"
run Geminabox
Install Gem in a box:
$ gem install geminabox
Start gem server:
$ cd ~/.gem/mirror
$ rackup
Edit your application's Gemfile to use your gem server:
source "http://your.servers.ip:9292"

Tuesday, 21 May 2013

Virtualbox guest host NATed


After installing CentOS6.4 as guest OS in Windows 8.0 and configuring the single network interface using NAT mode, I could not from first instance ssh using putty to the guest OS DHCP ip address given as 10.0.2.15.

I had to power off the Guest and enable port forwarding first as described in NATFORWARD section under NAT networking mode on chapter 6 of the users manual.

https://www.virtualbox.org/manual/ch06.html#natforward

Below are the commands i used to configure and check port forwarding

 .\VBoxManage  listvms
 .\VBoxManage modifyvm "CentOS01"  --natpf1 "guestssh,tcp,,2222,,22"
  .\VBoxManage.exe showvminfo CentOS01 |findstr "2222"
NIC 1 Rule(0):   name = guestssh, protocol = tcp, host ip = , host port = 2222, guest ip = , guest port = 22
in putty host = localhost and port will be in this case 2222

the above was done to test the ORD OpenStack Red Hat distribution, i had several failures before i was able to install it using the quickstart successfuly. first due the disk size, the disk size should be over 22 Gbyte so that Cinder can create 20Gbyte disk by default, second the selinux needs to be enabled. and every time it fails you need to remove cinder packages and logical volume manually before restarting the installation and cleaning up the bits and pieces from old installation.

a succesfful install should not take over 20-30 mins.