Monday, 12 June 2017

Concurrency graphs

Just last week we were discussing one of the JMeter test scripts in our team, and it occurred to me that it would be useful to see the concurrency of the requests, meaning which requests run in parallel. That should very roughly represent which threads on the server are doing the same thing at the same time, and it is interesting as another way to look at what's going on during the test.

I knew there wasn’t such a chart in JMeter, because I thought about it briefly couple of times before and couldn't find it, but I was surprised to find that there wasn’t any ready to use tool to make that kind of chart. The closest I found was Gantt charts, but they are not designed to deal with thousands of requests and hundreds of threads. So I set up to make my own tool. In R, of course.

Here’s how the result looks (generated from one of the internal tests, legend cut off not to disclose anything):
The chart above shows which requests are currently running on each of the concurrent JMeter threads. It gives a very high level picture of what’s going on with the concurrency (requests are shown in different colors each, so you can assess the concurrency just by looking here). It is kinda hard to read though, especially in a case like this where we have a lot of requests, long duration of the test and long user thinking times in between. So here's another way to look at the same data:
This chart shows calculated concurrency level for different points of time (points of time being request start times). Concurrency is calculated as number of “overlaps” - i.e. if in the first graph you were to draw a vertical line and count the number of same-colored horizontal blocks intersecting that line - that would be the concurrency level for the corresponding request, and that value is displayed on the chart as (request_start_time, concurrency). Calculation is pretty rough, for more precision we'd like to calculate concurrency at regular time intervals, not just at request start times - that does mean longer running time and even harder to read chart though, so I decided I'm happy with what I've got here.
Of course, even this chart isn't easy to read, so if you want to drill into the data, I suggest running R-scripts in interactive mode and either look at text data, or chart only the specific request you are interested in. The code is there, and is hopefully easy to understand.
It's all good and nice to look at the data from the tests, but for it to be really useful you want to know whether it resembles production. You can check that by producing similar graphs from production usage data, e.g. from access.log. Please note, that access.log can be much noisier than JMeter output files, since in JMeter we get to name the samplers, whereas access.log will have each request with all the URL parameters included. So before graphing or calculating concurrency, you'll need to clear the data and filter out requests you are not interested in (e.g. you might want to remove static resources and/or remove dynamic user- or session- dependent parameters from other urls). Example of such filtering is in the script.

Feel free to grab the scripts, and please let me know in the comments if they were useful, or if you see some error in the calculations.

Monday, 13 July 2015

Security ramble

All the recent hacks where mass amounts of personal data has been exposed made me wonder, whether in time public perception of privacy and data security will change.
What I mean is people nowadays seem very much surprised and distressed whenever their data gets stolen, be it photos from iCloud, or SSN and address info from governmental databases, or PHIs from health and insurance providers. It's almost like your average Joe or Jane do not expect it to ever happen... but data gets stolen all the time. And I don't see any reason for these hacks to stop in the near future.

Wouldn't it be more reasonable to assume that every information storage system will be hacked, and any data will be stolen? This assumption will give you state of mind and tools to concentrate on active monitoring and mitigation plan, whereas nowadays it looks like people mostly concentrate on preventing the hack (some big hacks went unnoticed for many months!).

I would much prefer if people responsible for the systems where my personal data is stored:

  • Assumed they are gonna be hacked.
  • Made sure when it happens they will notice (automated smart monitoring systems).
  • Made sure it is complicated and/or expensive to use stolen data to harm me (block bank accounts, make it possible to cancel ID easily, make it hard to make sense of my PHI without some key that is also easy to cancel/revoke, make sure devices that can physically harm me have inbuilt protection against that physical harm - e.g. e.g. it shouldn't be possible to program heart pacifiers to murder its carrier).
  • Worked on making the attack expensive (we are gonna be hacked, but it will be annoying, frustrating and expensive process for a hacker) and long (store unrelated data in different disconnected places, so you have to do a separate hack for each of the pieces).
And I myself am assuming my data can be stolen at any point, so I am trying to behave with that assumption in mind:
  • There are no private emails or photos that, if made public, will harm me - I do not put stuff that can harm me in the internet. I don't say shitty things about people behind their backs. I do not lie. Not that I naturally feel the need to do all that stuff, but assuming you can get exposed at any moment does provide additional motivation to withhold from being a dick.
  • My money are stored in different places, and my cards are not connected to my savings.
  • My most important email account is behind a 2fa authentication, and it is connected to my phone, so if it is compromised, I will notice, and I can block it fast.
  • And last but not the least, I am mentally prepared it can all fail me. If that happens it will mess me up a bit and create some hassle to block/change/restore cards, accounts and IDs, but it will not be the end of the world.

Thursday, 9 April 2015

Oracle troubles and findings: tuning experience

I am not an Oracle DBA. But with performance testing I more often than not end up creating the whole environment, which means setting up Oracle server as well. Since I am not testing Oracle server specifically, I usually only tune it enough for it to not be the bottleneck. Lucky for me Oracle server is actually pretty cool compared to applications I test, and only gets to be a bottleneck in scalability testing where it serves multiple application servers. Still... recently I've bumped into a set of correlated problems that led me to tuning effort on the Oracle server itself.
Sponsored by Internet - meaning that all that I've done was googled in the internet and applied with fingers crossed.

So, here it goes.

First problem was the following: when I went from 4 application server nodes to 6 application server nodes, response times high rocketed. It was obviously an infrastructural problem, and after a while Oracle was the only culprit. Unlike usually, CPU usage wasn't that high on the oracle, so I had to dig a bit deeper, and guess what I found: concurrency issues such as "cursor pin S on X"!

To avoid doing hard parses, oracle puts any new query and it's execution plan into a shared cursors tree. Access to that tree is controlled by mutex pins algorithm. Only one session can grab a mutex pin for a specific cursor at a time. Also, similar queries are being put as leaves with a common root, and my understanding is the whole root is being pinned during any updates in the tree...

Anyway, it was happening for two reasons:
1. Application under test was using queries with literals where it should've been using prepared statements.
2. My oracle version (11.2.0.1) had known issues around shared cursors tree.

So I've updated to 11.2.0.4 and set CURSOR_SHARING=FORCE. What this option does is it replaces all literals in all queries by system variables, which effectively means that all the queries that only differ in literals are now treated as the same query, they have the same cursor, and cursors tree doesn't need to be constantly updated. This took care of concurrency issues. It also created another problem.

Suddenly one of my other queries which was never a problem before, a very simple and well behaved query, became a huge bottleneck. It would take thousands of CPU cycles to execute where before it was tens of cycles! This one took days of my time, numerous experiments that slightly improved the situation but didn't solve the main issue, and in the end I had to go to DBAs for help.

Turned out that innocent "1=2" in that query (which was there because the query was dynamically generated with optional conditions) was replaced by something like ":SYS_0=:SYS_1", and that meant Oracle was grabbing those variables and evaluating the clause again and again for each row in a huge table (I would think it would do it once, understand it's FALSE and leave it at it - but no).
This was of course the result of CURSOR_SHARING=FALSE. I'll say in advance, that I got exactly the same behaviour with CURSOR_SHARING=SIMILAR.

The suggested fix in my case was either to switch to prepared statements everywhere so that we don't need to use CURSOR_SHARING=FALSE/SIMILAR, or to remove "1=2" from the query that suffered from that setting. Can't have it both.

Other useful tuning:

  • Increasing shared_pool_size.
  • Increasing session_cached_cursors.
  • Weirdly enough, locking statistics on selected columns helped.

Sunday, 1 February 2015

HAProxy balancing https backends

Recently I needed to configure load balancing in my environment, where I needed to balance between few https servers with sticky sessions enabled. I looked in the haproxy manual, I googled, I asked - and for days there was no making it work.

Most of the haproxy configuration examples out there are for the case when client connects to haproxy via https, and then haproxy decrypts it and balances requests between http backends. Few examples around https backends assumed that no sticky sessions are needed, so they all sit on top of tcp. To this day I have not found a guide or an example of how to configure what I need, so once I figured out how to do that, I thought I'd share.

So the way you do it is:
0) You need haproxy 1.5+. haproxy before that did not support https on its own.
1) A client connects to haproxy via https. There need to be a certificate+private key combination (that client would trust) on the haproxy server.
2) HAProxy decrypts the traffic and attaches a session cookie. If the cookie is already there, it knows where to send the request further.
3) HAProxy encrypts the traffic again before sending it to backend (where backend can decrypt it).
4) and the other way around.

And the configuration for that is:

  • For both backend and frontend you should have mode http.
  • In the bind line you need to add ssl cert <path to haproxy certificate + private key file>.
  • In the backend section you need to set load balancing algorithm - e.g. roundrobin or leastconn.
  • In the backend section you also need to set a cookie - e.g. cookie JSESSIONID insert indirect no cache.
  • For each server you need to say "ssl" after the ip, and then also set a cookie.
For me the one part I couldn't find in any guides was to put "ssl" in the server line (as well as in the bind line). I might have missed it somewhere in the not-so-helpful haproxy manual.

One thing I didn't go into was setting up a proper certificate on a backend servers in my environment, because of course in test environment they are self signed and all that. In order to work around it, just add another global setting to the haproxy settings: ssl-server-verify none.

And here's the example of the config file frontend & backend sections to make it work:

frontend  main
    mode http
    bind :443 ssl crt /etc/haproxy/cert.pem
    default_backend app

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    mode http
    balance     roundrobin
    option httpchk GET /concerto/Ping
    cookie JSESSIONID insert indirect nocache
    server  app1 10.0.1.11:443 ssl check cookie app1
    server  app2 10.0.1.12:443 ssl check cookie app2
    server  app3 10.0.1.13:443 ssl check cookie app3
    server  app4 10.0.1.14:443 ssl check cookie app4

Monday, 15 December 2014

We suck in security

And by “we” I mean humanity, at least the part of humanity that uses computers to create, store and share information. This is my main take from this year’s kiwicon 8. And this is the story of how I got there.
Disclaimer: I’m just gonna assume that presenters knew what they were talking about and use what they said shamelessly, and then give details in the further blog posts (or for now you can probably look at the posts of other attendees).
According to Rich Smith, Computer Security appears where Technology intersects with People. Makes sense. So lets look at the technology and at people separately.

1. Technology.
Just at this two-days conference vulnerabilities have been demonstrated in the very software that we rely on in keeping us safe: turns out well established Cisco firewall and most of the anti viruses are pretty easy to hack (for people who do that professionally). That’s like a wall with many holes. You get relaxed because wow wall, and next thing you know is all your sheep are stolen.

Firewalls and anti viruses are software, but the problem runs deeper. Protocols and languages! Internet has not been designed for security, and looks like all the crutches we built for it since don’t help as much as you would hope. JavaScript with its modern capabilities is always a ticking bomb in your living room, and now there is also WebRTC that was designed for peer-to-peer browser communications, and that helps tools like BeEF hide themselves. BeEF is stealth as it is, so maybe it doesn’t make an awful lot of difference, but you can see the potential: where earlier BeEF server would control a bunch of browsers directly, now it can also make browsers control other browsers. Thank you, WebRTC. You would think that technologies arising these days would try to be secure by design, but oh well…

Wanna go even deeper? Ian “MCP” Latter presented proof of concept for new protocol that allows to transfer information through screen and through programmable keyboard. He also demonstrated how exploitation framework built on top of these protocols allows perpetrator to steal information bypassing all the secure infrastructure around the target. The idea is you are not passing files, you are showing temporary pictures on the monitor screen, you capture this stream of pictures, and you decipher information from those pictures. Sounds like something out of “Chuck”, yet this is a very real technology. As its creator said, “By the nature of the component protocols, TCXf remains undetected and unmitigated by existing enterprise security architectures.”
Then there are also internet-spread vulnerabilities that got known in the last year, that for me as a bystander sound mostly like: a lot of people build their products on top of some third-party libraries. When those libraries get compromised, half of the internet is compromised. And they do get compromised.
Then there is also the encryption problem, where random numbers aren’t really as random as people using them think. But compared to all above it sounds like the least of our problems.
Okay, so technology isn’t as secure as we want, what about people?

2. People
People are the weakest link. Forget technology, even if we were to make it perfect, people would still get security compromised. And according to many speakers on the kiwicon, so far security area sucks in dealing with people. There is wide-spread default blame culture: when someone falls a victim to social engineering, they are getting blamed and fired. That is hardly how people learn, but that is exactly how you create atmosphere in which no one would go to security team when in doubt because of the fear of getting fired. Moreover, we don’t test people. We don’t measure their “security”, and we don’t know how to train them so that training would stick - because we don’t know what works, and what doesn’t.
So, we have problems with technology and with people. What else is bad?

There are plenty of potential attackers out there. Governments, enforcement agencies, corporations, individuals with various goals from getting money to getting information to personal revenge… they have motivation, they have skills and tools, and it is so much cheaper to attack than it is to defend (so called “Asymmetric defence”).
To make it even easier for attackers, targets don’t talk to each other. They don’t share information when they were attacked, to report such a thing is seen as to compromise yourself. And even when information is willingly shared, we don’t have good mechanisms to do that, so we do it the slowest way: manually. It might be easy enough in simple cases, but as @hypatia and @hashoctothorpe said, complex systems often mean complex problems, which would make them hard to describe and to share.

So, we suck in security. This is quite depressing. To make it a bit less depressing, lets talk about solutions that were also presented on the kiwicon (in some cases).

Most solutions were for the “People” part of the problem. Not one but three speakers talked about that.
The short answer is (in words of Etsy’s Rich Smith): ComSec should be Enabling, Transparent and Blameless.
The slightly longer answer is:
  • Build culture that encourages people to seek assistance from the Security specialists and to report breaches (don’t blame people, don’t try to fix people - fix the system).
  • Share information between departments and between organizations.
  • Proactive reach: for security team to reach to development and help them develop secure products.
  • Build trust.
  • Recognise that complex system will have complex problems.
  • Do realistic drills and training, measure the impact of training and adjust it.
People are reward driven and trustful by default. What makes it a problem is that people are thus highly susceptible to social engineering methods which are many. This can’t be fixed (do we even want it fixed?), but at least we can make it super easy to ask professionals for help without feeling threatened.

Okay, so situation in the People area can be improved (significantly if everyone were to follow Etsy’s culture guidelines) - at least for some organizations. What about the Technology area? Well… this is what I found in the presentations:
  • Use good random numbers.
  • Compartmentalize (don’t keep all eggs in one basket, don’t use flat networks, don’t give one user permissions to all servers, etc.).
  • Make it as expensive as possible for attackers to hack you: anti-kaizen for attackers, put bumps and huge rolling stones in their way, make it not worth the effort.
  • Know what you are doing (e.g. don’t just use third-party libraries for your product without verifying how secure they are).
  • …?
This is depressing, okay. In fact, I’m gonna stop here and let you feel how depressing it is. And then in the next posts I’ll write about more cheerful things. Kiwicon was really a lot of fun and epicness (I was in a room full of my childhood heroes, yeeey!). And there was a DeLorean. Doesn’t get much more fun than that. :-D

Wednesday, 19 November 2014

R for processing JMeter output CSV files

So, I'm working as a performance engineer, and I run a lot of tests in JMeter. Of course most of those tests I run in non-GUI mode. To get results out of non-GUI mode there are two basic ways:
  1. Configure Aggregate report to save results to a file. Afterwards load that file to a GUI JMeter to see aggregated results.
  2. Use a special JMeter plugin to save aggregated results to a database.
I already wrote about the second way, so today I'll write about the first one. There are probably better ways to do it, but until recently this was how I processed results:
  1. Run a test, have it save results to a file.
  2. Open that file in Agregate Report component in GUI JMeter to get aggregated results.
  3. Click "Save Table Data" to get new csv with aggregated results.
  4. Edit that new CSV to get rid of samplers I am not interested in (mostly the ones that I didn't bother to name - e.g. separate URLs that compose a page), and to also get rid of the columns I am not interested in.
  5. Sort the data in the CSV by Sampler - this is because I run many tests, and I need to compare results between runs. For that reason I create a spreadsheet and copy response times and throughput data to that spreadsheet, adding more and more columns for the table with rows labeled as samplers. Whatever, works for me.
  6. Copy results from csv to a big spreadsheet and graph the results.
At some point I used macros and regexps in Notepad++ to do stage 4. Then my laptop died and I lost it, couldn't be bothered to write it again, even though it was big help. Still, even with the macro there were a lot of manual steps just to get to meaningful results.

But hey, guess what, I've been learning stuff recently - in particular Data Science and programming in R. So I used little I know and created this little script in R to do steps 2-5 above for me.

Now all I have to do is to place JMeter output files in a folder, start R Studio (which is a free tool, and I have it anyway) (you can probably do it with pure R, no need in R Studio even), set working directory to the folder with files and run the script. Script goes through all csv files in the folder (or you can setup whatever filenames list you want in the script), and for each file:
  • Calculates Median, 90% Line and Throughput per minute for each sampler
  • Removes the samplers starting with "/" - i.e. samplers I didn't bother to give proper names, so I am probably not very interested in their individual results.
  • Removes delays (that's the thing with our scripts - we use Debug sampler usually named as "User Delay" to set up a realistic load model).
  • Orders results by sample name.
  • Saves results to a separate file.
One button and voilĂ  - all the processing done. Now I only need to copy data to my big spreadsheet and graph it as I choose.

Script is in githubfeel free to grab and use.

Sunday, 14 September 2014

Clouds for performance testing

Cloud computing has been a buzz word in IT few years ago, and now it is rapidly becoming industry standard rather than some new thing. Clouds have matured quite a lot since they got public. Now, I do not know the full history of clouds, but in the last few months I had an opportunity to work with few of them and to assess them for my very particular purpose: performance testing in the cloud. What you need for performance testing is a consistent performance (CPU, memory, disks, network) and if that's a cloud - also an opportunity to quickly and easily bring environment up and down.  

These are the clouds I tried out:
  • HP Cloud
  • Rackspace
  • AWS
  • MS Azure
Without going much into details (to avoid breaking any NDAs), lets just say, that in each cloud I deployed a multi-tiered web application using either puppet master or (in case of Azure where I only really looked at the vanilla Cassandra database, see explanation below) internal cloud tooling. Then I loaded the solution and monitored resource utilisation, response times, throughput, and I noted down any problems that got in the way.

And here’s what I’ve got.

HP Cloud. The worst cloud I’ve seen. The biggest problems I’ve encountered are the following:
  • Unstable VM hosts: two times a VM we used suddenly lost the ability to attach disks, which practically caused DB server to die, and us to lose extra day on creating and configuring a new DB server.
  • Unstable network: ping time between VMs inside the cloud would occasionally jump from 1ms to 16-30ms.
  • High steal CPU time - which means that VM would not get the requested CPU time from the host. During testing it got as high as 80% on the load generating nodes, 69% at the database server, 15% on the application servers.
There were also minor inconveniences such as:
  • It is impossible to resize a live VM: you’ll need to destroy it, and then to recreate, if you need to add RAM or CPU.
  • There is no option to get dedicated resources for a VM.
  • Latest OS versions were not available in the library of images, which means that if you need a new OS version, you’ll have to install it manually, create a customised VM image, and pay separately for each license.
  • Sometimes HP Cloud would have a maintenance that puts VMs offline for several hours.

HP Cloud was the starting point of my cloud investigation, and it was obvious we cannot use it for performance testing. So next I moved to Rackspace - another Openstack provider, more mature and powerful than HP Cloud. More expensive, as well. In Rackspace I didn’t have any problems with steal CPU time, nor with resizing VMs on the fly. It was a stable environment allowing to do benchmarking and load testing. However, it also had a bunch of problems:
  • Sometimes a newly provisioned VM wouldn’t have any network connectivity but through Rackspace web console. Far more often a new VM wouldn’t have network connectivity for a limited amount of time (2-5 minutes) after the provisioning, which caused our Puppet scripts to fail and thus caused a lot of trouble in provisioning test environments. Rackspace tech support has been aware of the issue, but they weren’t able to fix it in the time I was on a project (if they fixed it later, I wouldn’t know).
  • There were occasional spikes in the ping times up to 32 ms.
  • Hardware in Rackspace wasn’t up to our standards: CPU we got didn’t have a lot of cache, so our application would stress out CPU much more than on the hardware we used “at home”. That practically meant that to get the performance we wanted we’d need at least twice as much hardware, which was quite expensive.

After Rackspace we moved onto AWS (my colleagues did more stuff on AWS, than me, thus “we”), and we were amazed at how good it was. In AWS we didn’t have any of the problems we had in Openstack. AWS runs on good hardware (including SSD disks), allows to pay for dedicated resources (but we didn’t have to do it, because even non-dedicated VMs gave consistent results with zero steal time!), shows consistent small ping times between VMs, has a quite cool RDS service for running Amazon-managed easy-to-control relational database servers.

Yet, AWS is not cheap. So we thought we'd quickly try MS Azure to see if it can provide comparable results for a lower price. Because I was to compare Azure vs AWS in few specific performance-related areas (mostly I was interested to see CPU steal times, disk and network performance), I ran few scalability tests for the Cassandra database. Cassandra is a noSQL database, that is quite easy to install and start using. What was cool for my purposes, it has a built in performance measuring tool named cassandra-stress. It's a fast to setup and extremely easy to run test, and also Puppet just wouldn't work with Azure, so instead of the multi-tiered web application I went with Cassandra scalability test.

MS Azure wasn’t actually that bad, but it is nowhere near AWS as an environment for running high loads:
  • The biggest problem seemed to be network latency. Where AWS was doing perfectly fine, Azure had about 40% failures on timeouts on high loads. Ping times between nodes during tests were as high as 74 ms at times (compared to 0.3 ms in AWS under similar load). From time to time my SSH connection to this or that VM would break for no apparent reason.
  • Concurrently provisioning VMs from the same image is tricky: part of the resources is actually locked during VM creation, and no other thread can use it. That caused few "The operation cannot be performed at this time because a conflicting operation is underway. Please retry later.” errors when I was creating my environment.
  • Unlike AWS, Azure doesn’t allow you to use SSD, which means a lower disk IO performance. Also in Azure there are limitations on the number of IOPS you can have per storage account (though to be fair there is no practical limitation on how many storage accounts you can have in your environment). Even using RAID-0 of 8 disks didn’t allow me to reach the performance we easily had in AWS without a RAID.
  • For some reason (I am not entirely sure it was MS Azure fault) CPU usage was very uneven between the Cassandra nodes, even though the load on each node was pretty much the same.
  • I was not able to use Puppet because the special Puppet module for MS was out of sync with the Azure API.
This being said, Azure is somewhere near Rackspace (if not better) in terms of performance, and is quite easy to use. For a non-technical person who wants a VM in the cloud for personal use I’d recommend Azure.

For running performance testing in the cloud, AWS is so far the best I've seen. I also went through few of Amazon courses, and it looks to me like the best way to utilise AWS powers is to write an application that would use AWS services (such as queues and messages) for communicating between nodes.

As a summary: from my experience I would recommend to stay away from the HP Cloud, to use MS Azure for simple tasks, to use AWS for complicated time-critical tasks. And if you are a fan of Openstack - Internet says Rackspace is considered to be the most mature of the Openstack providers and to run the best hardware.

Sunday, 20 July 2014

Introvertic ramble on the trap of openspaces and office spaces in general


Hi, my name is Viktoriia, and I’m an introvert.

The weekend before last I spent two awesome days socializing with some of the best testers in New Zealand. After that I spent another three days trying to recover from all the joy. I was exhausted emotionally and physically, and had to spend full Sunday being sick and miserable because that’s how my body reacts to over-socialization - it goes to hibernate. Humans are not built to spend time in hibernate. That got me thinking…

Every day working in the office I get a bit more socialization that I would voluntarily choose to. And then when I get one little spike (like a testing conference), it becomes a butterfly that broke the cammel’s back.

Don’t get me wrong, co-location is awesome and critical for agile teams and all that. But there are also problems that come from the way we implement it (by placing everyone into these huge openspaces), and not only problems relevant to introverts exclusively:
  • The constant humming noise. Even if we forget about people who talk loud because that’s how they talk - the typing, and moving, and clicking, and talking, and whatelse is always there. Noise is stress. We even had it as a topic in school and university in Russia: even though human brain is pretty good with filtering out non-changing signals, human-produced complicated noise still makes it to do a lot of work to maintain those filters. Nervous system is always working extra hard just to save you the ability to concentrate. 
  • The cold going round. When someone is sick, everyone is sick. Someone is always sick. Sneezing and coughing never really stops. It’s like a kindergarden for IT - if you don’t have iron-made immune system, you are bound to go in and out of colds non-stop. Nothing serious, but pretty annoying. 
  • The temperature. Since we are all sharing the same space, we cannot possibly set temperature so that it’s good for everyone. For me it’s always freezing in the office. Judging from the number of people in jackets around, I guess I’m not the only one. 
  • The socialization itself. For introverts like me it’s additional stress just to be around this many people all the time. It makes it harder to concentrate, and it means that I’m always under just a little extra bit of stress. Immune system works badly when you are under stress, so that feeds into constantly being in and out of sickbay, which feeds into concentration problems again.
  • Commuting. This one applies to working from office in general, not just to openspaces. Every day so much time is being lost on getting from home to office and back. This makes roads overloaded, makes air worse, makes us all spend our precious time doing what really isn't necessary. Would be cool to free up roads for people who actually do have a good reason for being there. In IT in many cases it can be avoided - we have enough collaboration tools to go from 5 days a week working side by side to 1 day when everyone's physically in the office to align their actions and adjust plans as necessary and 4 days when everyone is where they choose to be, being online and connected via internet.
  • Multitasking. There have actually been research done* about the efficiency of office workers in different settings. It was shown that even extraverts work more efficiently and more creatively when they have a little bit of privacy (even if that’s a cubicle or a smaller room with just your team - but not the openspace). We also all know that exploratory testing recommends uninterrupted test sessions. The thing is, humans suck in multitasking. We can only really do one thing at a time. We can switch between tasks fast, that’s true, but imagine the overhead! When part of your resources is spent on ignoring the noise (I guess, headphones somewhat help, but in my experience you just get touched a lot when people want to talk to you), part on fighting the cold and part on switching between different tasks (passersby wanting to chat, for example) - you cannot possibly work at your fullest.
While having separate rooms for each team instead of openspace would make things much better, I personally would still prefer to have a choice to work from home. I found that few days when I was sick and worked from home turned out to be no less productive than an average day in the office, and most times even more productive. Always more comfortable.

It would be awesome to have an oportunity to work from home and be judged by results, not by hours in the chair. Especially since many IT companies seem to be already evaluating performance by results. Company I currently work for has a thorough system of logging and evaluating successes and results, and no one really sticks for hours as far as I know. Yet it is not a common practice to allow employees to work from home, aside from emergencies and special cases. I wish it was. One of the reasons I want to go to contracting in few years is to have an opportunity to live out of Auckland in a nice house with good internet and do all the work from there. In my book it beats both living in the center of Auckland to be near office and living outside of Auckland and spending few hours every work day on commuting. I'd rather work 9 hours from home than work 8 hours in the office and spend another hour on getting there and back.

*about research and more, there is an awesome book “Quiet: The Power of Introverts” by Susan Cain. It quotes and references quite a lot of scientific research in the area. I highly recommend it to anyone who’s interested in how people work.

Monday, 14 July 2014

#KWST4, day two

It just so happened, that I was presenting the first ER of the day. Now let me explain: when I'm on any kind of stage, I go autopilot. Very realistic and well-behaved autopilot, but autopilot nonetheless. So I kinda panicked through the presentation and the discussion, and because I wasn't able to take notes, I don't remember much. On the other hand, it was my ER, so I'm gonna write about it in details.

I talked about my experience of dealing with emergencies on my last job in Russia, where I was testing mobile applications for android - think "google maps" but better. I won't mention the name of the company not to attract searching bots and unnecessary attention to this blog post, but if you're interested, it's on my LinkedIn account. Now, this company is amazing in many ways, and it's got huge user auditory and a reputation to maintain. From time to time due to different circumstances I was in a situation when I get a new build, it goes into production in few hours, and full proper retest takes about a week. Bear in mind, I still had enough time to do the testing between the craziness, but the craziness still happend from time to time.

The way I dealt with it was using these three tools:

  • Cluster (environments, contexts, test cases).
  • Prioritize.
  • Parallelize.
The secret is that you cannot possible use these tools if you are not ready. So the solution really is "Be prepared". This is what you need to be able to use those tools:
  • Know your environments (what are they, how are they different, which differences matter and why, are there any special reasons to be doing testing on a particular environment).
  • Know how your application is being used and how is it likely to be used after the release (popular workflows, past statistics, core users, geek users, marketing effort).
  • Know high risk areas (bug-rich areas, understanding of the codebase, functions that make application useless or annoying to use when broken, what changed since last release and since last build)
  • Easy to read documented coverage blocks - you should be able to give a coworker who doesn't know your application well a peace of paper and ask to test areas mentioned there. I like to use coverage matrix in a spreadsheet where each row is a short test idea e.g. "make sure the field doesn't let through special characters" or "try two users doing this concurrently".
  • Know on a deep level how your application works (architecture in diagrams and mindmaps, dependencies between different parts, server-client protocols).
Few tips on how do you get there:
  • Ask questions early - make sure you understand whats, hows and whys about every feature. Or at least whats and whys.
  • Document knowledge in diagrams and bullet-point lists: structure!
  • Do risks analysis with the team, and use results both in development and in testing.
  • Keep up to date with sales/marketing. If they spent a month advertising some shiny new feature you wanna make sure that feature is spotless.
  • Gather and analyze usage statistics - there are plenty of the libraries nowadays (google analytics and the likes) that allow you to do it. You wanna know how users are actually using your application, not just how you think they are using it.
  • Work close with development - they are irreplaceable source of information. They can also back you up when you are explaining to higher powers why you absolutely need to test this feature, but can release without testing that one.
  • Learn differences between environments.
I talked more, of course, but hopefully these lists make sense on their own. To give you an example of how this worked in real life, this is how it usually went. Instead of testing the full scope of the functionality on all representative devices in different contexts and conditions (on the move, in building, in different map setups, with network fluctuations, etc.) I would choose 3-5 devices that are most suitable for the final round (hardware differences, OS versions, popularity, known problems), I would choose a subset of tests, I would cluster my tests by the context. I would e.g. go and take one ride on a bus to the subway and back, carrying three devices with me and switching between them to do all the testing that needs to be done on the move. I would ask a coworker to do the easier part of the testing to increase the scope. I would only test functionality that had a chance to be affected by the new build (because I tested everything else in the previous build), and also core functionality. I would do part of the testing with stubs - e.g. on a test environment or using server simulators like Fiddler - because I understood the impact of that and how to emulate the server properly. I would make sure severe bugs that were fixed and tested few builds back did not come back (happens sometimes when the code is merged from one branch to another in Git).
And after I am done, I would write a letter to a PM (CCed to certain team members) where I would specify which areas are left untested, what are the risks involved, what is my recommendation (safe to release/need more time to test/need to fix known bugs) and why.

There was certainly an interesting discussion after I was done with my ER, but as I said, I don't remember it very well. This is what I do remember.
  • Few people were worried that after seeing a project successfully going into production without going through full testing, PM would decide that full testing isn't actually necessary. We did not have this problem in my company, but I think it's a valid point. I can only suggest to educate your managers each release on what are the risks, and why they matter (e.g. what would be the consequences of the failure).
  • I think it was Adam Howard who asked whether I used any of the crisis techniques in normally paced testing. I totally did, especially the clustering: I mean, once you figure out a way to do something efficiently, you can't really go back. Well, I am personally too lazy to go back from being efficient. :-) And also, once you went through the emergency, you learn to be ready to the next one, so it's a self-feeding circle, really.
  • We talked a bit about how important is co-location, in particular for developers, testers and product owners. I also mentioned, and I think few people agreed, that it's also important to have the right to not be disturbed when you choose to. Sean also mentioned that in their company they dealt with teams being geographically distributed by setting up a constant video thread between locations through huge TVs on the wall.
Next ER was presented by Rachel Carson. She talked about speeding up testing (and the whole project) by switching from Waterfall to "Jet Waterfall" - kind of a Waterfall-Agile hybrid if I understood that correctly: cross-functional teams, co-location, more communication within the team. According to Rachel, testing activities didn't change much, they were pretty good to begin with. But the mindset of testers shifted to the more realistic side: now instead of reporting all the bugs, they started to think whether it's useful to report the bug. Sometimes it was better to talk to development and either get it fixed right away or learn that it will never be fixed with a reasonable explanation. They also started to use the Definition of Done rather than pure guts to drive the testing.

These are some of my notes from that ER and discussion afterwards:
  • Important developer's skill - to be able to share information with non-technical people (Rachel).
  • When you are forced into defending design decisions, be sure to stay critical about them (Rachel).
  • When there are many existing bugs on a project, you can use them as a set of data rather than one-by-one, to extract useful info. E.g. areas where bugs are clustering might need more attention and/or redesign. (Adam and Chris).
The last ER on the KWST was done by Adam Howard. He talked about speeding up the whole process of development on a very challenging bound-to-fail project. Which thanks to Adam and his team did not fail in the end. :-)
This is what I got from it:
  • They used visual models (mostly mindmaps) to see/show the big picture - that allowed the team to make a decision to rewrite some functionality rather than to try and fix numerous existing bugs there.
  • Fixing all the bugs didn't work because the results of those fixes did't play well together - another reason to use visual models.
  • Adam tried to switch testing team to the ways of exploratory testing. To make it easier on them, Adam didn't require them to switch - he just showed the way. Some teams accepted new way, some didn't.
  • It's very challenging to change when you have to do it fast. As a result, after the project was out of the door, and Adam and his team went back to Assurity (it was a contract project), remaining testers went back to the old ways. It's not easy to make even good practices stick if they are new.
  • Adam and his team partially took BA role on themselves in order to create the big picture view of the project. Some (e.g. Oliver and Katrina) argued that instead of doing it we should rather push BAs to do their job better.
And I believe that's about it. We discussed the exercises from the day before, played some games, said goodbyes and went our ways. It was a lot of fun, and I've learned from every single person who was there. Not everyone is quoted in my two posts (I think I might have missed Ben here, but mentioned him on twitter. Or vice versa), but everyone was brilliant.

I'd like to finish this with mentioning mazing people who made it happen. Thanks to Oliver, Rich, Janice, Nikki (I hope I spelled their names right) and to our sponsors:  and !

Saturday, 12 July 2014

#KWST4 day one

I just got back from the awesome two-days long testing workshop KWST4. It was a bit overwhelming for me, with the amount of intense thinking activity and socialization, and I'm still a bit out of it, but nonetheless I'll try to write down all the highlights while I still remember them.

There were at most 17 people in the room, KWST is a pretty small workshop, but that's likely what made it so good. Everyone was participating in discussions, and everyone got to share their experiences and pains. :-) We had four ERs (ER = experience report) on the first day, and three more ERs on the second day. There were also exercise, testing games and a lot of socialization on coffee/lunch breaks. The way ERs worked was someone presented their experience, and then we all had a facilitated discussion around that experience. Main topic of the workshop was "How to speed up testing, and why we shouldn't".

First ER was presented by Sean Cresswell, a test manager from Trademe. Sean shared his experience on how changing the way Risks based testing was perceived in the team allowed to speed up testing. Business unit was already doing what they called Risks analysis, but somehow in the end tests were still prioritized with specification coming first. That led to most important bugs being found closer to the end. Changing the way BU did Risks analysis and reordering tests allowed to change the perception of the testing. It might have taken about the same time as before, but now most critical bugs were found in the beginning of the testing, which gave the impression the testing itself sped up. And come on, we all know that finding serious bugs early is good for so many reasons besides the perception of testing. :-)

Some ideas that came out of this discussion, were:
  • Risks analysis should affect priorities in development as well as in testing (I think that was by Rachel Carson, but not sure).
  • Plan the work so that everyone (devs, BAs, testers) are busy all the time - as a planning strategy (Oliver Erlewein).
  • Instead of step by step howto guides it's sometimes useful to write down lists of questions, answers to which will lead a person through the howto (Andrew Robins).

There were also some pretty heated up discussions around definition of "risk" and such. I personally can recommend Rex Black's book on Risks based testing. Funny enough, when I twitted about the book and jokingly mentioned that it's good despite Rex's evilness (of being involved with ISTQB. ISTQB is evil), I found myself in a twitter argument around Rex's evilness. Huh. Sometimes life is weird.

Moving on, second ER was from Chris Rolls, and he talked about examples of successful and unsuccessful usage of test automation. One of the examples I found interesting was using automation to quickly do regression testing of a web app, after security patch was applied.
Chris also formulated a pretty cool (if you ask me) approach to test automation: when automating existing tests, first transfer test objective. He talked about the common perception of automated tests as step by step repetition of existing manual tests. It's easy to lose the "why" when thinking like this: why are we automating this test, and why do we test it this way. It might very well be that after transferring test objective first, you'll find you can reach that test objective in a different, more efficient way in automation.

Discussion went mostly about automation in general, and testing roles. I liked this idea from Oliver: he said he uses roles to protect people from being pulled away from their main tasks. Example he gave was one of his testers was responsible for developing a testing framework, and Oliver had to protect him from being used as a manual tester when there were lack of those. Apparently, naming the guy "test automation architect" or something like that makes a huge difference to the management. :-)

After the lunch break Thomas Recker presented his ER on coming to do the test automation late in the project. He talked about his experience of being called to "come and automate" when it was too late to influence test design or decisions around testability of the application. He was also asked both to do the testing and to fit some existing list of functionality coverage which didn't play well with how the tests worked. In the end he had to balance actual testing with the bureaucratic part and with the "make it stick together" parts of the job. I think we could all agree that coming into project early and having the opportunity to do test design and such with automation in mind helps with implementing test automation.

There were an interesting discussion around test automation strategies: when does it make sense to automate, and how do you choose what to automate. Few interesting for me things came out of that:
  • Andrew suggested to try and see the existing test as a measurement: if it can be seen as such, it's a good candidate to automate.
  • Till Neunast mentioned it makes sense to automate testing around business logic when it's implemented in files/modules that change often. No sense in having automated test that always succeeds because part of the application it checks never changes.
  • Aaron Hodder insists that you cannot automate testing, because "once it's automated, it's a different 'thing'".
  • Oliver described the system they have in his workplace: an internal twitter-like tool that is connected with the test automation framework. It works both ways: you can twit to the system to make it start some tests, and system constantly twits about what it's doing, and what are the results of the testing. Pretty cool!
  • Joshua Raine raised a question about differentiating between test automation that helps you here and now and test automation that requires investment now with a potential pay off in future, and about maybe finding some other types of test automation in the same classification.
In the end of the day Andrew Robins presented the ER number four. His topic was "Enabling the team as a technique to speed up testing". Andrew worked in a challenging environment where the company produced not just software, but hardware as well. That meant that test environment would consist of specially made prototypes, and would be extremely expensive. Previously (before Andrew did his magic) that led to a huge bottleneck where testers only had one environment, and it had to be reconfigured for different tests. Reconfiguration took three days. You can see how it's not good to make 40 people wait while the only existing test environment is being unreachable.
The way Andrew solved that and sped up testing significantly was to plan for testing in advance. Months before the testing was to begin, he started on the task to get more environments. Because there was enough time, the company was able to budget and plan for two test environment, which meant far less reconfiguration bottlenecks as well as opportunity to run different tests in parallel.

Important lesson here for me was to plan for testing as well as plan testing. To be fair we are doing it in OrionHealth, but to have it as a consciously formulated strategy is so much better than just do it intuitively!

One interesting part of the discussion that I managed to note down was about other ways of handling limited environment when there is no way to get a new one. The consensus was that one great technique is to stub/emulate parts of the environment you need. I might also add that it might be a good idea to challenge the "we cannot get another environment" notion and go into the cloud. Clouds allow to spin up additional environments pretty fast, secure and really cheap, so unless some specialized hardware is needed, no reason not to use it.

There was also plenty of stuff tracked on twitter, I do recommend to go through #KWST4 hashtag there. And of course discussions were pretty intense, so I didn't have a chance to note down/twit most of it, and probably left out a lot. It would be interesting to compare with other blog posts on the event. :-)

In the end of the day one we also did a group exercise: we had 10-15 minutes to come up with a common answer for each of the given 4 questions. I teamed up with Adam Howard and Nigel Charman. I like the answers our team gave, and I would totally hire anyone who demonstrates kind of thinking we demonstrated. But funny enough it turned out that part of anonymous test managers who assessed our answers called asking reasonable questions "being evasive" and were more concerned about the formatting and punctuation than about the actual content and what skills it showed. Well, what can I say... I wouldn't hire those test managers and wouldn't go work for them either. It also interestingly shows that hiring process in the testing industry is not at its top game in many companies.

Unrelated to the exercise, sadly enough some are still looking for drones instead of testers and wouldn't even consider a really good tester if he doesn't have ISTQB certification, or N years of experience with some tool. Luckily for us, there are also plenty of smart IT companies on the market who hire people to do the job, not to fit in a cell. :-)

Day 2 to follow.