Just a tester

Monday, 15 December 2014

We suck in security

And by “we” I mean humanity, at least the part of humanity that uses computers to create, store and share information. This is my main take from this year’s kiwicon 8. And this is the story of how I got there.

Disclaimer: I’m just gonna assume that presenters knew what they were talking about and use what they said shamelessly, and then give details in the further blog posts (or for now you can probably look at the posts of other attendees).

According to Rich Smith, Computer Security appears where Technology intersects with People. Makes sense. So lets look at the technology and at people separately.

1. Technology.

Just at this two-days conference vulnerabilities have been demonstrated in the very software that we rely on in keeping us safe: turns out well established Cisco firewall and most of the anti viruses are pretty easy to hack (for people who do that professionally). That’s like a wall with many holes. You get relaxed because wow wall, and next thing you know is all your sheep are stolen.

Firewalls and anti viruses are software, but the problem runs deeper. Protocols and languages! Internet has not been designed for security, and looks like all the crutches we built for it since don’t help as much as you would hope. JavaScript with its modern capabilities is always a ticking bomb in your living room, and now there is also WebRTC that was designed for peer-to-peer browser communications, and that helps tools like BeEF hide themselves. BeEF is stealth as it is, so maybe it doesn’t make an awful lot of difference, but you can see the potential: where earlier BeEF server would control a bunch of browsers directly, now it can also make browsers control other browsers. Thank you, WebRTC. You would think that technologies arising these days would try to be secure by design, but oh well…

Wanna go even deeper? Ian “MCP” Latter presented proof of concept for new protocol that allows to transfer information through screen and through programmable keyboard. He also demonstrated how exploitation framework built on top of these protocols allows perpetrator to steal information bypassing all the secure infrastructure around the target. The idea is you are not passing files, you are showing temporary pictures on the monitor screen, you capture this stream of pictures, and you decipher information from those pictures. Sounds like something out of “Chuck”, yet this is a very real technology. As its creator said, “By the nature of the component protocols, TCXf remains undetected and unmitigated by existing enterprise security architectures.”

Then there are also internet-spread vulnerabilities that got known in the last year, that for me as a bystander sound mostly like: a lot of people build their products on top of some third-party libraries. When those libraries get compromised, half of the internet is compromised. And they do get compromised.

Then there is also the encryption problem, where random numbers aren’t really as random as people using them think. But compared to all above it sounds like the least of our problems.

Okay, so technology isn’t as secure as we want, what about people?

2. People

People are the weakest link. Forget technology, even if we were to make it perfect, people would still get security compromised. And according to many speakers on the kiwicon, so far security area sucks in dealing with people. There is wide-spread default blame culture: when someone falls a victim to social engineering, they are getting blamed and fired. That is hardly how people learn, but that is exactly how you create atmosphere in which no one would go to security team when in doubt because of the fear of getting fired. Moreover, we don’t test people. We don’t measure their “security”, and we don’t know how to train them so that training would stick - because we don’t know what works, and what doesn’t.

So, we have problems with technology and with people. What else is bad?

There are plenty of potential attackers out there. Governments, enforcement agencies, corporations, individuals with various goals from getting money to getting information to personal revenge… they have motivation, they have skills and tools, and it is so much cheaper to attack than it is to defend (so called “Asymmetric defence”).

To make it even easier for attackers, targets don’t talk to each other. They don’t share information when they were attacked, to report such a thing is seen as to compromise yourself. And even when information is willingly shared, we don’t have good mechanisms to do that, so we do it the slowest way: manually. It might be easy enough in simple cases, but as @hypatia and @hashoctothorpe said, complex systems often mean complex problems, which would make them hard to describe and to share.

So, we suck in security. This is quite depressing. To make it a bit less depressing, lets talk about solutions that were also presented on the kiwicon (in some cases).

Most solutions were for the “People” part of the problem. Not one but three speakers talked about that.

The short answer is (in words of Etsy’s Rich Smith): ComSec should be Enabling, Transparent and Blameless.

The slightly longer answer is:

Build culture that encourages people to seek assistance from the Security specialists and to report breaches (don’t blame people, don’t try to fix people - fix the system).
Share information between departments and between organizations.
Proactive reach: for security team to reach to development and help them develop secure products.
Build trust.
Recognise that complex system will have complex problems.
Do realistic drills and training, measure the impact of training and adjust it.

People are reward driven and trustful by default. What makes it a problem is that people are thus highly susceptible to social engineering methods which are many. This can’t be fixed (do we even want it fixed?), but at least we can make it super easy to ask professionals for help without feeling threatened.

Okay, so situation in the People area can be improved (significantly if everyone were to follow Etsy’s culture guidelines) - at least for some organizations. What about the Technology area? Well… this is what I found in the presentations:

Use good random numbers.
Compartmentalize (don’t keep all eggs in one basket, don’t use flat networks, don’t give one user permissions to all servers, etc.).
Make it as expensive as possible for attackers to hack you: anti-kaizen for attackers, put bumps and huge rolling stones in their way, make it not worth the effort.
Know what you are doing (e.g. don’t just use third-party libraries for your product without verifying how secure they are).
…?

This is depressing, okay. In fact, I’m gonna stop here and let you feel how depressing it is. And then in the next posts I’ll write about more cheerful things. Kiwicon was really a lot of fun and epicness (I was in a room full of my childhood heroes, yeeey!). And there was a DeLorean. Doesn’t get much more fun than that. :-D

Wednesday, 19 November 2014

R for processing JMeter output CSV files

So, I'm working as a performance engineer, and I run a lot of tests in JMeter. Of course most of those tests I run in non-GUI mode. To get results out of non-GUI mode there are two basic ways:

Configure Aggregate report to save results to a file. Afterwards load that file to a GUI JMeter to see aggregated results.
Use a special JMeter plugin to save aggregated results to a database.

I already wrote about the second way, so today I'll write about the first one. There are probably better ways to do it, but until recently this was how I processed results:

Run a test, have it save results to a file.
Open that file in Agregate Report component in GUI JMeter to get aggregated results.
Click "Save Table Data" to get new csv with aggregated results.
Edit that new CSV to get rid of samplers I am not interested in (mostly the ones that I didn't bother to name - e.g. separate URLs that compose a page), and to also get rid of the columns I am not interested in.
Sort the data in the CSV by Sampler - this is because I run many tests, and I need to compare results between runs. For that reason I create a spreadsheet and copy response times and throughput data to that spreadsheet, adding more and more columns for the table with rows labeled as samplers. Whatever, works for me.
Copy results from csv to a big spreadsheet and graph the results.

At some point I used macros and regexps in Notepad++ to do stage 4. Then my laptop died and I lost it, couldn't be bothered to write it again, even though it was big help. Still, even with the macro there were a lot of manual steps just to get to meaningful results.

But hey, guess what, I've been learning stuff recently - in particular Data Science and programming in R. So I used little I know and created this little script in R to do steps 2-5 above for me.

Now all I have to do is to place JMeter output files in a folder, start R Studio (which is a free tool, and I have it anyway) (you can probably do it with pure R, no need in R Studio even), set working directory to the folder with files and run the script. Script goes through all csv files in the folder (or you can setup whatever filenames list you want in the script), and for each file:

Calculates Median, 90% Line and Throughput per minute for each sampler
Removes the samplers starting with "/" - i.e. samplers I didn't bother to give proper names, so I am probably not very interested in their individual results.
Removes delays (that's the thing with our scripts - we use Debug sampler usually named as "User Delay" to set up a realistic load model).
Orders results by sample name.
Saves results to a separate file.

One button and voilà - all the processing done. Now I only need to copy data to my big spreadsheet and graph it as I choose.

Script is in github, feel free to grab and use.

Sunday, 14 September 2014

Clouds for performance testing

Cloud computing has been a buzz word in IT few years ago, and now it is rapidly becoming industry standard rather than some new thing. Clouds have matured quite a lot since they got public. Now, I do not know the full history of clouds, but in the last few months I had an opportunity to work with few of them and to assess them for my very particular purpose: performance testing in the cloud. What you need for performance testing is a consistent performance (CPU, memory, disks, network) and if that's a cloud - also an opportunity to quickly and easily bring environment up and down.

These are the clouds I tried out:

HP Cloud
Rackspace
AWS
MS Azure

Without going much into details (to avoid breaking any NDAs), lets just say, that in each cloud I deployed a multi-tiered web application using either puppet master or (in case of Azure where I only really looked at the vanilla Cassandra database, see explanation below) internal cloud tooling. Then I loaded the solution and monitored resource utilisation, response times, throughput, and I noted down any problems that got in the way.

And here’s what I’ve got.

HP Cloud. The worst cloud I’ve seen. The biggest problems I’ve encountered are the following:

Unstable VM hosts: two times a VM we used suddenly lost the ability to attach disks, which practically caused DB server to die, and us to lose extra day on creating and configuring a new DB server.
Unstable network: ping time between VMs inside the cloud would occasionally jump from 1ms to 16-30ms.
High steal CPU time - which means that VM would not get the requested CPU time from the host. During testing it got as high as 80% on the load generating nodes, 69% at the database server, 15% on the application servers.

There were also minor inconveniences such as:

It is impossible to resize a live VM: you’ll need to destroy it, and then to recreate, if you need to add RAM or CPU.
There is no option to get dedicated resources for a VM.
Latest OS versions were not available in the library of images, which means that if you need a new OS version, you’ll have to install it manually, create a customised VM image, and pay separately for each license.
Sometimes HP Cloud would have a maintenance that puts VMs offline for several hours.

HP Cloud was the starting point of my cloud investigation, and it was obvious we cannot use it for performance testing. So next I moved to Rackspace - another Openstack provider, more mature and powerful than HP Cloud. More expensive, as well. In Rackspace I didn’t have any problems with steal CPU time, nor with resizing VMs on the fly. It was a stable environment allowing to do benchmarking and load testing. However, it also had a bunch of problems:

Sometimes a newly provisioned VM wouldn’t have any network connectivity but through Rackspace web console. Far more often a new VM wouldn’t have network connectivity for a limited amount of time (2-5 minutes) after the provisioning, which caused our Puppet scripts to fail and thus caused a lot of trouble in provisioning test environments. Rackspace tech support has been aware of the issue, but they weren’t able to fix it in the time I was on a project (if they fixed it later, I wouldn’t know).
There were occasional spikes in the ping times up to 32 ms.
Hardware in Rackspace wasn’t up to our standards: CPU we got didn’t have a lot of cache, so our application would stress out CPU much more than on the hardware we used “at home”. That practically meant that to get the performance we wanted we’d need at least twice as much hardware, which was quite expensive.

After Rackspace we moved onto AWS (my colleagues did more stuff on AWS, than me, thus “we”), and we were amazed at how good it was. In AWS we didn’t have any of the problems we had in Openstack. AWS runs on good hardware (including SSD disks), allows to pay for dedicated resources (but we didn’t have to do it, because even non-dedicated VMs gave consistent results with zero steal time!), shows consistent small ping times between VMs, has a quite cool RDS service for running Amazon-managed easy-to-control relational database servers.

Yet, AWS is not cheap. So we thought we'd quickly try MS Azure to see if it can provide comparable results for a lower price. Because I was to compare Azure vs AWS in few specific performance-related areas (mostly I was interested to see CPU steal times, disk and network performance), I ran few scalability tests for the Cassandra database. Cassandra is a noSQL database, that is quite easy to install and start using. What was cool for my purposes, it has a built in performance measuring tool named cassandra-stress. It's a fast to setup and extremely easy to run test, and also Puppet just wouldn't work with Azure, so instead of the multi-tiered web application I went with Cassandra scalability test.

MS Azure wasn’t actually that bad, but it is nowhere near AWS as an environment for running high loads:

The biggest problem seemed to be network latency. Where AWS was doing perfectly fine, Azure had about 40% failures on timeouts on high loads. Ping times between nodes during tests were as high as 74 ms at times (compared to 0.3 ms in AWS under similar load). From time to time my SSH connection to this or that VM would break for no apparent reason.
Concurrently provisioning VMs from the same image is tricky: part of the resources is actually locked during VM creation, and no other thread can use it. That caused few "The operation cannot be performed at this time because a conflicting operation is underway. Please retry later.” errors when I was creating my environment.
Unlike AWS, Azure doesn’t allow you to use SSD, which means a lower disk IO performance. Also in Azure there are limitations on the number of IOPS you can have per storage account (though to be fair there is no practical limitation on how many storage accounts you can have in your environment). Even using RAID-0 of 8 disks didn’t allow me to reach the performance we easily had in AWS without a RAID.
For some reason (I am not entirely sure it was MS Azure fault) CPU usage was very uneven between the Cassandra nodes, even though the load on each node was pretty much the same.
I was not able to use Puppet because the special Puppet module for MS was out of sync with the Azure API.

This being said, Azure is somewhere near Rackspace (if not better) in terms of performance, and is quite easy to use. For a non-technical person who wants a VM in the cloud for personal use I’d recommend Azure.

For running performance testing in the cloud, AWS is so far the best I've seen. I also went through few of Amazon courses, and it looks to me like the best way to utilise AWS powers is to write an application that would use AWS services (such as queues and messages) for communicating between nodes.

As a summary: from my experience I would recommend to stay away from the HP Cloud, to use MS Azure for simple tasks, to use AWS for complicated time-critical tasks. And if you are a fan of Openstack - Internet says Rackspace is considered to be the most mature of the Openstack providers and to run the best hardware.

Sunday, 20 July 2014

Introvertic ramble on the trap of openspaces and office spaces in general

Hi, my name is Viktoriia, and I’m an introvert.

The weekend before last I spent two awesome days socializing with some of the best testers in New Zealand. After that I spent another three days trying to recover from all the joy. I was exhausted emotionally and physically, and had to spend full Sunday being sick and miserable because that’s how my body reacts to over-socialization - it goes to hibernate. Humans are not built to spend time in hibernate. That got me thinking…

Every day working in the office I get a bit more socialization that I would voluntarily choose to. And then when I get one little spike (like a testing conference), it becomes a butterfly that broke the cammel’s back.

Don’t get me wrong, co-location is awesome and critical for agile teams and all that. But there are also problems that come from the way we implement it (by placing everyone into these huge openspaces), and not only problems relevant to introverts exclusively:

The constant humming noise. Even if we forget about people who talk loud because that’s how they talk - the typing, and moving, and clicking, and talking, and whatelse is always there. Noise is stress. We even had it as a topic in school and university in Russia: even though human brain is pretty good with filtering out non-changing signals, human-produced complicated noise still makes it to do a lot of work to maintain those filters. Nervous system is always working extra hard just to save you the ability to concentrate.
The cold going round. When someone is sick, everyone is sick. Someone is always sick. Sneezing and coughing never really stops. It’s like a kindergarden for IT - if you don’t have iron-made immune system, you are bound to go in and out of colds non-stop. Nothing serious, but pretty annoying.
The temperature. Since we are all sharing the same space, we cannot possibly set temperature so that it’s good for everyone. For me it’s always freezing in the office. Judging from the number of people in jackets around, I guess I’m not the only one.
The socialization itself. For introverts like me it’s additional stress just to be around this many people all the time. It makes it harder to concentrate, and it means that I’m always under just a little extra bit of stress. Immune system works badly when you are under stress, so that feeds into constantly being in and out of sickbay, which feeds into concentration problems again.
Commuting. This one applies to working from office in general, not just to openspaces. Every day so much time is being lost on getting from home to office and back. This makes roads overloaded, makes air worse, makes us all spend our precious time doing what really isn't necessary. Would be cool to free up roads for people who actually do have a good reason for being there. In IT in many cases it can be avoided - we have enough collaboration tools to go from 5 days a week working side by side to 1 day when everyone's physically in the office to align their actions and adjust plans as necessary and 4 days when everyone is where they choose to be, being online and connected via internet.
Multitasking. There have actually been research done* about the efficiency of office workers in different settings. It was shown that even extraverts work more efficiently and more creatively when they have a little bit of privacy (even if that’s a cubicle or a smaller room with just your team - but not the openspace). We also all know that exploratory testing recommends uninterrupted test sessions. The thing is, humans suck in multitasking. We can only really do one thing at a time. We can switch between tasks fast, that’s true, but imagine the overhead! When part of your resources is spent on ignoring the noise (I guess, headphones somewhat help, but in my experience you just get touched a lot when people want to talk to you), part on fighting the cold and part on switching between different tasks (passersby wanting to chat, for example) - you cannot possibly work at your fullest.

While having separate rooms for each team instead of openspace would make things much better, I personally would still prefer to have a choice to work from home. I found that few days when I was sick and worked from home turned out to be no less productive than an average day in the office, and most times even more productive. Always more comfortable.

It would be awesome to have an oportunity to work from home and be judged by results, not by hours in the chair. Especially since many IT companies seem to be already evaluating performance by results. Company I currently work for has a thorough system of logging and evaluating successes and results, and no one really sticks for hours as far as I know. Yet it is not a common practice to allow employees to work from home, aside from emergencies and special cases. I wish it was. One of the reasons I want to go to contracting in few years is to have an opportunity to live out of Auckland in a nice house with good internet and do all the work from there. In my book it beats both living in the center of Auckland to be near office and living outside of Auckland and spending few hours every work day on commuting. I'd rather work 9 hours from home than work 8 hours in the office and spend another hour on getting there and back.

*about research and more, there is an awesome book “Quiet: The Power of Introverts” by Susan Cain. It quotes and references quite a lot of scientific research in the area. I highly recommend it to anyone who’s interested in how people work.

Monday, 14 July 2014

#KWST4, day two

It just so happened, that I was presenting the first ER of the day. Now let me explain: when I'm on any kind of stage, I go autopilot. Very realistic and well-behaved autopilot, but autopilot nonetheless. So I kinda panicked through the presentation and the discussion, and because I wasn't able to take notes, I don't remember much. On the other hand, it was my ER, so I'm gonna write about it in details.

I talked about my experience of dealing with emergencies on my last job in Russia, where I was testing mobile applications for android - think "google maps" but better. I won't mention the name of the company not to attract searching bots and unnecessary attention to this blog post, but if you're interested, it's on my LinkedIn account. Now, this company is amazing in many ways, and it's got huge user auditory and a reputation to maintain. From time to time due to different circumstances I was in a situation when I get a new build, it goes into production in few hours, and full proper retest takes about a week. Bear in mind, I still had enough time to do the testing between the craziness, but the craziness still happend from time to time.

The way I dealt with it was using these three tools:

Cluster (environments, contexts, test cases).
Prioritize.
Parallelize.

The secret is that you cannot possible use these tools if you are not ready. So the solution really is "Be prepared". This is what you need to be able to use those tools:

Know your environments (what are they, how are they different, which differences matter and why, are there any special reasons to be doing testing on a particular environment).
Know how your application is being used and how is it likely to be used after the release (popular workflows, past statistics, core users, geek users, marketing effort).
Know high risk areas (bug-rich areas, understanding of the codebase, functions that make application useless or annoying to use when broken, what changed since last release and since last build)
Easy to read documented coverage blocks - you should be able to give a coworker who doesn't know your application well a peace of paper and ask to test areas mentioned there. I like to use coverage matrix in a spreadsheet where each row is a short test idea e.g. "make sure the field doesn't let through special characters" or "try two users doing this concurrently".
Know on a deep level how your application works (architecture in diagrams and mindmaps, dependencies between different parts, server-client protocols).

Few tips on how do you get there:

Ask questions early - make sure you understand whats, hows and whys about every feature. Or at least whats and whys.
Document knowledge in diagrams and bullet-point lists: structure!
Do risks analysis with the team, and use results both in development and in testing.
Keep up to date with sales/marketing. If they spent a month advertising some shiny new feature you wanna make sure that feature is spotless.
Gather and analyze usage statistics - there are plenty of the libraries nowadays (google analytics and the likes) that allow you to do it. You wanna know how users are actually using your application, not just how you think they are using it.
Work close with development - they are irreplaceable source of information. They can also back you up when you are explaining to higher powers why you absolutely need to test this feature, but can release without testing that one.
Learn differences between environments.

I talked more, of course, but hopefully these lists make sense on their own. To give you an example of how this worked in real life, this is how it usually went. Instead of testing the full scope of the functionality on all representative devices in different contexts and conditions (on the move, in building, in different map setups, with network fluctuations, etc.) I would choose 3-5 devices that are most suitable for the final round (hardware differences, OS versions, popularity, known problems), I would choose a subset of tests, I would cluster my tests by the context. I would e.g. go and take one ride on a bus to the subway and back, carrying three devices with me and switching between them to do all the testing that needs to be done on the move. I would ask a coworker to do the easier part of the testing to increase the scope. I would only test functionality that had a chance to be affected by the new build (because I tested everything else in the previous build), and also core functionality. I would do part of the testing with stubs - e.g. on a test environment or using server simulators like Fiddler - because I understood the impact of that and how to emulate the server properly. I would make sure severe bugs that were fixed and tested few builds back did not come back (happens sometimes when the code is merged from one branch to another in Git).

And after I am done, I would write a letter to a PM (CCed to certain team members) where I would specify which areas are left untested, what are the risks involved, what is my recommendation (safe to release/need more time to test/need to fix known bugs) and why.

There was certainly an interesting discussion after I was done with my ER, but as I said, I don't remember it very well. This is what I do remember.

Few people were worried that after seeing a project successfully going into production without going through full testing, PM would decide that full testing isn't actually necessary. We did not have this problem in my company, but I think it's a valid point. I can only suggest to educate your managers each release on what are the risks, and why they matter (e.g. what would be the consequences of the failure).
I think it was Adam Howard who asked whether I used any of the crisis techniques in normally paced testing. I totally did, especially the clustering: I mean, once you figure out a way to do something efficiently, you can't really go back. Well, I am personally too lazy to go back from being efficient. :-) And also, once you went through the emergency, you learn to be ready to the next one, so it's a self-feeding circle, really.
We talked a bit about how important is co-location, in particular for developers, testers and product owners. I also mentioned, and I think few people agreed, that it's also important to have the right to not be disturbed when you choose to. Sean also mentioned that in their company they dealt with teams being geographically distributed by setting up a constant video thread between locations through huge TVs on the wall.

Next ER was presented by Rachel Carson. She talked about speeding up testing (and the whole project) by switching from Waterfall to "Jet Waterfall" - kind of a Waterfall-Agile hybrid if I understood that correctly: cross-functional teams, co-location, more communication within the team. According to Rachel, testing activities didn't change much, they were pretty good to begin with. But the mindset of testers shifted to the more realistic side: now instead of reporting all the bugs, they started to think whether it's useful to report the bug. Sometimes it was better to talk to development and either get it fixed right away or learn that it will never be fixed with a reasonable explanation. They also started to use the Definition of Done rather than pure guts to drive the testing.

These are some of my notes from that ER and discussion afterwards:

Important developer's skill - to be able to share information with non-technical people (Rachel).
When you are forced into defending design decisions, be sure to stay critical about them (Rachel).
When there are many existing bugs on a project, you can use them as a set of data rather than one-by-one, to extract useful info. E.g. areas where bugs are clustering might need more attention and/or redesign. (Adam and Chris).

The last ER on the KWST was done by Adam Howard. He talked about speeding up the whole process of development on a very challenging bound-to-fail project. Which thanks to Adam and his team did not fail in the end. :-)

This is what I got from it:

They used visual models (mostly mindmaps) to see/show the big picture - that allowed the team to make a decision to rewrite some functionality rather than to try and fix numerous existing bugs there.
Fixing all the bugs didn't work because the results of those fixes did't play well together - another reason to use visual models.
Adam tried to switch testing team to the ways of exploratory testing. To make it easier on them, Adam didn't require them to switch - he just showed the way. Some teams accepted new way, some didn't.
It's very challenging to change when you have to do it fast. As a result, after the project was out of the door, and Adam and his team went back to Assurity (it was a contract project), remaining testers went back to the old ways. It's not easy to make even good practices stick if they are new.
Adam and his team partially took BA role on themselves in order to create the big picture view of the project. Some (e.g. Oliver and Katrina) argued that instead of doing it we should rather push BAs to do their job better.

And I believe that's about it. We discussed the exercises from the day before, played some games, said goodbyes and went our ways. It was a lot of fun, and I've learned from every single person who was there. Not everyone is quoted in my two posts (I think I might have missed Ben here, but mentioned him on twitter. Or vice versa), but everyone was brilliant.

I'd like to finish this with mentioning mazing people who made it happen. Thanks to Oliver, Rich, Janice, Nikki (I hope I spelled their names right) and to our sponsors: @AST_News @SoftEdMan and @AssurityNZ!

Saturday, 12 July 2014

#KWST4 day one

I just got back from the awesome two-days long testing workshop KWST4. It was a bit overwhelming for me, with the amount of intense thinking activity and socialization, and I'm still a bit out of it, but nonetheless I'll try to write down all the highlights while I still remember them.

There were at most 17 people in the room, KWST is a pretty small workshop, but that's likely what made it so good. Everyone was participating in discussions, and everyone got to share their experiences and pains. :-) We had four ERs (ER = experience report) on the first day, and three more ERs on the second day. There were also exercise, testing games and a lot of socialization on coffee/lunch breaks. The way ERs worked was someone presented their experience, and then we all had a facilitated discussion around that experience. Main topic of the workshop was "How to speed up testing, and why we shouldn't".

First ER was presented by Sean Cresswell, a test manager from Trademe. Sean shared his experience on how changing the way Risks based testing was perceived in the team allowed to speed up testing. Business unit was already doing what they called Risks analysis, but somehow in the end tests were still prioritized with specification coming first. That led to most important bugs being found closer to the end. Changing the way BU did Risks analysis and reordering tests allowed to change the perception of the testing. It might have taken about the same time as before, but now most critical bugs were found in the beginning of the testing, which gave the impression the testing itself sped up. And come on, we all know that finding serious bugs early is good for so many reasons besides the perception of testing. :-)

Some ideas that came out of this discussion, were:

Risks analysis should affect priorities in development as well as in testing (I think that was by Rachel Carson, but not sure).
Plan the work so that everyone (devs, BAs, testers) are busy all the time - as a planning strategy (Oliver Erlewein).
Instead of step by step howto guides it's sometimes useful to write down lists of questions, answers to which will lead a person through the howto (Andrew Robins).

There were also some pretty heated up discussions around definition of "risk" and such. I personally can recommend Rex Black's book on Risks based testing. Funny enough, when I twitted about the book and jokingly mentioned that it's good despite Rex's evilness (of being involved with ISTQB. ISTQB is evil), I found myself in a twitter argument around Rex's evilness. Huh. Sometimes life is weird.

Moving on, second ER was from Chris Rolls, and he talked about examples of successful and unsuccessful usage of test automation. One of the examples I found interesting was using automation to quickly do regression testing of a web app, after security patch was applied.

Chris also formulated a pretty cool (if you ask me) approach to test automation: when automating existing tests, first transfer test objective. He talked about the common perception of automated tests as step by step repetition of existing manual tests. It's easy to lose the "why" when thinking like this: why are we automating this test, and why do we test it this way. It might very well be that after transferring test objective first, you'll find you can reach that test objective in a different, more efficient way in automation.

Discussion went mostly about automation in general, and testing roles. I liked this idea from Oliver: he said he uses roles to protect people from being pulled away from their main tasks. Example he gave was one of his testers was responsible for developing a testing framework, and Oliver had to protect him from being used as a manual tester when there were lack of those. Apparently, naming the guy "test automation architect" or something like that makes a huge difference to the management. :-)

After the lunch break Thomas Recker presented his ER on coming to do the test automation late in the project. He talked about his experience of being called to "come and automate" when it was too late to influence test design or decisions around testability of the application. He was also asked both to do the testing and to fit some existing list of functionality coverage which didn't play well with how the tests worked. In the end he had to balance actual testing with the bureaucratic part and with the "make it stick together" parts of the job. I think we could all agree that coming into project early and having the opportunity to do test design and such with automation in mind helps with implementing test automation.

There were an interesting discussion around test automation strategies: when does it make sense to automate, and how do you choose what to automate. Few interesting for me things came out of that:

Andrew suggested to try and see the existing test as a measurement: if it can be seen as such, it's a good candidate to automate.
Till Neunast mentioned it makes sense to automate testing around business logic when it's implemented in files/modules that change often. No sense in having automated test that always succeeds because part of the application it checks never changes.
Aaron Hodder insists that you cannot automate testing, because "once it's automated, it's a different 'thing'".
Oliver described the system they have in his workplace: an internal twitter-like tool that is connected with the test automation framework. It works both ways: you can twit to the system to make it start some tests, and system constantly twits about what it's doing, and what are the results of the testing. Pretty cool!
Joshua Raine raised a question about differentiating between test automation that helps you here and now and test automation that requires investment now with a potential pay off in future, and about maybe finding some other types of test automation in the same classification.

In the end of the day Andrew Robins presented the ER number four. His topic was "Enabling the team as a technique to speed up testing". Andrew worked in a challenging environment where the company produced not just software, but hardware as well. That meant that test environment would consist of specially made prototypes, and would be extremely expensive. Previously (before Andrew did his magic) that led to a huge bottleneck where testers only had one environment, and it had to be reconfigured for different tests. Reconfiguration took three days. You can see how it's not good to make 40 people wait while the only existing test environment is being unreachable.
The way Andrew solved that and sped up testing significantly was to plan for testing in advance. Months before the testing was to begin, he started on the task to get more environments. Because there was enough time, the company was able to budget and plan for two test environment, which meant far less reconfiguration bottlenecks as well as opportunity to run different tests in parallel.

Important lesson here for me was to plan for testing as well as plan testing. To be fair we are doing it in OrionHealth, but to have it as a consciously formulated strategy is so much better than just do it intuitively!

One interesting part of the discussion that I managed to note down was about other ways of handling limited environment when there is no way to get a new one. The consensus was that one great technique is to stub/emulate parts of the environment you need. I might also add that it might be a good idea to challenge the "we cannot get another environment" notion and go into the cloud. Clouds allow to spin up additional environments pretty fast, secure and really cheap, so unless some specialized hardware is needed, no reason not to use it.

There was also plenty of stuff tracked on twitter, I do recommend to go through #KWST4 hashtag there. And of course discussions were pretty intense, so I didn't have a chance to note down/twit most of it, and probably left out a lot. It would be interesting to compare with other blog posts on the event. :-)

In the end of the day one we also did a group exercise: we had 10-15 minutes to come up with a common answer for each of the given 4 questions. I teamed up with Adam Howard and Nigel Charman. I like the answers our team gave, and I would totally hire anyone who demonstrates kind of thinking we demonstrated. But funny enough it turned out that part of anonymous test managers who assessed our answers called asking reasonable questions "being evasive" and were more concerned about the formatting and punctuation than about the actual content and what skills it showed. Well, what can I say... I wouldn't hire those test managers and wouldn't go work for them either. It also interestingly shows that hiring process in the testing industry is not at its top game in many companies.

Unrelated to the exercise, sadly enough some are still looking for drones instead of testers and wouldn't even consider a really good tester if he doesn't have ISTQB certification, or N years of experience with some tool. Luckily for us, there are also plenty of smart IT companies on the market who hire people to do the job, not to fit in a cell. :-)

Day 2 to follow.

Wednesday, 4 June 2014

A bit about measuring client-side performance for web applications

What the heck is client-side performance for web applications? Well, it's about the time browser on your user's computer takes to download all the elements on a page and render that page. It's also about the speed of JavaScript on the page. It's not, strictly speaking, about the network between a user and a server, but that usually gets measured too.

Why do you care about it? Because no matter how fast are your servers in, well, serving responses to user's requests - if user still ends up with a slow application, (s)he wouldn't care about the reasons and would rather just go to someone else's site. Sometimes milliseconds matter in how much revenue you'll get, or how many clients your site will bring to you.

I've been doing a bit of research on client-side performance of web applications today, and I found quite a lot of interesting stuff that I didn't know about earlier. I thought it can be useful to put it all in one place. :-)

First of all, amongst all the cool tools I found these two free online analyzers:

Both of them provide pretty detailed statistics and analysis of the page performance, and they also provide simple recommendations on how to fix the discovered issues. First tool has a fair amount of settings allowing you to emulate different browsers, different locations, network throttling, browser settings etc. Second tool doesn't have that much settings, but it shows results for both desktop and mobile site versions. Pretty cool, look for yourself!

Awesome as they are, these two tools can only be applied to publicly available pages. Luckily there are also few simple non-intrusive tools that can help developers and testers in measuring client-side performance of their applications in testing environments - i.e. without having to build some libraries into a code base (as you have to do with boomerang, Google.analytics or countless paid frameworks with similar functionality).

Here we go:

Fiddler for any Windows browser (Charles for Mac browsers?) - measures time spent on downloading different resources, can show a timeline for a page. If you select few requests and look at Timeline tab, you'll see which requests are called in parallel, and which are not.
Google Insights browser extensions (for Chrome and Firefox) - show detailed analysis on page performance, advice on actions to improve it.
AJAX View - ancient proxy from Microsoft, gives performance statistics for JavaScript on a page down to the function level. Looks like the project has been abandoned in 2009, but the downloadable package is still there.
APM DynaTrace AJAX edition - paid tool with free trial, that gives page performance analysis down to function level.

As everyone knows, it's much cheaper to prevent than to fix, so here's few articles I bumped into that can help developers in designing a page so as to maximize perceived client-side performance:

Cool thing about both of these lists is that it's all easily applicable on a stage of functional testing. And it should be! In my experience functional testers are perfectly good in noticing when the software under test is slow, but it rarely gets measured or reported, so it stays on the level "I feel this could be faster". With these tools it's quite easy to look under the hood, get some measurement and communicate with developers in a language of facts rather than feelings.

To go further down the rabbit hole you'll need active participation from development team. There are plenty frameworks out there that allow you to gather information on performance on real users' computers. The only catch is that you have to actually include some additional code into your application to do that. The amount of code would differ between frameworks, and so will the price. I haven't actually tried to use any of those frameworks myself so I cannot compare them, but development team should be able to choose the one that suits your case the best.

That's just those I found today, free ones are marked green:

https://github.com/HubSpot/BuckyClient - "Bucky is a client and server for sending performance data from the client into statsd+graphite, OpenTSDB, or any other stats aggregator of your choice." That's the only self-hosted tool I found - i.e. tool which won't send any information whatsoever to a third party servers.
Google Analytics
- https://developers.google.com/analytics/devguides/platform/
- https://developers.google.com/analytics/devguides/collection/analyticsjs/
http://newrelic.com/real-user-monitoring - New Relic Browser, has free limited features offering and 2-weeks full features trial
App Dynamics - http://www.appdynamics.com/ - a paid tool, similar to google analytics
http://www.compuware.com/en_us/application-performance-management/products/user-experience-management/real-user-monitoring-web-and-mobile.html
http://www.compuware.com/en_us/application-performance-management/products/dynatrace-free-trial.html
http://www.lognormal.com/boomerang/doc/
http://www.real-user-monitoring.com/

Aaaaand that's probably it. I hope it'll give someone a start. :-)

Sunday, 1 June 2014

Darkwing Duck about testing

"Darkwing Duck" is my favorite Disney TV series. It's awesome. It has superheroes, cool variety of villains, puns, cultural references, a very active extra smart girl and crazy plots.

Recently I've been watching episode 7 "Dirty money" again first time in few years, and realized: that's a saga about ISTQB vs Context-driven testers! :-D Agent Grizzlikof tries to force DW to do everything by the book, but instead our hero succeeds using his own genuine methods and situational awareness.

And in the end of the episode Darkwing Duck shows us one of the few ways in which ISTQB Foundation textbook can be useful. Yep, that's how it goes - it's only as useful as the paper on which it has been printed. Ignore the letters, use the paper... works for crime fighters!

And more on this topic, I believe that detective work is a very cool area to learn from for a tester. It's a similar business: go into the unknown and find information you seek using scientific methods, reasoning and wide domain and technical knowledge.

P.S.: I'm sure in actual law enforcement structures there are a lot of useful rules, because in their case people get hurt when someone makes a mistake, so inventing the bicycle each time has price too high to pay. And there are plenty of useful rules in testing (e.g. "Ask questions to understand") too, but ISTQB book is not about those nice flexible recommendations-rules. So my metaphor stands. :-)

Wednesday, 28 May 2014

Fair pay for IT specialists

I have this dream to get my own testing consultancy going in few years. Find good thinkers, find clients in need of smart testing, do the right thing... nothing fancy, just good ol' testing. There are plenty of challenges ahead of me on the way to this dream (it's not even a SMART goal yet), of course, but from time to time I like to think about little details as if I was already there.

Today I got to this: how would I organize payment system in my dream company?

I've been in the industry for almost 11 years now, and everywhere I see more or less similar picture:
- people get hired from outside - vs promoting capable people on the inside;
- a newly hired specialist gets higher salary that someone on a similar position who's already working for the company;
- companies don't work hard enough to keep their people as they consider them easily replaceable.

Now, I don't know about you, but to me it seems unfair and straight dumb. IT is an industry mostly getting its profit from innovation, and people are the basic material making that profit possible. People are important. Software company is not a factory! It can never work as a factory for a very simple reason: factories deal with repetitive tasks, whereas software development always deals with new tasks in a constantly changing environment.
An equally simple fact follows: if you want to be successful, you cannot treat people who work for you as replaceable parts of the system. It means both high expectations and high rewards.

So, going back to the initial problem... this is what I'm thinking to apply to my future business:
- a newly hired specialist will get salary not higher than someone already working in the company who is doing the same job. Which automatically means that salaries for anyone inside the company are at least on par with the market;
- all salary levels will be transparent as well as requirements for moving between those levels;
- structure of profits and spends will be available for anyone inside the company to see and comment on;
- when we need someone to fill a position inside the company with the higher level of competence than there already exists, we would first look for anyone who is ready to upskill and take it as a challenge inside the company.

I strongly believe that if a company wants loyalty, it has to be loyal to its employees first. And it also makes perfect commercial sense to me: the more happy, motivated and high qualified people I gather - the more efficient and high quality work we can do, the more happy clients will be, the more clients will want us. Profit!

So, does anyone think my payment rules would work? Do you think it's fair? Would you like to work in a company with rules like these? Do not hesitate to tell me if you think that's an awful idea.

Thursday, 8 May 2014

Lessons learned in non-functional testing

Yesterday I gave a lightning talk at my favorite WeTest Auckland meetup hosted by Assurity. It was a five minutes (okay, maybe 6 minutes) long presentation with the 15 minutes of discussion afterwards. So, for five minutes I had to make it really short, yet making sense.

The main idea I was trying to express is really simple: plan for non-functional quality characteristics right from the beginning.

As an IT graduate I had this assumption when I first started to work as a software tester, that developers would do proper technical and database design according to the task at hand, and then perform testing of the product from the very beginning. In other words, I thought it was obvious that someone would start testing as soon as you get your first requirements from a customer. In reality it wasn't like this at all. Testing usually started at the latest stages of development, which cost projects lots of work hours, effort and money, and often reputational losts. As years passed by, situation seemed to improved. Nowadays most IT companies accept that testing is important, and that it should begin as soon as possible. With one "but" - it only applies to functional testing.

When I switched from functional to performance testing last September, I didn't have that naive assumption about functional testing beeing done the right way everywhere... turns out, I still had even more naive assumption that non-functional testing is done the right way. I haven't done any serious programming in 7 years, yet when occasionally I think about implementing this or that, I still operate by the algorithm I learned in uni: I ask myself questions about a problem I'm trying to solve, and about the future of the software that would solve it. I thought it was obvious: when you write e.g. a web application, you think about bunch of stuff on top of functionality: how would it scale, how would you plug in new functionality, how would you localize it, how would you customize it, what questions will users ask about it, how would you test it, and so on. Sometimes most of the answers would be "it doesn't matter because..." - but important thing is to ask questions and know that for sure.

Well, in real life it happens rarely. I'm not talking about companies like Google or Amazon - surely those guys know what they are doing if only because they had enough experience in the area. But most companies aren't Google. What I see in real life is fire fighting: application is created, then it goes into production, and then you have problems in production, which are extremely expensive to fix. Some issues are obvious: e.g. security breaches and performance and/or scalability breakdowns. Some issues are more subtle: e.g. bad supportability would mean a lot of time and effort spent by field agents (support line, or operators, or whatever you call people who talk with unhappy users). Bad maintainability would mean additional problems when you need to install updates or reconfigure something in live system. Bad testability would mean more time necessary for testing e.g. new release, and potentialy worse quality because some areas testers won't be able to reach. Bad initial usability will bite you in the ass when you decide to implement shiny new GUI which would be different from the existing one, because users will be resistant to changes (remember MS Office 2003 -> 2007 outcry). And what's even worse, to fix non-functional issue often requires to implement fundamental architecture changes.

To summarize, non-functional quality is important, it costs a lot to fix, it isn't being given enough attention.
So, the question is, if you work in one of those companies, what can you do?

As part of the team, regardless of your role, you can communicate and underline the importance of the following:

When doing tech design for a new application or a new feature, think about the future. What would happen with your application in a year? What new functionality it might have? What would be the environment and the load (e.g. number of users per hour)? Are you gonna store any sensitive data? Would you need to create additional applications (e.g. mobile) to coordinate with it? What additional data you might need to store? How would you localize/customize it if necessary? How often would you need to release a new version?
Do risks analysis. Always do risks analysis - better be prepared for the trouble that might come your way.
Involve specialists in specific fields if your team lacks those: there are consultancies, that can review tech design, or do security testing, or performance testing, for you.
Start the actual testing as soon as possible - not only functional testing, but non-functional too. On early stages it would look differently from how it loos on later stages: e.g. it makes little to no sense to do a full performance testing for a half-made release, but it is possible to do performance testing of existing separate components, especially of the core ones. It pays off because the earlier you fix the issue, the cheaper it is.
Treat non-functional issues fair. If the team doesn't fix e.g. testability issues in time, you will pay for it later with much more complicated testing and higher price for building testability into the project. If performance or security issues are not being treated with respect because "it still works, no one will notice it's slow/insecure", you will pay with your reputation and money when it manifests on production. And if your product is a success, it will manifest in production sooner or later. #too_much_painful_experience

After I did the talk, one of the testers on the meetup asked if all of this makes sense for startups and such - she implied that in the beginning you don't want to spend your time thinking about all this stuff, because you are under pressure to deliver fast. This is another point of view, but I strongly disagree with it. The only case in which you can forget about quality is when you are making proof of concept, then throw it out the window and start anew. If you are gonna use the codebase of your first release further, it pays off greatly to spend few extra hours (and in the very beginning you would only need few extra hours) on analysing the future of your application and doing your tech design (database design and application design) properly.

Remember: there is nothing more permanent than a temporary solution.

Slides for the presentation are shared here:
https://docs.google.com/file/d/0Bxi4eMT3I97ea3luRU9nemRpODg/edit