On Software Development: October 2008

Sunday, October 26, 2008

Joel on software

If you did not read his blog, you need to. (http://joelonsoftware.com)
He is one of the best writers on software development , and my favorite blogger.

Not all the posts are on software managment. The best software blogs , are from few years back. I gurantee it is worth your time to go to his archieve and look at it.

A great recent pearl on standard API: martian headsets
enjoy!

coding and testing pipeline

This is one good Agile/Scrum concepts which is rarely used in most of the teams. I recommend using it even if your working "waterfall".

Instead of releasing the full feature set for testing on beta /hand-over-to-test day, you need to have small releases , each one both stable and has additional testable feature.
Benefits:

This will allow the test team to start testing even before the regular date (more time to test means more bugs are found)
reduce the instability of the version , which save time for both developers and QA.
Allow faster feedback on bugs for the developers.

There should be be multiple releases in each version. Don`t go to far and release one each week (remember that it has an overhead which can never reduced to zero, no matter what the "Agile" methodology says).
I believe a once-in-month period is sufficent , for a 3 months-version perioid.

Tester expertize

For each new feature , make sure the QA (quality assurance) team gets all the info/training material as the developer does.

If the tester learns what to test from the original developer , the testing level will be very poor , and will only verify that what the developer already knows.

Anecdote regarding this:

On of my first jobs, I was responsible of an old feature which someone else (which is long gone) wrote. The tester did not know the feature and asked me , the responsible developer , what does this feature do.

After asking around and understanding that no one around knew this feature, I had to dwell into the code and describe what the code does.

The tester happily wrote it down and then went back to his office and started testing it.

Anyone see`s the problem here?

The tester tested my understanding of the code and not what the feature should do. He found a bug and showed it to me. I then realized I failed to read what the code does in a corner case. Both he and I did not know whether the actual code behavior was ok or not , and we ended up telling that I was wrong understaning the code , not that the code has a bug.

Develolper Expertize

Trivial principle ,yet usually ignored because lack of time in the short run.

Developers must have expertize in both the subject domain they are working on, whether it is finance/networking or music.
They should also have a minimal level of knowledge in the programming language they are using.
Sounds trivial? it is not. Most teams has at least one "novice" programmer strait from the university which, usually, don`t have the required expertise I mentioned. You can:

Let him code new features
Let him only fix old features
Let him watch but never touch production-code. He can touch non-critical scripts.
We do not hire inexperienced employees.

What happens in your team?
I bet most of you answer (2) , some answer (1) , know that it is bad but have tight deadlines and only small percentage will answer (3) , and almost no one answered (4).

What is the right answer? I have to go with (3) and in some cases even (4).
I think the best analogy to this questions , is a case where a father and son need to paint the house together.
Asking your 3-years old boy to help you paint the house is not helpfull. Yes , he is cute and eager to help , but we both know you can do it alone in half the time you will do it together , because you need to superwise him all the time , make sure he does not "eat the paint" and paint the furntiure.
When your kid is 10 years old you may start using him, knowing that although you will not save time yet, this will help your kid learn the craft.
When your kid is 12 , you still supervise him, but already gain a good time boost with his help.
When your kid is 15 , he can do the job himself , and if you work together , you get the job done in half the time.

Setting the analogy aside, you can can let your new employee write non-production code. This is either internal-company-code or even unit-test-code. At the first week or two , you probably won`t gain any time benefit from him doing that , but like the 10 years old kid , the developer learns the craft.
For those of you who answered (1) with tight-deadlines , please understand that it will only take you more time , in the medium-long run. What is code good for if it is written fast but does not pass first QA tests?
For those who answered (2) , another analogy: lets say you missed a very small spot when painting the cieling , will you give your 3-years old boy the brush and a ladder and let him loose? you will probably , again , loose time. Fixing bugs is not a good way to make new employees learn the craft. It is only the "easy" (not caring) way to do it.

Last emphasize , expertise of a new subject is needed even for an engineer with dozen years of experience. For example , multi-threading expertise is a hard and quite rare one.
Don`t let any programmer write multi-threaded code before s/he had a course/book in the matter and few weeks(months?) of experience with writing non-production multi-threaded code.

Increasing Quality - review and unit-tests

These principles aim is to increase code-quality and thus overall productivy. They aim to reduce the number of bugs which slip the development team. We know that bugs take (exponentially) longer time to solve depending to how fast they were traced.

The 3 principles are:

design review
unit-testing
code review

Design review

The most important review is not the code-review but the design review. This stage is mainly ignored in most of the companies. After the design is ready (Im implicitly assuming you are doing some sort of a design before starting to code) , let the tech-leads/seniors/smart-people review it and give remarks.
The benefits of design review:

(a) change done in this change and not after the code-review , will save days of work later.

(b) change done here will most probably be actually done. A lot of code-reviews ends up saying "it should have been done differently but we do not have time to fix it and re-test it now , so leave it be". In the design-review stage these does not happen.

(c) will be used to communicate initial API between different components , helping other developers understand what to expect for.

Note that design-review must be enforced by the managers on the tech-leads/seniors , as it is usually done in the period when the tech-leads has the least time.The manager must press them on this issue.

unit test

The best way to know the quality of a component is to ask the developer what is the level of functionallity he tested. Let there be the following answers: "It may work" , "It probably works even on hard cases" , "It defentially works on all cases".

The good result should be: "It defenitally works on avegarage and it probably works even in corner cases". less then that and they did not test enough , more than that , and they wasted time on unnecerayy tests.

Developers should do extensive unit-testing , manaul , or better yet, automated ( in java the term "JUnits" is most familiar) . This developer testing goal should be to test the high-risk sections of the code. 80-90% coverage will be better than 100% coverage , if you have more tests of error-prone code.We must understand that on 99% of the features, full code coverage is both impossible and a waste of resouces!

It is OK for a developer to leave some tedious test cases to the quality-assurance people as long as the developers do make an effort to test few of these cases and knows that the code "probabily works".

code review

This is done quite commonly and its a good practice. This stage can help trace dozens of bugs very fast. It also helps to spread knowledge between team members , and provide coding feedback for the coder.

One uncommon practice in code review I want to recommend is that the code-review will be done"offline" without the original developer sitting next to the reviewer.

This will ensure the developer writes code that another human being can understand (usually using remarks, good naming conventions etc) , as he knows someone will read his code in few weeks and will suffer otherwise.

It will also ensure that no bugs are swept under the table, as you don`t talk with the code-reviewer directly.

The coder and the reviewer will meet twice : once before the code review for high-level explanation and once after the reviewer read the whole code, to share his remarks.

Bug fixing assignment - the greatest dilema of them all

closing to the end of the first version development, the team find few bugs in the version. It always happen. When should these bugs be fixed?
Don`t listen to the engineer which will tell you: "I`m your best engineer , developing the cutting edge features the fastest in the company. Let the new programmers to solve the bugs. This way they will learn the system and make me available to new development".
Does the enginner sounds reasnible to you? A lot of managers falls to this pit. Its not that the engineer is evil , s/he usually really believes that is the best way to work.
It is not.

I got two simple principles for you regaring bug-fixing:

The bug-fixer should be the original coder if possible (and it is the manager job to make it possible)
The bug-fixing time should be close to the finding of the bug time . It does not have to be in that day , but shortly after. (and not in the next bug-fix stage which is 3 months later)

There is a great debate on these two principles, lots of manager will say the complete opposite. I will explain why I believe they are true.

Principle 1 - The bug-fixer should be the original coder (if possible)

feedback - The coder understand his own coding mistakes and becomes a better coder, which produces less bugs.
It will take the orginal coder a slight percentage of the time it will take to another programmer to solve it.
When solving a bug , there is always a small chance to introduce a new different one. When the fixer is not the original coder , the chances become much bigger.
Not all the bugs , but the nasty ones , are only simptoms to other ("chronic") bugs. The original coder will find the real bug and treat it. Other programmer will probably cure the symtoms , not the problem. One simple example: a race-condition bug can be solved locally , but can be a symptom of multi-threading problem across-the-system.

Principle 2 - The bug-fixing time should be close to the finding of the bug time
If you wait to end of the current version to fix the bug you can have few problems:

When all the bug-fixes are concentrated in the end of the version , it is guranteed , that the bug assignment will never be as good.
unstable development version until that time. Finding workaround for multiple developers will usually waste more time that fixing it on the first place.
unstable test version - the QA team may not be able to test a feature well , or waste time in discovering an already known bug. This will further reduce the quality of the version.

Focal points

This is a very simple principle which take`s zero effort to implement. Surprisingly, a lot of teams do not implement it.

The problem:
When outsiders (non group guys) want to talk with insiders about a bug/feature/just-an-explanation , they rarely know who to refer to in the group.
In a lot of teams , the outsider need to go through all the group members , one-by-one , to find out who can help him with the relevant subject. When this one-by-one walk is done by email, the outsider will probably give up after three or four fwd`ed mails.
Add to that that sometimes the first-line-manager or 2nd-line-manager are also outsiders , and you can see what a horrific waste of time and bad intra-group communication is done here.
On some cases (aka "Agile") this can also happen inside the group , as a feature/bug does not have a clear go-to guy and instead the "whole group" is responsible to everything.

Solution:
It does not matter if you used the "circle-of-knowledge" principle / Agile / Scrum or any other methodologies, you must assign one "focal point" person for each feature/area.

The focal point will know which feature/bugs are being developed and will be able to say who developed it ,which bugs may be related and pinpoint the exact person which will be able to give you the final answer.
They must know the field well , but do not need to be the one which actually write the code or knows the exact small details of everything.

In a flat-hierarchy team , there can be a lot of focal-points , each one is responsible for his own area-of-expertize.
In a traditional hierarchy team , the group/team leaders tend to be the focal points for all the areas their team are doing.

It takes a MAX of 3 hops to get to the person which knows the answer:
first hop - Danny points you to the focal point of this subject, Joe.
second hop - Joe either knows the answer right away and we are done here, or knows exactly who solved the bug a month ago (Jenny) and send her a email, cc-ing himself regarding the problem.
third hop - Jenny answers the problem and will never "pass-it-on".

a final note: DO NOT MISTAKE FOCAL-POINT TO TEAM LEADER.
Focal-point can be a non-senior programmer too, if he is the main expert in his field (or the field is so un-intresting that no one wants to get near it)

How to fight against "job-security" employee-principle using the "Circles-of-knowledge" manager-principle

This is a great principle I learned from a great manager in one of the last jobs I had.
I`ll start describing the problem that this principle solves by a true example:

A startup company , lets call it , "the guild" has 20 good dedicated workers which build a great and complex piece of software. As "the guild"`s buisness is blooming they want to recruit 20 new employees each year.
Some of the existing employees, usually the less bright ones , fear for their job/status when the new employees arrive and think of a clever way to have "job security". They do not teach the other employees anything about the systems they are working on and remain the only one knowning how to operate these systems.
At the first year, the company still manage fine. The new employees work tend to "flow" around the old employee`s systems and the new and the old work in hamoney.
At the second year, the old employees understand that they are the only ones knowing how to operate the core systems and they demand better salary/promotion etc. The manager ask himself (and some of his workers) can anyone else knows how to do thier jobs ? as the answer is negative - they did not allow anyone to get near those core-systems , the manager promotes the non-bright people to a team-leader jobs.
At the third year , those team-leaders , continue to use the principle that helped them advance so well , on their previous jobs (the core-systems) and the new jobs: their whole new team don`t allow anyone near their code.
At the fourth year , the "job security" principle is so bad , that no team can work with another team and no new feature which needs inter-team cooperation can be done.
The employess demand a raise again , and as they are now totally irriplacable , get it.
At the fifth year , the employees are getting bored and decide to move to another company. They allow one month notice in which they will be willing to explain all the core-systems that they wrote to the new employee replacing them, but one month is never enough to pass a five years body of knowledge.
At the sixth year, things in the core-systems , which now are not maintanaced by anyone , start to break. No one knows how to fix them and the core-systems are slowly getting deprecated and finally shut down.

This is a true story which happend in various teams in the same company. I`ll bet you met at least one employee which is creating his "job-security" in every big company you worked on.

Some of you are probably running to the "comments" and going to tell me about the great software development methadologies you are working on ( like "Agile/XP") who succesfully solve this problem by making sure that no one person will ever be responsible to one component.
So let me say one thing, Agile/XP solve this problem , but creates another, even more serious one: Instead of only one person knows how to work with the core-system , Agile/XP makes sure that NO ONE will know how to work with the core-system.

Lets me describe the "circles-of-knowledge" solution (again, a young manager I worked with created and perfected it. Im only describing his idea). It can be summed by three simple principles:

Inner circle principle - Every big feature will have few(1-2) people which are responsible for it and have excellent knowledge on the code/bugs of the feature.
Outer circle principle - When coding/bug-fixing a big feature, few(1-2) other people will get tasks related to the feature which they are not in its inner-circle.
Make sure the circles overlaps and interwind as much as you can, creating good redundency among your employees

Usage example:
On our team there there are 5 people ,lets call them A,B,C,D and E. The team task is to develop a client-server system with 4 components, with the following man-hours percentages:
server side framework (20%) , server-side application(big feature 40%) , client-side framework(10%) , client-side application(30%).
This is an example , of work-division according to the circle-of-knowledge principle:
A: inner circle - server-side FW outer circle - server-side apps
B: inner circle - server-side Apps outer circle - server-side FW
C: inner circle - server-side Apps outer circle - client-side FW,client-side apps
D: inner cirlce - client-side FW outer-circle- server-side apps
E: inner circle - clinet-side apps outer circle - client-side FW

This is only one example , of a possible division. The important thing to note is that big features will have more than one people in the inner-circle , and few on the outer circle too and that even the smallest feature will have someone in the outer-circle which have knowledge of it.
And as there are more employees (imagine same team with 10 people) the cirlcels overlaps become clearer.

Using these priciples , it is easy to see the end result:
Redundency - you can "loose" 2 employees and still have excellent knowledge coverage of all the features. Even if your "loose" half of your team, you will still have excellent knowledge about the inner-circle features , and reasnilblee knowledge of the other featues , they covered as part of their outer circle.
Resposibility - at any given time , each employee is directly responsible for a feature of his own. . It is clear who can help in a problem. It is clear who to praise for a job well done and whoto blaim for a system defacts.
Efficency - As all the features are covered with excellent level of knowledge , it is easy to develop new code or solve bugs. There is never a "black-box" feature which all dread to touch.
But you never incrase the outer-circle to the whole team, preventing the problem of "all-need-to-know everthing but end up knowing nothing".

A final note to clear things up: The division to the circle of knowledge is being created over a period of time , meaning , don`t go and divide each two-weeks feature into 4 different people (thats the mistake agile/xp guys tend to do). Instead, make sure that over a long period of time , like 3-6 months , you slowly create this coverage. Example; The first team (inner-circle) will do the core feature on the first 2 months. In the 3rd month a small feature needs to be added - assign one guy from the inner circle and one from the outer-circle. In the 4th month another small feature is added, again , put one from the inner circle and one from the outer-circle , and so on and so on.

Using this method , is not easy at first for the manager, cause it requires a lot of premeditation of his part, but it is worth it. Who said the work of a manager is easy ...

Good Manager

You need a lot of character to become a good manager. I can`t teach you that (other blogs claim they can ...)
I can give you simple work principles ,to-do and don`t-do list, I collected from the good/bad managers I had the privilege/horror to work with.

Recruiting: Small excellent groups are far more productive than big moderate groups.
Following this rule allegedly contradicts a known principle: "The more people the manager has, the more important the manager is". Yes , known problem.
I can only suggest setting your ego aside when recruiting new employees and thinking what is really best, long run for the business and thus for you. I hope you will choose wisely.
Flatten the hierarchy of your team.
Find how many people can you effectively manage by yourself. The number is usually capped at 5-7. people. If you have that many people, treat all as equal (in importance not in knowledge). If you have more than that , create sub-hierarchy's by creating 2-4 groups in the team and provide one group leader for each group. Tell them to treat each one of their employees as equal [ this is a recursive principle :) ]
Do a weekly-review of one hour with the whole team. Share managerial decisions (those you can) in the weekly meetings. Don`t let rumors rise , and Send a summary of the weeky-review after it , so people who could not join won`t waist a full day asking what did they miss.
When dividing tasks/pay you can never be equal with everyone. thats a fact.
So Don`t hesitate to tell your employees that they will be compensaited in the next tasks/pay division. It will usually change their mood from sad "I wanna quit" to joy "The boss owes me a great thing next time" . Remember that they will wait for you to keep your promise.
keep your promises to your employees.

Hello World

In this blog , I will talk about software development.
The good
The bad
and the Ugly things in software development.

It is not intended for specific programming language , but on principals related to managing a software project as a whole.
Trying to summarize here the work experience I had in quite a few projects , in small companies (start ups) and very big companies.

so , lets start.

On Software Development