IT Woes: Blown Deadlines.
I’ve seen everything in corporate IT over 25+ years. Servers crashing every five minutes, the runaway defect rate, key developers leaving due to the hostile environment… Not going to name the people or companies, as I am sure they learned their lessons. I am an engineer, so for me the 70–90% IT failure rate is about the opportunity to solve the problem, rather than place the blame and climb the corporate ladder.
Every IT Problem Is Technical
Before I get into blown deadlines, overrun costs, and other consistent estimation issues, let’s pause for a moment and think about those estimates. What is your process exactly? Other than guessing. Is it… technical? Not in a vague “high-level enterprise architecture” kind of way. How well you understand what you are building — from the programming standpoint?
Having spent 25+ years in IT I am well-aware of all the pitfalls: scope creep, eternal developer’s “two week” promises, integration issues between seemingly straightforward third-party pieces, etc. Unpredictable, they are, even in a generously padded project, those things are hardly an excuse for repeating estimation mistakes. The pressure from above? All customer (stakeholder, etc.) expectations are based on some scope and more or less reasonable.
E.g. if something looks like 10 screens, backed by 20 database entities (aka “tables” in relational databases), the amount of work to implement those 10 browser screens with HTML, and the amount of underlying business logic to map 20 tables to 10 screens is finite. 10–15% leeway at best. No surprises. Good developers do one (tested and ready to ship) screen per day.
How I know it takes a day? I am an engineer. It’d have taken me that long.
Three things are required to estimate any software development effort.
- Knowing what you are building — up to the littlest (technical) detail about the smallest part/module of your system.
- Knowing how you intend to build them: again, up to every little framework and API use. Only then you’ll know how long it’d take you.
- Confidence, that your team is as capable, as you are, so it’ll be a similar effort time/cost-wise compared to the ideal project staffed by experts like you.
Team Capabilities
Let’s start with #3 above. Add testing and release activities and adjust for your team capabilities, meaning the people able to complete the task. There is a finite number of capable programmers on Earth. Let’s call them “senior” for the sake of discussion: engineers able to work w/o supervision and rework/cleanup — i.e. exhibiting positive performance.
Take my 25 years in the industry and 20+ multi-million enterprise projects for what it’s worth. 80/20 or another heuristic rule, the few aforementioned experts write all of the code, while the rest writes the code to be rewritten by true programmers during bug fixes. Ever thought of coding this way? Who ends up authoring all of the code over time?
Talented juniors have negative performance too i.e. create unneeded work for senior developers babysitting them. They are an investment — all I am going to say for now, as the topic deserves a separate article. Everyone else… e.g. mythical “mid-level” engineers: not so talented eternal juniors unable to learn despite their “years of experience”, simply drag the project down due to their negative performance.
So it’s either one-two days per moderate screen, or the project is never delivered at all, evident by blown deadlines, cost overruns, and other things leading to the slow death at the eternal corporate IT 70–90% project failure rate. There is no middle ground: some linear skill-related expectation of e.g. five lower skilled (and paid) developers producing as much, as one “rockstar”. They won’t produce working code at all. Resulting in the “rockstar” working on 100% of their assignments at the end, like I said — in addition to his/her code. After lots of expensive QA cycles, finger-pointing, and other unproductive commotion.
Captain Obvious for some (engineers), unfortunately I’ve seen lots of naive attempts to replace one by five (10, 20, etc.), “find some work for them” (the code monkeys), and otherwise treat creative individual-centric occupations like software engineering, as some kind of assembly line staffed by minimum wage workers overseen by a foreman typically called “enterprise architect”. I want to keep the discussion pure and logical, so I won’t get into the irrational reasons like man-hour fraud — the specialty of the Great IT Consulting Food Chain.
It is also not a “geeks vs. suits” standoff. Business schools teach all the right things, but there are tasks and times to be, you know… technical. Business and engineering are two different activities. Classic MBA-level “business” is largely statistical: focused on trends and long-term strategy. Engineering is precise i.e. tactical: focused on one thing — killing the enemy in front of you, while the generals plan future battles.
Enterprise software development process needs that precision, doesn’t it? Starting with the time and cost estimates. So what’s your method? From the technical standpoint.
I can tell you mine. I am an engineer. I do the three things above: the “what” and “how”: clarifying and streamlining the requirements, then mapping them to technologies, which is an elaborate and non-trivial process compared to mainstream “solution architecture” (more on that below). The third step is to build/adjust the capable team.
There are many ways to plan in terms of “what” and “how”. I do classic technical design: write down detailed bullet point lists of “todos” based on careful research of every little part. Is that how you plan your projects too? What is your exact process? Padding guesstimates? Let’s get to the bottom of the problem, like good engineers, instead of treating the symptoms, so to speak, with such padding.
Knowing What and How. “Architect’s” Responsibility?
I am not going to speak for Accounts Receivable, Manufacturing Control, and other examples of rather “general” management. Software engineering is a bit more complex than those to treat it like an assembly line I mentioned above — with its foremen — so called “architects”, all annoying “technical mumbo-jumbo” is typically “delegated” to.
Normal (construction-related) architecture is more akin UI design in IT, than useless “enterprise architecture” “deliverables”: five year old coloring book worthy “diagrams” of boxes and arrows: as detached from reality, as non-technical manager’s guesses how many specific “resources” is needed and how long things would take.
Semi-technical architects don’t know “whats” and “hows” — no universal books to build inherently custom systems by. Every business process is unique and challenging. There are tons of conflicting workflow requirements and convoluted regulations to start with, let alone incompatible technologies, not immediately apparent to the typical “architect” with zero coding experience.
Having fixed many tough bugs related to conflicting business rules, the lowest code monkey becomes the de-facto Subject Matter Expert (SME), knowing the domain better than any power user. He/she also becomes THE architect, choosing the final set of frameworks and services: the combination that actually works, after lots of tricky adjustments and workarounds — almost always entirely different from the vaguely recommended by the official “architect” “technology stack”.
Even if a non-engineer (manager or architect) wants to do everything by the book, e.g. plans to hire the front end, “middle tier”, and back end specialists, how much he/she knows about any of these activities? How many of those specialists are needed? What exactly they’ll produce? And when — interacting with each other? Need to know what every team member is doing before drafting a Gantt diagram. These are 100% technical questions, don’t you think?
There is no such thing as “technical” background. How would you feel about a surgeon with a “medical background” operating on you? Or general with a “military background” leading troops into battle? They are doctors and soldiers, period. Is it unreasonable to require everyone building software: at all levels to be an engineer?
Embracing it and empowering every engineer in the team to be a top generalist is key to the project’s success. Because every engineer would function as a (bad) generalist: e.g. SME and architect anyway due to teh creative nature of software development. Unless you want those assembly line workers to constantly bug the foreman (who’d in turn bug the shift supervisor and so on and so forth) about every little detail e.g. “You told us to use this framework. It’s not working with another framework. We are stuck. Please pick different frameworks for us… Thank you, it’s working now, but we started having problems with the third framework…”
Ever heard of Work to Rule aka Italian Strike? Doubt it’s intentional sabotage in IT context. Nevertheless it’s just as devastating in creative occupations like software engineering. That’s what all IT “communication problems” are about. People need to grow professionally to function as an effective team. No formal “team-building”, let alone magical “oversight” can make it happen.
The benefit is not linear. It’s exponential. Empowering everyone in the team to become a generalist (founder-level developer if you will) leads to 10–100x productivity (per my experience), meaning you won’t need as many people, further reducing things lost in translation.
Modeling IT after late 1800s factories with thousands of workers and the lengthy chains of command probably worked at the dawn of computer era. No longer an option due to runaway business process complexity. Software development is extremely vulnerable to incompetence. I am talking about technology, not management mistakes. Your team is as weak, as it’s lowest code monkey. Not as capable, as its credentials-clad “architect”. If you even find a real engineer working in that capacity, since it tends to be an all-talk pre-sales occupation to advise on buying IBM or Oracle or vaguely explain how Kafka works.
Y2K Deadline: the Perfect Managerial Solution. Did It Work?
Was the Y2K aversion a success or failure? It’s obvious, that the original creators of 1970s mainframe software didn’t expect it to live that long. Hence only two spaces for the year to save the precious (at the time) disk space and RAM. All so called “legacy” systems were supposed to be completely rewritten in newer languages e.g. C++ at the time, storing data in newer (compared to 1970s flat files) relational databases.
It didn’t happen in the late 90s. The engineers tried. The technology wasn’t ready to support the business automation complexity: robust information flows, convoluted government reporting and compliance regulations, etc. increased hundred-fold since the 70s.
I mean the conventional technology sold by SAP, IBM, and Oracle, and pushed by the aforementioned “architects” wasn’t. Still isn’t — evident by lots of COBOL and other mainframe software in production. Go to any big department store or any auto dealership and see those terminal screens for yourself. Embedded in the browser or staring at you in all their green on black alphanumeric glory, doesn’t matter.
After the engineers tried and failed, their non-technical bosses got their chance, using what they were taught in business school: the brute force - hiring the cheapest code monkeys to change every year field from two digits to four. I know, not the most elegant solution engineering-wise, let alone viable long-term (I’ll give it another 10 years), but nevertheless showing the real value of so called “architecture”, don’t you think?
No “Architects” at Google
How it should have been done? Around the same time (late 90s) the “big tech” was born — starting with Google. It doesn’t have architects or functional managers. Just engineers. A lot of outsiders picture Google and Amazon as mysterious scientific facilities staffed by MIT and Caltech PhDs in lab coats. While any programmer knows, they are normal software engineering companies. So how their products compare to IT projects e.g. multi-million ERPs? Why Google doesn’t have the 70%+ project failure rate? Because its products are less complex compared to business process automation?
No, because no one has applied Google-level programming to IT problems. Read my OOP vs FP post here if you want more technical details. It’s not rocket science. Google’s normal quality programming shines only because IT has never known it. After 20+ years in American IT alone I am yet to see a single piece of C++ or Java code, that is minimally object-oriented i.e. not a “struct with functions”.
The industry was whole 20 years ago. Now it is split into the space age B2C “big tech” and “outsourced” IT still celebrating the Y2K aversion and continuing the decades-old crusade to eradicate “expensive” programmers through mythical DIY tools and “almost-turnkey” ERP packages. Why can’t it embrace normal programming like Google did? No “scientific” magic. Even the same programming languages like Java. Isn’t it time to move on from the 1970s COBOL-inspired architectures and 1980s Oracle Forms, all today’s conventional business software development tools mimic? Simply take Amazon, Google, and Facebook inventions, available for free open-source. Read my other, more technical articles here, if you are curious what I meant exactly.
In an y case, tell me, what’s preventing an average ”IT organization” to become a Google? Hard to find top talent? Google values smart grads. Different pay? Google didn’t pay its engineers $500K until recently. It is the attitude and process, nothing else. Engineers empowered to be top generalists.
Most of IT problems, including seemingly organizational ones (communication issues, poor motivation, etc.) are 100% technical. I am an engineer, and that’s what we, engineers do for living: unconditionally solve problems instead of facilitating and mitigating.
Case Study One: A Moderately-Scoped IT Project
I’m not going to talk about $100M ERPs. Let’s pick something smaller and remove as many uncertainties, as we can. The case study I chose for this post was a straightforward project estimated at two months. I was brought in after nine months, when it kept hopelessly falling behind.
Shockingly, from the pure engineering standpoint, the initial estimate was 100% correct. Armed with my today’s Px100 Platform or similar tool, I could have done it alone in three weeks, meaning a team of two-three could do it within two months. What went wrong?
There was no scope creep. The project was an automated data processing pipeline typical for financial companies exchanging files of trades and other data in archaic mainframe-era formats. It was replacing the old mainframe system, that exhibited severe performance issues, let alone being extensible. I’m not going to bore you with further details. The most important thing about those requirements was their 100% clarity. All input and output formats were known, along with 100% of data transformation and consolidation/reporting logic.
Isn’t it typically the case for any blown deadline? If something is estimated low, it means a straightforward task with no surprises, doesn’t it? Or put another way, no excuses. Not from the technical perspective.
Professional Atrophy
Was that project staffed by wrong people? Of course it was. The question is, did the team have enough right people? It had more than enough: three smart engineers. So why those three didn’t deliver? Only one reason: professional atrophy.
See, “IT organizations”, that is the world outside of the handful of “big tech” companies like Google and Netflix, have been dumbing down and eradicating “expensive” programming through “business-oriented languages” and mythical DIY tools for decades. It was going on long before the infamous “outsourcing” of the early 2000s. Anyone left in IT after that (who hasn’t gone to Googles) simply lost their ability to program.
The issue manifests itself big time in recruiter emails I keep receiving since it is impossible to unsubscribe from recruiters. Everyone wants an expert with a laundry sheet of qualifications. Alright, you go to one of those interviews, endure the technical grilling, and prove your credentials and experience. Then what? Putting those expert skills to use? No, fixing spaghetti mindlessly typed by the generations of code monkeys.
I get it. IT managers desperately seek a messiah with e.g. multi-threading and other commonly known CS wizardry: hoping, that the general programming talent and knowledge alone would be enough to magically put a failing project back on track. The question is, what such talented developer would be doing if hired? One of two things, right? Untangling messy spaghetti or integrating mediocre third-party “packages”.
That’s not taking other factors into account — a much longer discussion about software engineer’s career future outside of the handful of engineering heavens (and havens) like Google. E.g. the initial life or death negotiation over an already lower than average pay. The both sides: “budget-conscious” managers and underpaid developers may be just soldiers: casualties of that needless war. Nevertheless enemies from day one.
There are two choices for the developer: either leaving within a couple of months or giving up. Neither is good. The first: job-hopping relies on learning allegedly “hot” abbreviations at the tutorial level to get an extra $5K (after the aforementioned life or death negotiation). As a result no one does anything deeper than tutorials with that or another technology, meaning no real problem-solving experience or skills.
The second: giving up is even worse — stopping to program and doing the bare minimum, like the main Office Space character: Peter, working only 15 minutes a week on some “TPS report”. Anyone who’s spent a year in IT knows how accurate that movie is.
I treat every problem as “technical” if you still remember. Believe you or not, the right technology can eliminate unproductive commotion. I am not talking about “collaboration apps”. It’s the work itself. A tight project that has zero margin for BS. Of course done by fewer highly skilled pros.
The three gentlemen I am talking about gave up long time before I took their project. I know what you are thinking. The matter of proper “motivation”, right? Not really. Once someone gives up, the professional atrophy starts. They needed technical coaching first. Remember, the problem is always technical.
They estimated and even planned the project correctly. I am a bit biased towards precise and minimalistic custom solutions. Java was the right choice in general IMO. How they used it, is a different story. They couldn’t follow their own estimates, having lost most of their programming proficiency.
I pay attention to red flags. I knew what I didn’t like at the interview, however kept my conclusions to myself until I carefully studied everything: the nature of the outstanding defects, their historical trends and patterns, the requirements change/discovery patterns, and of course the team.
Trimming the Team
Half of it had to go: clearly unfit to be programmers. I am talking about things like opening database transactions in a loop. Doesn’t get any lower than that — by my standards. Could I explain it to them? I did, giving everyone a chance to improve. As I suspected, the bad part of the team couldn’t learn at all.
They weren’t “junior” to give them any break. Nor I’d have been sympathetic if they were paid pennies by their bodyshops, which BTW wasn’t the case. Talk about adding insult to the injury. A couple of big “reputable” American recruiting chains placed them — unfortunately as technically clueless, as foreign bodyshops. In any case it was apparent they were unable to learn and were dragging the project down. But most importantly the project didn’t need that many developers. Especially those with clearly negative performance.
It’s way worse that the typical “you get what you pay for”. It’s how much extra you need to spend (after paying anyone for their negative performance) to undo the damage.
Technology
They good part of the team was lost. They jumped at the chance to do normal Java programming (of the multi-threaded and other kinds) Google and Netflix engineers enjoy daily, while IT devs drown in spaghetti. Unfortunately the excitement alone to finally write good code after years of low-level maintenance and integration, wasn’t enough. Having been exposed to messes and tasked to cope with the limitations of outdated and buggy software, they forgot how to write normal code.
If that wasn’t bad enough, they had to deal with the management mistakes. My (semi-technical) predecessor decided to mix .Net and Java to utilize someone he inherited from another team, who only knew .Net. Why to accommodate someone’s skillset, introducing huge moving parts and the points of failure into the project? Not surprisingly it resulted into tricky deployment, which, yes, you guessed it correctly, required a dedicated DevOps guy.
Sadly, today’s IT tries to mitigate the dependency hell (Java + .Net + “legacy” + whatever) with “clever” containers like Docker supposed to swallow deployment messes. I address the problem at the root striving for minimum moving parts and dependencies. It is the engineer, that should learn the best tools for the job, and adapt, not the other way around. Hey, I gave that guy a chance too: to learn Java. He had to go. Wasn’t exactly a “rockstar”.
The rest of their (beginner) mistakes was pretty typical. E.g. storing fine-grained configuration in an inherently coarse-grained relational database: something I learned (not to do) at the age of 20 on my very first project. It was another time and another country. I was fortunate to be trusted with architecting and delivering a complete financial system at that age. Fat American IT I joined at 25 after immigrating to the US? One can work there until retirement and never deliver anything complete. “Years of experience” mean nothing if all you do is writing glue code and fixing bugs in it: both eliminated by the good design.
See, saving that project was simply the matter of NOT writing unnecessary code. It required a vanilla Java+Spring application with a minimalistic admin front end and reporting hooks. Just Spring alone (they didn’t use — talk about technical atrophy in the late 2000s) did a lot of heavy lifting. I made sure they utilized it to the fullest including the expression language that came out in v.3, perfect for injecting precise data transformation rules where they belonged.
With all today’s AI hype, you may think that Google does something “scientific”. A few data scientists there do. 99.9% of Googlers and Netflixers write normal quality code, that works — compared to the IT’s eternal 70–90% failure rate.
Everyone loves the “big tech” thinking up fairytales about CalTech and MIT grads in white lab coats and other “geniuses”. Normal people work there, making normal software engineering shine. Not due to being “scientific”, but because it is always art — performed by the few true programmers. Backdropped by all of the spaghetti typed by code monkeys, that shouldn’t be in the industry. There are no tasks for them in a creative project.
When I talk about Px100 (Productivity x100): a mix of the pro “power tools” and the process that frees developers from unproductive IT commotion, I am not exaggerating the 100x part. Cut off one bad piece, and you’ll find out, that you can cut off two more. Then four more. The headcount/cost implosion is as exponential, as clueless explosions resulting from the brute force tactics.
I understand, it may sound gibberish to you. Compared to what that team was doing by the book: capturing the processing pipeline via a workflow engine: jBPM and implementing business rules via a third-party “rules engine”: Drools, let alone the outdated J2EE (that is circa 2002 second version of Enterprise Edition Java) data persistence, the amount of code to write: the Java code itself, XML and other configuration, front-end markup, SQL scripts, etc. was reduced 30x. I am not kidding you. Nine months in, they didn’t even get to implementing the actual business logic. It was the never ending (glue code) plumbing, I simply threw away.
Getting Everyone Onboard
The problem was 100% technical, but it did require some “soft skills”. I could execute my authority to push my technical vision. This is not how one talks to a smart person. I mean this is not how two expert engineers discuss the project. That’s who I was for that team after I took the project: a fellow engineer, not the boss.
I started with one-on-ones with all three: off site, over lunch or coffee. Friendly interviews, as you will. Recalling tough situations, listening to them vent about BS and injustice. I wasn’t establishing the rapport. I am an engineer. I unconditionally side with fellow geeks. But also put the project first. Because the project (meaning the customer) pays everyone’s salaries.
Luckily two (of the three) devs were contractors. All it took to “align” their interests with the company’s was a reminder, that if the project fails, they’d be out. Vs. infinite business process extensibility (and thus the need for automation) if it succeeds. Just two engineers talking the technical stuff: how the project won’t survive w/o an expert contribution. Not some empty managerial promises. Sure I didn’t own that huge corporation, Anything could happen. The service was mission critical. It worked well. It employed just three. Well below the radar to be “outsourced” if you ask me. There was (and last time I checked still is) enough work there for years.
The third guy was a perm employee: smart, but lazy/comfortable. I didn’t count much on him. He was sucked into the new fast-paced process by the two contractors. See, it’s not just money (bonuses, etc.) or promotions. Many times it’s the long forgotten in “buy over build” and “outsourced” IT joy of engineering. Seeing your trick code work flawlessly. No motivational speeches. No do-what-you-want “Google Fridays”. Actually establishing (or restoring) an enjoyable project. I don’t want to bash non-technical IT managers too hard, but as everyone can see, managing a dev team is a technical activity, that involves more than verbal facilitation and mitigation: building things for them to be productive.
The rest was rather boring and uneventful. We reassessed the project and re-estimated it returning to the original two-month term: three weeks to redo the core frameworks, bringing in Spring and a couple of other tools, and then six weeks of straightforward development. No surprises. Not a single (two-week iteration) deadline missed. Clean self-sufficient deployment. Still running there like a clock with the two contractors extending and maintaining it.
I know. It’s not very impressive resume-wise. I’ve “managed” bigger teams if you are wondering: also shrinking them at least in half. In my world (armed with Px100), five experts can support thousands of SaaS subscribers with unique needs, thinking they have a dedicated team working just for them. I am not talking about online email clients, spreadsheets, or generic form-filling apps. I mean $100M ERPs. Start cutting unneeded internal pieces (w/o cutting a single customer requirement), and you’ll find out that you can cut two after cutting one, like I explained above.
Case Study Two: SaaS MVP
I connected with my LionStack partner Jason Barber shortly after I finished the fourth pure-JavaScript version of Px100. Jason wanted to see it in action: along with my claims of 100x productivity. LionStack itself needed a CRM. I developed the initial Fetch in five days. Jason didn’t bother me about it for a year, while we worked on other projects for paying customers. It was however obvious that the initial minimalistic Fetch had to evolve. For him to be more productive selling our products. And to sell it to others. Why not?
That’s the difference between internal projects implemented by IT departments, and something you sell to real-world customers. A POC vs. MVP if you will.
Fetch proved, that Px100 can churn out Zoho-level apps in the matter of man-days. Woo-hoo. The classic “proof of concept”. It was time to offer something more substantial than dumb data entry and rudimentary analytics.
I am finishing it up right now. My estimate is three weeks: along with writing for our blog, hiring, and other CTO duties. I am “pedaling in a higher gear” (explained here). It’s hard. I have higher standards for the first versions of our products I call “minimally viable”. A SaaS MVP needs a lot of plumbing: e.g. robust self-signup with a choice of subscription plans and automatic billing. Any 21st century CRM needs to be connected to major email and calendar services like Google’s and Microsoft’s. It needs elaborate workflows and security: who sees what. It needs to be extensible, since every sales process is unique. Yet it needs to stay simple and intuitive instead of turning into a yet another Salesforce clone with tens of menus and checkboxes to configure everything.
Overall it looks like we’ll conceptually retain less than 20% of the original Fetch. That’s the difference between a POC: production-ready system, and an MVP: sales-ready one.
Why Programming Takes So Damn Long?
There are 10–100x more invisible requirements behind high-level screen mockups.
- Invisible business plumbing: robust billing, DDoS protection, etc. addressing all catastrophic “what if” scenarios.
- Any record that can be created, can also be deleted. Users can make mistakes, that need to be “undone” (rolled back). E.g. any payment should be reversible. A customer can be given a credit for any reason. Oh, and both the initial action and the reversal need to be traceable via a detailed audit trail: 100% automatic in Px100.
- Tricky relationships between records. No, even a “relational” database is not going to take care of them automatically. Let’s say some record (account, deal, employee) is assigned to a specific user. What happens after deleting that user? Should all his/her records be reassigned? To whom? You need to explicitly ask the admin deleting the user, which means another UI screen backed by the server-side logic.
- Database “denormalization” for performance reasons e.g. displaying complex analytics and reports in real-time.
Should I continue? None of that can be predicted and put into the original spec. It’d take years to write. And that’s just 100% pure business logic aka business plumbing. Typically 90% of all work (eliminated by Px100) would be doing something generic over and over: the glue code aka technical plumbing.
That was just the technology, assuming everything was covered process-wise. It rarely is. LionStack doesn’t have personal conflicts, rework due communication problems, and expensive QA cycles due to untested code mindlessly thrown over the wall, simply because of our extremely lean team. We intend to operate this way — per product. But even pedaling in a higher gear, we have a lot of ground to cover business plumbing wise. There is no way around business process complexity.