Vardan Torosyan

There is too much

2026-06-22T00:00:00+00:00

I keep hearing from engineers lately: there is too much going on. Too many parallel threads. Too much code being generated to actually read. Big features merged that nobody could honestly claim to fully understand. On-call for code you didn’t write and don’t fully trust. And underneath all of it, a quieter admission: the work isn’t teaching me much anymore, and it’s stopped being fun.

I know that feeling well. Not from the AI era - from the day my peers became my reports.

When I transitioned to management at SoundCloud, the people I’d been coding alongside were suddenly on my team. The cognitive load was immediate and brutal. I was stretched in every direction and convinced I had to be on top of all of it: every PR, every decision, every conversation. I spent the first few months in a kind of frantic vigilance, trying to hold everything at once. I was doing everything poorly, and eventually I had to admit that to myself, which is a specific kind of awful.

What actually helped wasn’t a mindset shift. It was aggressive prioritization - Shreyas Doshi’s LNO framework in particular - and the uncomfortable acceptance that I had to let some things go badly in order to do the important things well. That took years, not weeks. Five-ish, if I’m honest.

This is just the job description of an engineering manager

Take the word “AI” out and read that list again:

Accountable for output you didn’t personally produce.
Reviewing and steering all day instead of making.
More threads in flight than you can hold in your head.
Signing off on things you don’t fully control or understand.
Grieving the loss of the craft that made you good in the first place.

That’s not a description of agentic coding. That’s a description of becoming a manager. What AI did was give every engineer a small team of tireless, fast, occasionally-wrong direct reports. And with the team came the manager’s problem. The discomfort engineers are feeling right now isn’t an AI problem. It’s a delegation problem, and delegation is the oldest unsolved problem in our discipline.

The good news: it’s not unsolved because nobody tried. Managers have been failing at it and slowly adapting for decades. There’s actually a playbook.

The timeline nobody talks about

Here is the part that bothers me most. It took me roughly five years to get comfortable with the cognitive load of management. Five years of doing it poorly, admitting it, and slowly building the instincts to triage, delegate, and let go without losing the thread. Nobody demanded I be good at it in month two.

Engineers don’t get that grace period. The token-maxxing era is over, ROI conversations are everywhere, and the pressure to show results from AI-augmented work is immediate. Companies bought into the narrative that AI solves the productivity problem - and that narrative doesn’t leave much room for “we need time to figure out how to work with this well.”

What I rarely see discussed is what this costs. Not in tokens, but in engineers who are quietly overwhelmed, building habits under pressure that don’t actually scale, and losing trust in their own judgment because the tools move faster than they can make sense of. The delegation problem is hard. It took managers decades of collective failure to develop even a rough playbook. Expecting engineers to solve it in a quarter, while also shipping, is not a plan. It’s just pressure with a narrative on top.

Start with what’s actually in your control

Before the tactics, there’s an older idea underneath all of them. The Stoics split the world in two: the things that are up to you, and the things that aren’t. Epictetus opens with it because everything else depends on getting it right.

“There is too much” is a true statement about the second pile. The volume of code your agents can generate is not in your control. The number of features other teams need is not in your control. You have limited time and limited power, and no amount of anxiety expands either.

What is in your control is small and it is everything: where you point your attention, what standard you hold, what you decide not to do, and whether you’re honest about which is which. The whole reason “there is too much” feels like drowning is that we keep trying to exert control over the size of the ocean. You can’t. You can only decide where to swim.

What actually helps

None of this makes the load disappear. It makes it carriable.

Separate ownership from authorship. You own the outcome, not the lines. Ownership is having a model of where this thing breaks and what you’d do when it does - not having typed it. You can own code you didn’t write. You cannot own code you refuse to understand. Those are different statements, and the gap between them is the whole job.
Decide what you must understand deeply - then triage the rest without guilt. You cannot review thousands of lines of generated code at equal depth, and pretending you can is how things slip. This is just LNO applied to review: the leverage code - the 3am on-call path - gets read line by line. The overhead code - rarely-hit, easily-reversible - gets sampled, and an OK-tier pass is the correct amount of effort, not a guilty compromise. This isn’t laziness, it’s the 1% that matters.
Externalize. The doc is not a crutch, it’s the method. Stop trying to hold it in your head. Every good manager runs on lists, not memory. Offload the state so your head is free for judgment.
Build verification, not total verification, and calibrate it to track record. You will not personally check everything. So you invest in the things that check for you: tests, types, a teammate’s review, a smell test you trust. And how hard you check should scale with what Andy Grove called task-relevant maturity. On an unfamiliar class of problem, spell it out and read every line. Once an agent has been right on that class a dozen times, step back to agreeing on the goal and spot-checking. The mistake is a fixed level of trust: reviewing everything forever and drowning, or nothing and getting burned.
Letting something fail is a valid output. Not everything in front of you has to get done, and the discipline is saying which things won’t - explicitly, to the people affected, rather than letting them quietly rot. An unmet need you’ve named is a decision others can plan around. One you’ve buried under “we’ll get to it” is a trap waiting to spring. The skill was never doing everything. It’s choosing what not to do, out loud.
The discomfort is the job, not a bug in it. Acting on incomplete information, sitting with the unease of not-fully-knowing, and committing anyway - that is judgment. Managers don’t feel more certain than you; they’ve made peace with feeling uncertain and moving regardless.
Keep something you understand deeply. In my first year of management I made the mistake of letting go of everything technical at once. Don’t. Pick one area - a service, a domain, a class of problems - and stay close to it. Not because you need to, but because it keeps your technical identity intact while everything else is in flux. It also gives you a reference point for evaluating everything else.
Track what you’re learning, not just what you’re shipping. It’s easy to stay busy and stop growing without noticing. I wrote about this in the last post: keeping a small log of observations, surprises, decisions and why you made them. Not polished documentation, just traces. The noticing is happening anyway. Most of the time it just evaporates.
Talk to someone who has been through it. The playbook transfers faster through conversation than through trial and error. Find a manager, a senior engineer, anyone who adapted to a similar shift and is willing to be honest about how long it actually took. Hearing “it took me two years and here’s what helped” is more useful than most frameworks.
Manage yourself first. All of the above is really self-management wearing different hats. That’s not soft advice - it’s the hardest and most neglected part of operating under load. I’ve been writing little meditations on exactly this: keeping your composure while entropy wins, and the quiet relief of accepting what you don’t control. What started as notes to myself is starting to feel like everyone’s problem now.

The part the playbook doesn’t fix

The management playbook teaches you to carry the load. It does not give you back the deep, hands-in-the-dirt understanding that made the work feel like yours. I had that grief the year I stopped writing code. You trade depth for leverage, and on the bad days it feels like a worse trade than it is. I don’t think that ever fully goes away.

But if engineers are now living the manager’s life whether they chose it or not, the least we can do is hand them the lessons we paid for the hard way. A lot of what feels new about cognitive debt isn’t new. It’s just management, arriving early and uninvited.

Revisiting performance in the age of AI

2026-06-12T00:00:00+00:00

About a year and a half ago I wrote a framework for quantifying software engineer performance. The idea was simple: performance is a weighted combination of input, output, outcome, and impact, and if you score yourself weekly against agreed categories, you remove a lot of the friction and anxiety from review season.

AI agents are taking over engineerings tasks more and more, and I kind of assumed that this framework makes no sense anymore. However, after thinking about it I think I still stand behind the structure.

What AI actually broke

I think AI does not break the four dimensions. It breaks the cost structure between them.

Input got invisible. Time spent coding used to be a rough signal of effort. Now the most valuable input often looks like nothing: reading code you didn’t write, sitting with a confusing system, writing down your understanding before opening a tool. I wrote about this in What to do when AI is quietly making you worse: the work that builds the 1% knowledge is exactly the work that doesn’t produce a visible artifact.

Output got cheap. This is the big one. PRs merged, docs written, features shipped and so forth - volume of deliverables no longer tells you much about the person who delivered them. When output is cheap, measuring output is measuring nothing. Worse: it’s measuring willingness to generate, which is not the same thing as engineering. This is Goodhart’s Law playing out in fast-forward: the moment a measure becomes a target, and becomes nearly free to produce, it stops being a good measure.

There’s also a sharper problem underneath: we are bad at judging our own productivity with these tools. METR’s 2025 study of experienced open-source developers became famous for finding that AI tools slowed them down by 19% and while METR later walked back the headline number after methodology critiques, the part that held up is the one that matters here: developers consistently believed they were ~20% faster regardless of what the clock said. If individual engineers can’t subjectively tell whether AI is helping them, a performance system built on counting their outputs has no chance.

Outcome and impact are still expensive. Did the thing solve the problem? Did it move the business? AI hasn’t made these cheaper, because they depend on the parts of the job that haven’t been automated: choosing the right problem, understanding the customer, making good tradeoffs. If anything, they got more expensive relative to everything else, because now there’s a lot more output competing for the same finite outcome.

In the old post I wrote that in fast-paced environments output may dominate the weights. I don’t believe that anymore in any environment. If your performance system today still has a heavy w2, you’re paying people to run a text generator.

The dimension that is missing

Here’s the bigger problem, and it’s not a weighting problem. The framework measured what you produced. It never measured what you learned. I have realized this after reading the Why We Still Suck at Resilience by Adrian Hornsby.

Learning was a defensible omission in 2024, because learning and producing were coupled. You couldn’t ship a hard feature without building a mental model of the system along the way. The learning came for free as a byproduct of the output. The framework could ignore learning because output smuggled it in.

AI decoupled them. You can now produce a lot while learning almost nothing. The code ships, the PR merges, everything looks fine and your understanding of the system is exactly where it was six months ago. This isn’t just my intuition anymore; the research is piling up. A Microsoft and Carnegie Mellon study of 319 knowledge workers found that the more confidence people place in GenAI, the less critical thinking they apply and that workers stop thinking critically precisely when they lack the skills to inspect and challenge the AI’s output. Which is a brutal loop: the less you know, the less you question, the less you learn. MIT researchers studying AI-assisted writing coined a term for the long-run cost: cognitive debt. It’s the right metaphor. Like tech debt, it doesn’t show up in this sprint’s metrics.

The framework can’t see the difference between an engineer who is compounding and an engineer who is accumulating cognitive debt. Both score the same. One of them is a problem you’ll discover at 2am during an incident.

So if I were rewriting the equation today, it would look something like:

Performance = w1(Learning) + w2(Judgment) + w3(Outcome) + w4(Impact)

Where learning is the rate at which your mental models of the system, the customer, and the domain are growing, and judgment is the quality of your decisions when the answer isn’t obvious - which problem to solve, which tradeoff to take, when to trust the model and when to be suspicious.

I’m aware this looks like I replaced two measurable things with two unmeasurable things. That’s sort of the point. The measurable half of performance got automated. What’s left is the hard half, and pretending otherwise is how you end up with a team that’s very busy and quietly getting worse.

Can you quantify learning and judgment at all?

Partially, and imperfectly, and I think imperfect-but-honest beats precise-but-wrong. Some proxies that I am thinking about:

Incident reasoning. During incidents and post-incident reviews, can the person reason about the system from first principles, or only pattern-match on dashboards? This shows up clearly if you look for it. On-call is becoming one of the last honest performance signals we have.
Explanation ownership. Can you explain a PR you merged, including the AI-generated parts to a teammate without opening it? If yes, you own it. If no, you approved it. Those are different jobs and should be evaluated differently.
Prediction calibration. Before a project: what do you think will be hard, what’s your instinct on the design, where will it break? After: how close were you? Engineers who write down predictions and review them build judgment visibly. This is the “write before you look” practice turned into an evaluation signal.
Knowledge traces. Things noticed, surprises captured, edges documented. Not polished docs but traces. An engineer leaving a steady trail of “this surprised me, here’s why” is learning. An engineer leaving only merged PRs might be, or might not be, and you can’t tell.
Questions asked. This one is uncomfortable and I’ll come back to it in the next post. The engineers learning fastest are the ones saying “I don’t understand this” most often. Most performance cultures punish exactly that sentence.

None of these fit neatly into a points table, and I’m resisting the urge to build one, because that urge is what got the original framework into trouble. The mechanics I’d keep from the old post are the cadence ones: weekly self-reflection, a brag document, regular check-ins with your manager. The thing being tracked changes; the discipline of tracking doesn’t.

What this means if you’re an engineer

The practical advice from the original post still mostly holds: clarify expectations, agree on the system with your manager, document, check in regularly. What changes is what you should be optimizing for inside that system.

If your company still rewards output volume, you face a genuinely uncomfortable choice: optimize for the metric and hollow out, or optimize for learning and look slower on paper. I don’t have a clean answer for that, because it’s not a problem you can solve individually - it’s a problem your organization’s incentives create. Which is exactly what the next post is about: what happens to companies that keep rewarding performance theater: especially its most celebrated form, heroism, while the actual value of their engineers shifts to something their review systems can’t see.

The Prioritization Trap

2026-06-05T00:00:00+00:00

In the last couple of weeks I’ve had a few separate conversations that, looking back, were all the same conversation. One was about community issues vs. customer asks. One was about whether we invest in new features or pay down engineering foundations. One was the usual short-term vs. long-term tension. Different details - but the same underlying problem every time: two things that both clearly matter, not enough capacity to fully do both, and a group trying to decide which one wins.

And in every one of them, sooner or later, someone said the obvious thing: “we need both.”

“We need both” is almost always the correct answer. But it’s also useless in the moment, because you can’t do “both.” You can’t schedule “both.” You can’t tell an engineer on Monday morning to go work on “both.” The answer is true and unactionable at the same time, and that combination is exactly what makes these conversations go in circles.

“We need both” is a signal, not an answer

When the honest answer to a prioritization question is “we need both,” that’s not indecision. It’s a signal that you’re trying to decide at the wrong level of abstraction.

“Features vs. foundations” is not a decision. Nobody can actually pick between two categories, because both categories contain real, important work. The moment you try to choose one, your gut correctly objects and you’re stuck.

The mistake is treating the big question as the thing to answer. It isn’t. The big question is the thing to break down.

What merge sort knows that we forget

There’s an idea from computer science that I think about more than I’d like to admit, and it’s the most boring-sounding one: divide and conquer.

Take sorting. You cannot sort a million items in your head. The problem is too big, there are too many moving parts, you have nowhere to even start. But you can always sort two items. That’s trivial. So merge sort doesn’t try to solve the big problem at all. It splits the list in half, and splits the halves, and keeps splitting until it’s left with pieces so small they’re embarrassingly easy and then it composes the sorted pieces back together, two at a time, all the way up.

The trick was to make the unit small enough that the problem stopped being hard.

“Features vs. foundations” at the level of a quarter is a million items. But “what does this one engineer work on this week” is two items. That’s a question a human can actually answer.

Because at the level of a single “sprint”, you’re no longer choosing features or foundations. You’re choosing a mix. Maybe it’s 70% feature work, 20% foundations, 10% community. Suddenly the unanswerable binary has turned into a ratio, and ratios are something we’re actually good at reasoning about. You can argue about whether it should be 70/20/10 or 50/30/20.

Decompose far enough and “either/or” becomes “what’s the right mix” - and that is a dramatically easier question.

If you want the formal version: this is just induction. You don’t prove the whole theorem at once. You prove the base case, then you prove that each step follows from the one before. You never have to hold the entire thing in your head. Prioritization works the same way - get the unit right, get the next step right, and the quarter takes care of itself.

When it works: the parts have to be independent

Decomposition is not magic and I’ve seen it sold as if it were. The reason merge sort works is a property called optimal substructure - the solution to the whole is genuinely composable from the solutions to the parts.

Sorting has it: sorted halves merge cleanly into a sorted whole. When your subproblems have this property, when they’re independent enough that you can interleave them without one corrupting the other - decomposition works. This is the world of capacity splitting, sprint ratios, the 70/20/10 models. You slice the work, assign the slices, and the slices don’t fight each other.

When it’s harder: dependencies

Sometimes the parts aren’t independent. The foundation work genuinely has to land before the feature can be built on top of it. You can’t interleave those in a single sprint, because one is a hard prerequisite for the other.

It just changes what decomposition buys you. When subproblems have sequential dependencies, you still break the big thing into small things, but instead of parallelizing the pieces, you sequence them. The output isn’t a ratio, it’s an order. “First this, then that, then the other.” Which, again, is something a team can actually execute against, in a way that “features vs. foundations” never was.

When it doesn’t work at all

And then there are the decisions that genuinely resist decomposition, and it’s important to recognize them, because trying to decompose them is its own trap.

“Should we go upmarket?” is not a question you can answer 70/20/10. You can’t serve enterprise and not-serve-enterprise 70% of the time. These are strategic stance questions - they set the frame that everything else gets decomposed inside of. The substructure isn’t optimal; the answer to the whole is not the sum of small answers, because a half-committed strategy is often worse than either direction taken fully.

These need a macro answer first. A real decision, made by someone with the authority to make it, ideally written down. Once that stake is in the ground, the work underneath can be decomposed. But if you skip the macro decision and try to ratio your way through a stance question, you get the worst of all worlds: a team busily interleaving subtasks that point in contradictory directions.

So the skill is partly knowing which kind of problem you’re holding.

The question to actually ask

If there’s one thing to take from all of this, it’s a single diagnostic question to bring into the next “we need both” conversation:

At what granularity do these competing priorities stop competing?

For most of them, the answer is smaller than you think - usually a single sprint, or a single engineer’s week. Zoom in to that level and the tension you were arguing about for an hour often just resolves into “okay, roughly this much of that, this much of the other, let’s adjust in two weeks.”

It connects to something I keep coming back to in other posts - that planning for impact beats planning for predictability, and that the traps in our work usually come from operating at the wrong altitude. This is the same lesson wearing different clothes. The big question feels important, so we keep trying to answer it directly. But the leverage is almost never in answering the big question. It’s in breaking it down until the answer becomes obvious.

What to do when AI is quietly making you worse

2026-05-27T00:00:00+00:00

I’ve been thinking about this for a while now, and a recent Root Cause podcast conversation pushed me to finally write it down. In the podcast they talking about something that I suspect a lot of engineers feel but rarely say out loud: you might only need deep system knowledge 1% of the time, but that 1% is often the moment that matters most. And AI is quietly eroding exactly that.

This isn’t a “AI bad” post. I’ve been using it heavily - for coding, for analysis, for thinking through hard problems. The throughput gains are real. But throughput and understanding are not the same thing. And I think we are, collectively, choosing throughput in a way that we’ll regret.

What actually erodes

The risks aren’t dramatic and they are not immediatly visible, they’re slow and they’re invisible.

When you stop fighting with hard problems directly, the mental models fade. You stop building intuition. You start pattern-matching on outputs instead of reasoning from first principles. And the worst part –> you don’t notice it happening. The code still ships. The PR still merges. Everything looks fine until the incident at 2am where you genuinely cannot reason about what the system is doing because you never really had to learn it.

There’s a good analogy here from aviation. Pilots trained heavily on autopilot gradually lose the ability to fly manually and this isn’t theoretical, it’s contributed to real crashes. Air France 447 is the well-documented case. The response from regulators wasn’t to ban autopilot, it was to mandate manual flying hours. The industry recognized that certain skills only stay alive through deliberate practice, and built that practice back in artificially. We haven’t done anything equivalent in software yet.

The junior problem is where this gets most acute. New engineers joining today may never develop system intuition the way the previous generation did, because the friction that forced learning is gone. Getting stuck for hours on a weird race condition, debugging by reading logs, writing something from scratch and not understanding why it was slow - those were annoying, but they put knowledge in your head in a way that reading the answer doesn’t. We’re removing that forcing function without replacing it with anything. The people who need the reps most are the ones who never get them.

The three things that actually still matter

A few days ago I organized some thoughts on this in an internal talk, and I keep coming back to the same framing: engineering right now boils down to three things.

What problem should you solve? This requires taste, judgment, closeness to the customer. Cannot be outsourced to AI.

How should you solve it? Architecture, tradeoffs, understanding the system. Fundamentals compound here: a 25-year engineer’s grasp of OS internals still makes them dramatically better even as AI does more of the actual coding.

Actually solving it. AI is handling more and more of this. It’s the part changing the fastest.

The trap is thinking the third thing is the whole job. It was never the whole job. It just happened to be the most visible part. Now that it’s getting automated, the first two things are getting exposed as the actual leverage and a lot of engineers aren’t ready for that.

What nobody talks about: how do you actually develop judgment?

Everyone says judgment will matter more. Agents will do the thinking, humans will make the decisions. Fine. But nobody really explains how you develop that skill, or how you avoid losing it.

I think judgment is built from a specific loop: you form a view, you commit to it, you see what happens, and you update. That cycle, repeated enough times, is what builds calibration. The problem with AI is that it short-circuits the first step. You skip forming your own view and go straight to evaluating someone else’s. Do that enough and the muscle atrophies and again, not dramatically, just quietly. You become a better reviewer and a worse thinker.

Two practices I’ve found actually help, and they’re related.

Write before you look. Before opening a tool, before asking the model, write down what you think. Not a design doc necessarily, just your current understanding of the problem, your instinct about the solution, where you think the tricky part is. Even a few sentences. This forces you to articulate your reasoning rather than pattern-match on someone else’s output. It’s also surprisingly useful as a diagnostic: if you can’t write anything, you probably don’t understand the problem well enough to evaluate any answer.
Form a view before reading the suggestion. When reviewing AI-generated code or design, read it critically with your own opinion already in hand. What would you have done? Where does this differ? Why might the model have gone this direction and is it right? This sounds small but it’s the difference between passive consumption and active evaluation. One builds judgment, the other just builds familiarity with AI output.

These aren’t perfect solutions. But the broader point is: judgment is a skill that requires reps of deciding, not just approving. If your job becomes mostly approving, you need to manufacture the deciding reps somewhere else deliberately.

What you can do (what I’m trying to do)

This is the part I want to focus on, because the problem is obvious enough. What’s less obvious is what to actually do about it.

Own the problem before touching the tool. This sounds basic but I’m surprised how often I catch myself reaching for the editor before I’ve really understood what’s broken. Spend more time with the customer: their pain, their workflow, their confusion. The real leverage is there, not in how fast you can generate code.
Debug production manually. On-call rotations and post-incident reviews are now, weirdly, one of the last places where you’re forced to understand the system as it actually behaves. Treat them seriously. Don’t rush to close the incident. Understand what happened.
Read code you didn’t write. Especially AI-generated code. If you can’t explain a PR to a teammate, you don’t own it yet. Ownership is not clicking merge - it’s having the mental model.
Write the doc before the editor. Design documents are where thinking happens. Not as bureaucracy, as a forcing function for understanding. If you can’t write a coherent design doc, you probably don’t understand the problem well enough to solve it. AI won’t tell you that. It’ll just generate something plausible.
Develop taste for where the model breaks. This is becoming a core skill on its own - knowing when to trust the output and when to be suspicious. It requires deliberately trying to break things, chopping problems small, iterating in environments where failure is cheap. You rebuild this intuition with every new model release.
Protect your cognitive load. Stop doing for the sake of doing. There’s a kind of energy around AI tooling right now where everyone is just… moving fast. Sometimes the most valuable thing you can do is slow a teammate down and ask “do you actually understand what this does?” That’s not friction. That’s engineering.
Start an experimental knowledge log. This is something I’ve been building for myself and I think it works well enough to propose as a team practice. The problem with documentation is that it requires intent, someone has to decide to write it, keep it current, and remember it exists. That bar is too high, and everyone knows docs go stale. What I’m doing instead is capturing traces: small observations, things that surprised me, edges I hit, decisions I made and why. Not polished, not structured. Just logged. The mechanic that makes it actually stick is keeping the friction near zero. A Slack reaction on a thread, for example, can trigger an update to a shared knowledge base - you’re already reacting to things that matter, you’re just routing that signal somewhere it doesn’t disappear. The key insight is that the noticing is happening anyway. Most of the time it just evaporates. An experimental knowledge log is a cheap way to leave traces without turning every observation into a documentation task. It also does something else: it forces you to stay in the habit of noticing. Of thinking “this is worth remembering.” That instinct is exactly what erodes when you outsource too much to AI.

The thing that worries me most

Most engineers I know who are heavy AI users are thoughtful about it. They’re asking these questions. But the tooling is being adopted way faster than the culture of careful use is developing. And organizations are rewarding throughput metrics - PRs merged, features shipped - in ways that actively discourage the slower, deeper work that builds the 1% knowledge.

That 1% is not evenly distributed. It lives in the engineers who’ve debugged enough production incidents, read enough weird kernel behavior, sat with enough confusing systems long enough to develop genuine intuition. We’re not replacing those people, we’re just not making new ones.

The real gap right now is not execution. It’s clarity of thought and closeness to the problem. AI is making execution cheap. That should mean we invest more in the other two. Instead, a lot of teams are just doing more execution.

Thoughts on the Agentic SDLC

2026-04-20T00:00:00+00:00

I was reading The Software Development Lifecycle Is Dead and had some loud thoughts I want to share.

We still build software in stages

The article is mostly right: we have learned to build software in certain stages. There is usually a moment where we try to understand why something should exist in the first place (Product DNA). Then we move into design (RFC/Design Document), where we compare approaches and make sense of tradeoffs. Then we plan the delivery, break the work into pieces, and start building. And eventually we decide whether something is ready to ship (Definition of Done).

I strongly believe this basic shape still holds. I do not think it is going away.

What has changed, at least for me, is the kind of certainty we can expect at each stage. The structure is still there, but the meaning of the stages is shifting. What used to feel like a fairly direct path from problem to solution now feels more like a process of narrowing, testing, and discovering what the problem really is.

Product Requirements

The first place I notice this is in Product DNA / Requirements. In the old sense, the document is supposed to help us define the problem clearly enough that the rest of the work can follow. That is still true, but with more AI-shaped work, the early stage feels less like a definition and more like an exploration. You start with a direction, a rough understanding of the user need, maybe even a strong intuition about the outcome you want. But you are often not writing down a truth so much as writing down the best version of your current understanding. The point is not to lock in the answer, it is to make the question explicit enough that the team can reason about it together.

That distinction matters. When the thing you are building has a more probabilistic or adaptive quality to it, the problem rarely reveals itself all at once. You might think you are solving one thing, and then the first real interactions show you the actual problem lives somewhere else. Or that the original problem was real, but incomplete. Or that the right framing is slightly different from what everyone first assumed.

So Product DNA becomes less about saying “this is exactly what we are building,” and more about saying “this is the space we are trying to understand, and this is the outcome we care about.” It is still useful for alignment. It just needs to leave room for discovery.

RFC / Design Docs

Design docs change in a similar way. They still matter, because they are where we make tradeoffs visible and force ourselves to be honest about the implications of different approaches. But in a world where system behavior is not always fully predictable from the outset, design docs feel less like a final specification and more like a way to structure the investigation.

That does not mean they become vague. It means they should be more explicit about uncertainty. A good design doc in this world should say not only what we think we will build, but what we still need to learn before we trust that direction. It should capture the options we considered, the assumptions we are making, and the places where we expect to change our minds once we have more signal.

That leads naturally into the part of the process that often gets treated as secondary, but should probably be treated as central: exploration. Call it a POC, a prototype, an experiment, or just “trying it out.” The label does not matter much. What matters is that some questions cannot be answered by documents alone.

You can write a convincing product brief and still be wrong about whether the experience is actually useful. You can produce a thoughtful design and still miss how the system behaves once a real user gets involved. You can plan a delivery perfectly and still discover that the thing you thought was a milestone was really just the beginning of the work.

That is why the exploratory phase starts to feel less optional. It is not a side quest before the “real” project begins. In many cases, it is where the real project becomes visible. The role of exploration is not just to de-risk implementation, it is to reveal what the system actually needs to be. That is a different job.

Delivery Plans

Delivery plans also carry a slightly different meaning now. They still help us sequence work, manage dependencies, and create a sensible path forward. But when the thing being built requires more learning along the way, milestones become less about completion and more about confidence. A milestone is not just “the code exists.” It is also “we know enough now to make the next decision with less uncertainty than before.”

That is a subtle but important shift. Progress is not just measured by output, it is measured by what we have learned. Sometimes the most valuable thing a milestone gives us is not momentum, but a correction. It tells us the original idea was close, but not quite right. Or that the system needs a different shape than we expected. Or that the thing we thought would be central turns out to matter less than something we discovered along the way.

Definition of Done / Readiness

This is the part of the workflow that tends to stay the most recognizable, because there is still a real need to ask whether something can be operated safely and reliably. That does not go away. If anything, it becomes more important. But the standard for readiness shifts too. It is not enough to know that something runs. We also need to know whether we can understand its behavior, measure its quality, and notice when it begins to drift.

That is where observability and evaluation start to matter in a deeper way, not just as operational hygiene, but as a way of owning the behavior of the system after it ships. The more adaptive or probabilistic the system is, the more important it becomes to understand not only whether it is up, but whether it is still doing the right thing. That means closing feedback loops earlier, and making sure the system can teach us when it is getting worse rather than just failing loudly.

Closing thoughts

I do not think the answer is to invent an entirely new lifecycle (though there are attempts, like asdlc.io). The old structure still gives us something valuable. It creates shared language. It creates checkpoints. It keeps teams from skipping the hard conversations.

The process stays, but the meaning shifts.

Walking the DEM Lifecycle: What I Learned by Using Grafana

2026-03-28T00:00:00+00:00

In my previous post, I wrote about how I approached onboarding after moving from Identity and Access to Synthetic Monitoring and Session Replay. One of the most useful things I did during that onboarding was build a small playground and use it to walk the DEM lifecycle end to end.

The app is intentionally simple: no database, a fake checkout flow, a status page, a few API failure modes, and Grafana Faro for frontend telemetry. That made it a good environment for one question I kept coming back to:

if something breaks in production, how do I move from a signal to actual understanding?

What I ended up learning was that Digital Experience Monitoring is not really about collecting more signals. It is about how quickly you can move through the steps that matter: detect, scope, select a session, view what happened, diagnose, fix, and validate.

From signals to understanding

Digital Experience Monitoring (DEM) is often described in terms of signals: synthetic checks, frontend observability, logs and traces and so forth. Coming from the outside, that framing made sense. But once I started using the system, something became clear very quickly: the problem isn’t collecting signals, it’s making sense of them when something breaks

You can have all the right data and still feel stuck.

A synthetic check is failing
Frontend errors are slightly elevated
Logs look noisy

Individually, each of these is useful. Together, they should tell a coherent story and in practice that story is not always obvious.

Walking the lifecycle

To make sense of this, I started walking the system the way a user would. Not as an insider, but as someone trying to debug a real issue.

What emerged was a fairly consistent pattern:

Detect → Scope → Understand → Diagnose → Fix → Validate

This is not a new framework. Most teams already operate this way implicitly. But walking through it step by step exposed where things feel smooth—and where they don’t.

A concrete example

One of the simplest ways I tested my understanding was by creating a synthetic browser check against a demo app and then intentionally breaking it.

Detect

The synthetic check failed. That part worked exactly as expected.

Synthetic Monitoring is very effective at this because it gives you a controlled, repeatable version of a user flow. When it fails across multiple probes, you know something is off.

But at that moment, I realized something: I knew that something was broken, but I didn’t know if it mattered.

Scope

The next step was figuring out whether this failure showed up in real user behavior.

Looking at frontend signals helped answer that:

Are users hitting this flow?
Are error rates increasing?
Is the issue localized?

This is where the picture started to form. The failure wasn’t just synthetic, it was visible in real usage. That shift—from “signal” to “impact”—felt like a key transition.

Understand

This was the most interesting part of the experience. Knowing that users are affected still leaves a gap: what actually happens when the flow breaks?

Logs and metrics can help, but they require interpretation.

What helped more was looking at behavior:

where users drop off
which step fails
whether the issue is consistent

Instead of correlating multiple graphs, I could follow the flow itself. That made the problem much more concrete.

At this point, something clicked for me:

Synthetic Monitoring tells you where to look, Frontend Observability shows you what actually happens.

Those two together reduce a lot of guesswork.

Diagnose

Once I had a clear picture of the failure, moving into logs and traces felt very different.

Instead of exploring the system broadly, I was looking for something specific:

what happens during this step?
why does it fail here?

That narrowed the search space significantly.

Fix and validate

After applying a fix, I followed the same path again:

synthetic check returns to green
frontend signals stabilize
flows complete successfully

This closed the loop. And it highlighted something I hadn’t fully appreciated before:

Synthetic Monitoring isn’t just about detecting failures. It’s also a reliable way to confirm recovery.

What changed for me

Before this exercise, I thought about DEM mostly in terms of capabilities (checks, signals, integrations, etc). After walking the lifecycle, I started thinking about it differently: not as a collection of signals, but as a workflow

Each part answers a different question:

is the system behaving as expected?
are users actually impacted?
what does the failure look like?
where is the root cause?

The value is not just in having these answers, but in how quickly you can move between them.

Where things still feel hard

This exercise also made some tensions more visible. As systems grow more flows get monitored and more teams interact with them. At that point, small inconsistencies start to matter(naming, time ranges, missing context). None of these are new problems. But during an incident, they become very noticeable.

Closing thought

Walking the product as a user gave me a much clearer lens on DEM. The challenge is not collecting more data. It’s maintaining a clear path from detection to resolution.

Detect → Scope → Understand → Diagnose → Fix → Validate

Most systems have the pieces, but the difference is how well they connect.

Switching Teams as an Engineering Manager: How I Structured My Onboarding

2026-02-27T00:00:00+00:00

At Grafana Labs, we don’t just support internal mobility, we actively encourage it. So here’s some long-overdue news: after five years supporting the Identity and Access team, I’ve moved to lead the Synthetic Monitoring and Session Replay squads.

When I made the decision, I expected complexity. What I didn’t expect was how disorienting the transition would feel.

In the first week, I almost felt like I hadn’t just changed teams - I had changed companies. A new domain. A new vocabulary. New product surfaces. New stakeholders. Different operational realities.

And somewhere in that first week came a quiet realization: if I didn’t approach this intentionally, I’d spend months reacting instead of learning.

So I treated onboarding as a project. Not something that “just happens,” but something I would design.

My Onboarding Phases

I split my onboarding into four concrete phases. Not because I thought reality would follow the plan perfectly, but because without structure, everything feels urgent and nothing feels clear.

Phase 0 - Pre-Day 1: Build a Map

Before officially starting, I focused on building a rough mental map. I read strategy documents, OKRs, roadmaps, past retros, Slack threads. I tried to understand how the department talked about itself.

The goal was not depth. It was orientation. By the time Day 1 arrived, I didn’t understand the system, but I knew the vocabulary.

Phase 1 - First 2 Weeks: People Before Software

In the first two weeks, I deliberately biased toward conversations over diagrams.

I asked:

What frustrates you today?
What are we pretending is fine?
Where does work get stuck?
If you could fix one thing this quarter, what would it be?

I resisted the urge to dive deep into software too early. Understanding how the team experiences the system is more important than understanding the system itself. You can always learn code. Trust is slower.

Phase 2 - Weeks 3-4: Walk the Product

This is where I forced myself to stop reading and start using.

I walked the lifecycle end-to-end as a user.

instrumented a small demo app.
created checks.
triggered failures.
followed alerts.
wrote k6 tests.

Not to test the product, but to test my understanding. This phase revealed more gaps in my mental model than any document could. It also surfaced something important: friction often lives at the boundaries between products, not within them.

Phase 3 - End of Month 1: Reflect Publicly

At the four-week mark, I wrote and shared a reflection. Not a polished summary, a real one.

What did I learn?
What surprised me?
What still confused me?
Where was I behind?
What would I focus on next?

One uncomfortable realization was this: my job at that moment was to ask better questions, not to provide fast answers. If that sounds obvious - it’s not.

As a manager, there’s pressure to demonstrate clarity quickly. But clarity in a new domain takes time. Pretending to have it only creates fragility later.

Naming what I didn’t understand reduced the pressure to fake confidence. It also created space for the team to help me build context. Transparency during onboarding builds trust faster than posturing.

Reverse Engineering the Goals (Even If I Later Threw It Away)

One of the exercises I tried early on was to reverse engineer the squad’s goals. I synthesized what I thought the real objectives were and asked the team and PM: “Did I get this right?”

In hindsight, the document was probably too verbose and too heavy. It’s not something I would keep long term. But the exercise forced me to confront misalignment early.

Sometimes the value of a document isn’t in keeping it. It’s in thinking through it.

Treat Onboarding Like Data Collection

For the first month, I aggressively took notes.

Every 1:1.
Every architectural explanation.
Every Slack clarification.
Every moment where I realized I had misunderstood something.

I didn’t try to be elegant. I tried to be exhaustive. Over time, I started structuring those notes more intentionally. Onboarding, especially in a technical domain, is a data collection phase. You’re mapping a system that already exists, with its history, assumptions, and tensions.

Here are some patterns that helped me structure that data:

1. One-sentence summaries.
After each major conversation or topic, I tried to write one sentence I could rely on later. If I couldn’t summarize it clearly, I probably didn’t understand it well enough yet.

2. Facts and open questions per domain.
For each product or sub-system, I separated what I knew from what I didn’t. Writing down open questions made uncertainty explicit instead of vaguely uncomfortable.

3. An observation log - who said what.
Not in a political way, but in a pattern-detection way. Different perspectives reveal different parts of the system. Over time, themes start to emerge.

4. Hypotheses and tensions.
As patterns formed, I wrote down emerging hypotheses:

“Is private probes actually a product within a product?”
“Are we optimizing for enterprise customers over small ones?”

Capturing tensions early helped me avoid jumping to conclusions too quickly.

5. A decisions log.
Some decisions I intentionally deferred during onboarding. I wrote them down explicitly. That way, deferring wasn’t avoidance - it was conscious sequencing.

Later, I used AI tools to summarize and cluster my notes. Not to replace thinking, but to accelerate synthesis. Onboarding is not about having opinions quickly. It’s about building a clear mental model over time.

First gather signal. Then refine.

Ask for Direct Feedback

Rather than assuming alignment, I asked explicitly:

Is there anything you were expecting to see from me that you’re not seeing?

That question changed the tone of onboarding. It surfaced implicit expectations that no checklist would have captured.

Feedback during onboarding isn’t about evaluation. It’s about adjusting trajectory.

What Worked

Phased onboarding gave structure to uncertainty.
Walking the product exposed real friction.
Shadowing on-duty revealed operational reality.
Writing reflections forced synthesis.

Most importantly, being intentional created momentum.

What I’d Do Differently

Reverse engineering goals was useful but time consuming.

I underestimated how much institutional knowledge lives only in people’s heads.
I delayed parts of hands-on exploration longer than I should have.
I still lack full business context in certain areas.

But onboarding is not about finishing. It’s about trajectory.

Closing Thought

Switching teams as an Engineering Manager isn’t about proving yourself quickly. It’s about building clarity - of the domain, the people, the risks, and the direction.

You don’t need all the answers in the first month. You need structure. You need signal. And you need the humility to admit what you don’t yet understand.

In a follow-up post, I’ll write about what I learned when I tried to understand Digital Experience Monitoring by walking the lifecycle as a user, and why that exercise reshaped how I think about product clarity.

The Three Levers of Growth

2026-01-01T00:00:00+00:00

I’ve been thinking a lot recently about personal growth and, in the process, realized something important: there have always been periods in my career where I wasn’t growing. Sometimes I understood it too late. Other times, the lack of growth resulted in stagnation that ultimately harmed my team, my family, and the business.

Growth is different than performance. Often (more often than ideal), high performers experience growth pauses more than others. It’s critical to not confuse growth with performance. Growth is not the same as doing your job well. It has also been shown that growth is crucial for company retention, which is why you usually find the “Am I growing?” question on annual surveys.

For many years, I’ve done a quarterly assessment to measure growth by answering three questions.

Are you learning? Are you growing?
Are you gaining transferable skills or just learning how to cope with your current job?
How confident and capable do you feel in your role?

I wrote about this more here. This framework served me well for a long time, but as I grew as a leader, it started to feel insufficient. I was searching for a new lens when I came across this idea while reading Ego Is the Enemy by Ryan Holiday. In the book, there’s a reflection (originally from Epictetus) that stayed with me:

“It is impossible to learn that which one thinks one already knows.” - Epictetus, as cited in Ego Is the Enemy.

This resonated because it reminded me of a simple truth: growth depends on how you orient yourself relative to others.

A Meta Framework for Growth

In this context, I started thinking about growth not only as what you do, but who you relate to:

A Teacher: Someone you can learn from (someone more experienced or wiser than you). This means you are intentionally seeking knowledge and growth.
A Peer: Someone your knowledge can be tested against (someone equal). You have someone who challenges your thinking and helps you refine your understanding through dialogue.
A Student: Someone you can teach (someone less experienced). Student forces you to articulate and apply your knowledge in new ways.

This framework struck me as a useful meta model for growth. These levers don’t measure performance. They measure growth conditions.

I didn’t have a teacher for a very long time. Could that be why my growth felt stalled?

We often ask, especially in leadership roles, whether someone has mentored others, and sometimes even make it a prerequisite for promotion. There is an obvious aspect to this: when you mentor, you create leverage and maximize your impact in the organization. That’s what is expected from a leader.

But it’s less discussed that mentoring is fundamentally about teaching. When you teach someone, you are practicing the third component of the framework.

So what about the other two components?

Learning and growing with peers is the most obvious and visible lever, yet it is often overlooked. Who are you learning with?
Having a teacher is critical. If you don’t have someone to learn from, that can itself be a sign that growth has slowed or stopped.

The subtle trap here is ego. The more experience you have, the more likely it is that ego can convince you that you don’t need a teacher. But that’s exactly when you need one most.

A teacher doesn’t have to be from your company or professional circle, it can be anyone: someone from the internet, a neighbor, a former colleague, anyone whose perspective challenges you.

The Mathematics of Growth

In many careers (especially technical ones), almost everyone who passes the basic bar eventually reaches a “senior” level. Some companies even treat this as a terminal point with a defined progression timeline. But after that, growth becomes hard to define.

You can learn more stuff, but does that necessarily mean you’re growing? For me personally, merely accumulating knowledge is not enough.

So what’s the formula? I don’t think there is one. But having a teacher, a peer, and a student creates necessary, if not fully sufficient conditions for growth to happen.

The two clocks ticking behind every performance issue

2025-12-11T00:00:00+00:00

I talked in the past about the two metrics that quietly drive a huge portion of our professional outcomes: TTA (Time to Awareness) and TTR (Time to Resolution).

We often treat performance as a static grade: “She is a high performer,” or “He is struggling.” But performance is actually a function of speed. Specifically, the speed at which you detect gaps and the speed at which you close them.

Over and over I see people making the same mistake by looking at their or their team’s performance as a “grade”, but the reality changes when you start looking at the clocks of performance instead.

The First Clock: TTA (Time to Awareness)

Time to Awareness is the latency between reality changing and you realizing it. This is the most dangerous clock because it ticks silently. You can have a TTA of six months and feel perfectly fine the entire time.

In engineering systems, we have tools (like the Grafana ecosystem) and alerts to keep TTA near zero. If a server goes down, we know in seconds. In human systems, we have… “politeness.” We rely on social signals, which are often noisy. Managers delay feedback to “gather more data.” Peers don’t want to be “mean.” We avoid self-reflection because it’s uncomfortable.

High TTA is usually a symptom of a low-feedback environment (or a low-receptivity mindset).

The Second Clock: TTR (Time to Resolution)

Time to Resolution is the latency between knowing the truth and fixing it. Once the first clock stops (you know there is an issue), the second clock starts. This is where the work happens.

High TTR is rarely about laziness, it’s usually about ambiguity. “Improve system design” is a terrifying, vague goal. It’s too big to fix on a Tuesday afternoon. So we procrastinate. We wait for a “perfect time” to study that never comes.

High TTR is a symptom of poor definition.

The Zone of Performance

If you plot these two against each other, you can map every stage of your career:

High TTA / High TTR (The Danger Zone): You don’t know you’re failing, and even if you did, you wouldn’t fix it fast enough. This leads to PIPs and surprise firings.
Low TTA / High TTR (The Frustrated Stagnation): You are painfully aware of your gaps (maybe you have Imposter Syndrome), but you feel stuck. You analyze endlessly but don’t ship improvements.
High TTA / Low TTR (The Blind executor): You can fix anything you see, but you don’t see enough. You need a strong manager to point you in the right direction constantly.
Low TTA / Low TTR (The High Performer): You have a tight feedback loop. You know when you drift off course within days (or hours), and you correct it immediately.

How to stop the clocks

Reducing TTA (Know Sooner)

You cannot wait for your manager to tell you the truth. You have to hunt for it. Quantify your output: Don’t just “work.” Log what you shipped. Look at it at the end of the week. Does it look like a Senior Engineer’s week? If you don’t know, ask.

Ask for “leading” feedback: Instead of “How am I doing?”, ask “What is the one thing I could have done better this week?”

Reducing TTR (Fix Faster)

The only way to reduce TTR is to break the fix down until it looks like a ticket. Turn “Growth” into “Input”: Don’t put “Get better at communication” on your to-do list. Put “Write a 1-page RFC for the API migration” on your calendar.

Measure the inputs: If you want to improve, track the effort, not just the result.

The ultimate hack

The best engineers I know treat their career like a production system. They hate latency. They don’t wait for the quarterly review to find out if they are on track. They look in the mirror every week. They catch the drift early and course-correct before anyone else notices.

5 years at Grafana Labs

2025-11-28T00:00:00+00:00

It has been five years since I joined Grafana Labs. In my entire career, I have never worked at any company for this long. This is a new situation for me. Turns out, you can stay in the same company, with the same people, for years and still feel fulfillment, joy, and growth.

But it hasn’t all been rainbows, I had moments where I wondered if I was actually growing or just hanging around. So, I’ve been reflecting on the reality of staying put in a hyper-growth environment. To stay in a “cliché” zone, I thought to write 5 learnings and 5 challenges from the last years.

5 Learnings

You can’t cheat evolution: I see many startups today scaling to $100M revenue in six months with zero infrastructure or telemetry to sustain it. Then they hit the wall. You can cheat many things, but you can’t cheat evolution. It takes time to mature. Time works for you, but only if you let it. Scaling isn’t a magic trick; it requires repetition, mistakes, and corrections.
The ROI of people: I always knew the theory, but seeing it play out in real life is different. People are the most expensive investment and the biggest return. Scaling from 4 to 1200 doesn’t happen without continuous investment in people. Grafana Labs is remote, but the level of care I’ve seen here restored my belief that corporations can actually be humanistic.
Diplomacy is not politics: We overuse the word “politics” so much that everything becomes politics. But that isn’t true. Politics is power-playing; diplomacy is navigating complexity in a way that preserves trust. Scaling organizations require the right diplomacy.
Community is the strategy: Customers and community are not “third parties” you occasionally interact with and need to sell a product. They are part of the strategy. Honestly, they are part of the family. Long-term success comes from a symbiosis of internal and external collaboration, thus you cannot build in a vacuum.
Your career is built, not given: No matter how supportive your environment is, no one will steer your career except you. The moments where I grew the most were the ones where I actively shaped my own path—changing scopes, redefining responsibilities, and asking for alignment. Your career does not “happen.” It is built, step by step, by you.

5 Challenges

The tectonic plates of scale: The complexity of scale sneaks up on you. At 200 people, everything feels manageable. At 800, invisible cracks appear. At 1200, the cracks turn into tectonic plates. Navigating that shift required relearning what “alignment” even means.
Emotional asymmetry: Remote work is incredible, but it is emotionally asymmetric. You don’t see people in the hallway. You don’t sense tension early. Sometimes you discover problems only when they are fully grown trees, not when they are seedlings. You have to work twice as hard to keep your finger on the pulse.
Uneven growth pace: The speed of growth means not everyone grows at the same pace. Some people outgrow their roles, some fall behind, and some freeze. Managing that dynamic in fair and humanly way is one of the hardest parts of leadership.
Adrenaline vs. Systems: The temptation to optimize for short-term output is always there. The industry chases “ship faster” and “deliver more.” But great teams don’t survive on adrenaline; they survive on systems. Resisting the adrenaline rush was often harder than embracing it.
Uncomfortable truths: Being honest about culture is uncomfortable. Every company has blind spots, and pointing at them can feel like poking the bear, but avoiding it only makes the culture smaller. Learning how to name uncomfortable truths without being destructive, that was the real challenge.

Here is to the next chapter. Grafana Labs here to stay! Also, we are almost always hiring