Observability tools act as safety nets for engineers, providing essential support for maintaining system stability and resolving issues swiftly. These tools play a critical role in incident managementโa process that includes monitoring, alerting, triaging, investigating, and remediating.
My team specifically focuses on the investigating stage, where effective analysis helps engineers pinpoint the root cause of problems and ensures quick, efficient incident resolution.
Our product's multiple correlation tools, including RCA and Change Intelligence, lacked a unified workflow, creating a fragmented user experience.
Each correlation tool supported different telemetry types, creating limitations for users trying to investigate issues comprehensively. This inconsistency restricted users' ability to leverage correlations effectively across various data sets.
User feedback indicated that the UI, particularly for Change Intelligence, was complex and difficult to navigate. Tight deadlines and technical constraints contributed to a steep learning curve, reducing the featureโs overall value.
To streamline our workflows and increase the usability and usefulness of our correlations feature, we need to enhance and upgrade our latest feature, Change Intelligence, while phasing out the old correlation feature, RCA and correlation panel.
To uncover the key pain points and opportunities for improvement, I conducted extensive research through user interviews, session analysis, competitive benchmarking, and collaborative discussions with stakeholders.
To understand how users interacted with the existing correlation features, I reviewed approximately 200 FullStory sessions and conducted interviews with 12 users (4 internal and 8 external) who regularly used the product.
To better understand where our product stood in the market, I analyzed Honeycombโs BubbleUp feature, a frequent comparison made by users, with input from our GTM experts.
To ensure our research findings translated into actionable strategies, I facilitated collaborative sessions with stakeholders, engineers, and designers to prioritize our goals and roadmap.
To understand how research findings influenced design decisions, please continue reading the motivations in the iterative implementation section โฌ๏ธ .
Given the magnitude and complexity of the problem, we planned out four milestones. These milestones are designed to address and solve the problem incrementally, ensuring thorough consideration and attention to each. This allowed us to stay organized and kept us on track towards our main goal.
Following implementation and enhancement of the feature, daily active users increased by 130%. User engagement metrics showed higher average session duration as people explored the tool's expanded capabilities. Success stories from users confirmed the tool effectively met their specific needs, validating its improved functionality and user experience.