Cisco recently experimented with using AI to automate security incident report writing, revealing both the potential and limitations of AI in cybersecurity operations. While the AI successfully generated structured reports and saved time on routine documentation, it also produced inaccuracies, missed critical context, and required significant human oversight. This experiment highlights the current reality of AI in security: a useful tool for augmentation, not replacement, of human analysts.
Introduction
The cybersecurity industry faces a persistent challenge: too many incidents, too few analysts, and never enough time. Cisco’s Talos security team decided to address this by experimenting with AI-generated security incident reports. The results paint a nuanced picture of where AI currently stands in security operations.
This isn’t just another “AI will replace security analysts” story. Instead, Cisco’s honest assessment provides valuable insights into what AI can realistically accomplish in incident response today, and where human expertise remains irreplaceable.
For organizations considering AI integration into their security workflows, Cisco’s experience offers crucial lessons about deployment strategies, quality control, and the hybrid human-AI model that appears most promising.
Background & Context
Security teams worldwide struggle with documentation overhead. Every incident requires detailed reports covering timeline, indicators of compromise (IoCs), root cause analysis, remediation steps, and lessons learned. This documentation is critical for compliance, knowledge sharing, and improving defenses, but it’s time-consuming work that pulls analysts away from active threat hunting and response.
Large language models (LLMs) have shown impressive capabilities in generating structured text from unstructured data. Tools like ChatGPT, Claude, and specialized security AI platforms promise to automate repetitive writing tasks while maintaining consistency and completeness.
Cisco’s security operations center (SOC) handles thousands of incidents annually. The team saw an opportunity to leverage AI for initial report drafting, allowing human analysts to focus on investigation and decision-making rather than documentation. They deployed the system across multiple incident types, from routine malware detections to complex intrusion investigations.
The experiment ran for several months, generating reports that were then reviewed and edited by senior analysts. Cisco tracked metrics including time savings, accuracy rates, required edits, and analyst satisfaction with the AI-generated content.
Technical Breakdown
Cisco’s implementation used a fine-tuned LLM trained on historical incident reports from their environment. The system followed a structured workflow:
Data Collection Phase:
Input Sources:
- SIEM alerts and raw logs
- EDR telemetry data
- Network flow records
- Threat intelligence feeds
- Analyst notes and timestamps
The AI ingested structured data from security tools alongside unstructured analyst notes. It parsed timestamps, IP addresses, file hashes, and user accounts to build a chronological incident timeline.
Report Generation Process:
The LLM followed templates matching Cisco’s existing report structure. It populated sections systematically:
- Executive Summary: High-level incident description and business impact
- Technical Details: IoCs, attack vectors, and affected systems
- Timeline: Chronological event sequence from detection to containment
- Root Cause: Analysis of how the incident occurred
- Remediation: Actions taken and recommendations
Quality Control Integration:
Cisco implemented a review workflow where AI-generated reports were flagged with confidence scores. Reports below certain thresholds received priority human review before publication.
The system also highlighted sections where it lacked sufficient data or detected inconsistencies, prompting analysts to verify specific details.
Impact & Risk Assessment
Positive Outcomes:
The experiment demonstrated measurable benefits in specific scenarios. For routine incidents with clear patterns (known malware detections, policy violations, failed login attempts), AI reduced report writing time by 40-60%. The system excelled at:
- Extracting and formatting IoCs consistently
- Generating accurate timelines from timestamped data
- Applying standard remediation procedures for common incident types
- Maintaining consistent report structure and formatting
Junior analysts particularly benefited from AI-generated drafts as learning templates, seeing how senior analysts typically structure and phrase incident reports.
Critical Failures:
However, the AI struggled significantly with complex incidents requiring contextual understanding:
- Missed Attack Context: The AI failed to recognize when seemingly unrelated events were part of coordinated attack campaigns
- Incorrect Causality: It sometimes reversed cause-and-effect relationships in attack chains
- Hallucinated Details: The system occasionally generated plausible-sounding but factually incorrect technical details
- Lost Nuance: Subtle indicators that experienced analysts would flag received inadequate attention
One concerning example involved an intrusion where the AI correctly documented individual events but completely missed that the attacker had maintained persistence through multiple techniques, presenting it as a simple malware infection rather than a sophisticated breach.
Security Implications:
The most significant risk emerged around over-reliance. Analysts who trusted AI-generated reports without thorough review sometimes missed critical details. One incident was nearly closed prematurely when the AI report understated the scope of data access an attacker had obtained.
Vendor Response
Cisco publicly shared their findings through blog posts and conference presentations, taking a refreshingly transparent approach. Rather than marketing AI as a solution to the analyst shortage, they positioned it realistically as an augmentation tool requiring careful implementation.
Key recommendations from Cisco’s security team:
Appropriate Use Cases:
- Initial draft generation for routine, well-understood incident types
- IoC extraction and formatting from raw logs
- Report structure scaffolding that analysts complete
- Documentation consistency enforcement
Inappropriate Use Cases:
- Final report generation without human review
- Complex incident analysis requiring contextual understanding
- Root cause determination for novel attack techniques
- Strategic security recommendations
Cisco emphasized that their implementation required significant upfront investment in training data curation, prompt engineering, and integration with existing tools. Organizations considering similar deployments should expect 3-6 months of tuning before production use.
The company also noted that different AI models performed better on different tasks. Extracting structured data worked well with smaller, specialized models, while narrative sections benefited from larger, more capable LLMs.
Mitigations & Workarounds
Organizations interested in AI-assisted incident reporting should implement these safeguards:
Mandatory Human Review:
Review Requirements:
- All AI reports: Technical accuracy verification
- Medium+ severity: Senior analyst approval
- High/critical: Peer review by 2+ analysts
- Novel incidents: Complete analyst rewrite
Never publish AI-generated security reports without qualified human review. The cost of inaccurate incident documentation far exceeds any time savings.
Confidence Scoring:
Implement automated flagging for reports requiring extra scrutiny:
- Low confidence scores from the AI itself
- Incidents involving unfamiliar IoCs or techniques
- Reports with missing data in critical fields
- Inconsistencies between sections
Hybrid Workflows:
The most successful approach combined AI and human strengths:
- AI generates initial timeline and IoC extraction
- Human analyst investigates and validates technical details
- AI formats findings according to template structure
- Human analyst writes analysis, conclusions, and recommendations
- AI performs consistency and completeness checks
- Human analyst performs final review and approval
Training and Calibration:
Analysts need training to use AI tools effectively:
- Understanding AI limitations and common failure modes
- Recognizing hallucinated technical details
- Knowing when to override AI suggestions
- Properly reviewing and editing AI-generated content
Detection & Monitoring
Organizations deploying AI for security reporting should monitor for quality degradation:
Quality Metrics:
Tracking Dashboard:- Edit percentage (how much human revision required)
- Error rates by incident type
- Time from AI draft to analyst approval
- Reopened incidents due to incomplete reports
- Analyst satisfaction scores
Track these metrics over time to identify when AI performance degrades or when specific incident types consistently require extensive revision.
Feedback Loops:
Implement mechanisms for analysts to flag problematic AI outputs:
- Specific errors (hallucinations, missed context, incorrect causality)
- Incident types where AI performs poorly
- Suggested improvements to templates or training data
Use this feedback to continuously refine the system. AI reporting tools require ongoing maintenance, not one-time deployment.
Audit Trails:
Maintain complete records showing:
- Original AI-generated report versions
- All human edits with timestamps and analyst IDs
- Approval chain for final publication
- Discrepancies between AI and final versions
This documentation proves invaluable for improving the system and demonstrating due diligence during audits or post-incident reviews.
Best Practices
Based on Cisco’s experience and broader AI security research, follow these guidelines:
Start Small:
Begin with low-risk, high-volume incident types where mistakes have minimal consequences. Phishing reports, failed login attempts, and routine malware detections make good initial use cases. Gain confidence before expanding to complex incidents.
Maintain Human Expertise:
AI augmentation works only when human analysts possess the expertise to catch AI errors. Don’t reduce analyst training or headcount based on AI efficiency gains. The technology makes skilled analysts more productive, but cannot replace their judgment.
Template Discipline:
AI performs best with consistent, well-structured templates. Document your report formats explicitly, including required sections, data fields, and quality standards. The clearer your structure, the better AI can follow it.
Version Control:
Treat AI prompts, templates, and configurations like code. Use version control, document changes, and maintain rollback capability. Track which AI model versions generated which reports for future auditing.
Transparency:
Be clear about AI involvement in report generation, both internally and to stakeholders who receive reports. Maintain trust by being honest about the technology’s role and limitations.
Continuous Validation:
Regularly audit AI-generated reports even after initial deployment. Select random samples for detailed human review to catch systematic errors that might not appear in normal workflows.
Escalation Procedures:
Define clear criteria for when incidents should bypass AI assistance entirely:
Manual-Only Criteria:- Nation-state attribution
- Data breach notifications
- Legal/regulatory incident reports
- Novel attack techniques
- Business-critical system compromises
Key Takeaways
Cisco’s AI security reporting experiment reveals the current state of AI in cybersecurity operations:
AI as Augmentation: The technology works best as an analyst assistant, handling routine tasks and initial drafts while humans provide expertise, context, and judgment.
Quality Control is Essential: Without rigorous human review, AI-generated reports risk missing critical details, introducing inaccuracies, and creating false confidence in incident understanding.
Use Case Matters: AI excels at structured data extraction and formatting but struggles with complex analysis, contextual understanding, and novel situations.
Investment Required: Successful implementation demands significant effort in training, integration, workflow design, and ongoing monitoring—not a simple plug-and-play solution.
Human Expertise Remains Critical: AI makes skilled analysts more productive but cannot replace their expertise. Organizations need to maintain and develop analyst capabilities even as they deploy AI tools.
The promise of AI in security operations is real but bounded. Organizations that recognize both capabilities and limitations will gain the most value from these technologies.
References
- Cisco Talos Security Blog: AI-Assisted Incident Reporting Project Results
- SANS Institute: AI in Security Operations Centers Survey 2024
- NIST AI Risk Management Framework
- “The Role of AI in Security Operations” – Gartner Research
- ISO/IEC 27035: Information Security Incident Management Guidelines
- Industry conference presentations from Cisco security team members
- Academic research on LLM accuracy in technical documentation
Stay updated at CyDhaal.com
📧 Subscribe to our newsletter @ https://cydhaal.com/newsletter/