{"@context":"https://schema.org","@type":"CreativeWork","@id":"https://froggit.ai/public/capsules/166294c1-b1f5-49ee-861b-6a6dc22dba68","identifier":"166294c1-b1f5-49ee-861b-6a6dc22dba68","url":"https://froggit.ai/public/capsules/166294c1-b1f5-49ee-861b-6a6dc22dba68","name":"CLAD: Efficient Log Anomaly Detection Directly on Compressed Representations","text":"# CLAD: Efficient Log Anomaly Detection Directly on Compressed Representations\n\nSource-backed public reference for arXiv:2604.13024.\n\n**Authors:** Benzhao Tang, Shiyu Yang\n**Primary source:** https://arxiv.org/abs/2604.13024\n**Published:** 2026-04-14T17:57:01Z\n**Updated:** 2026-04-14T17:57:01Z\n**Categories:** cs.LG, cs.DB\n\n## Abstract Summary\nThe explosive growth of system logs makes streaming compression essential, yet existing log anomaly detection (LAD) methods incur severe pre-processing overhead by requiring full decompression and parsing. We introduce CLAD, the first deep learning framework to perform LAD directly on compressed byte streams. CLAD bypasses these bottlenecks by exploiting a key insight: normal logs compress into regular byte patterns, while anomalies systematically disrupt them. To extract these multi-scale deviations from opaque bytes, we propose a purpose-built architecture integrating a dilated convolutional byte encoder, a hybrid Transformer--mLSTM, and four-way aggregation pooling. This is coupled with a two-stage training strategy of masked pre-training and focal-contrastive fine-tuning to effectively handle severe class imbalance. Evaluated across five datasets, CLAD achieves a state-of-the-art average F1-score of 0.9909 and outperforms the best baseline by 2.72 percentage points. It delivers superior accuracy while completely eliminating decompression and parsing overheads, offering a robust solution that generalizes to structured streaming compressors.\n\n## Public Use Notes\n- This capsule summarizes the paper's arXiv metadata and abstract; it is not an independent replication or endorsement of the paper's claims.\n- Use it as a cited research reference for discovery, retrieval, and agent context.\n- For clinical, security, operational, or deployment-sensitive topics, treat the paper as research context rather than medical, legal, safety, or engineering advice.\n\n## Source\n- https://arxiv.org/abs/2604.13024","keywords":["cs.LG","cs.DB"],"about":[],"citation":["https://arxiv.org/abs/2604.13024"],"isPartOf":{"@type":"Dataset","name":"Forge Cascade Knowledge Graph","url":"https://froggit.ai"},"publisher":{"@type":"Organization","name":"Forge Cascade","url":"https://froggit.ai"},"dateCreated":"2026-04-15T06:00:04.575000Z","dateModified":"2026-06-19T13:48:06Z","isBasedOn":"https://arxiv.org/abs/2604.13024","additionalProperty":[{"@type":"PropertyValue","name":"trust_level","value":40},{"@type":"PropertyValue","name":"verification_status","value":"sources_verified"},{"@type":"PropertyValue","name":"provenance_status","value":"valid"},{"@type":"PropertyValue","name":"evidence_level","value":"primary_source"},{"@type":"PropertyValue","name":"content_hash","value":"9dccd3b67f48c1602277604728f35dab72fadb6424920c0a821d5cfd6fe96de9"}]}