Common Log Formats

Stellar Cyber supports numerous log formats and combinations of formats that log sources on the network send to modular sensors. This page presents several of the most common log formats that Stellar Cyber parsers support. For each log format, there’s a brief introduction, an example log, and a breakdown of the components of the log. Following this is a summary of the differences between the various formats. Finally, there’s a section that covers how different formats and different parsing and processing techniques like regex and Logstash can be combined and an example with a breakdown of its components.

The following are the major formats of data logs sent to Stellar Cyber. For each format, there’s an example of the same event so you can more easily spot format differences: a successful user login from 192.168.1.10 to 192.168.1.20 at 12:00:01 PM UTC on August 4, 2024.

CEF (Common Event Format)

CEF is a log format designed for interoperability between different security products and Security Information and Event Management (SIEM). It is structured and easy to parse, with fields like severity, event name, and source/destination IP addresses.

Delimited Text Formats (CSV, TSV, and Pipe-Separated Values)

Delimited text formats are simple, text-based methods in which data fields are separated by specific characters. These formats are commonly used for exporting and importing data between applications:

  • Comma-separated values (CSV): Fields are separated by commas.

  • Tab-separated values (TSV): Fields are separated by tabs.

  • Pipe-separated values: Fields are separated by the pipe ( | ) character.

While CSV is the most common, TSV and pipe-separated values are useful alternatives when the data sometimes contain commas or tabs. Other methods of separating values include space-separated values (SSV) and grave accent-separated ( ` ) values.

Comma-Separated Values

Tab-Separated Values

Pipe-Separated ( | ) Values

JSON (JavaScript Object Notation)

JSON is a lightweight data format that uses key-value pairs within a structured, human-readable text format. It's widely used in web applications and log management systems.

Key-Value Pairs

This format represents data as a series of key-value pairs, which are easy to parse and search. It is commonly used in configuration files and log data.

LEEF (Log Event Extended Format)

LEEF is a log format developed for IBM QRadar. It is similar to CEF but with slight variations in structure and field definitions. There are two versions of LEEF. In LEEF 2.0, the delimiters between key-value pairs are configurable. For more information about LEEF, see LEEF event components.

Syslog RFC 3164

RFC 3164 defines a traditional syslog format that includes mandatory header fields for a priority value, timestamp, and hostname followed by the rest of the message. It is less structured and often used for general logging purposes. For more information about syslog RFC 3164, see The BSD syslog Protocol.

Syslog RFC 5424

RFC 5424 is a more modern and structured syslog format, allowing for additional fields and structured data. It’s used for more detailed and flexible logging. For more information about syslog RFC 5424, see The Syslog Protocol.

Summary of Differences

  • CEF, LEEF, and syslog (RFC 3164 & RFC 5424) formats are primarily used in security logging and SIEMs.

  • CSV, TSV, pipe-separated values and JSON are general-purpose formats, with JSON providing more structure and flexibility.

  • Key-Value Pairs are simple and versatile but lack a standardized format.

Log Format Combinations

Log format types can be combined with each other and with other types of techniques for collecting, organizing, parsing, and pattern-matching data such as sFlow, XML, and regex (regular expression). Logstash is not a data log format either but a data processing pipeline that ingests, transforms, and sends data from various sources to destinations like Elasticsearch or other databases.