mozlog.structured — Structured logging for test output

mozlog.structured is a library designed for logging the execution and results of test harnesses. The internal data model is a stream of JSON-compatible objects, with one object per log entry. The default output format is line-based, with one JSON object serialized per line.

mozlog.structured is not based on the stdlib logging module, although it shares several concepts with it.

One notable difference between this module and the standard logging module is the way that loggers are created. The structured logging module does not require that loggers with a specific name are singleton objects accessed through a factory function. Instead the StructuredLogger constructor may be used directly. However all loggers with the same name share the same internal state (the “Borg” pattern). In particular the list of handler functions is the same for all loggers with the same name.

Typically, you would only instantiate one logger object per program. Two convenience methods are provided to set and get the default logger in the program.

Logging is threadsafe, with access to handlers protected by a threading.Lock. However it is not process-safe. This means that applications using multiple processes, e.g. via the multiprocessing module, should arrange for all logging to happen in a single process.

Data Format

Structured loggers produce messages in a simple format designed to be compatible with the JSON data model. Each message is a single object, with the type of message indicated by the action key. It is intended that the set of action values be closed; where there are use cases for additional values they should be integrated into this module rather than extended in an ad-hoc way. The set of keys present on on all messages is:

action
The type of the message (string).
time
The timestamp of the message in ms since the epoch (int).
thread
The name of the thread emitting the message (string).
pid
The pid of the process creating the message (int).
source
Name of the logger creating the message (string).

For each action there are is a further set of specific fields describing the details of the event that caused the message to be emitted:

suite_start

Emitted when the testsuite starts running.

tests
A list of test ids. Test ids can either be strings or lists of strings (an example of the latter is reftests where the id has the form [test_url, ref_type, ref_url]) and are assumed to be unique within a given testsuite. In cases where the test list is not known upfront an empty list may be passed (list).
run_info
An optional dictionary describing the properties of the build and test environment. This contains the information provided by mozinfo, plus a boolean debug field indicating whether the build under test is a debug build.
suite_end
Emitted when the testsuite is finished and no more results will be produced.
test_start

Emitted when a test is being started.

test
A unique id for the test (string or list of strings).
test_status

Emitted for a test which has subtests to record the result of a single subtest.

test
The same unique id for the test as in the test_start message.
subtest
Name of the subtest (string).
status
Result of the test (string enum; PASS, FAIL, TIMEOUT, NOTRUN)
expected
Expected result of the test. Omitted if the expected result is the same as the actual result (string enum, same as status).
test_end

Emitted to give the result of a test with no subtests, or the status of the overall file when there are subtests.

test
The same unique id for the test as in the test_start message.
status
Either result of the test (if there are no subtests) in which case (string enum PASS, FAIL, TIMEOUT, CRASH, ASSERT, SKIP) or the status of the overall file where there are subtests (string enum OK, ERROR, TIMEOUT, CRASH, ASSERT, SKIP).
expected
The expected status, or omitted if the expected status matches the actual status (string enum, same as status).
process_output

Output from a managed subprocess.

process pid of the subprocess.

command Command used to launch the subprocess.

data Data output by the subprocess.

log

General human-readable logging message, used to debug the harnesses themselves rather than to provide input to other tools.

level
Level of the log message (string enum CRITICAL, ERROR, WARNING, INFO, DEBUG).
message
Text of the log message.

Testsuite Protocol

When used for testsuites, the following structured logging messages must be emitted:

  • One suite_start message before any test_* messages
  • One test_start message per test that is run
  • One test_status message per subtest that is run. This might be zero if the test type doesn’t have the notion of subtests.
  • One test_end message per test that is run, after the test_start and any test_status messages for that same test.
  • One suite_end message after all test_* messages have been emitted.

The above mandatory events may be interspersed with process_output and log events, as required.

Subtests

The purpose of subtests is to deal with situations where a single test produces more than one result, and the exact details of the number of results is not known ahead of time. For example consider a test harness that loads JavaScript-based tests in a browser. Each url loaded would be a single test, with corresponding test_start and test_end messages. If there can be more than one JS-defined test on a page, however, it it useful to track the results of those tests seperately. Therefore each of those tests is a subtest, and one test_status message must be generated for each subtest result.

Subtests must have a name that is unique within their parent test.

Whether or not a test has subtests changes the meaning of the status property on the test itself. When the test does not have any subtests, this property is the actual test result such as PASS or FAIL . When a test does have subtests, the test itself does not have a result as-such; it isn’t meaningful to describe it as having a PASS result, especially if the subtests did not all pass. Instead this property is used to hold information about whether the test ran without error. If no errors were detected the test must be given the status OK. Otherwise the test may get the status ERROR (for e.g. uncaught JS exceptions), TIMEOUT (if no results were reported in the allowed time) or CRASH (if the test caused the process under test to crash).

StructuredLogger Objects

mozlog.structured.structuredlog.set_default_logger(default_logger)[source]

Sets the default logger to logger.

It can then be retrieved with get_default_logger()

Parameters:default_logger – The logger to set to default.
mozlog.structured.structuredlog.get_default_logger(component=None)[source]

Gets the default logger if available, optionally tagged with component name. Will return None if not yet set

Parameters:component – The component name to tag log messages with
class mozlog.structured.structuredlog.StructuredLogger(name, component=None)[source]
add_handler(handler)[source]

Add a handler to the current logger

critical(*args, **kwargs)

Log a message with level CRITICAL

Parameters:
  • message – The string message to log
  • exc_info – Either a boolean indicating whether to include a traceback derived from sys.exc_info() or a three-item tuple in the same format as sys.exc_info() containing exception information to log.
debug(*args, **kwargs)

Log a message with level DEBUG

Parameters:
  • message – The string message to log
  • exc_info – Either a boolean indicating whether to include a traceback derived from sys.exc_info() or a three-item tuple in the same format as sys.exc_info() containing exception information to log.
error(*args, **kwargs)

Log a message with level ERROR

Parameters:
  • message – The string message to log
  • exc_info – Either a boolean indicating whether to include a traceback derived from sys.exc_info() or a three-item tuple in the same format as sys.exc_info() containing exception information to log.
handlers[source]

A list of handlers that will be called when a message is logged from this logger

info(*args, **kwargs)

Log a message with level INFO

Parameters:
  • message – The string message to log
  • exc_info – Either a boolean indicating whether to include a traceback derived from sys.exc_info() or a three-item tuple in the same format as sys.exc_info() containing exception information to log.
process_output(*args, **kwargs)[source]

Log output from a managed process.

Parameters:
  • process – A unique identifier for the process producing the output (typically the pid)
  • data – The output to log
  • command – A string representing the full command line used to start the process.
remove_handler(handler)[source]

Remove a handler from the current logger

suite_end(*args, **kwargs)[source]

Log a suite_end message

suite_start(*args, **kwargs)[source]

Log a suite_start message

Parameters:tests – List of test identifiers that will be run in the suite.
test_end(*args, **kwargs)[source]

Log a test_end message indicating that a test completed. For tests with subtests this indicates whether the overall test completed without errors. For tests without subtests this indicates the test result directly.

Parameters:
  • test – Identifier of the test that produced the result.
  • status – Status string indicating the test result
  • expected – Status string indicating the expected test result.
  • message – String containing a message associated with the result.
  • stack – a stack trace encountered during test execution.
  • extra – suite-specific data associated with the test result.
test_start(*args, **kwargs)[source]

Log a test_start message

Parameters:test – Identifier of the test that will run.
test_status(*args, **kwargs)[source]

Log a test_status message indicating a subtest result. Tests that do not have subtests are not expected to produce test_status messages.

Parameters:
  • test – Identifier of the test that produced the result.
  • subtest – Name of the subtest.
  • status – Status string indicating the subtest result
  • expected – Status string indicating the expected subtest result.
  • message – String containing a message associated with the result.
  • stack – a stack trace encountered during test execution.
  • extra – suite-specific data associated with the test result.
warning(*args, **kwargs)

Log a message with level WARNING

Parameters:
  • message – The string message to log
  • exc_info – Either a boolean indicating whether to include a traceback derived from sys.exc_info() or a three-item tuple in the same format as sys.exc_info() containing exception information to log.
class mozlog.structured.structuredlog.StructuredLogFileLike(logger, level=u'info', prefix=None)[source]

Wrapper for file-like objects to redirect writes to logger instead. Each call to write becomes a single log entry of type log.

When using this it is important that the callees i.e. the logging handlers do not themselves try to write to the wrapped file as this will cause infinite recursion.

Parameters:
  • loggerStructuredLogger to which to redirect the file write operations.
  • level – log level to use for each write.
  • prefix – String prefix to prepend to each log entry.

Handlers

A handler is a callable that is called for each log message produced and is responsible for handling the processing of that message. The typical example of this is a StreamHandler which takes a log message, invokes a formatter which converts the log to a string, and writes it to a file.

class mozlog.structured.handlers.StreamHandler(stream, formatter)[source]

Handler for writing to a file-like object

Parameters:
  • stream – File-like object to write log messages to
  • formatter – formatter to convert messages to string format
class mozlog.structured.handlers.LogLevelFilter(inner, level)[source]

Handler that filters out messages with action of log and a level lower than some specified level.

Parameters:
  • inner – Handler to use for messages that pass this filter
  • level – Minimum log level to process

Formatters

Formatters are callables that take a log message, and return either a string representation of that message, or None if that message should not appear in the output. This allows formatters to both exclude certain items and create internal buffers of the output so that, for example, a single string might be returned for a test_end message indicating the overall result of the test, including data provided in the test_status messages.

Formatter modules are written so that they can take raw input on stdin and write formatted output on stdout. This allows the formatters to be invoked as part of a command line for post-processing raw log files.

class mozlog.structured.formatters.base.BaseFormatter[source]

Base class for implementing non-trivial formatters.

Subclasses are expected to provide a method for each action type they wish to handle, each taking a single argument for the test data. For example a trivial subclass that just produces the id of each test as it starts might be:

class StartIdFormatter(BaseFormatter);
    def test_start(data):
        #For simplicity in the example pretend the id is always a string
        return data["test"]
class mozlog.structured.formatters.unittest.UnittestFormatter[source]

Formatter designed to produce output in a format like that used by the unittest module in the standard library.

class mozlog.structured.formatters.xunit.XUnitFormatter[source]

Formatter that produces XUnit-style XML output.

The tree is created in-memory so this formatter may be problematic with very large log files.

Note that the data model isn’t a perfect match. In particular XUnit assumes that each test has a unittest-style class name and function name, which isn’t the case for us. The implementation currently replaces path names with something that looks like class names, but this doesn’t work for test types that actually produce class names, or for test types that have multiple components in their test id (e.g. reftests).

class mozlog.structured.formatters.html.HTMLFormatter

Formatter that produces a simple HTML-formatted report.

class mozlog.structured.formatters.machformatter.MachFormatter(start_time=None, write_interval=False, write_times=True, terminal=None, disable_colors=False)[source]
class mozlog.structured.formatters.tbplformatter.TbplFormatter[source]

Formatter that formats logs in the legacy formatting format used by TBPL This is intended to be used to preserve backward compatibility with existing tools hand-parsing this format.

Processing Log Files

The mozlog.structured.reader module provides utilities for working with structured log files.

class mozlog.structured.reader.LogHandler[source]

Base class for objects that act as log handlers. A handler is a callable that takes a log entry as the only argument.

Subclasses are expected to provide a method for each action type they wish to handle, each taking a single argument for the test data. For example a trivial subclass that just produces the id of each test as it starts might be:

class StartIdHandler(LogHandler):
    def test_start(data):
        #For simplicity in the example pretend the id is always a string
        return data["test"]
mozlog.structured.reader.each_log(log_iter, action_map)[source]

Call a callback for each item in an iterable containing structured log entries

Parameters:
  • log_iter – Iterator returning structured log entries
  • action_map – Dictionary mapping action name to callback function. Log items with actions not in this dictionary will be skipped.
mozlog.structured.reader.handle_log(log_iter, handler)[source]

Call a handler for each item in a log, discarding the return value

mozlog.structured.reader.imap_log(log_iter, action_map)[source]

Create an iterator that will invoke a callback per action for each item in a iterable containing structured log entries

Parameters:
  • log_iter – Iterator returning structured log entries
  • action_map – Dictionary mapping action name to callback function. Log items with actions not in this dictionary will be skipped.
mozlog.structured.reader.read(log_f, raise_on_error=False)[source]

Return a generator that will return the entries in a structured log file. Note that the caller must not close the file whilst the generator is still in use.

Parameters:
  • log_f – file-like object containing the raw log entries, one per line
  • raise_on_error – boolean indicating whether ValueError should be raised for lines that cannot be decoded.

Integration with argparse

The mozlog.structured.commandline module provides integration with the argparse module to provide uniform logging-related command line arguments to programs using mozlog.structured. Each known formatter gets a command line argument of the form --log-{name}, which takes the name of a file to log to with that format, or - to indicate stdout.

mozlog.structured.commandline.add_logging_group(parser)[source]

Add logging options to an argparse ArgumentParser or optparse OptionParser.

Each formatter has a corresponding option of the form –log-{name} where {name} is the name of the formatter. The option takes a value which is either a filename or “-” to indicate stdout.

Parameters:parser – The ArgumentParser or OptionParser object that should have logging options added.
mozlog.structured.commandline.setup_handlers(logger, formatters, formatter_options)[source]

Add handlers to the given logger according to the formatters and options provided.

Parameters:
  • logger – The logger configured by this function.
  • formatters – A dict of {formatter, [streams]} to use in handlers.
  • formatter_options – a dict of {formatter: {option: value}} to to use when configuring formatters.
mozlog.structured.commandline.setup_logging(suite, args, defaults=None)[source]

Configure a structuredlogger based on command line arguments.

The created structuredlogger will also be set as the default logger, and can be retrieved with get_default_logger().

Parameters:
  • suite – The name of the testsuite being run
  • args – A dictionary of {argument_name:value} produced from parsing the command line arguments for the application
  • defaults – A dictionary of {formatter name: output stream} to apply when there is no logging supplied on the command line. If this isn’t supplied, reasonable defaults are chosen (coloured mach formatting if stdout is a terminal, or raw logs otherwise).
Return type:

StructuredLogger

Simple Examples

Log to stdout:

from mozlog.structured import structuredlog
from mozlog.structured import handlers, formatters
logger = structuredlog.StructuredLogger("my-test-suite")
logger.add_handler(handlers.StreamHandler(sys.stdout,
                                          formatters.JSONFormatter()))
logger.suite_start(["test-id-1"])
logger.test_start("test-id-1")
logger.info("This is a message with action='LOG' and level='INFO'")
logger.test_status("test-id-1", "subtest-1", "PASS")
logger.test_end("test-id-1", "OK")
logger.suite_end()

Populate an argparse.ArgumentParser with logging options, and create a logger based on the value of those options, defaulting to JSON output on stdout if nothing else is supplied:

import argparse
from mozlog.structured import commandline

parser = argparse.ArgumentParser()
# Here one would populate the parser with other options
commandline.add_logging_group(parser)

args = parser.parse_args()
logger = commandline.setup_logging("testsuite-name", args, {"raw": sys.stdout})

Count the number of tests that timed out in a testsuite:

from mozlog.structured import reader

count = 0

def handle_test_end(data):
    global count
    if data["status"] == "TIMEOUT":
        count += 1

reader.each_log(reader.read("my_test_run.log"),
                {"test_end": handle_test_end})

print count

More Complete Example

This example shows a complete toy testharness set up to used structured logging. It is avaliable as structured_example.py:

import argparse
import sys
import traceback
import types

from mozlog.structured import structuredlog, commandline

class TestAssertion(Exception):
    pass

def assert_equals(a, b):
    if a != b:
        raise TestAssertion("%r not equal to %r" % (a, b))

def expected(status):
    def inner(f):
        def test_func():
            f()
        test_func.__name__ = f.__name__
        test_func._expected = status
        return test_func
    return inner

def test_that_passes():
    assert_equals(1, int("1"))

def test_that_fails():
    assert_equals(1, int("2"))

def test_that_has_an_error():
    assert_equals(2, 1 + "1")

@expected("FAIL")
def test_expected_fail():
    assert_equals(2 + 2, 5)

class TestRunner(object):
    def __init__(self, logger):
        self.logger = logger

    def gather_tests(self):
        for item in globals().itervalues():
            if type(item) == types.FunctionType and item.__name__.startswith("test_"):
                yield item.__name__, item

    def run(self):
        tests = list(self.gather_tests())

        self.logger.suite_start(tests=[name for name, func in tests])
        self.logger.info("Running tests")
        for name, func in tests:
            self.run_test(name, func)
        self.logger.suite_end()

    def run_test(self, name, func):
        self.logger.test_start(name)
        status = None
        message = None
        expected = func._expected if hasattr(func, "_expected") else "PASS"
        try:
            func()
        except TestAssertion as e:
            status = "FAIL"
            message = e.message
        except:
            status = "ERROR"
            message = traceback.format_exc()
        else:
            status = "PASS"
        self.logger.test_end(name, status=status, expected=expected, message=message)

def get_parser():
    parser = argparse.ArgumentParser()
    return parser

def main():
    parser = get_parser()
    commandline.add_logging_group(parser)

    args = parser.parse_args()

    logger = commandline.setup_logging("structured-example", args, {"raw": sys.stdout})

    runner = TestRunner(logger)
    try:
        runner.run()
    except:
        logger.critical("Error during test run:\n%s" % traceback.format_exc())

if __name__ == "__main__":
    main()

Each global function with a name starting test_ represents a test. A passing test returns without throwing. A failing test throws a TestAssertion exception via the assert_equals() function. Throwing anything else is considered an error in the test. There is also a expected() decorator that is used to annotate tests that are expected to do something other than pass.

The main entry point to the test runner is via that main() function. This is responsible for parsing command line arguments, and initiating the test run. Although the test harness itself does not provide any command line arguments, the ArgumentParser object is populated by commandline.add_logging_group(), which provides a generic set of structured logging arguments appropriate to all tools producing structured logging.

The values of these command line arguments are used to create a mozlog.structured.StructuredLogger object populated with the specified handlers and formatters in commandline.setup_logging(). The third argument to this function is the default arguments to use. In this case the default is to output raw (i.e. JSON-formatted) logs to stdout.

The main test harness is provided by the TestRunner class. This class is responsible for scheduling all the tests and logging all the results. It is passed the logger object created from the command line arguments. The run() method starts the test run. Before the run is started it logs a suite_start message containing the id of each test that will run, and after the testrun is done it logs a suite_end message.

Individual tests are run in the run_test() method. For each test this logs a test_start message. It then runs the test and logs a test_end message containing the test name, status, expected status, and any informational message about the reason for the result. In this test harness there are no subtests, so the test_end message has the status of the test and there are no test_status messages.

Example Output

When run without providing any command line options, the raw structured log messages are sent to stdout:

$ python structured_example.py

{"source": "structured-example", "tests": ["test_that_has_an_error", "test_that_fails", "test_expected_fail", "test_that_passes"], "thread": "MainThread", "time": 1401446682787, "action": "suite_start", "pid": 18456}
{"source": "structured-example", "thread": "MainThread", "time": 1401446682787, "action": "log", "message": "Running tests", "level": "INFO", "pid": 18456}
{"source": "structured-example", "test": "test_that_has_an_error", "thread": "MainThread", "time": 1401446682787, "action": "test_start", "pid": 18456}
{"status": "ERROR", "thread": "MainThread", "pid": 18456, "source": "structured-example", "test": "test_that_has_an_error", "time": 1401446682788, "action": "test_end", "message": "Traceback (most recent call last):\n  File \"structured_example.py\", line 61, in run_test\n    func()\n  File \"structured_example.py\", line 31, in test_that_has_an_error\n    assert_equals(2, 1 + \"1\")\nTypeError: unsupported operand type(s) for +: 'int' and 'str'\n", "expected": "PASS"}
{"source": "structured-example", "test": "test_that_fails", "thread": "MainThread", "time": 1401446682788, "action": "test_start", "pid": 18456}
{"status": "FAIL", "thread": "MainThread", "pid": 18456, "source": "structured-example", "test": "test_that_fails", "time": 1401446682788, "action": "test_end", "message": "1 not equal to 2", "expected": "PASS"}
{"source": "structured-example", "test": "test_expected_fail", "thread": "MainThread", "time": 1401446682788, "action": "test_start", "pid": 18456}
{"status": "FAIL", "thread": "MainThread", "pid": 18456, "source": "structured-example", "test": "test_expected_fail", "time": 1401446682788, "action": "test_end", "message": "4 not equal to 5"}
{"source": "structured-example", "test": "test_that_passes", "thread": "MainThread", "time": 1401446682788, "action": "test_start", "pid": 18456}
{"status": "PASS", "source": "structured-example", "test": "test_that_passes", "thread": "MainThread", "time": 1401446682789, "action": "test_end", "pid": 18456}
{"action": "suite_end", "source": "structured-example", "pid": 18456, "thread": "MainThread", "time": 1401446682789}

The structured logging module provides a number of command line options:

$ python structured_example.py --help

usage: structured_example.py [-h] [--log-unittest LOG_UNITTEST]
                             [--log-raw LOG_RAW] [--log-html LOG_HTML]
                             [--log-xunit LOG_XUNIT]
                             [--log-mach LOG_MACH]

optional arguments:
  -h, --help            show this help message and exit

Output Logging:
  Options for logging output. Each option represents a possible logging
  format and takes a filename to write that format to, or '-' to write to
  stdout.

  --log-unittest LOG_UNITTEST
                        Unittest style output
  --log-raw LOG_RAW     Raw structured log messages
  --log-html LOG_HTML   HTML report
  --log-xunit LOG_XUNIT
                        xUnit compatible XML
  --log-mach LOG_MACH   Human-readable output

In order to get human-readable output on stdout and the structured log data to go to the file structured.log, we would run:

$ python structured_example.py --log-mach=- --log-raw=structured.log

0:00.00 SUITE_START: MainThread 4
0:01.00 LOG: MainThread INFO Running tests
0:01.00 TEST_START: MainThread test_that_has_an_error
0:01.00 TEST_END: MainThread Harness status ERROR, expected PASS. Subtests passed 0/0. Unexpected 1
0:01.00 TEST_START: MainThread test_that_fails
0:01.00 TEST_END: MainThread Harness status FAIL, expected PASS. Subtests passed 0/0. Unexpected 1
0:01.00 TEST_START: MainThread test_expected_fail
0:02.00 TEST_END: MainThread Harness status FAIL. Subtests passed 0/0. Unexpected 0
0:02.00 TEST_START: MainThread test_that_passes
0:02.00 TEST_END: MainThread Harness status PASS. Subtests passed 0/0. Unexpected 0
0:02.00 SUITE_END: MainThread