Adventures in

Snarfing and Crunching!

Starring:

Heka

Cloud Services

Cloud Servers

Cloud Data

Many Flavors, Many Tools

  • Log files / (r)syslog(-ng) / Logstash / Splunk / etc.
  • Stats / StatsD / Graphite / Ganglia / etc.
  • CEP / EEP / Riemann / Esper / ArcSight / etc.
  • Monitoring / Nagios / Zenoss / etc.
  • App instrumentation / New Relic / etc.
  • Server metrics / proc fs / etc.
  • etc. / etc. / etc.

One Basic Pattern

  • Get data
  • Transform and/or transport data
  • Deliver data

One Basic Pattern: More Detail

  • Access some stream of bits
  • Identify / split on record boundaries
  • Convert records to common format
  • Route records to appropriate consumers
  • Watch data as it streams
  • Generate new messages in common format
  • Convert from common format to some other format
  • Push bits

Heka to the Rescue!

Packs, Runners, and Channels, Oh My!

  • Protocol buffer Message structs
  • Wrapped in PipelinePack envelopes
  • Plugins run in their own goroutines
  • Packs are passed through the pipeline over channels
  • Plugin lifespan and system interaction handled by PluginRunners
  • Tight coupling of a plugin to a single goroutine helps keep concurrency sane

Routing

After decoding, most packs will be handed to Heka's internal router for delivery to any appropriate filter and/or output plugins.

  • Filters and outputs specify a "message matcher"
  • Simple YACC grammar (e.g. "Type == 'MyMessageType'")
  • Supports regular expression matching
  • Very fast
  • Run in their own goroutines, fed via input channels

Sandboxes: Lua

    All Heka plugins can be written in Go, but filters, decoders, and encoders can have their logic in Lua.

  • Lua sandboxes limit resource usage and consumption
  • Filters support dynamic loading
  • Very fast
  • Turing complete > complex config DSLs
  • Lua Parsing Expression Grammars (LPEG)

Let's See It!

Why Go?

Once we decided we wanted to write this tool, a number of factors contributed to our choice to use Go:

  • Performance
  • Lightweight deployment env
  • Concurrency primitives
  • Risk management

We looked at Rust, but it wasn't (and still isn't) quite ready.

What's Been Awesome?

  • See last slide
  • Tooling
  • Interfaces
  • Back pressure fail conditions
  • Platform rapidly improving underneath us
  • Static linking
  • CGo

What's Been !Awesome?

  • Static linking
  • CGo
    • CGo interactions cause a lot of copying
    • CFLAGS / LDFLAGS directives require full paths
    • No static linking on Windows => test pain
    • Linking is finicky on Windows / OSX
  • Test scoping
    • Can't effectively test from outside of package
    • Can't export test code from within a package
  • Packaging

What's Been !Awesome?

Yes, generics would be nice.

Gotchas

  • Mind the copies
  • Interface > struct
  • Package bloat
  • No guard rails

Thanks! Questions?