Fun (or Less Agony) with Splunk Tstats
Getting to Know Tstats
Most of us have heard about how fast Splunk’s tstats command can produce fast searches, but there’s not much in the training materials to help us learn how to use it. SPL is already hard enough, so just the idea of learning tstats syntax can be daunting. After all, who wants to rewrite all of their dashboards and reports after already creating them based on raw search? The truth is, tstats is great but it has a few limitations and can’t be leveraged for every use case.
First, let’s talk about the benefits. Here are the most notable ones:
- It’s super-fast. Tstats doesn’t read or decompress raw event data, which means it skips the process of data extraction by only reading the fields captured in the tsidx files (more on that below). For data models, it will read the accelerated data and fallback to the raw data if accelerated data isn’t available (by default).
- Its syntax is familiar for SQL developers. It might be a bit of a stretch to suggest this, but tstats syntax is more like SQL than searching raw data with the standard search command (“| search” is implied with all searches that don’t have a leading search command).
- It’s streamlined. Much like SQL, the data is selected and aggregated in one query.
The most notable drawbacks are:
- Tstats is limited to indexed fields and data models. That means additional work may be required to create the fastest searches for your data.
- Acceleration isn’t great for data sources with dynamic lookups that change often. An “accelerated” result is merely pre-computed, but if that lookup result changes then your accelerated results might have stale data.
- When you use tstats searches in dashboards, creating drilldowns is more difficult. For every dashboard panel, you have to manually create a search that will utilize any clicked values and embed it within the drilldown XML tags.
- Aggregation functions don’t support eval statements, unlike the regular stats command. This limits the flexibility somewhat, but evals can usually be implemented in another way as a workaround.
The tstats command is most commonly employed for accelerated data models and calculating metrics for your event data.
Writing Tstats Searches
The syntax for tstats takes some practice to get right. If you’re used to SQL, you can think of it like replacing SELECT with “| tstats” and swapping the order of your WHERE and GROUP BY clauses. You’ll want to make sure you specify a WHERE clause with an index to keep the scope of your search as specific as possible. The following fields are indexed by default and can be searched with tstats:
- _time
- _indextime
- source
- sourcetype
- host
- punct
Additional metadata fields that can be used but aren’t part of the tsidx are:
- index
- splunk_server
Syntax (Simplified)
| tstats [stats-function](field) AS renamed-field where [field=value] by field
Example 1: Sourcetypes per Index
Raw search:
index=* OR index=_* | stats count by index, sourcetype
Tstats search:
| tstats count where index=* OR index=_* by index, sourcetype
Example 2: Indexer Data Distribution over 5 Minutes
Raw search:
index=os sourcetype=syslog | stats count by splunk_server
Tstats search:
| tstats count where index=os sourcetype=syslog earliest=-5m by splunk_server
Example 3: CIM Data Model Search – Count of Destination IPs by Source IP
Standard datamodel search:
| datamodel Network_Traffic All_Traffic search | stats dc(All_Traffic.dest_ip) by All_Traffic.src_ip | rename All_Traffic.* AS *
Tstats search:
| tstats dc(All_Traffic.dest_ip) AS dest_ip from datamodel=Network_Traffic by All_Traffic.src_ip | rename All_Traffic.* AS *
If you’re ever confused as to how to turn your data model search into a tstats version, one trick is to recreate the equivalent of your search in the Datasets (Pivot) function. Then, open the Job Inspector to find the tstats command used in the background for your pivot under “Normalized Search.”
Optional Arguments
Like most Splunk commands, there are arguments you can pass to it (see the docs page for a full list). Commonly utilized arguments (set to either true or false) are:
- allow_old_summaries – Allows Splunk to use results that were generated prior to a change of the data model. This option is only applicable to accelerated data model searches.
- summariesonly – As the name implies, this option tells Splunk whether to search summaries or summaries plus raw data. In this context, summaries are synonymous with accelerated data.
- append – This option allows you to chain tstats searches together into a unified search result, similar to the union or append commands.
- prestats – If you’re going to use append, this option is required to be set to true. It changes the output to the “prestats” format, which is used to pass the results into aggregation functions such as chart, stats, or timechart.
Timecharting
What if you need to run a tstats search, but you want to see a trend of your data over time (like timechart)? Have no fear, you can do this by adding _time to your split-by fields with the span argument, and then converting to the format used by timechart. See the following example:
| tstats count where index=* by _time span=1d, index | xyseries _time index count | makecontinuous
Here, we’re using xyseries to convert each value in the index column to its own distinct column with the value of count. For more on xyseries, check out the docs or the Splunk blog entry, Clara-fication: transpose, xyseries, untable, and More. Once that’s done, we use makecontinuous to be sure there are no gaps in the timeline. Voila! You now have the equivalent of timechart.
Chaining Tstats
If you need to take search results from multiple data models and aggregate the results, one way to do so is by using tstats with the append=true option. Whenever you use append, however, you also need prestats. In the following example, we chain two tstats searches together and using stats to aggregate the results:
| tstats prestats=t count from datamodel=Authentication where nodename=Authentication by _time span=15m, nodename | tstats prestats=t append=t count from datamodel=Network_Traffic where nodename=All_Traffic by _time span=15m, nodename | stats count by _time, nodename | xyseries _time nodename count | makecontinuous | fields - *.*
Limitations
Although tstats has come a long way, there are still limitations. My pet peeves are:
- The default dashboard drilldowns are terrible, so you generally have to customize every panel.
- You can’t aggregate _indextime like you can with _time. This would be helpful to look at indexing rates over time.
- You can’t use eval statements within your aggregations, like you can in stats (et al.).
Despite its limitations, tstats is an incredibly useful tool to have in your arsenal. Mastering it can be the difference between a slow UI where users don’t want to use your Splunk dashboards, or supporting multiple users at a time on the same dashboards with lightning-fast responsiveness. It truly elevates the ceiling for how much data you can represent at scale for many concurrent users.
Additional Resources
Searching FAST: How to Start Using tstats and Other Acceleration Techniques (David Vueve, .conf 2017)
Speed up your searches! (Satoshi Kawasaki, .conf 2017)
Fantastic post! Thanks for sharing.
Regarding the limitaitons of tstats, neither can you extract the _raw event as you can do with stats.
e.g stats values(_raw) as “Raw_Event”