Skip to content
Ric (Ryszard) Szopa edited this page Nov 13, 2013 · 1 revision

Query Log

The query log is a realtime HTTP stream of all queries coming into the vttablet server. $ curl -s localhost:6509/debug/vt/querylog | head -n 2000 > query.log

This will collect 2000 lines of queries. Each line is a tab separated list of fields that represent the following info:

  • Method
  • RemoteAddr
  • Username
  • StartTime
  • EndTime
  • TotalTimeSeconds (fractional seconds this query took to execute)
  • PlanType
  • OriginalSql
  • BindVariables (strings are normalized and scrubbed)
  • NumberOfQueries
  • RewrittenSql
  • QuerySources
  • MysqlResponseTimeSeconds (amount of time spent waiting for mysql to process the query)
  • WaitingForConnectionSeconds (amount of time spent waiting for a free connection to mysql)
  • SizeOfResponse
  • CacheHits
  • CacheMisses
  • CacheAbsent
  • CacheInvalidations

This is a superset of what tcpdump could collect. It can also be used to collect slow query info by filtering with awk.

Query Stats

This data is derived from the query plan cache. Every query is normalized and certain stats recorded for the lifetime of the process.

This can return up to 5000 entries (the size of the plan cache) and can end up generating almost 20MB of output depending on the size of the sql query strings.

curl localhost:6509/debug/schema/query_stats > qstats.json

It is JSON array of dict entries. Here is a sample:

{
    "Query": "SELECT id, video_id, status, type, owner_id, policy, origin, originator_id, time_created, time_updated, flags, subtype, status_detail, video_src, name_value_pairs, notes, user_id, composite_license_id, asset_id, composite_license_type FROM ut_content_claim WHERE status IN (:status0,:status1,:status2) AND flags & :flags_mask = :flags_value AND video_id = :video_id AND owner_id is not null",
    "Table": "ut_content_claim",
    "Plan": "PASS_SELECT",
    "QueryCount": 16142625,
    "Time": 16800047536000,
    "RowCount": 9256034,
    "ErrorCount": 0
}
  • Table: the table name extracted from the sql statement - this is the empty string for joins.
  • QueryCount: number of times this plan was executed
  • RowCount: number of rows the plan has returned cumulatively
  • Time: number of nanoseconds spent running this plan cumulatively

By computing a diff over an interval, you can extract average queries per table, average rows returned, etc.

Clone this wiki locally