The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Distributed SQL Tips and Tricks – August 3, 2020

Welcome to this week’s tips and tricks blog where we recap some distributed SQL questions from around the Internet. We’ll also review upcoming events, new documentation, and blogs that have been published since the last post. Got questions? Make sure to ask them on our YugabyteDB Slack channel, Forum, GitHub, or Stackoverflow. Ok, let’s dive right in:

Is there a way to get “updated_at” timestamps from YCQL without additional columns?

On each INSERT/UPDATE/DELETE YugabyteDB also stores by default the timestamp when the operation occurred. This can be easily looked up by using the WriteTime function. WriteTime returns the timestamp in microseconds when the value was written:

ycqlsh:yb_demo> create table users(id bigint PRIMARY KEY, name TEXT);
ycqlsh:yb_demo> INSERT INTO users(id,name) VALUES (1, 'Billy Johnes');
ycqlsh:yb_demo> SELECT id,name,writetime(name) from users;

 id | name     	| writetime(name)
  1 | Billy Johnes | 1596097449565229

(1 rows)

How to check the per-row on-disk space usage in YugabyteDB?

Data is saved in --fs_data_dirs director(ies), which contains both the transaction logs (WAL) and SSTable files (though you can store WAL files separately using --fs_wal_dirs).

After inserting data, quite likely many of the rows are still in WAL and memtables and not yet flushed to SSTable format on disk. And the WAL files are not in compressed format. To get a better indication, we can either load a lot more data (such that the WAL portion is insignificant) or force a manual flush of the table, and then inspect the size of just the data directories.

After filling a sample table with 1M rows:

CREATE TABLE test(a bigint, b bigint, c bigint, d bigint, PRIMARY KEY((a), b);

We can see the on disk size of the directory where the SSTable files are stored is very small:

$ du -hs yb-data/tserver/data
60K 	yb-data/tserver/data

This confirms that most of the data is still in WAL and memtables.

After forcing a flush of data to disk using:

./bin/yb-admin --master_addresses flush_table <keyspace> <table>

The data directory size is as follows:

$ du -hs yb-data/tserver/data
48M 	yb-data/tserver/data

So approximate on-disk size per row size for the given schema is about 48 bytes. This includes the overheads for the metadata (indexes, bloom filters, internal timestamps, etc.).

How to check if a jsonb field contains a property name in YCQL?

Assuming we have a table with rows below:

CREATE TABLE testjsonb (id text PRIMARY KEY,fields jsonb);
INSERT INTO testjsonb(id,fields)values('3','{"property":"value3"}');
INSERT INTO testjsonb(id,fields)values('1','{"property3":"value3"}');

We can use IF NOT fields->>'property' = null like below:

ycqlsh:yb_demo> select * from testjsonb IF NOT fields->>'property' = null;

 id | fields
  3 | {"property":"value3"}

(1 rows)

How to enable query logging in YSQL?

Query logging can be enabled using the --ysql_log_statement gflag in yb-tserver.

Valid values are: none (off), ddl (only data definition queries, such as create/alter/drop), mod (all modifying/write statements, includes DDLs plus insert/update/delete/trunctate, etc), and all (all statements).

The query logs will show in postgres*.log files in the logs directory.

As an example, we’ll start a cluster with yb-ctl and write test queries:

$ ./bin/yb-ctl start --tserver_flags="ysql_log_statement=all"
$ ./bin/ysqlsh
yugabyte=# create extension if not exists pg_stat_statements;
yugabyte=# SELECT * FROM pg_stat_statements;

And then we can check in the logs directory:

~/yugabyte-data/node-1/disk-1/yb-data/tserver/logs$ ls

And check the recorded queries: (log trimmed below)

~/yugabyte-data/node-1/disk-1/yb-data/tserver/logs$ cat postgresql-2020-07-30_163500.log
I0730 16:35:46.708248 31619] Starting to init ybclient
I0730 16:35:46.708577 31619] New master addresses: []
I0730 16:35:46.709975 31619] Successfully built ybclient
2020-07-30 16:35:56.470 UTC [31612] LOG:  statement: SELECT n.nspname as "Schema",
      c.relname as "Name",
      CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'table' WHEN 'I' THEN 'index' END as "Type",
      pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
    FROM pg_catalog.pg_class c
     	LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
    WHERE c.relkind IN ('r','p','v','m','S','f','')
      	AND n.nspname <> 'pg_catalog'
      	AND n.nspname <> 'information_schema'
      	AND n.nspname !~ '^pg_toast'
      AND pg_catalog.pg_table_is_visible(c.oid)
    ORDER BY 1,2;
2020-07-30 16:36:19.483 UTC [31612] LOG:  statement: create extension if not exists pg_stat_statements;
I0730 16:36:19.483259 31612] Starting thread pool { name: TransactionManager queue_limit: 150 max_workers: 50 }
2020-07-30 16:36:21.787 UTC [31612] LOG:  statement: SELECT * FROM pg_stat_statements;

New Documentation, Blogs, Tutorials, and Videos

New Blogs

New Videos

New and Updated Docs

ICYMI, to support the release of YugabyteDB 2.2, we published these new docs:

Upcoming Events

We Are Hiring!

Yugabyte is growing fast and we’d like you to help us keep the momentum going! All Yugabyte interviews are held virtually, so we can all continue doing our part with social distancing. Although we are based in Silicon Valley, we hire exceptional folks wherever they are. Check out our currently open positions:

Our team consists of domain experts from leading software companies such as Facebook, Oracle, Nutanix, Google, and LinkedIn. We have come a long way in a short time but we cannot rest on our past accomplishments. We need your ideas and skills to make us better at every function that is necessary to create the next great software company. All while having tons of fun and blazing new trails!

Get Started

Ready to start exploring YugabyteDB features? Getting up and running locally on your laptop is fast. Visit our quickstart page to get started.

High performance distributed SQL by Yugabyte

Related Posts