New ClickHouse Altinity Stable Release 19.16.10.44

Welcome to the year 2020! New Year came and it’s a good time to update your ClickHouse to the latest Altinity stable version which we are presenting today. As always, Altinity Stable is a version we know is good and recommended for production use. You can use a newer version in your lab environment to try the latest features.

The previous Altinity-stable version was 19.13.7.57, and a huge amount of work has been done since then: 2744 commits in 528 pull-requests from 113 contributors. While the most productive contributors are from the Yandex team (thank you, guys, you’re great!), we are proud to say that several important features were implemented by Altinity, or in collaboration between us and Yandex. Please see the detailed release notes below. 

Major changes since 19.13.7

Notable new features

  • Tiered storage: multiple storage volumes for MergeTree tables. It’s possible to store fresh data on SSD and automatically move old data to HDD. We already discussed this functionality in our blog.
  • LIVE VIEW tables that we also described in our blog. The support in this release is not complete, but new improvements are coming quickly.  Feel free to try it out.
  • WITH FILL modifier of ORDER BY for data gap filling. It allows to fill ‘missing data’ to provide uniform reports. For example, you can fill missing dates for time series so that every day is shown even if some days have no data. This feature is not documented yet, so here is an example how missing dates can be filled:
SELECT arrayJoin([today()-10, today()]) AS d ORDER BY d ASC WITH FILL 
  • Table constraints that are checked at insert.
  • Sensitive data masking for query_log, server logs, process list with regexp-based rules.
  • Table function input() for reading incoming data in INSERT SELECT query. A very useful feature when you need to preprocess data just before inserting. We are going to showcase it in our planned articles around log processing use cases.
  • Cascaded materialized views. It is an important feature for many use cases including Kafka integration. In particular, it allows you to have load raw data from Kafka using MV and then aggregate it using another MV. That was not possible in previous ClickHouse releases.

Kafka

Altinity took over support for the Kafka engine a few months ago. Kafka functionality and stability has been improved in this release, in particular:

  • ClickHouse can act as Kafka producer, and not just to read from Kafka, but also send data back with an insert statement.
  • Atomic parsing of each message: ‘kafka_row_delimiter’ is now obsolete (ignored)
  • More reliable commit logic
  • Virtual columns ‘_partition’ and ‘_timestamp’ for Kafka engine table.
  • Parsing of most of the formats is working properly now

Backward Incompatible Changes

  • count() supports only a single argument
  • Legacy ‘asterisk_left_columns_only’ setting has been removed (it was disabled by default).
  • Numeric values for Enums can now be used directly in ‘IN’ section of the query
  • Changed serialization format of ‘bitmap*’ aggregate function states to improve performance. Serialized states of ‘bitmap*’ from previous versions cannot be read. If you happen to use bitmap aggregate functions, please contact us before upgrading.
  • system.query_log column type was changed from UInt8 to Enum8
  • ANY RIGHT/FULL/INNER JOIN is disabled by default. Set ‘any_join_distinct_right_table_keys’ setting to enable them.

Known issues

  • In case of topic rebalancing on the Kafka side, duplicates are possible. We are going to address it in 2020/Q1. https://github.com/ClickHouse/ClickHouse/issues/7259
  • Select from mysql() table function is broken in this release. Two workarounds are possible:
    • Use MySQL table engine
    • ‘create table as mysql(…)’ and then select from this table — this is actually a shortcut to MySQL table engine 
  • Some constant-like functions can misbehave when queried through Distributed tables: https://github.com/ClickHouse/ClickHouse/issues/7666
  • ‘system reload dictionary’ does not reload dictionaries with ‘update_field’: https://github.com/ClickHouse/ClickHouse/issues/7440
  • complicated ‘where’ conditions involving fields of UInt8 type with values > 1 can return unexpected results. (workaround: instead of a and b and c use a <> 0 and b <> 0 and c <> 0) https://github.com/ClickHouse/ClickHouse/issues/7772
  • If you want to preserve old ANY JOIN behavior while upgrading from a version before 19.13, you may need to install 19.13.7 first, change ‘any_join_distinct_right_table_keys’ setting there and after that you can upgrade to 19.16. But we recommend to review your queries and rewrite them without this join type. In future releases it will be available, but with a different name (SEMI JOIN).

Besides that, we can confirm that 19.16.10.44 is well tested and stable, and we didn’t see any significant issues while operating it in our environment and several user installations. We recommend upgrading to this version if you are using an older one.

Please contact us at info@altinity.com if you experience any issues with the upgrade.  Meanwhile, enjoy the new stable release!


Appendix

The curious minds can find some interesting reading in this section. We carefully analyzed all commits and created a comprehensive list of other changes in various parts of ClickHouse server. 

New Formats

  • ORC format.
  • Template/TemplateIgnoreSpaces format. It allows to parse / generate data in custom text formats. So you can for example generate HTML directly from ClickHouse thus turning ClickHouse to the web server.
  • CustomSeparated/CustomSeparatedIgnoreSpaces format. Supports custom escaping and delimiter rules.
  • JSONEachRowWithProgress
  • Parse unquoted NULL literal as NULL (enabled by setting ‘format_csv_unquoted_null_literal_as_null’).
  • Initialize null fields with default values if the data type of this field is not nullable (enabled by setting ‘input_format_null_as_default’).

New or improved functions

  • Aggregate function combinators which fill null or default value when there is nothing to aggregate: -OrDefault, -OrNull
  • Introduce uniqCombined64() to get sane results for cardinality > UINT_MAX
  • QuantileExactExclusive and Inclusive aggregate functions
  • hasToken/hasTokenCaseInsensitive (look for the token in string the same way as token_bf secondary index)
  • multiFuzzyMatchAllIndices, multiMatchAllIndices (return the Array of all matched indices in multiMatch family functions)
  • repeat function for strings
  • sigmoid and tanh functions for ML applications
  • Roaring Bitmaps:
  • Changes CRoaring serialization functions (you will not be able to read Bitmaps created by earlier versions)
  • bitmapSubsetInRange,
  • bitmapMin, bitmapMax,
  • bitmapSubsetLimit(bitmap, range_start, limit),
  • groupBitmapAnd, groupBitmapOr, groupBitmapXor
  • geohashesInBox(longitude_min, latitude_min, longitude_max, latitude_max, precision) which creates an array of precision-long strings of geohash boxes covering the provided area.
  • Support for wildcards in paths of table functions file and hdfs. If the path contains wildcards, the table will be readonly
SELECT * FROM hdfs('hdfs://hdfs1:9000/some_dir/another_dir/*/file', 'Parquet', 'col1 String')
  • New function neighbour(value, offset[, default_value]). Allows to reach prev/next row within the column.
  • Optimize queries with ORDER BY expressions clause, where expressions have coinciding prefix with sorting key in MergeTree tables. This optimization is controlled by ‘optimize_read_in_order’ ‘setting
  • New function arraySplit and arrayReverseSplit which can split an array by “cut off” conditions. They are useful in time sequence handling.
  • Table function values (the name is case-insensitive). It allows to create table with some data inline.
SELECT * FROM VALUES('a UInt64, s String', (1, 'one'), (2, 'two'), (3, 'three'))
  • fullHostName (alias FQDN)
  • numbers_mt() — multithreaded version of numbers(). 
  • currentUser() (and alias user()), returning login of authorized user.
  • S3 engine and table function. Partial support in this release (no authentication), complete version is expected in 19.18.x and later
  • WITH TIES modifier for LIMIT

New dictionaries features

  • Redis as an external dictionary source
  • <sparse_hashed> dictionary layout, that is functionally equivalent to the <hashed> layout, but is more memory efficient. It uses about twice as less memory at the cost of slower value retrieval.
  • ‘allow_dictionaries’ user setting that works similar to ‘allow_databases’.
  • HTTP source new attributes: credentials and http-headers.

Operations / Monitoring

  • system.metric_log table which stores values of system.events and system.metrics with specified time interval.
  • system.text_log in order to store ClickHouse logs to itself
  • Support for detached parts removal:
ALTER TABLE <table_name> DROP DETACHED PART '<part_id>'
  • MergeTree now has an additional option ‘ttl_only_drop_parts’ (disabled by default) to avoid partial pruning of parts, so that they dropped completely when all the rows in a part are expired.
  • Added miscellaneous function getMacro(name) that returns String with the value of corresponding <macros> from configuration file on current server where the function is executed.

New metrics & events

  • system.metrics: DistributedFilesToInsert, GlobalThread, GlobalThreadActive, LocalThread, LocalThreadActive
  • system.events: Merge

New system tables:

  • system.disks
  • system.storage_policies

New columns in system tables:

  • system.clusters: errors_count, estimated_recovery_time
  • system.collations: language
  • system.parts: disk_name
  • system.parts_columns: disk_name
  • system.processes: os_thread_ids
  • system.query_log: os_thread_ids
  • system.tables: storage_policy

 

system.merge_tree_settings changed/added

Name Default Description
concurrent_part_removal_threshold 100 Activate concurrent part removal (see ‘max_part_removal_threads’) only if the number of inactive data parts is at least this.
max_part_loading_threads auto(6) The number of threads to load data parts at startup.
max_part_removal_threads auto(6) The number of threads for concurrent removal of inactive data parts. One is usually enough, but in ‘Google Compute Environment SSD Persistent Disks’ file removal (unlink) operation is extraordinarily slow and you probably have to increase this number (recommended is up to 16).
storage_policy default Name of storage disk policy
ttl_only_drop_parts 0 Only drop altogether the expired parts and not partially prune them.

system.settings changed/added

Name Default Description
allow_drop_detached 0 Allow ALTER TABLE … DROP DETACHED PART[ITION] … queries
allow_experimental_live_view 0 Enable LIVE VIEW. Not mature enough.
allow_introspection_functions 0 Allow functions for introspection of ELF and DWARF for query profiling. These functions are slow and may impose security considerations.
any_join_distinct_right_table_keys 0 (was 1) Enable old ANY JOIN logic with many-to-one left-to-right table keys mapping for all ANY JOINs. It leads to confusing not equal results for ‘t1 ANY LEFT JOIN t2’ and ‘t2 ANY RIGHT JOIN t1’. ANY RIGHT JOIN needs one-to-many keys maping to be consistent with LEFT one.
asterisk_left_columns_only removed  
connection_pool_max_wait_ms 0 The wait time when connection pool is full.
default_max_bytes_in_join 100000000 Maximum size of right-side table if limit’s required but max_bytes_in_join is not set.
distributed_directory_monitor_max_sleep_time_ms 30000 Maximum sleep time for StorageDistributed DirectoryMonitors, it limits exponential growth too.
distributed_replica_error_cap 1000 Max number of errors per replica, prevents piling up increadible amount of errors if replica was offline for some time and allows it to be reconsidered in a shorter amount of time.
distributed_replica_error_half_life 60 Time period reduces replica error counter by 2 times.
format_custom_escaping_rule Escaped Field escaping rule (for CustomSeparated format)
format_custom_field_delimiter   Delimiter between fields (for CustomSeparated format)
format_custom_result_after_delimiter   Suffix after result set (for CustomSeparated format)
format_custom_result_before_delimiter   Prefix before result set (for CustomSeparated format)
format_custom_row_after_delimiter   Delimiter after field of the last column (for CustomSeparated format)
format_template_resultset   Path to file which contains format string for result set (for Template format)
format_template_row   Path to file which contains format string for rows (for Template format)
format_template_rows_between_delimiter   Delimiter between rows (for Template format)
input_format_csv_unquoted_null_literal_as_null 0 Consider unquoted NULL literal as N
input_format_null_as_default 0 For text input formats initialize null fields with default values if data type of this field is not nullable
input_format_tsv_empty_as_default 0 Treat empty fields in TSV input as default values.
input_format_values_accurate_types_of_literals 1 For Values format: when parsing and interpreting expressions using template, check actual type of literal to avoid possible overflow and precision issues.
input_format_values_deduce_templates_of_expressions 0 For Values format: if field could not be parsed by streaming parser, run SQL parser, deduce template of the SQL expression, try to parse all rows using template and then interpret expression for all rows.
joined_subquery_requires_alias 0 Force joined subqueries to have aliases for correct name qualification.
kafka_max_wait_ms 5000 The wait time for reading from Kafka before retry.
live_view_heartbeat_interval 15 The heartbeat interval in seconds to indicate live query is alive.
max_http_get_redirects 0 Max number of http GET redirects hops allowed. Make sure additional security measures are in place to prevent a malicious server to redirect your requests to unexpected services.
max_live_view_insert_blocks_before_refresh 64 Limit maximum number of inserted blocks after which mergeable blocks are dropped and query is re-executed.
min_free_disk_space_for_temporary_data 0 The minimum disk space to keep while writing temporary data used in external sorting and aggregation.
optimize_read_in_order 1 Enable ORDER BY optimization for reading data in corresponding order in MergeTree tables.
partial_merge_join 0 Use partial merge join instead of hash join for LEFT and INNER JOINs.
partial_merge_join_optimizations 0 Enable optimizations in partial merge join
partial_merge_join_rows_in_left_blocks 10000 Group left-hand joining data in bigger blocks. Setting it to a bigger value increase JOIN performance and memory usage.
partial_merge_join_rows_in_right_blocks 10000 Split right-hand joining data in blocks of specified size. It’s a portion of data indexed by min-max values and possibly unloaded on disk.
query_profiler_cpu_time_period_ns 1000000000 (was 0) Highly experimental. Period for CPU clock timer of query profiler (in nanoseconds). Set 0 value to turn off CPU clock query profiler. Recommended value is at least 10000000 (100 times a second) for single queries or 1000000000 (once a second) for cluster-wide profiling.
query_profiler_real_time_period_ns 1000000000 (was 0) Highly experimental. Period for real clock timer of query profiler (in nanoseconds). Set 0 value to turn off real clock query profiler. Recommended value is at least 10000000 (100 times a second) for single queries or 1000000000 (once a second) for cluster-wide profiling.
replace_running_query_max_wait_ms 5000 The wait time for running query with the same query_id to finish when setting ‘replace_running_query’ is active.
s3_min_upload_part_size 536870912 The mininum size of part to upload during multipart upload to S3.
temporary_live_view_timeout 5 Timeout after which temporary live view is deleted.
Share