ClickHouse Digest #1

Nov 6, 2017

Activity in and around ClickHouse has dramatically increased recent weeks. Percona Live in Dublin, meetups in Bay Area and Berlin, new features, many new companies adopting ClickHouse — all that generate a lot of news and increase interest to ClickHouse further.

In this article I would like to highlight some ClickHouse recent news. Eventually we plan to publish a monthly ClickHouse news digest, so this is a test ball.

Berlin Meetup Videos

Yandex has published videos from the October meetup in Berlin. Despite the hurricane a lot of people attended and there were some really interesting presentations and good offline discussions.

  1. “Introduction to ClickHouse” from Alexey Milovidov, the “father” of ClickHouse.
  2. “Migration to ClickHouse. Practicle Guide” by Alexander Zaitsev guides you through the steps and challenges when migrating to ClickHouse from another DBMS.
  3. “Creating marketing funnels or how to calculate complex metrics in ClickHouse” by Yandex analyst Mariya Mansurova demostrates the power of ClickHouse arrays and advanced SQL features. This is my personal favorite.
  4. “LMP in ClickHouse” — interesting application of ClickHouse to network analysis tasks.
  5. “Pattern Discovery with ClickHouse” — another very interesting example of working with ClickHouse arrays in a strict mathematical form.
  6. “What’s new in ClickHouse” observes some recently added ClickHouse features and also discusses nearest plans.

New ClickHouse features

It is very encouraging to see features that ClickHouse community has been waiting for many months finally find their spots in the new ClickHouse releases. The latest releases add few important ones.

Kafka Engine. Yes, ClickHouse can be natively plugged to Kafka now using Kafka table engine. Thanks to CloudFlare who contributed to this feature.

Custom partitioning. Originally ClickHouse only allowed monthly partitions. That was a significant inconvenience for short-lived data and also did not allow to partition data differently for non-time series use cases. Finally, custom partitions are available. See more details in our follow-up post.

Date range has been extended to 2105. That was a weird limitation not to store dates over 2038. Not anymore.

Also, ClickHouse added basic support for geospatial queries (pointInPolygon function), support of machine learning library CatBoost and other features.

New community projects

We do not monitor community properly closely but some projects cannot be missed.

chproxy — HTTP proxy to ClickHouse. It has a lot of features including load balancing, user quotas, response caching, request queueing and so on.

New release of Tabix adds table transpose, 3D charts and fixes a lot of issues.

Altinity Demo Appliance

The last but not least is Altinity Demo Appliance that has been just released. It is evolved from a demo we showed at Percona Live and packages ClickHouse together with the ontime airline performance test dataset, Grafana, Tabix, ProxySQL (do you know that ProxySQL supports ClickHouse already?) and logstash integration for logs parsing. Everything is glued together in order to help newcomers start using ClickHouse faster. Altinity Demo Appliance is available as Docker or Amazon image.

Please visit the introduction page for instructions or try it online at http://demo.altinity.com.

Stay tuned.

Share