Altinity
ClickHouse Leading Service Provider

Blog

Posts tagged Tutorial
Handling Variable Time Series Efficiently in ClickHouse

May 23, 2019

ClickHouse offers incredible flexibility to solve almost any business problem in a multiple of ways. Schema design plays a major role in this. For our recent benchmarking using the Time Series Benchmark Suite (TSBS) we replicated TimescaleDB schema in order to have fair comparisons. In that design every metric is stored in a separate column. This is the best for ClickHouse from a performance perspective, as it perfectly utilizes column store and type specialization.

Sometimes, however, schema is not known in advance, or time series data from multiple device types needs to be stored in the same table. Having a separate column per metric may be not very convenient, hence a different approach is required. In this article we discuss multiple ways to design schema for time series, and do some benchmarking to validate each approach.

Read More
Altinity ClickHouse Operator for Kubernetes

Apr 9, 2019

When I was setting up my first ClickHouse clusters 3 years ago it was like a journey to an unknown world full of caveats. ClickHouse is very simple and easy to use but not THAT simple. Sometimes I dreamed that setting up the cluster would be as easy as making a cup of coffee. It took us a while to find the right approach, but finally our dreams came true. Today, we are happy to introduce ClickHouse operator for Kubernetes!

Read More
A Magical Mystery Tour of the LowCardinality Data Type

Mar 27, 2019

Many ClickHouse features like LowCardinality data type seem mysterious to new users.  ClickHouse often deviates from standard SQL and many data types and operations do not even exist in other data warehouses. The key to understanding is that the ClickHouse engineering team values speed more than almost any other property. Mysterious SQL expressions often turn out to be 'secret weapons' to achieve unmatched speed.

In fact, the LowCardinality data type is an example of just such a feature. It has been available since Q4 2018 and was marked as production ready in Feb 2019, but still is not documented, magically appearing in some documentation examples. In this article we will fill the gap  by explaining how LowCardinality works, and when it should be used.

Read More
Using ODBC with Clickhouse

20 Sept 2018
This article shows different ways of how you can use ClickHouse in connection with other data sources to make queries use all of ClickHouse optimization features in order to make results come faster. Also, it is good practice when you have some infrastructure elements already linked to some other data sources or tools that supports ODBC.

Read More
Circular Replication Cluster Topology in ClickHouse

May 10, 2018
In some cases, there is a need to configure a distributed cluster with replication but there are not enough servers in order to place every replica on a separate node. It is better to have multiple replicas at the same nodes configured in a special way, that allows to continue executing queries even in the case of a node failure. Such replication configuration can be found in different distributed systems, it is often referred to as ‘circular’ or ‘ring’ replication. In this article, we will discuss how to set up circular replication in ClickHouse.

Read More
ClickHouse for Machine Learning

Jan 18, 2018
ClickHouse is very flexible and can be used for various use cases. One of the most interesting technology areas now is machine learning, and ClickHouse fits nicely there as very fast datasource. A few months ago ClickHouse team implemented the support for ML algorithms, that makes it much easier and faster to run ML over ClickHouse data. They started with open source Yandex CatBoost algorithm, but it can be extended with other algorithms in the future. In this article, we posted a tutorial on how ClickHouse can be used to run CatBoost models.

Read More
Logstash with ClickHouse

Dec 18, 2017  
There are many cases where ClickHouse is a good or even the best solution for storing analytics data. One common example is web servers logs processing. In this article, we guide you through Nginx web server example but it is applicable to other web servers as well.

We will use Logstash with ClickHouse in order to process web logs. Logstash is a commonly used tool for parsing different kinds of logs and putting them somewhere else. For example, in ClickHouse. This solution is a part of Altinity Demo Appliance

Read More
Migration to ClickHouse

Oct 23, 2017   
ClickHouse is an excellent analytics database choice not just for startups but also for companies that have already invested significant amount of resources into their analytics solutions, but are not completely satisfied with the results. In this article we will discuss how and when companies consider the ClickHouse migration project, and what challenges they may expect. We do not disclose any names, but every example has a real world prototype. 

Read More