Mafiree logo
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
Schedule a Call
Menu
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
  • Schedule a Call
Database
Database Database Managed Services
MySQL MySQL
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB MongoDB
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL PostgreSQL
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
SQL Server MSSQL
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Aerospike
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
Analytics DB
Analytics DB Analytics DB Services
Clickhouse Clickhouse
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB TiDB
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB ScyllaDB
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
DevOps
DevOps DevOps Services
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Networking
Networking Networking Services
Basic Basic
Advanced Advanced
MySQL MySQL
MongoDB MongoDB
PostgreSQL PostgreSQL
MSSQL MSSQL
Aerospike Aerospike
Clickhouse Clickhouse
TiDB TiDB
ScyllaDB ScyllaDB
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Basic Basic
Advanced Advanced
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
  1. Home
  2. > Blogs
  3. > Clickhouse
  4. > From Arrays to UUIDs: ClickHouse Functions That Will Surprise You!

From Arrays to UUIDs: ClickHouse Functions That Will Surprise You!

Unlock the full potential of ClickHouse with these must-know functions. From data transformation to advanced analytics, elevate your database skills today!

sukan March 23, 2026

Subscribe for email updates

Introduction

Ready to take your data analytics to the next level? Let's explore how ClickHouse's powerful functions and flexible data formats can transform the way you work with massive datasets. From the ClickHouse addDays function for date arithmetic to the ClickHouse generateUUIDv4 function for unique identifier generation, we'll cover functions that simplify complex queries and optimize storage — making data insights more accessible than ever.

If you're new to ClickHouse, learn more about ClickHouse for analytics before diving into these advanced functions.

 

Clickhouse Functions Quick-Reference: From addDays to generateUUIDv4

Function CategoryFunctions CoveredUse Case
ArrayarrayMap, groupArray, argMax/argMinData transformation
Date & TimeaddDays(), INTERVAL, toStartOfYearTime-based analytics
UUIDgenerateUUIDv4(), generateUUIDv7()Unique ID generation
Windowrow_number, runningDifferenceTrend analysis
Aggregatequantile, stddevPop, sumIfStatistical analysis

 

List of Functions 

ClickHouse provides a wealth of functions to work with, each enhancing how you interact with data. Here are some of the most practical functions to streamline your data operations:

 

  • Array Functions
    • Array map()
    • GroupArray()
    •  argMax(), argMin()
  • Window Functions
    • row_number
    • runningDifference()
  • DateType Functions
    • toStartOf [Year/Quater/Month/Day] 
    • addDays()
    • toDateTime()
  • Aggregate Functions
    • Quantile 
    • Standard Deviation
    • Aggregate Function Combinators
      • if
      • Array
  • Full Text Search
    • match()
  • UUID Functions
    • generate UUIDv4()
  • Visual Representation Function
    • bar()
  • User Defined Function (UDF)
    • Executable UDF
  • Readable Formatting Function
    • formatReadableSize
    • formatReadableDecimalSize
    • formatReadableQuantity
    • formatReadableTimeDelta

Array functions 

Array map ()

The arrayMap function in ClickHouse is a powerful tool for transforming arrays. It applies a given expression to each element of an input array, returning a new array with the results. 

Syntax

arrayMap(f, array)

f : The expression or function to apply to each element.

Array: The input array.

clickhouse_node1 :) select arrayMap(x ->(x+1),[1,2,3]) as result ; 

Query id: 7972e9a0-5eca-4489-a6ee-efde1fec3bea  

    ┌─result──┐ 
 1. │ [2,3,4] │   
    └─────────┘ 
    
 1 row in set. Elapsed: 0.003 sec. 

In this example, the arrayMap function is used to increment each element in the input array [1, 2, 3] by 1. x -> x + 1 is a lambda function, where each x represents an element in the array.

groupArray(): 

The groupArray function in ClickHouse is used to create an array of values for each group in a query. It collects the values of a specified column into an array, which can be particularly useful for aggregating data.

Syntax:

 groupArray(x) or groupArray(max_size)(x)

 x : column or value to be aggregated

max_size :  limits the size of the resulting array

 

clickhouse_node1 :) SELECT *FROM player_list

Query id: 780949e9-8a15-4320-a9bf-9caceede549d

   ┌─team───┬─player──┬─salary─┬─position───┐
1. │ Team A │ Alice   │  50000 │ Forward    │
2. │ Team A │ Bob     │  60000 │ Midfielder │
3. │ Team B │ Charlie │  70000 │ Defender   │
4. │ Team B │ David   │  55000 │ Goalkeeper │
5. │ Team A │ Eve     │  45000 │ Midfielder │
6. │ Team B │ Frank   │  65000 │ Forward    │
   └────────┴─────────┴────────┴────────────┘

6 rows in set. Elapsed: 0.033 sec. 
clickhouse_node1 :) SELECT *
FROM player_list

Query id: 780949e9-8a15-4320-a9bf-9caceede549d

   ┌─team───┬─player──┬─salary─┬─position───┐
1. │ Team A │ Alice   │  50000 │ Forward    │
2. │ Team A │ Bob     │  60000 │ Midfielder │
3. │ Team B │ Charlie │  70000 │ Defender   │
4. │ Team B │ David   │  55000 │ Goalkeeper │
5. │ Team A │ Eve     │  45000 │ Midfielder │
6. │ Team B │ Frank   │  65000 │ Forward    │
   └────────┴─────────┴────────┴────────────┘

6 rows in set. Elapsed: 0.033 sec. 

Explanation:

  1. groupArray(actions):
    • Aggregates all actions for a user on a specific day into an array.
  2. Nested Array:
    • Using (timestamp, action) pairs ensures the actions are associated with their timestamps.
  3. arraySort:
    • Sorts the array by the timestamp to ensure the actions are in chronological order.

 

argMax(), argMin(): 

The argMax function in ClickHouse is an aggregate function that returns the value of a specified expression corresponding to the maximum value of another expression. This is particularly useful for retrieving a related value based on a maximum condition.

Syntax:

argMax(value, order)

argMin(Value,order)

clickhouse_node1 :)SELECT * FROM sale

Query id: 5c408b13-76ed-47ec-ab65-c21ad9fad3ff

   ┌─product───┬─revenue─┬───────date─┐
1. │ Product A │     150 │ 2024-01-10 │
2. │ Product B │     200 │ 2024-01-15 │
3. │ Product A │     300 │ 2024-01-20 │
4. │ Product C │     250 │ 2024-01-22 │
5. │ Product B │     100 │ 2024-01-25 │
   └───────────┴─────────┴────────────┘

5 rows in set. Elapsed: 0.013 sec. 

 

argMax()

clickhouse_node1 :) SELECT
    	argMax(product, revenue) AS highest_price,
  		max(revenue)
		FROM sale

Query id: ea05eab3-3120-44a0-9e74-89689dd412c1

   ┌─highest_price ─┬─max(revenue)─┐
1. │ Product A      │          300 │
   └────────────────┴──────────────┘

1 row in set. Elapsed: 0.027 sec. 

argMin()

clickhouse_node1 :) SELECT
    argMin(product, revenue) AS lowest_price,
    min(revenue)
FROM sale

Query id: 3f85880b-e368-4695-aa15-70d2a4298dad

   ┌─lowest_price ─┬─min(revenue)─┐
1. │ Product B     │          100 │
   └───────────────┴──────────────┘

1 row in set. Elapsed: 0.008 sec. 

 

Window Functions:

 

row_number:

The rowNumber function in ClickHouse (and SQL in general) is used for several purposes, primarily related to ranking and ordering data within partitions of a result set. 

Syntax 

rowNumber() OVER (PARTITION BY partition_expression ORDER BY   

orde_expression )

clickhouse_node1 :) SELECT * FROM player_list

Query id: 4410fb11-d5c4-48f8-bb83-27f870bbb37c

   ┌─team─────┬─player────────┬───salary─┬─position─┐
1. │ Lakers   │ LeBron James  │ 45000000 │ Forward  │
2. │ Lakers   │ Anthony Davis │ 38000000 │ Center   │
3. │ Warriors │ Stephen Curry │ 48000000 │ Guard    │
4. │ Warriors │ Klay Thompson │ 38000000 │ Guard    │
5. │ Nets     │ Kevin Durant  │ 42000000 │ Forward  │
6. │ Nets     │ Kyrie Irving  │ 37000000 │ Guard    │
7. │ Heat     │ Jimmy Butler  │ 40000000 │ Forward  │
8. │ Heat     │ Bam Adebayo   │ 32000000 │ Center   │
   └──────────┴───────────────┴──────────┴──────────┘

8 rows in set. Elapsed: 0.001 sec. 

clickhouse_node1 :) SELECT
    player,
    salary,
    row_number() OVER (ORDER BY salary DESC) AS top_salaries
FROM player_list
LIMIT 5

Query id: cafa4ce3-c11b-4621-9d35-9648c82f93cf

   ┌─player────────┬───salary─┬─top_salaries─┐
1. │ Stephen Curry │ 48000000 │            1 │
2. │ LeBron James  │ 45000000 │            2 │
3. │ Kevin Durant  │ 42000000 │            3 │
4. │ Jimmy Butler  │ 40000000 │            4 │
5. │ Anthony Davis │ 38000000 │            5 │
   └───────────────┴──────────┴──────────────┘

5 rows in set. Elapsed: 0.001 sec. 

 

runningDifference(): 

The runningDifference function in ClickHouse calculates the difference between the current value and the previous value in a specified column. This function is particularly useful for analyzing trends over time, such as changes in sales, stock prices, or other numerical data.

 

Syntax:

runningDifference(value)

clickhouse_node1 :) SELECT * FROM stock_prices

Query id: 49047995-2b4f-4173-8bf8-ca47327159d1

   ┌─stock_name─┬───────date─┬─closing_price─┐
1. │ AAPL       │ 2024-11-25 │         150.5 │
2. │ AAPL       │ 2024-11-26 │           153 │
3. │ AAPL       │ 2024-11-27 │         151.5 │
4. │ AAPL       │ 2024-11-28 │           154 │
5. │ GOOG       │ 2024-11-25 │          2800 │
6. │ GOOG       │ 2024-11-26 │          2825 │
7. │ GOOG       │ 2024-11-27 │          2810 │
8. │ GOOG       │ 2024-11-28 │          2830 │
   └────────────┴────────────┴───────────────┘

8 rows in set. Elapsed: 0.002 sec. 

clickhouse_node1 :) SELECT
    stock_name,
    date,
    closing_price,
    runningDifference(closing_price) AS daily_change
FROM stock_prices
ORDER BY
    stock_name ASC,
    date ASC

Query id: d84ead08-87fe-4388-94e9-619c001ac01a

   ┌─stock_name─┬───────date─┬─closing_price─┬─daily_change─┐
1. │ AAPL       │ 2024-11-25 │         150.5 │            0 │
2. │ AAPL       │ 2024-11-26 │           153 │          2.5 │
3. │ AAPL       │ 2024-11-27 │         151.5 │         -1.5 │
4. │ AAPL       │ 2024-11-28 │           154 │          2.5 │
5. │ GOOG       │ 2024-11-25 │          2800 │         2646 │
6. │ GOOG       │ 2024-11-26 │          2825 │           25 │
7. │ GOOG       │ 2024-11-27 │          2810 │          -15 │
8. │ GOOG       │ 2024-11-28 │          2830 │           20 │
   └────────────┴────────────┴───────────────┴──────────────┘

8 rows in set. Elapsed: 0.002 sec. 

Explanation :

  1. runningDifference(closing_price):
    • Computes the difference between each row and its preceding row for the closing_price column.
    • For the first row of each stock, the difference is 0 (or undefined, depending on interpretation).
  2. Grouped by stock_name:
    • The table is ordered by stock_name and date so that differences are calculated correctly for each stock.

 

Data Type-Specific Functions:

 

toStartOfYear

The toStartOfYear() function in ClickHouse simplifies working with dates by resetting a DateTime value to the very start of its year. It transforms any given date or time into January 1st at 00:00:00 of that year, making it easier to standardize and analyze yearly data trends.

 

Syntax: 

toStartOfYear(date)

 

Let's calculate the total annual sales of a shop.

clickhouse_node1 :) SELECT
    toStartOfYear(date) AS year,
    sum(price) AS total_sale,
    bar(total_sale, 0, 100000, 100) AS graph
FROM product
GROUP BY year

Query id: 8deaff18-92c9-4c8c-a277-a3d7d630f5e7

   ┌───────year─┬─total_sale─┬─graph────────────────────────────────────┐
1. │ 2021-01-01 │      30800 │ ██████████████████████████████▊          │
2. │ 2022-01-01 │      22000 │ ██████████████████████                   │
3. │ 2023-01-01 │      39600 │ ███████████████████████████████████████▌ │
4. │ 2024-01-01 │      23900 │ ███████████████████████▉                 │
   └────────────┴────────────┴──────────────────────────────────────────┘

4 rows in set. Elapsed: 0.005 sec. 

The query summarizes the shop's annual sales over four years, showing fluctuations in revenue.

 

addDays()

The addDays() function in ClickHouse is used to add a specified number of days to a given Date or DateTime value.

 

Syntax: 

addDays(Date, n)

n - no. of days to be added 

 

clickhouse_node1 :) SELECT
    order_date,
    product_id,
    destination,
    addDays(order_date, 5) AS estimated_delivery_time
FROM orders

Query id: 69aeab17-b1b6-4847-83cc-ddf62c548d9b

    ┌──────────order_date─┬─product_id─┬─destination───┬─estimated_delivery_time─┐
 1. │ 2024-11-01 10:00:00 │ P123       │ New York      │     2024-11-06 10:00:00 │
 2. │ 2024-11-02 14:30:00 │ P456       │ Los Angeles   │     2024-11-07 14:30:00 │
 3. │ 2024-11-03 09:15:00 │ P789       │ Chicago       │     2024-11-08 09:15:00 │
 4. │ 2024-11-04 11:45:00 │ P101       │ San Francisco │     2024-11-09 11:45:00 │
 5. │ 2024-11-05 16:00:00 │ P202       │ Houston       │     2024-11-10 16:00:00 │
 6. │ 2024-11-06 13:20:00 │ P303       │ Boston        │     2024-11-11 13:20:00 │
 7. │ 2024-11-07 08:00:00 │ P404       │ Dallas        │     2024-11-12 08:00:00 │
 8. │ 2024-11-08 17:30:00 │ P505       │ Seattle       │     2024-11-13 17:30:00 │
 9. │ 2024-11-09 12:00:00 │ P606       │ Miami         │     2024-11-14 12:00:00 │
10. │ 2024-11-10 15:45:00 │ P707       │ Denver        │     2024-11-15 15:45:00 │
    └─────────────────────┴────────────┴───────────────┴─────────────────────────┘

10 rows in set. Elapsed: 0.006 sec. 

The query successfully calculates the estimated delivery time for each order by adding 5 days to the order_date. The resulting output shows that the estimated_delivery_time for each order is exactly 5 days after the order_date, which aligns with the intended functionality of the addDays() function.

 

INTERVAL Syntax: ClickHouse INTERVAL 60 DAY and Beyond

In addition to functions like addDays(), ClickHouse supports the SQL-standard INTERVAL keyword for date and time arithmetic. This is especially useful for filtering recent data or calculating date offsets directly in queries.

Syntax:

-- Add 60 days to current date
SELECT now() + INTERVAL 60 DAY AS future_date;

-- Filter orders from the last 60 days
SELECT order_id, order_date, product_id
FROM orders
WHERE order_date >= now() - INTERVAL 60 DAY;

-- Combine with other INTERVAL units
SELECT
    now() + INTERVAL 2 MONTH AS two_months_later,
    now() - INTERVAL 1 YEAR AS one_year_ago,
    now() + INTERVAL 60 DAY AS sixty_days_later;

The ClickHouse INTERVAL 60 DAY syntax uses singular unit keywords (DAY, not DAYS). Supported units include SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, and YEAR.

When to use INTERVAL vs addDays(): Use INTERVAL for static, readable date offsets in WHERE clauses and simple expressions. Use addDays() when you need to add a dynamic number of days from a column value.

Clickhouse date functions

 

timeDiff()

The timeDiff() function in ClickHouse calculates the difference between two DateTime values. This makes it highly flexible and useful for time-based analytics, such as calculating the time gap between events or analyzing the duration of processes.

Returns the difference between two date or dateTime in seconds

 

Syntax : 

timeDiff(dateTime1, dateTime2)

 

clickhouse_node1 :) SELECT
    event_id,
    start_time,
    end_time,
    timeDiff(start_time, end_time) AS time_diff_in_seconds
FROM event_logs

Query id: bd58660f-0dc7-4ba7-91ac-f3f4ce00633c

    ┌─event_id─┬──────────start_time─┬────────────end_time─┬─time_diff_in_seconds─┐
 1. │        1 │ 2024-11-28 08:00:00 │ 2024-11-28 10:30:00 │                 9000 │
 2. │        2 │ 2024-11-28 09:15:00 │ 2024-11-28 11:45:00 │                 9000 │
 3. │        3 │ 2024-11-28 14:00:00 │ 2024-11-28 18:00:00 │                14400 │
 4. │        4 │ 2024-11-28 12:00:00 │ 2024-11-28 13:45:00 │                 6300 │
 5. │        5 │ 2024-11-28 07:30:00 │ 2024-11-28 10:00:00 │                 9000 │
 6. │        6 │ 2024-11-28 20:15:00 │ 2024-11-29 00:45:00 │                16200 │
 7. │        7 │ 2024-11-28 06:00:00 │ 2024-11-28 07:30:00 │                 5400 │
 8. │        8 │ 2024-11-28 16:30:00 │ 2024-11-28 19:00:00 │                 9000 │
 9. │        9 │ 2024-11-28 21:00:00 │ 2024-11-29 01:00:00 │                14400 │
10. │       10 │ 2024-11-28 23:45:00 │ 2024-11-29 03:15:00 │                12600 │
    └──────────┴─────────────────────┴─────────────────────┴──────────────────────┘

10 rows in set. Elapsed: 0.001 sec. 

Aggregate Functions: 

Quantile : 

The quantile function is an aggregate function in ClickHouse used to calculate percentile values for a dataset. It identifies a value below which a certain percentage of observations in a group falls. For instance, the 0.5 quantile (50th percentile) is the median of the data.

 

Clickhouse_node1 :) SELECT
    server_id,
    quantile(0.5)(response_time_ms) AS median_response_time,
    quantile(0.9)(response_time_ms) AS p90_response_time,
    quantile(0.99)(response_time_ms) AS p99_response_time
FROM web_server_logs
GROUP BY server_id

Query id: 881778e5-19dd-4703-97d0-70a3662818c8

   ┌─server_id─┬─median_response_time─┬─p90_response_time─┬─p99_response_time─┐
1. │ server_1  │                  110 │               228 │             292.8 │
2. │ server_3  │                  220 │               252 │             259.2 │
3. │ server_2  │                   55 │                66 │ 69.60000000000001 │
   └───────────┴──────────────────────┴───────────────────┴───────────────────┘

3 rows in set. Elapsed: 0.645 sec. 

  

Explanation:

  • median_response_time: 50% of requests take this time or less.
  • p90_response_time: 90% of requests take this time or less (indicates performance for most users).
  • p99_response_time: 99% of requests take this time or less (indicates performance for almost all users).

Standard Deviation :

Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a dataset. A low standard deviation indicates that data points are close to the mean, whereas a high standard deviation suggests a wider spread of values.

In ClickHouse, stddevPop computes the population standard deviation, while stddevSamp computes the sample standard deviation.

stddevPop returns the square root of variance pop , and stddevSamp returns the square root of sample variance

Syntax

  • Population Standard Deviation (stddevPop):

stddevPop(column_name)

 

  • Sample Standard Deviation (stddevSamp):

stddevSamp(column_name)

Lets see an example for stddev

clickhouse_node1 :) SELECT
    server_id,
    avg(response_time_ms) AS average_latency,
    stddevPop(response_time_ms) AS stddevPop_latency --population standard deviation 
FROM server_latency
GROUP BY server_id

Query id: f64087f4-ed13-43f7-8c0d-0aa97777e8a9

   ┌─server_id─┬─average_latency─┬─stddev_latency─┐
1. │ server_1  │             114 │       20.59126 │
2. │ server_3  │             212 │       17.20465 │
3. │ server_2  │              60 │       7.071068 │
   └───────────┴─────────────────┴────────────────┘

3 rows in set. Elapsed: 0.031 sec. 

Clickhouse_node1 :) SELECT
    server_id,
    avg(response_time_ms) AS average_latency,
    stddevSamp(response_time_ms) AS stddevSamp_latency --sample standard deviation
FROM server_latency
GROUP BY server_id


Query id: 5d5ff99d-11ed-486a-be8e-727afa240b5f


   ┌─server_id─┬─average_latency─┬─stddevSamp_latency─┐
1. │ server_1  │             114 │          23.021729 │
2. │ server_3  │             212 │          19.235384 │
3. │ server_2  │              60 │           7.905694 │
   └───────────┴─────────────────┴────────────────────┘


3 rows in set. Elapsed: 0.002 sec.

Aggregating function combinators 

 

-If :

The suffix -If can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition. The aggregate function processes only the rows that trigger the condition. If the condition was not triggered even once, it returns a default value (usually zeros or empty strings).

 

Clickhouse_node1 :)  SELECT *
FROM sales

Query id: e7691b8b-6f93-43bf-b500-cf2e3c50374f

   ┌───────date─┬─sales_amount─┐
1. │ 2023-10-01 │          100 │
2. │ 2023-10-02 │          150 │
3. │ 2023-10-03 │          120 │
4. │ 2023-10-04 │          180 │
   └────────────┴──────────────┘

4 rows in set. Elapsed: 0.019 sec. 

Clickhouse_node1 :) SELECT sumIf(sales_amount, date > '2023-10-02') AS Total_sales
FROM sales

Query id: 9a008fbd-678a-43fd-9af9-6867b2a9cd1c

   ┌─Total_sales─┐
1. │         300 │
   └─────────────┘

1 row in set. Elapsed: 0.002 sec. 

 

-Array:

The -Array combinator modifies standard aggregate functions (like sum, avg, min, etc.) to accept arrays as input. Instead of calculating the aggregate over rows of data, it processes the elements within each array.

-If and -Array can be combined. However, ‘Array’ must come first, then ‘If’. Examples: uniqArrayIf(arr, cond),

 

Clickhouse_node1 :)  SELECT sumArray([100, 200, 150, 300, 400, 250]) AS total_sales

Query id: 00121da4-d23e-4afa-998b-be4a00b3ed7b

   ┌─total_sales─┐
1. │        1400 │
   └─────────────┘

1 row in set. Elapsed: 0.003 sec. 

 

Full-Text Search:

 

match(): 

The match function in ClickHouse is used to perform regular expression matching against a string. It allows you to search for patterns within text data and is useful for tasks such as validation, extraction, and filtering.

Syntax : 

match(string, pattern)

clickhouse_node1 :) SELECT *
FROM logs

Query id: 9c2d790d-60fc-4a58-8bea-fb11c91bd6b0

   ┌─id─┬─message────────────────────────────────┐
1. │  1 │ Error: file not found                  │
2. │  2 │ Warning: low disk space                │
3. │  3 │ Info: operation completed successfully │
   └────┴────────────────────────────────────────┘

3 rows in set. Elapsed: 0.036 sec. 
clickhouse_node1 :) SELECT *
FROM logs
WHERE match(message, 'Error')

Query id: 24e2adef-a964-4ea4-a77d-610bfc7e90c2

   ┌─id─┬─message───────────────┐
1. │  1 │ Error: file not found │
   └────┴───────────────────────┘

1 row in set. Elapsed: 0.023 sec. 

 

UUID Functions: generateUUIDv4 and generateUUIDv7 in ClickHouse:

 

generateUUIDv4(): 

 

The generateUUIDv4 function in ClickHouse is used to create a random UUID (Universally Unique Identifier) based on version 4 of the UUID standard. This version generates a UUID that is not based on any specific information, making it suitable for creating unique identifiers.

 

Syntax : 

generateUUIDv4()

clickhouse_node1 :) CREATE TABLE uuid.salary
(
    `id` UUID DEFAULT generateUUIDv4(),
    `team` String,
    `player` String,
    `salary` UInt32,
    `position` String
)
ENGINE = Memory 

clickhouse_node1 :) INSERT INTO salary (team, player, salary, position) VALUES
('Lakers', 'LeBron James', 45000000, 'Forward'),
('Warriors', 'Stephen Curry', 48000000, 'Guard'),
('Nets', 'Kevin Durant', 42000000, 'Forward'),
('Bucks', 'Giannis Antetokounmpo', 45000000, 'Forward'),
('Mavericks', 'Luka Doncic', 37000000, 'Guard');

clickhouse_node1 :) SELECT
    id,
    player
FROM salary

Query id: 90f9b017-1af8-4a5c-9495-0e51d0a9ed20

   ┌─id───────────────────────────────────┬─player────────────────┐
1. │ bc8e5596-e233-44ac-961b-cea3c04d2903 │ LeBron James          │
2. │ e67a1593-1dc7-4e7d-864f-7a0a344d09b6 │ Stephen Curry         │
3. │ 2bf0dbb3-544d-4923-808e-bd877ec5a407 │ Kevin Durant          │
4. │ 5c565b0b-0220-482d-90cd-017278bdac9d │ Giannis Antetokounmpo │
5. │ 0acaae19-1546-45f8-bde4-9d291ebe540e │ Luka Doncic           │
   └──────────────────────────────────────┴───────────────────────┘

5 rows in set. Elapsed: 0.004 sec. 

The ClickHouse generateUUIDv4 function produces a random version-4 UUID on each invocation. It is commonly used as a default value for primary key columns or to tag rows with unique identifiers during ingestion.

generateUUIDv7() — Time-Ordered UUIDs

Starting from ClickHouse 24.1+, generateUUIDv7() is available as an alternative that produces time-ordered UUIDs. Unlike v4, UUIDv7 values are sortable by creation time, making them ideal for distributed systems where time-based ordering matters.

-- Generate a time-ordered UUID (ClickHouse 24.1+)
SELECT generateUUIDv7() AS time_ordered_uuid;

-- Use as default column value
CREATE TABLE events (
    id UUID DEFAULT generateUUIDv7(),
    event_name String,
    created_at DateTime DEFAULT now()
) ENGINE = MergeTree()
ORDER BY id;

Need help setting up ClickHouse for production? Need help setting up ClickHouse for production?

Visual Representation Functions 

 

bar()

The bar() function in ClickHouse is used to generate a simple ASCII bar chart as a visual representation of data values in SQL query results. This function is useful for quickly seeing relative sizes and patterns in data directly in the query output, particularly when working with grouped or time-series data.

 

clickhouse_node1 :) SELECT *
FROM server_logs
LIMIT 5

Query id: f09fdcff-6aa3-4bfd-89fe-16e215a244fb

   ┌───────────timestamp─┬─server_id─┬─requests─┐
1. │ 2024-11-04 09:00:00 │ server-1  │     3200 │
2. │ 2024-11-04 09:00:00 │ server-4  │     2500 │
3. │ 2024-11-04 09:01:00 │ server-2  │     2800 │
4. │ 2024-11-04 09:01:00 │ server-5  │     2900 │
5. │ 2024-11-04 09:02:00 │ server-3  │     4500 │
   └─────────────────────┴───────────┴──────────┘
5 rows in set. Elapsed: 0.001 sec. 

clickhouse_node1 :) SELECT
    server_id,
    sum(requests) AS total_requests,
    bar(total_requests, 0, 1000000, 1000) AS load_bar
FROM server_logs
GROUP BY server_id

Query id: 010f70f2-7b61-4171-80a2-af48e6b3644b

   ┌─server_id─┬─total_requests─┬─load_bar─────────────────────────┐
1. │ server-1  │          18400 │ ██████████████████▍              │
2. │ server-5  │          25800 │ █████████████████████████▊       │
3. │ server-4  │          31900 │ ███████████████████████████████▉ │
4. │ server-3  │          24600 │ ████████████████████████▌        │
5. │ server-2  │          16600 │ ████████████████▌                │
   └───────────┴────────────────┴──────────────────────────────────┘

5 rows in set. Elapsed: 0.003 sec.  

 

User Defined Functions(UDF)

User Defined Functions (UDFs) in ClickHouse allow users to define custom functions using the SQL language (or sometimes in other programming languages) to extend ClickHouse’s capabilities beyond its built-in functions. This enables users to create reusable logic for complex calculations, transformations, or specific business requirements that are not covered by the default functions.

 

Syntax :

Create function function_name as (parameters,...) -> expression  

 

clickhouse_node1 :) CREATE FUNCTION age_group AS (age) -> 
    CASE 
        WHEN age < 18 THEN 'Child'
        WHEN age >= 18 AND age < 60 THEN 'Adult'
        ELSE 'Senior'
    END;

Query id: d7e1fbeb-bd0b-4114-85d2-8cd29f0e500c

Ok.
    
clickhouse_node1 :) SELECT
    age,
    age_group(age) AS category
FROM
(
    SELECT arrayJoin([10, 25, 65]) AS age
)

Query id: 96fea7af-5007-49b8-9980-0b6d669f3daf

   ┌─age─┬─category─┐
1. │  10 │ Child    │
2. │  25 │ Adult    │
3. │  65 │ Senior   │
   └─────┴──────────┘

3 rows in set. Elapsed: 0.039 sec. 

 

Executable User Defined Functions : 

ClickHouse can call any external executable program or script to process data.The configuration of executable user defined functions can be located in one or more xml-files. The path to the configuration is specified in the user_defined_executable_functions_config parameter.

 

Creating a user defined function for masking  sensitive data like number and email address.

Lets see how can we do this using 

 

Masking_function.xml

<functions>
    <function>
        <name>mask_sensitive_data</name>
        <type>executable</type>
        <command>masking.py</command>
        <return_type>String</return_type>
        <argument>
            <type>String</type>
        </argument>
        <format>TSV</format>
    </function>
</functions>

 

Add the following configuration in config.xml 

 

<user_defined_executable_functions_config> /masking_function.xml </user_defined_executable_functions_config>

 

Create a python file masking.py and save it in user_script folder 

 

def mask_email(email):
    """Mask an email address by replacing characters in the local part."""
    try:
        local, domain = email.split("@")
        if len(local) > 1:
            masked_local = local[0] + "*" * (len(local) - 1)
        else:
            masked_local = "*"
        return f"{masked_local}@{domain}"
    except Exception:
        return "invalid_email"

def mask_phone(phone):
    """Mask a phone number by replacing all but the last two digits."""
    try:
        if len(phone) >= 2:
            return "*" * (len(phone) - 2) + phone[-2:]
        return "invalid_phone"
    except Exception:
        return "invalid_phone"

 

Lets execute through the query

 

clickhouse_node1 :) SELECT mask_sensitive_data('maks_email24@gmail.com') AS masked_email ;

Query id: c1470bdf-1061-477b-b02b-c44b47005175

   ┌─masked_email───────────┐
1. │ m***********@gmail.com │
   └────────────────────────┘

1 row in set. Elapsed: 0.480 sec. 

 

This query demonstrates how a hypothetical function mask_sensitive_data processes an email by masking part of its characters for privacy. In the result, only the first character and the domain remain visible (m***********@gmail.com), effectively protecting sensitive user information while preserving its usability for identification or reporting.

 

Readable Formatting Functions

 

formatReadableSize

The formatReadableSize function converts a size in bytes into a human-readable string, such as KB, MB, GB, etc.

 

Syntax:

 

formatReadableSize(size_in_bytes)

 

Clickhouse_node1 :) SELECT
    database,
    name AS table_name,
    total_bytes AS size_in_bytes,
    formatReadableSize(total_bytes) AS readable_size
FROM system.tables
WHERE database = 'default'
ORDER BY total_bytes DESC

Query id: a06491e4-cc04-49c5-9ee2-8653dce53abf

   ┌─database─┬─table_name────┬─size_in_bytes─┬─readable_size─┐
1. │ default  │ orders        │           752 │ 752.00 B      │
2. │ default  │ user_activity │           611 │ 611.00 B      │
3. │ default  │ stock_prices  │           488 │ 488.00 B      │
4. │ default  │ sale          │           461 │ 461.00 B      │
5. │ default  │ logs          │           403 │ 403.00 B      │
6. │ default  │ users         │           369 │ 369.00 B      │
7. │ default  │ ck            │           337 │ 337.00 B      │
8. │ default  │ sales         │           333 │ 333.00 B      │
9. │ default  │ player_list   │             0 │ 0.00 B        │
   └──────────┴───────────────┴───────────────┴───────────────┘

9 rows in set. Elapsed: 0.002 sec. 

 

formatReadableQuantity

The formatReadableQuantity function in ClickHouse converts a numeric value into a human-readable format. It is particularly useful for displaying large numbers (e.g., counts, sizes, etc.) in a compact and easy-to-understand way like thousands, millions, billions and so on.

Clickhouse_node1 :)SELECT
    arrayJoin([100, 12345, 432546534]) AS elapsed,
    formatReadableTimeDelta(elapsed) AS time_delta

Query id: bc85c349-41d1-4745-b791-8d539f3d413c

   ┌───elapsed─┬─time_delta──────────────────────────────────────────────────────┐
1. │       100 │ 1 minute and 40 seconds                                         │
2. │     12345 │ 3 hours, 25 minutes and 45 seconds                              │
3. │ 432546534 │ 13 years, 8 months, 17 days, 7 hours, 48 minutes and 54 seconds │
   └───────────┴─────────────────────────────────────────────────────────────────┘

 

formatReadableTimeDelta

The formatReadableTimeDelta function in ClickHouse converts a time interval (in seconds) into a human-readable duration. It is used to make elapsed times or durations easier to understand by displaying them in terms of days, hours, minutes, and seconds.

 

Clickhouse_node1 :)  SELECT
    arrayJoin([120, 12375, 432346534]) AS elapsed,
    formatReadableTimeDelta(elapsed) AS time_delta

Query id: a58953dd-33b1-45d1-9cec-77c4bd41d427

   ┌───elapsed─┬─time_delta─────────────────────────────────────────────┐
1. │       120 │ 2 minutes                                              │
2. │     12375 │ 3 hours, 26 minutes and 15 seconds                     │
3. │ 432346534 │ 13 years, 8 months, 15 days, 15 minutes and 34 seconds │
   └───────────┴────────────────────────────────────────────────────────┘

3 rows in set. Elapsed: 0.003 sec. 

 

ClickHouse Function Updates: What's Changed in Recent Versions

ClickHouse evolves rapidly, and several functions covered in this guide have received notable improvements in recent releases:

  • generateUUIDv7() (v24.1+): A new UUID generation function that produces time-sortable UUIDs based on the UUIDv7 specification. Unlike generateUUIDv4(), the output is chronologically ordered, making it suitable for primary keys in time-series workloads.
  • INTERVAL Improvements (v23.8+): ClickHouse has enhanced INTERVAL support to allow compound intervals such as INTERVAL '2 HOUR 30 MINUTE', making time arithmetic even more flexible.
  • arrayFold() (v23.4+): Joins arrayMap and arrayFilter as a powerful array manipulation tool. It reduces an array to a single value using a custom accumulator function.
  • Variant & Dynamic Types (v24.1+): New data types that impact how functions handle mixed-type columns, offering more flexibility in schema design.
  • Date Functions Enhancements: Functions like addDays(), addHours(), and other date arithmetic functions now consistently support Date32 type for extended date ranges (1900-01-01 to 2299-12-31).

We recommend checking the official ClickHouse changelog for the most up-to-date function references.

Conclusion

ClickHouse's versatile functions — from the ClickHouse addDays function and INTERVAL 60 DAY syntax for date arithmetic, to generateUUIDv4() for unique ID generation — are more than tools. They're enablers of efficient, insightful, and actionable data analytics. Whether you're a beginner or an experienced data engineer, mastering these functions can unlock new dimensions of performance and insight for your projects.

Ready to level up? Explore Mafiree's ClickHouse consulting services and transform your data journey today!

 

 

FAQ

In ClickHouse, the correct syntax is INTERVAL 60 DAY (singular DAY, not DAYS). You can use it to add or subtract days from any date or DateTime value. For example: SELECT now() + INTERVAL 60 DAY AS future_date; or in a WHERE clause: SELECT * FROM orders WHERE order_date >= now() - INTERVAL 60 DAY; This works with all date/time units: SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, and YEAR.
Call generateUUIDv4() directly in a SELECT statement to get a random UUID: SELECT generateUUIDv4(); You can also use it as a default column value: CREATE TABLE my_table (id UUID DEFAULT generateUUIDv4(), name String) ENGINE = MergeTree() ORDER BY id; Each call produces a unique, random 128-bit UUID conforming to the version 4 specification.
The addDays() function adds a specified number of days to a Date or DateTime value. The syntax is addDays(date, n), where n is the number of days to add. For example: SELECT addDays(toDate('2025-01-01'), 5) returns 2025-01-06. It's useful for calculating delivery dates, subscription expiry, trial periods, and other date-based deadlines.
Both achieve similar results but differ in syntax and flexibility. addDays(date, n) is a function that accepts a column or variable as n, making it ideal for dynamic day counts from table data. INTERVAL n DAY uses SQL-standard syntax and is more readable for static offsets, such as WHERE date >= now() - INTERVAL 60 DAY. Both return the same data type. Use addDays() for column-based arithmetic and INTERVAL for fixed offsets in filters and expressions.

Author Bio

sukan

Sukan is Database Team Lead at Mafiree with over a decade of experience in database systems, architecture, and performance optimization. He specializes in MySQL, MongoDB, TiDB, and ClickHouse, developing architectural improvements that make data platforms faster, more efficient, and cost-effective. Sukan writes about practical database engineering topics, real-world performance tuning, data replication, and high-scale system design, drawing from extensive hands-on experience solving complex technical challenges.

Leave a Comment

Related Blogs

Speeding up ClickHouse Queries: Materialized Views vs. Refreshable Views Explained

Learn how materialized views and their refreshable counterparts simplify complex analytics by precomputing results, boosting query performance, and enabling near real-time decision-making.

  2285 views
What is Clickhouse

Experience the future of analytics with ClickHouse! Its powerful columnar storage and vectorized execution allow for lightning-fast data processing, making it ideal for real-time insights.

  2302 views
Can effective archiving cut your infrastructure costs by 50%?

Discover how effective data archival can slash your infrastructure costs by 50%! Learn how we helped one client streamline their database management and improve performance.

  780 views
Benchmark :: MySQL Vs ColumnStore Vs Clickhouse

Comparison of MySQL, Columnstore, Clickhouse with respect to time series queries.

  13985 views

Subscribe for email updates

Get in touch with us

Highlights

More than 6000 Servers Monitored

Happy Clients

Certified DBAs

24 x 7 x 365 Support

PCI

Database Services

MySQL MongoDB PostgreSQL SQL Server Aerospike Clickhouse TiDB MariaDB Columnstore

Quick Links

Careers Blog Contact Privacy Policy Disclaimer Policy

Contacts

Linkedin Mafiree Facebook Mafiree Twitter Mafiree

Nagercoil Office

Miru IT Park, Vallankumaranvillai,

Nagercoil, Tamilnadu - 629 002.

Bangalore Office

Unit 303, Vanguard Rise,

5th Main, Konena Agrahara,

Old Airport Road, Bangalore - 560 017.

Call: +91 6383016411

Email: sales@mafiree.com


Copyright © - All Rights Reserved - Mafiree