Managing extremely large datasets, complex queries, and varying workloads in a data warehouse can be both challenging and costly. With the Snowflake AI Data Cloud, you can adjust compute resource levels on a virtual warehouse. However, setting up a warehouse to handle your most complex queries means it will use the same compute level for simpler queries, potentially wasting resources and money.
To address this, Snowflake introduced the Query Acceleration Service, which allows users to maintain a smaller warehouse size while dynamically increasing compute resources as needed for complex queries and large scans.Â
In this blog, we’ll explain the Query Acceleration Service, when to enable it, and how to activate it for your warehouses.
What is the Snowflake Query Acceleration Service (QAS)?
Query Acceleration Service (QAS) is a Snowflake feature designed to enhance the performance of long-running queries, particularly those involving large table scans or workloads with unpredictable data volumes.
QAS identifies queries that would benefit from additional compute power and offloads portions of the query processing to shared compute resources. By running these portions in parallel, the service significantly reduces processing time, leading to faster query execution and more efficient resource use.
When to Use QAS?
QAS is designed to enhance the performance of resource-intensive queries. It’s particularly beneficial for workloads characterized by:
Unpredictable Data Volumes: Ad-hoc queries dealing with datasets that vary in size will benefit from QAS’s ability to scale resources as needed.
Queries Involving Massive Datasets with Selective Filters: By chunking off portions of a large dataset and running scans in parallel, QAS will run queries faster.
Create Table As Select (CTAS) Statements Involving Large Tables: QAS works similarly on CTAS statements by parallelizing the scan of the large table to improve performance.
How to Find Benefiting Queries
Adding QAS will increase the credit usage of the warehouse, so it’s good to know if the warehouse you’re thinking of adding it to will benefit from it or not. Snowflake provides two ways to help you find queries and warehouses that QAS can accelerate:
SYSTEM$ESTIMATE_QUERY_ACCELERATION
SYSTEM$ESTIMATE_QUERY_ACCELERATION
is a system function that accepts a query ID of a previously run query and returns whether the query is eligible for QAS. If it is eligible, it returns the estimate of what the query execution time would have been with QAS enabled for the various scale factors, which is how many times the size of the original warehouse the computer power is scaled up to. The scale factor in this function displays scales of 1, 2, 4, 8, and 10.Â
Here is an example of how to run the function:
SELECT PARSE_JSON(SYSTEM$ESTIMATE_QUERY_ACCELERATION('01b62129-0608-13d3-0074-120301991192'));
The function will return a JSON object similar to this:
{
"estimatedQueryTimes": {
"1": 183,
"10": 105,
"2": 142,
"4": 125,
"8": 115
},
"originalQueryTime": 325.66,
"queryUUID": "01b62129-0608-13d3-0074-120301991192",
"status": "eligible",
"upperLimitScaleFactor": 10
}
For this example query, the original query time was 325.66 minutes and is significantly reduced by utilizing QAS.
If it isn’t eligible, the function will return this JSON object:
{
"estimatedQueryTimes": {},
"originalQueryTime": 20.291,
"queryUUID": "cf23522b-3b91-cf14-9fe0-988a292a4bfa",
"status": "ineligible",
"upperLimitScaleFactor": 0
}
QUERY_ACCELERATION_ELIGIBLE
The second way to determine queries and warehouses to add QAS to is the SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
view. This view contains all the eligible queries and the estimated execution time with QAS enabled. Some useful queries include:
Determining which queries run in the past week would have the most benefit from QAS:
SELECT query_id, eligible_query_acceleration_time
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
WHERE start_time > DATEADD('day', -7, CURRENT_TIMESTAMP())
ORDER BY eligible_query_acceleration_time DESC;
Viewing the warehouses with the most eligible time in the past month:
SELECT warehouse_name, SUM(eligible_query_acceleration_time) AS total_eligible_time
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
WHERE start_time > DATEADD('month', -1, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
ORDER BY total_eligible_time DESC;
How to Use Query Acceleration Service
Adding the Service to a Warehouse
Enabling QAS on a warehouse is a simple parameter that can be set either during warehouse creation or by altering an existing warehouse:
CREATE WAREHOUSE phData_WH WITH
ENABLE_QUERY_ACCELERATION = true;
ALTER WAREHOUSE phData_WH SET
ENABLE_QUERY_ACCELERATION = true;
Enabling QAS on a warehouse allows it to increase its compute by up to 8x its size when necessary. This also means that while QAS is needed, the warehouse will use up to 8x the credits of the normal warehouse size. You can see how this can get out of hand quickly.Â
To help control costs, users can change the max Scale Factor QAS uses. This puts a cap on how much compute resources QAS can lease.Â
For instance, if you set the Scale Factor to 4, then QAS can only lease 4X as much compute as the original warehouse size. The Scale Factor can be enabled as a parameter when creating or altering a warehouse:
ALTER WAREHOUSE my_wh SET
ENABLE_QUERY_ACCELERATION = true
QUERY_ACCELERATION_MAX_SCALE_FACTOR = 4;
Viewing the Results
When using QAS, it’s recommended to monitor the service to make sure it continues to be beneficial. Snowflake allows you to monitor query acceleration by query in the profile view:
The Query Acceleration results of all queries at once can be viewed using the QUERY_HISTORY
view:
SELECT query_id,
query_text,
warehouse_name,
start_time,
end_time,
query_acceleration_bytes_scanned,
query_acceleration_partitions_scanned,
query_acceleration_upper_limit_scale_factor
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
You can also see how QAS affects your costs with the QUERY_ACCELERATION_HISTORY
view, either by warehouse or account. Here’s an example of a query to see how many credits QAS per warehouse in the past month:
SELECT warehouse_name,
SUM(credits_used) AS total_credits_used
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
WHERE start_time >= DATE_TRUNC(month, CURRENT_DATE)
GROUP BY warehouse_name
ORDER BY total_credits_used DESC;
Query Acceleration Service Limitations
There are several reasons that queries are ineligible to use QAS:
The query contains functions that return non-deterministic results, e.g.,
RANDOM()
The query has a limit clause (or
TOP
clause) without an order by clauseThe query has a filter that isn’t selective enough
The query has a group by clause that causes the cardinality to be too high for eligibilityÂ
There are not enough partitions in the scan, causing the benefits of the service to be offset by the latency of QAS acquiring resources
How Much Does Query Acceleration Service Cost?
There is no upfront cost for enabling QAS, although it is required to have the Enterprise edition of Snowflake or higher to use the feature. The service’s cost comes from the credits used by the leased compute resources that QAS uses.Â
The SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
view enables you to see the credits used by the service itself, but it’s beneficial to see the overall credit usage of a warehouse before and after enabling QAS to really judge what the cost implications are.Â
Snowflake recommends using this query to view the cost difference between the eight weeks leading up to enabling QAS and the eight weeks afterward:
WITH credits AS (
SELECT 'QC' AS credit_type,
TO_DATE(end_time) AS credit_date,
SUM(credits_used) AS num_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
WHERE warehouse_name = 'PHDATA_WAREHOUSE'
AND credit_date BETWEEN
DATEADD(WEEK, -8, (
SELECT TO_DATE(MIN(end_time))
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
WHERE warehouse_name = 'PHDATA_WAREHOUSE'
))
AND
DATEADD(WEEK, +8, (
SELECT TO_DATE(MAX(end_time))
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
WHERE warehouse_name = 'PHDATA_WAREHOUSE'
))
GROUP BY credit_date
UNION ALL
SELECT 'WC' AS credit_type,
TO_DATE(end_time) AS credit_date,
SUM(credits_used) AS num_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE warehouse_name = 'PHDATA_WAREHOUSE'
AND credit_date BETWEEN
DATEADD(WEEK, -8, (
SELECT TO_DATE(MIN(end_time))
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
WHERE warehouse_name = 'PHDATA_WAREHOUSE'
))
AND
DATEADD(WEEK, +8, (
SELECT TO_DATE(MAX(end_time))
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY
WHERE warehouse_name = 'PHDATA_WAREHOUSE'
))
GROUP BY credit_date
)
SELECT credit_date,
SUM(IFF(credit_type = 'QC', num_credits, 0)) AS qas_credits,
SUM(IFF(credit_type = 'WC', num_credits, 0)) AS compute_credits,
compute_credits + qas_credits AS total_credits,
AVG(total_credits) OVER (
PARTITION BY NULL ORDER BY credit_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
AS avg_total_credits_7days
FROM credits
GROUP BY credit_date
ORDER BY credit_date;
This will provide the QAS credits used, the normal compute credits used, their combination, and the average total credits used over a seven day period. Change 'PHDATA_WAREHOUSE'
to the name of any warehouse that you enabled QAS on to see the results in your account.Â
Closing
By using Snowflake’s Query Acceleration Service in the right ways, you can significantly optimize your data warehouse performance without burning through Snowflake credits. QAS lets you handle queries with large scans and unpredictable workloads without compromising the overall efficiency of your Snowflake environment. By right-sizing your warehouses and strategically applying QAS, you can achieve faster query results while not wasting money.
Feeling overwhelmed maintaining your complex Snowflake environments?
phData’s Elastic Operations service can help find optimizations like enabling QAS and much more!
FAQs
Can I improve query performance without QAS?
There are several ways to improve query performance manually before turning to QAS:
Analyze Query Execution Plans to Identify Bottlenecks
Monitoring queries using Snowflake’s execution plans can help determine where the query is getting bogged down and shed light on what part of the query needs to be changed to perform better.
Explore Using Clustering Keys
Adding clustering keys to extremely large tables can help improve performance.
Use Materialized Views
For queries that are frequently run, but the data underneath changes infrequently, Materialized Views can significantly speed up access.
Can I add indexes to tables instead of using QAS?
Snowflake does not support traditional indexes (with the exception of Hybrid Tables). It is already optimized using data clustering and micro-partitions.