This Week in Databend #86
March 24, 2023 · 4 min read
PsiACE
Stay up to date with the latest weekly developments on Databend!
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's On In Databend
Stay connected with the latest news about Databend.
FlightSQL Handler in Progress
Flight SQL is an innovative open database protocol that caters to modern architectures. It boasts a columnar-oriented design and provides seamless support for parallel processing of data partitions.
The benefits of supporting FlightSQL include reducing serialization and deserialization during query execution, as well as easily supporting SDKs in different languages using predefined *.proto
files.
We're currently engaged in developing support for the FlightSQL Handler. If you're interested, refer to the following links:
Natural Language to SQL
By integrating with the popular AI services, Databend now provide you an efficient built-in solution - the AI_TO_SQL
function.
With this function, instructions written in natural language can be converted into SQL query statements aligned with table schema. With just a few modifications (or possibly none at all), it can be put into production.
SELECT * FROM ai_to_sql(
'List the total amount spent by users from the USA who are older than 30 years, grouped by their names, along with the number of orders they made in 2022',
'<openai-api-key>');
*************************** 1. row ***************************
database: openai
generated_sql: SELECT name, SUM(price) AS total_spent, COUNT(order_id) AS total_orders
FROM users
JOIN orders ON users.id = orders.user_id
WHERE country = 'USA' AND age > 30 AND order_date BETWEEN '2022-01-01' AND '2022-12-31'
GROUP BY name;
The function is now available on both Databend and Databend Cloud. To learn more about how it works, refer to the following links:
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
Vector Similarity Calculation in Databend
Databend has added a new function called cosine_distance
. This function accepts two input vectors, from
and to
, which are represented as slices of f32 values.
select cosine_distance([3.0, 45.0, 7.0, 2.0, 5.0, 20.0, 13.0, 12.0], [2.0, 54.0, 13.0, 15.0, 22.0, 34.0, 50.0, 1.0]) as sim
----
0.1264193
The Rust implementation efficiently performs calculations by utilizing the ArrayView
type from the ndarray crate.
pub fn cosine_distance(from: &[f32], to: &[f32]) -> Result<f32> {
if from.len() != to.len() {
return Err(ErrorCode::InvalidArgument(format!(
"Vector length not equal: {:} != {:}",
from.len(),
to.len(),
)));
}
let a = ArrayView::from(from);
let b = ArrayView::from(to);
let aa_sum = (&a * &a).sum();
let bb_sum = (&b * &b).sum();
Ok((&a * &b).sum() / ((aa_sum).sqrt() * (bb_sum).sqrt()))
}
Do you remember how to register scalar functions in Databend? You can check Doc | How to Write a Scalar Function and PR | #10737 to verify your answer.
Highlights
Here are some noteworthy items recorded here, perhaps you can find something that interests you.
- Learn how to monitor Databend using Prometheus and Grafana: Doc | Monitor - Prometheus & Grafana
- Metabase Databend Driver helps you connect Databend to Metabase and dashboard your data: Doc | Integrations - Metabase
- Databend now supports
PIVOT
,UNPIVOT
,GROUP BY CUBE
andGROUP BY ROLLUP
query syntax. For more information, please see PR #10676 and #10601.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Enable -Zgitoxide
to Speed up Git Dependencies Download
Enabling -Zgitoxide
can speed up the download of our Git dependencies significantly, which is much faster than using Git only.
This feature integrates cargo with gitoxide, a pure Rust implementation of Git that is idiomatic, lean, fast, and safe.
Issue #10466 | CI: Enable -Zgitoxide
to speed our git deps download speed
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.
New Contributors
We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.
- @SkyFan2002 made their first contribution in #10656. This pull request aimed to resolve inconsistent results caused by variations in column name case while executing SQL statements with
EXCLUDE
.
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog: https://github.com/datafuselabs/databend/compare/v1.0.22-nightly...v1.0.33-nightly
🎉 Contributors 24 contributors
Thanks a lot to the contributors for their excellent work.
🎈Connect With Us
Databend is a cutting-edge, open-source cloud-native warehouse built with Rust, designed to handle massive-scale analytics.
Join the Databend Community to try, get help, and contribute!