Return

This Week in Databend #74

December 28, 2022 · 3 min read

PsiACE

Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: .


Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's New

Check out what we've done this week to make Databend even better for you.

Features & Improvements ✨

Meta

  • remove stream when a watch client is dropped (#9334)

Planner

  • support selectivity estimation for range predicates (#9398)

Query

  • support copy on error (#9312)
  • support databend-local (#9282)
  • external storage support location part prefix (#9381)

Storage

  • rangefilter support in (#9330)
  • try to improve object storage io read (#9335)
  • support table compression (#9370)

Metrics

  • add more metrics for fuse compact and block write (#9399)

Sqllogictest

  • add no-fail-fast support (#9391)

Code Refactoring 🎉

*

  • adopt rustls entirely, removing all deps to native-tls (#9358)

Format

  • remove format_xxx settings (#9360)
  • adjust interface of FileFormatOptionsExt (#9395)

Planner

  • remove SyncTypeChecker (#9352)

Query

  • split fuse source to read data and deserialize (#9353)
  • avoid io copy in read parquet data (#9365)
  • add uncompressed buffer for parquet reader (#9379)

Storage

  • add read/write settings (#9359)

Bug Fixes 🔧

Format

  • fix align_flush with header only (#9327)

Settings

  • use logical CPU number as default value of num_cpus (#9396)

Processors

  • the data type on both sides of the union does not match (#9361)

HTTP Handler

  • false alarm (warning log) about query not exists (#9380)

Sqllogictest

  • refactor sqllogictest http client and fix expression string like (#9363)

What's On In Databend

Stay connected with the latest news about Databend.

Introducing databend-local​

Inspired by clickhouse-local, databend-local allows you to perform fast processing on local files, without the need of launching a Databend cluster.

> export CONFIG_FILE=tests/local/config/databend-local.toml
> cargo run --bin=databend-local -- --sql="SELECT * FROM tbl1" --table=tbl1=/path/to/databend/docs/public/data/books.parquet

exec local query: SELECT * FROM tbl1
+------------------------------+---------------------+------+
| title | author | date |
+------------------------------+---------------------+------+
| Transaction Processing | Jim Gray | 1992 |
| Readings in Database Systems | Michael Stonebraker | 2004 |
| Transaction Processing | Jim Gray | 1992 |
| Readings in Database Systems | Michael Stonebraker | 2004 |
+------------------------------+---------------------+------+
4 rows in set. Query took 0.015 seconds.

Learn More

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Compressing Short Strings​

When processing the same queries with short strings involved, Databend usually reads more data than other databases, such as Snowflake.

SELECT SearchPhrase, MIN(URL), COUNT(*) AS c FROM hits WHERE URL LIKE '%google%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;

Such queries might be more efficient if short strings (URLs, etc) are compressed.

Issue 9001: performance: compressing for short strings

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

andylokandyariesdevilBohuTANGdantengskydrmingdrmereastfisher
andylokandyariesdevilBohuTANGdantengskydrmingdrmereastfisher
everpcpcleiyskymergify[bot]PsiACERinChanNOWWWsoyeric128
everpcpcleiyskymergify[bot]PsiACERinChanNOWWWsoyeric128
sundy-liXuanwoxudong963youngsofunzhang2014zhyass
sundy-liXuanwoxudong963youngsofunzhang2014zhyass

🎈Connect With Us

Databend is a cutting-edge, open-source cloud-native warehouse built with Rust, designed to handle massive-scale analytics.

Join the Databend Community to try, get help, and contribute!