dev/changelog/52.0.0.md
This release consists of 549 commits from 121 contributors. See credits at the end of this changelog for more information.
See the upgrade guide for information on how to upgrade from previous versions.
Breaking changes:
FileSource to be constructed with a Schema #18386 (adriangb)AggregateUDFImpl::supports_null_handling_clause to false #18441 (Jefffrey)CacheAccessor::remove to take &self rather than &mut self #18726 (alchemist51)pyarrow feature #18528 (timsaucer)statistics_cache function #19054 (nuno-faria)newlines_in_values from FileScanConfig to CsvSource #19313 (adriangb)Performance related:
EliminateNestedUnion and EliminateOneUnion optimizer rules' #18678 (alamb)vectorized_equal_to for PrimitiveGroupValueBuilder in multi group by aggregation #17977 (rluvaton)MIN/MAX aggregates #18644 (2010YOUY01)new_repeated when converting scalar to an array #19018 (rluvaton)with_hashes #19373 (alamb)to_hex (> 2x) #19503 (andygrove)starts_with and ends_with for scalar arguments #19516 (andygrove)contains for scalar search arg #19529 (andygrove)md5 #19568 (andygrove)HashTableLookupExpr::evaluate #19602 (UBarney)split_part #19570 (andygrove)Nullstate / accumulators #19625 (Dandandan)Implemented enhancements:
array_slice functionality to support ListView and LargeListView types #18432 (Weijun-H)SessionState::create_logical_expr_from_sql_expr #18423 (petern48)ansi enable parameter for execution config #18635 (comphead)CREATE FUNCTION #18450 (r1b)corr with single row and NaN #18677 (comphead)scan_efficiency_ratio metric for parquet reading #18577 (petern48)abs math function part 1 - non-ANSI mode #18205 (hsiang-c)<slt:ignore> marker in sqllogictest for non-deterministic expected parts #18857 (2010YOUY01)array_slice benchmark #18879 (dqkqd)url_encode, url_decode and try_url_decode #17399 (anhvdq)remove_optimizer_rule to SessionContext #19209 (nuno-faria)retract_batch #19278 (petern48)try_sum function #18569 (davidlghellin)get_field with multiple path arguments #19389 (adriangb)to_time function #19540 (kumarUjjawal)space #19610 (kazantsev-maksim)partition_statistics API for SortMergeJoinExec #19567 (kumarUjjawal)datafusion-cli #19388 (jizezhang)Fixed bugs:
with_param_values on LogicalPlan::EmptyRelation returns incorrect schema #18286 (dqkqd)LogicalPlan::Values after placeholder substitution #18740 (dqkqd)WorkTableExec special case in reset_plan_states #18803 (geoffreyclaude)rstest is a DEV dependency #19014 (crepererum)starts_with #19077 (willemv)expand_views_at_output #19019 (nuno-faria)bit_shift #19222 (kumarUjjawal)array_remove/array_remove_n/array_remove_all not using the same nullability as the input #19259 (rluvaton)map_from_arrays #19275 (kumarUjjawal)make_dt_interval #19236 (kumarUjjawal)date_sub #19225 (kumarUjjawal)bit_get #19220 (kumarUjjawal)next_day #19253 (kumarUjjawal)CooperativeExec and CoalesceBatchesExec #19400 (haohuaijin)reset_state for LazyMemoryExec #19362 (nuno-faria)SparkDateAdd and SparkDateSub #19377 (mzabaluev)Documentation updates:
WITHIN GROUP syntax in aggregate UDAFs #18607 (kosiew)FilterExec metrics to user-guide/metrics.md #19043 (2010YOUY01)force_filter_selections to restore pushdown_filters behavior prior to parquet 57.1.0 upgrade #19003 (alamb)Parquet over to PhysicalExprAdapter, remove SchemaAdapter #18998 (adriangb)list_files_cache_limit and list_files_cache_ttl #19108 (delamarch3)to_unixtime udf function to support a consistent set of argument types #19442 (kumarUjjawal)TypeSignatureClass::Any #19485 (Jefffrey)Other:
assert_or_internal_err! macro for more ergonomic internal invariant checks #18511 (2010YOUY01)clippy::needless_pass_by_value to datafusion-physical-expr #18557 (corasaurus-hex)tokio dependency and clippy #18598 (comphead)clippy::needless_pass_by_value for crates that don't require code changes. #18586 (2010YOUY01)Interval::and, Interval::not, and add Interval::or tests #18621 (pepijnve)assert_or_internal_err!() in datafusion/sql #18614 (2010YOUY01).asf.yaml #18636 (comphead)NullableInterval::and and NullableInterval::or. #18625 (pepijnve).asf.yaml #18652 (comphead)clippy::needless_pass_by_value to datafusion-catalog #18638 (Standing-Man)clippy::needless_pass_by_value to datafusion-datasource-avro #18641 (Standing-Man)clippy::needless_pass_by_value to datafusion-core #18640 (Standing-Man)assert_or_internal_err!() in datafusion/datasource #18697 (kumarUjjawal)clippy::needless_pass_by_value to datafusion-datasource #18682 (AryanBagade)Rust / cargo test (amd64) action #18709 (Jefffrey)assert_or_internal_err!() in datafusion/optimizer #18699 (kumarUjjawal)assert_or_internal_err!() in datafusion/functions #18700 (kumarUjjawal)assert_or_internal_err!() in datafusion/expr-common #18702 (kumarUjjawal)assert_or_internal_err!() in datafusion/functions-aggregate #18716 (kumarUjjawal)assert_or_internal_err!() in datafusion/functions-nested #18724 (kumarUjjawal)assert_or_internal_err!() in datafusion/physical-expr-common #18735 (kumarUjjawal)assert_or_internal_err!() in datafusion/physical-expr #18736 (kumarUjjawal)assert_or_internal_err!() in datafusion/physical-optimizer #18732 (kumarUjjawal)assert_or_internal_err!() in datafusion/physical-plan #18730 (kumarUjjawal)assert_or_internal_err!() in datafusion/expr #18731 (kumarUjjawal)PruningPredicate documentation #18742 (2010YOUY01)Interval constants to match NullableInterval #18654 (pepijnve)main branch CI test failure #18792 (2010YOUY01)assert_or_internal_err!() #18790 (2010YOUY01)GuaranteeRewriter to datafusion_expr #18821 (pepijnve)HashJoinExec and use CASE expressions for more precise filters #18451 (adriangb)bit_count #18841 (pepijnve)bit_count Spark function #18871 (comphead)case_when_with_expr #18872 (pepijnve)clippy::needless_pass_by_value to datafusion-physical-plan #18864 (2010YOUY01)bit_get() signature away from user defined #18836 (Jefffrey)Extensions #18887 (gabotechs)clippy::needless_pass_by_value globally across the workspace #18904 (2010YOUY01)map function alias handling in SQL planner #18914 (friendlymatthew)NdJsonReadOptions::schema_infer_max_records #18920 (Jefffrey)GROUPING SET CUBE #18798 (kosiew)to_local_time() signature away from user_defined #18707 (Jefffrey)traverse_chain macro to function #18951 (Dandandan)denys of needless_pass_by_value in lib.rs files #18996 (Jefffrey)FETCH Clause in Planner and CLI #18691 (kosiew)arrow, parquet to 57.1.0 #18820 (alamb)dev/rust_lint.sh #17863 (2010YOUY01)PartitionPruningStatistics #19020 (alamb)EmptyRelation Including produce_one_row Semantics #18842 (kosiew)tpchgen-cli to generate tpch data in bench.sh #19035 (alamb)datafusion-common crate #19080 (2010YOUY01)CASE exprs with constant value lookup tables #19143 (alamb)BatchCoalescer, new benchmarks (TPC-H Q21 SMJ up to ~4000x faster) #18875 (mbutrovich)AsyncFuncExec #19118 (mach-kernel)rust_lint.sh #19254 (2010YOUY01)datafusion-common crate #19247 (2010YOUY01)power() signature away from user defined #18968 (Jefffrey)clippy::allow_attributes for optimizer and macros #19310 (kumarUjjawal)decimal.slt #19352 (Jefffrey)GROUPING SETS(()) and handle empty-grouping aggregates #19252 (kosiew)ProjectionExpr::new easier to use with constants #19343 (alamb)range_and_generate_series #19428 (rluvaton)create_physical_expr #19299 (rgehan)ScalarValue::to_array_of_size #19441 (Jefffrey)ScalarValue code #19439 (Jefffrey)ascii signature away from user_defined #19513 (kumarUjjawal)get_data_types() for NativeType #19449 (Jefffrey)partition_statistics API for NestedLoopJoinExec #19468 (kumarUjjawal)type_coercion/functions.rs #19518 (Jefffrey)percentile_cont to clarify support input types #19611 (Jefffrey)rust_decimal, ignore RUSTSEC-2026-0001 to get clean CI #19657 (alamb)substring_index via single-byte fast path and direct indexing #19590 (lyne7-sc)Signature::coercible for isnan/iszero #19604 (kumarUjjawal)rust_decimal, remove ignore of RUSTSEC-2026-0001 #19666 (alamb)ParquetOpener::open() #19677 (2010YOUY01)Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
67 dependabot[bot]
38 Andrew Lamb
36 Jeffrey Vo
35 Kumar Ujjawal
34 Adrian Garcia Badaracco
22 Tim Saucer
19 Yongting You
13 Sergey Zhukov
11 Pepijn Van Eeckhoudt
11 kosiew
10 Daniël Heres
10 Dhanush
10 Oleks V
8 Geoffrey Claude
8 Raz Luvaton
7 Andy Grove
7 Liang-Chi Hsieh
7 Qi Zhu
6 Peter Nguyen
6 Shashidhar B M
5 Alan Tang
5 Alex Huang
5 Bruce Ritchie
5 Gene Bordegaray
5 Nuno Faria
5 Sriram Sundar
4 Blake Orth
4 Thomas Tanon
4 Yuvraj
4 theirix
3 Aryan Bagade
3 Chakkk
3 Emily Matheys
3 Huaijin
3 Khanh Duong
3 Kushagra S
3 Vedic Chawla
3 feniljain
3 harshit saini
3 jizezhang
3 shifluxxc
3 xonx
3 xudong.w
2 Carlos Hurtado
2 Chen Chongchen
2 Cora Sutton
2 Haresh Khanna
2 Lía Adriana
2 Manish Kumar
2 Martin Grigorov
2 Matthew Kim
2 Namgung Chan
2 Nimalan
2 Nithurshen
2 Rosai
2 Shubham Yadav
2 Trent Hauck
2 Vegard Stikbakke
2 Vrishabh
2 Xander
2 chakkk309
2 mag1c1an1
2 nlimpid
2 yqrz
1 Adam Curtis
1 Aly Abdelmoneim
1 Andrey Velichkevich
1 Arpit Bandejiya
1 Bharathwaj G
1 Bipul Lamsal
1 Clement de Groc
1 Congxian Qiu
1 David López
1 David Stancu
1 Devanshu
1 Dongpo Liu
1 EeshanBembi
1 Eshaan Gupta
1 Ethan Urbanski
1 Frederic Branczyk
1 Gabriel
1 Gohlub
1 Heran Lin
1 James Xu
1 Jatin Kumar singh
1 Karan Pradhan
1 Karthik Kondamudi
1 Kazantsev Maksim
1 Marco Neumann
1 Matt Butrovich
1 Max Burke
1 Michele Vigilante
1 Mikhail Zabaluev
1 Mohit rao
1 Ning Sun
1 Peter Lee
1 Quoc Anh
1 Ram
1 Randy
1 Renan GEHAN
1 Ruchir Khaitan
1 Samyak Sarnayak
1 Shiv Bhatia
1 Smith Cruise
1 Smotrov Oleksii
1 Solari Systems
1 Suhail
1 T2MIX
1 Tal Glanzman
1 Tamar
1 Tim-53
1 Tobias Schwarzinger
1 Ujjwal Kumar Tiwari
1 Willem Verstraeten
1 YuraLitvinov
1 bubulalabu
1 delamarch3
1 hsiang-c
1 r1b
1 rin
1 xavlee
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.