Back to Datafusion

Apache DataFusion 47.0.0 Changelog

dev/changelog/47.0.0.md

53.1.047.3 KB
Original Source
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -->

Apache DataFusion 47.0.0 Changelog

This release consists of 364 commits from 94 contributors. See credits at the end of this changelog for more information.

Breaking changes:

  • chore: cleanup deprecated API since version <= 40 #15027 (qazxcdswe123)
  • fix: mark ScalarUDFImpl::invoke_batch as deprecated #15049 (Blizzara)
  • feat: support customize metadata in alias for dataframe api #15120 (chenkovsky)
  • Refactor: add FileGroup structure for Vec<PartitionedFile> #15379 (xudong963)
  • Change default EXPLAIN format in datafusion-cli to tree format #15427 (alamb)
  • Support computing statistics for FileGroup #15432 (xudong963)
  • Remove redundant statistics from FileScanConfig #14955 (Standing-Man)
  • parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561 (adriangb)
  • feat: Add unique id for every memory consumer #15613 (EmilyMatt)

Performance related:

  • Fix sequential metadata fetching in ListingTable causing high latency #14918 (geoffreyclaude)
  • Implement GroupsAccumulator for min/max Duration #15322 (shruti2522)
  • [Minor] Remove/reorder logical plan rules #15421 (Dandandan)
  • Improve performance of first_value by implementing special GroupsAccumulator #15266 (UBarney)
  • perf: unwrap cast for comparing ints =/!= strings #15110 (alan910127)
  • Improve performance sort TPCH q3 with Utf8Vew ( Sort-preserving mergi… #15447 (zhuqi-lucas)
  • perf: Reuse row converter during sort #15302 (2010YOUY01)
  • perf: Add TopK benchmarks as variation over the sort_tpch benchmarks #15560 (geoffreyclaude)
  • Perf: remove clone on uninitiated_partitions in SortPreservingMergeStream #15562 (rluvaton)
  • Add short circuit evaluation for AND and OR #15462 (acking-you)
  • perf: Introduce sort prefix computation for early TopK exit optimization on partially sorted input (10x speedup on top10 bench) #15563 (geoffreyclaude)
  • Improve performance of last_value by implementing special GroupsAccumulator #15542 (UBarney)
  • Enhance: simplify x=x --> x IS NOT NULL OR NULL #15589 (ding-young)

Implemented enhancements:

  • feat: Add tree / pretty explain mode #14677 (irenjj)
  • feat: Add array_max function support #14470 (erenavsarogullari)
  • feat: implement tree explain for ProjectionExec #15082 (Standing-Man)
  • feat: support ApproxDistinct with utf8view #15200 (zhuqi-lucas)
  • feat: Attach Diagnostic to more than one column errors in scalar_subquery and in_subquery #15143 (changsun20)
  • feat: topk functionality for aggregates should support utf8view and largeutf8 #15152 (zhuqi-lucas)
  • feat: Native support utf8view for regex string operators #15275 (zhuqi-lucas)
  • feat: introduce JoinSetTracer trait for tracing context propagation in spawned tasks #14547 (geoffreyclaude)
  • feat: Support serde for JsonSource PhysicalPlan #15311 (westhide)
  • feat: Support serde for FileScanConfig batch_size #15335 (westhide)
  • feat: simplify regex wildcard pattern #15299 (waynexia)
  • feat: Add union_by_name, union_by_name_distinct to DataFrame api #15489 (Omega359)
  • feat: Add config max_temp_directory_size to limit max disk usage for spilling queries #15520 (2010YOUY01)
  • feat: Add tracing regression tests #15673 (geoffreyclaude)

Fixed bugs:

  • fix: External sort failing on an edge case #15017 (2010YOUY01)
  • fix: graceful NULL and type error handling in array functions #14737 (alan910127)
  • fix: Support datatype cast for insert api same as insert into sql #15091 (zhuqi-lucas)
  • fix: unparse for subqueryalias #15068 (chenkovsky)
  • fix: date_trunc bench broken by #15049 #15169 (Blizzara)
  • fix: compound_field_access doesn't identifier qualifier. #15153 (chenkovsky)
  • fix: unparsing left/ right semi/mark join #15212 (chenkovsky)
  • fix: handle duplicate WindowFunction expressions in Substrait consumer #15211 (Blizzara)
  • fix: write hive partitions for any int/uint/float #15337 (christophermcdermott)
  • fix: core_expressions feature flag broken, move overlay into core functions #15217 (shruti2522)
  • fix: Redundant files spilled during external sort + introduce SpillManager #15355 (2010YOUY01)
  • fix: typo of DropFunction #15434 (chenkovsky)
  • fix: Unconditionally wrap UNION BY NAME input nodes w/ Projection #15242 (rkrishn7)
  • fix: the average time for clickbench query compute should use new vec to make it compute for each query #15472 (zhuqi-lucas)
  • fix: Assertion fail in external sort #15469 (2010YOUY01)
  • fix: aggregation corner case #15457 (chenkovsky)
  • fix: update group by columns for merge phase after spill #15531 (rluvaton)
  • fix: Queries similar to count-bug produce incorrect results #15281 (suibianwanwank)
  • fix: ffi aggregation #15576 (chenkovsky)
  • fix: nested window function #15033 (chenkovsky)
  • fix: dictionary encoded column to partition column casting bug #15652 (haruband)
  • fix: recursion protection for physical plan node #15600 (chenkovsky)
  • fix: add map coercion for binary ops #15551 (alexwilcoxson-rel)
  • fix: Rewrite date_trunc and from_unixtime for the SQLite unparser #15630 (peasee)
  • fix(substrait): fix regressed edge case in renaming inner struct fields #15634 (Blizzara)
  • fix: normalize window ident #15639 (chenkovsky)
  • fix: unparse join without projection #15693 (chenkovsky)

Documentation updates:

  • MINOR fix(docs): set the proper link for dev-env setup in contrib guide #14960 (clflushopt)
  • Add Upgrade Guide for DataFusion 46.0.0 #14891 (alamb)
  • Improve SessionStateBuilder::new documentation #14980 (alamb)
  • Minor: Replace Star and Fork buttons in docs with static versions #14988 (amoeba)
  • Fix documentation warnings and error if anymore occur #14952 (AmosAidoo)
  • docs: Improve docs on AggregateFunctionExpr construction #15044 (ctsk)
  • Minor: More comment to aggregation fuzzer #15048 (2010YOUY01)
  • Improve benchmark documentation #15054 (carols10cents)
  • doc: update RecordBatchReceiverStreamBuilder::spawn_blocking task behaviour #14995 (shruti2522)
  • doc: Correct benchmark command #15094 (qazxcdswe123)
  • Add insta / snapshot testing to CLI & set up AWS mock #13672 (blaginin)
  • Config: Add support default sql varchar to view types #15104 (zhuqi-lucas)
  • Support EXPLAIN ... FORMAT <indent | tree | json | graphviz > ... #15166 (alamb)
  • Update version to 46.0.1, add CHANGELOG (#15243) #15244 (xudong963)
  • docs: update documentation for Final GroupBy in accumulator.rs #15279 (qazxcdswe123)
  • minor: fix data/sqlite link #15286 (sdht0)
  • Add upgrade notes for array signatures #15237 (jkosh44)
  • Add doc for the statistics_from_parquet_meta_calc method #15330 (xudong963)
  • added explaination for Schema and DFSchema to documentation #15329 (Jiashu-Hu)
  • Documentation: Plan custom expressions #15353 (Jiashu-Hu)
  • Update concepts-readings-events.md #15440 (berkaysynnada)
  • Add support for DISTINCT + ORDER BY in ARRAY_AGG #14413 (gabotechs)
  • Update the copyright year #15453 (omkenge)
  • Docs: Formatting and Added Extra resources #15450 (2SpaceMasterRace)
  • Add documentation for Run extended tests command #15463 (alamb)
  • bench: Document how to use cross platform Samply profiler #15481 (comphead)
  • Update user guide to note decimal is not experimental anymore #15515 (Jiashu-Hu)
  • datafusion-cli: document reading partitioned parquet #15505 (marvelshan)
  • Update concepts-readings-events.md #15541 (oznur-synnada)
  • Add documentation example for AggregateExprBuilder #15504 (Shreyaskr1409)
  • Docs : Added Sql examples for window Functions : nth_val , etc #15555 (Adez017)
  • Add disk usage limit configuration to datafusion-cli #15586 (jsai28)
  • Bug fix : fix the bug in docs in 'cum_dist()' Example #15618 (Adez017)
  • Make tree the Default EXPLAIN Format and Reorder Documentation Sections #15706 (kosiew)
  • Add coerce int96 option for Parquet to support different TimeUnits, test int96_from_spark.parquet from parquet-testing #15537 (mbutrovich)
  • STRING_AGG missing functionality #14412 (gabotechs)
  • doc : update RepartitionExec display tree #15710 (getChan)
  • Update version to 47.0.0, add CHANGELOG #15731 (xudong963)

Other:

  • Improve documentation for DataSourceExec, FileScanConfig, DataSource etc #14941 (alamb)
  • Do not swap with projection when file is partitioned #14956 (blaginin)
  • Minor: Add more projection pushdown tests, clarify comments #14963 (alamb)
  • Update labeler components #14942 (alamb)
  • Deprecate Expr::Wildcard #14959 (linhr)
  • Minor: use FileScanConfig builder API in some tests #14938 (alamb)
  • Minor: improve documentation of AggregateMode #14946 (alamb)
  • chore(deps): bump thiserror from 2.0.11 to 2.0.12 #14971 (dependabot[bot])
  • chore(deps): bump pyo3 from 0.23.4 to 0.23.5 #14972 (dependabot[bot])
  • chore(deps): bump async-trait from 0.1.86 to 0.1.87 #14973 (dependabot[bot])
  • Fix verification script and extended tests due to rustup changes #14990 (alamb)
  • Split out avro, parquet, json and csv into individual crates #14951 (AdamGS)
  • Minor: Add backtrace feature in datafusion-cli #14997 (2010YOUY01)
  • chore: Update SessionStateBuilder::with_default_features does not replace existing features #14935 (irenjj)
  • Make create_ordering pub and add doc for it #14996 (xudong963)
  • Simplify Between expression to Eq #14994 (jayzhan211)
  • Count wildcard alias #14927 (jayzhan211)
  • replace TypeSignature::String with TypeSignature::Coercible #14917 (zjregee)
  • Minor: Add indentation to EnforceDistribution test plans. #15007 (wiedld)
  • Minor: add method SessionStateBuilder::new_with_default_features() #14998 (shruti2522)
  • Implement tree explain for FilterExec #15001 (alamb)
  • Unparser add AtArrow and ArrowAt conversion to BinaryOperator #14968 (cetra3)
  • Add dependency checks to verify-release-candidate script #15009 (waynexia)
  • Fix: to_char Function Now Correctly Handles DATE Values in DataFusion #14970 (kosiew)
  • Make Substrait Schema Structs always non-nullable #15011 (amoeba)
  • Adjust physical optimizer rule order, put ProjectionPushdown at last #15040 (xudong963)
  • Move UnwrapCastInComparison into Simplifier #15012 (jayzhan211)
  • chore(deps): bump aws-config from 1.5.17 to 1.5.18 #15041 (dependabot[bot])
  • chore(deps): bump bytes from 1.10.0 to 1.10.1 #15042 (dependabot[bot])
  • Minor: Deprecate ScalarValue::raw_data #15016 (qazxcdswe123)
  • Implement tree explain for DataSourceExec #15029 (alamb)
  • Refactor test suite in EnforceDistribution, to use standard test config. #15010 (wiedld)
  • Update ring to v0.17.13 #15063 (alamb)
  • Remove deprecated function OptimizerRule::try_optimize #15051 (qazxcdswe123)
  • Minor: fix CI to make the sqllogic testing result consistent #15059 (zhuqi-lucas)
  • Refactor SortPushdown using the standard top-down visitor and using EquivalenceProperties #14821 (wiedld)
  • Improve explain tree formatting for longer lines / word wrap #15031 (irenjj)
  • chore(deps): bump sqllogictest from 0.27.2 to 0.28.0 #15060 (dependabot[bot])
  • chore(deps): bump async-compression from 0.4.18 to 0.4.19 #15061 (dependabot[bot])
  • Handle columns in with_new_exprs with a Join #15055 (delamarch3)
  • Minor: Improve documentation of need_handle_count_bug #15050 (suibianwanwank)
  • Implement tree explain for HashJoinExec #15079 (irenjj)
  • Implement tree explain for PartialSortExec #15066 (irenjj)
  • Implement tree explain for SortExec #15077 (irenjj)
  • Minor: final 46.0.0 release tweaks: changelog + instructions #15073 (alamb)
  • Implement tree explain for NestedLoopJoinExec, CrossJoinExec, `So… #15081 (irenjj)
  • Implement tree explain for BoundedWindowAggExec and WindowAggExec #15084 (irenjj)
  • implement tree rendering for StreamingTableExec #15085 (Standing-Man)
  • chore(deps): bump semver from 1.0.25 to 1.0.26 #15116 (dependabot[bot])
  • chore(deps): bump clap from 4.5.30 to 4.5.31 #15115 (dependabot[bot])
  • implement tree explain for GlobalLimitExec #15100 (zjregee)
  • Minor: Cleanup useless/duplicated code in gen tools #15113 (lewiszlw)
  • Refactor EnforceDistribution test cases to demonstrate dependencies across optimizer runs. #15074 (wiedld)
  • Improve parsing extra_info in tree explain #15125 (irenjj)
  • Add tests for simplification and coercion of SessionContext::create_physical_expr #15034 (alamb)
  • Minor: Fix invalid query in test #15131 (alamb)
  • Do not display logical_plan win explain tree mode 🧹 #15132 (alamb)
  • Substrait support for propagating TableScan.filters to Substrait ReadRel.filter #14194 (jamxia155)
  • Fix wasm32 build on version 46 #15102 (XiangpengHao)
  • Fix broken serde feature #15124 (vadimpiven)
  • chore(deps): bump tempfile from 3.17.1 to 3.18.0 #15146 (dependabot[bot])
  • chore(deps): bump syn from 2.0.98 to 2.0.100 #15147 (dependabot[bot])
  • Implement tree explain for AggregateExec #15103 (zebsme)
  • Implement tree explain for RepartitionExec and WorkTableExec #15137 (Standing-Man)
  • Expand wildcard to actual expressions in prepare_select_exprs #15090 (jayzhan211)
  • fixed PushDownFilter bug [15047] #15142 (Jiashu-Hu)
  • Bump env_logger from 0.11.6 to 0.11.7 #15148 (mbrobbel)
  • Minor: fix extend sqllogical consistent with main test #15145 (zhuqi-lucas)
  • Implement tree rendering for SortPreservingMergeExec #15140 (Standing-Man)
  • Remove expand wildcard rule #15170 (jayzhan211)
  • chore: remove ScalarUDFImpl::return_type_from_exprs #15130 (Blizzara)
  • chore(deps): bump libc from 0.2.170 to 0.2.171 #15176 (dependabot[bot])
  • chore(deps): bump serde_json from 1.0.139 to 1.0.140 #15175 (dependabot[bot])
  • chore(deps): bump substrait from 0.53.2 to 0.54.0 #15043 (dependabot[bot])
  • Minor: split EXPLAIN and ANALYZE planning into different functions #15188 (alamb)
  • Implement tree explain for JsonSink #15185 (irenjj)
  • Split out datafusion-substrait and datafusion-proto CI feature checks, increase coverage #15156 (alamb)
  • Remove unused wildcard expanding methods #15180 (goldmedal)
  • #15108 issue: "Non Panic Task error" is not an internal error #15109 (Satyam018)
  • Implement tree explain for LazyMemoryExec #15187 (zebsme)
  • implement tree explain for CoalesceBatchesExec #15194 (Standing-Man)
  • Implement tree explain for CsvSink #15204 (irenjj)
  • chore(deps): bump blake3 from 1.6.0 to 1.6.1 #15198 (dependabot[bot])
  • chore(deps): bump clap from 4.5.31 to 4.5.32 #15199 (dependabot[bot])
  • chore(deps): bump serde from 1.0.218 to 1.0.219 #15197 (dependabot[bot])
  • Fix datafusion proto crate json feature #15172 (Owen-CH-Leung)
  • Add blog link to EquivalenceProperties docs #15215 (alamb)
  • Minor: split datafusion-cli testing into its own CI job #15075 (alamb)
  • Implement tree explain for InterleaveExec #15219 (zebsme)
  • Move catalog_common out of core #15193 (logan-keede)
  • chore(deps): bump tokio-util from 0.7.13 to 0.7.14 #15223 (dependabot[bot])
  • chore(deps): bump aws-config from 1.5.18 to 1.6.0 #15222 (dependabot[bot])
  • chore(deps): bump bzip2 from 0.5.1 to 0.5.2 #15221 (dependabot[bot])
  • Document guidelines for physical operator yielding #15030 (carols10cents)
  • Implement tree explain for ArrowFileSink, fix original URL #15206 (irenjj)
  • Implement tree explain for LocalLimitExec #15232 (shruti2522)
  • Use insta for DataFrame tests #15165 (blaginin)
  • Re-enable github discussion #15241 (2010YOUY01)
  • Minor: exclude datafusion-cli testing for mac #15240 (zhuqi-lucas)
  • Implement tree explain for CoalescePartitionsExec #15225 (Shreyaskr1409)
  • Enable used_underscore_binding clippy lint #15189 (Shreyaskr1409)
  • Simpler to see expressions in explain tree mode #15163 (irenjj)
  • Fix invalid schema for unions in ViewTables #15135 (Friede80)
  • Make ListingTableUrl::try_new public #15250 (linhr)
  • Fix wildcard dataframe case #15230 (jayzhan211)
  • Simplify the printing of all plans containing expr in tree mode #15249 (irenjj)
  • Support utf8view datatype for window #15257 (zhuqi-lucas)
  • chore: remove deprecated variants of UDF's invoke (invoke, invoke_no_args, invoke_batch) #15123 (Blizzara)
  • Improve feature flag CI coverage datafusion and datafusion-functions #15203 (alamb)
  • Add debug logging for default catalog overwrite in SessionState build #15251 (byte-sourcerer)
  • Implement tree explain for PlaceholderRowExec #15270 (zebsme)
  • Implement tree explain for UnionExec #15278 (zebsme)
  • Migrate dataframe tests to insta #15262 (jsai28)
  • Minor: consistently apply clippy::clone_on_ref_ptr in all crates #15284 (alamb)
  • chore(deps): bump async-trait from 0.1.87 to 0.1.88 #15294 (dependabot[bot])
  • chore(deps): bump uuid from 1.15.1 to 1.16.0 #15292 (dependabot[bot])
  • Add CatalogProvider and SchemaProvider to FFI Crate #15280 (timsaucer)
  • Refactor file schema type coercions #15268 (xudong963)
  • chore(deps): bump rust_decimal from 1.36.0 to 1.37.0 #15293 (dependabot[bot])
  • chore: Attach Diagnostic to "incompatible type in unary expression" error #15209 (onlyjackfrost)
  • Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype #15239 (zhuqi-lucas)
  • Migrate user_defined tests to insta #15255 (shruti2522)
  • Remove inline table scan analyzer rule #15201 (jayzhan211)
  • CI Red: Fix union in view table test #15300 (jayzhan211)
  • refactor: Move view and stream from datasource to catalog, deprecate View::try_new #15260 (logan-keede)
  • chore(deps): bump substrait from 0.54.0 to 0.55.0 #15305 (dependabot[bot])
  • chore(deps): bump half from 2.4.1 to 2.5.0 #15303 (dependabot[bot])
  • chore(deps): bump mimalloc from 0.1.43 to 0.1.44 #15304 (dependabot[bot])
  • Fix predicate pushdown for custom SchemaAdapters #15263 (adriangb)
  • Fix extended tests by restore datafusion-testing submodule #15318 (alamb)
  • Support Duration in min/max agg functions #15310 (svranesevic)
  • Migrate tests to insta #15288 (jsai28)
  • chore(deps): bump quote from 1.0.38 to 1.0.40 #15332 (dependabot[bot])
  • chore(deps): bump blake3 from 1.6.1 to 1.7.0 #15331 (dependabot[bot])
  • Simplify display format of AggregateFunctionExpr, add Expr::sql_name #15253 (irenjj)
  • chore(deps): bump indexmap from 2.7.1 to 2.8.0 #15333 (dependabot[bot])
  • chore(deps): bump tokio from 1.43.0 to 1.44.1 #15347 (dependabot[bot])
  • chore(deps): bump tempfile from 3.18.0 to 3.19.1 #15346 (dependabot[bot])
  • Minor: Keep debug symbols for release-nonlto build #15350 (2010YOUY01)
  • Use any instead of for_each #15289 (xudong963)
  • refactor: move CteWorkTable, default_table_source a bunch of files out of core #15316 (logan-keede)
  • Fix empty aggregation function count() in Substrait #15345 (gabotechs)
  • Improved error for expand wildcard rule #15287 (Jiashu-Hu)
  • Added tests with are writing into parquet files in memory for issue #… #15325 (pranavJibhakate)
  • Migrate physical plan tests to insta (Part-1) #15313 (Shreyaskr1409)
  • Fix array_has_all and array_has_any with empty array #15039 (LuQQiu)
  • Update datafusion-testing pin to fix extended tests #15368 (alamb)
  • chore(deps): Update sqlparser to 0.55.0 #15183 (PokIsemaine)
  • Only unnest source for EmptyRelation #15159 (blaginin)
  • chore(deps): bump rust_decimal from 1.37.0 to 1.37.1 #15378 (dependabot[bot])
  • chore(deps): bump chrono-tz from 0.10.1 to 0.10.2 #15377 (dependabot[bot])
  • remove the duplicate test for unparser #15385 (goldmedal)
  • Minor: add average time for clickbench benchmark query #15381 (zhuqi-lucas)
  • include some BinaryOperator from sqlparser #15327 (waynexia)
  • Add "end to end parquet reading test" for WASM #15362 (jsai28)
  • Migrate physical plan tests to insta (Part-2) #15364 (Shreyaskr1409)
  • Migrate physical plan tests to insta (Part-3 / Final) #15399 (Shreyaskr1409)
  • Restore lazy evaluation of fallible CASE #15390 (findepi)
  • chore(deps): bump log from 0.4.26 to 0.4.27 #15410 (dependabot[bot])
  • chore(deps): bump chrono-tz from 0.10.2 to 0.10.3 #15412 (dependabot[bot])
  • Perf: Support Utf8View datatype single column comparisons for SortPreservingMergeStream #15348 (zhuqi-lucas)
  • Enforce JOIN plan to require condition #15334 (goldmedal)
  • Fix type coercion for unsigned and signed integers (Int64 vs UInt64, etc) #15341 (Omega359)
  • simplify array_has UDF to InList expr when haystack is constant #15354 (davidhewitt)
  • Move DataSink to datasource and add session crate #15371 (jayzhan-synnada)
  • refactor: SpillManager into a separate file #15407 (Weijun-H)
  • Always use PartitionMode::Auto in planner #15339 (Dandandan)
  • Fix link to Volcano paper #15437 (JackKelly)
  • minor: Add new crates to labeler #15426 (logan-keede)
  • refactor: Use SpillManager for all spilling scenarios #15405 (2010YOUY01)
  • refactor(hash_join): Move JoinHashMap to separate mod #15419 (ctsk)
  • Migrate datasource tests to insta #15258 (shruti2522)
  • Add downcast_to_source method for DataSourceExec #15416 (xudong963)
  • refactor: use TypeSignature::Coercible for crypto functions #14826 (Chen-Yuan-Lai)
  • Minor: fix doc for FileGroupPartitioner #15448 (xudong963)
  • chore(deps): bump clap from 4.5.32 to 4.5.34 #15452 (dependabot[bot])
  • Fix roundtrip bug with empty projection in DataSourceExec #15449 (XiangpengHao)
  • Triggering extended tests through PR comment: Run extended tests #15101 (danila-b)
  • Use equals_datatype to compare type when type coercion #15366 (goldmedal)
  • Fix no effect metrics bug in ParquetSource #15460 (XiangpengHao)
  • chore(deps): bump aws-config from 1.6.0 to 1.6.1 #15470 (dependabot[bot])
  • minor: Allow to run TPCH bench for a specific query #15467 (comphead)
  • Migrate subtraits tests to insta, part1 #15444 (qstommyshu)
  • Add FileScanConfigBuilder #15352 (blaginin)
  • Update ClickBench queries to avoid to_timestamp_seconds #15475 (acking-you)
  • Remove CoalescePartitions insertion from HashJoinExec #15476 (ctsk)
  • Migrate-substrait-tests-to-insta, part2 #15480 (qstommyshu)
  • Revert #15476 to fix the datafusion-examples CI fail #15496 (goldmedal)
  • Migrate datafusion/sql tests to insta, part1 #15497 (qstommyshu)
  • Allow type coersion of zero input arrays to nullary #15487 (timsaucer)
  • Decimal type support for to_timestamp #15486 (jatin510)
  • refactor: Move Memtable to catalog #15459 (logan-keede)
  • Migrate optimizer tests to insta #15446 (qstommyshu)
  • FIX : some benchmarks are failing #15367 (getChan)
  • Add query to extended clickbench suite for "complex filter" #15500 (acking-you)
  • Extract tokio runtime creation from hot loop in benchmarks #15508 (Omega359)
  • chore(deps): bump blake3 from 1.7.0 to 1.8.0 #15502 (dependabot[bot])
  • Minor: clone and debug for FileSinkConfig #15516 (jayzhan211)
  • use state machine to refactor the get_files_with_limit method #15521 (xudong963)
  • Migrate datafusion/sql tests to insta, part2 #15499 (qstommyshu)
  • Disable sccache action to fix gh cache issue #15536 (Omega359)
  • refactor: Cleanup unused fetch field inside ExternalSorter #15525 (2010YOUY01)
  • Fix duplicate unqualified Field name (schema error) on join queries #15438 (LiaCastaneda)
  • Add utf8view benchmark for aggregate topk #15518 (zhuqi-lucas)
  • ArraySort: support structs #15527 (cht42)
  • Migrate datafusion/sql tests to insta, part3 #15533 (qstommyshu)
  • Migrate datafusion/sql tests to insta, part4 #15548 (qstommyshu)
  • Add topk information into tree explain plans #15547 (kumarlokesh)
  • Minor: add Arc for statistics in FileGroup #15564 (xudong963)
  • Test: configuration fuzzer for (external) sort queries #15501 (2010YOUY01)
  • minor: Organize fields inside SortMergeJoinStream #15557 (suibianwanwank)
  • Minor: rm session downcast #15575 (jayzhan211)
  • Migrate datafusion/sql tests to insta, part5 #15567 (qstommyshu)
  • Add SQL logic tests for compound field access in JOIN conditions #15556 (kosiew)
  • Run audit CI check on all pushes to main #15572 (alamb)
  • Introduce load-balanced split_groups_by_statistics method #15473 (xudong963)
  • chore: update clickbench #15574 (chenkovsky)
  • Improve spill performance: Disable re-validation of spilled files #15454 (zebsme)
  • chore: rm duplicated JoinOn type #15590 (jayzhan211)
  • Chore: Call arrow's methods row_count and skipped_row_count #15587 (jayzhan211)
  • Actually run wasm test in ci #15595 (XiangpengHao)
  • Migrate datafusion/sql tests to insta, part6 #15578 (qstommyshu)
  • Add test case for new casting feature from date to tz-aware timestamps #15609 (friendlymatthew)
  • Remove CoalescePartitions insertion from Joins #15570 (ctsk)
  • fix doc and broken api #15602 (logan-keede)
  • Migrate datafusion/sql tests to insta, part7 #15621 (qstommyshu)
  • ignore security_audit CI check proc-macro-error warning #15626 (Jiashu-Hu)
  • chore(deps): bump tokio from 1.44.1 to 1.44.2 #15627 (dependabot[bot])
  • Upgrade toolchain to Rust-1.86 #15625 (jsai28)
  • chore(deps): bump bigdecimal from 0.4.7 to 0.4.8 #15523 (dependabot[bot])
  • chore(deps): bump the arrow-parquet group across 1 directory with 7 updates #15593 (dependabot[bot])
  • chore: improve RepartitionExec display tree #15606 (getChan)
  • Move back schema not matching check and workaround #15580 (LiaCastaneda)
  • Minor: refine comments for statistics compution #15647 (xudong963)
  • Remove uneeded binary_op benchmarks #15632 (alamb)
  • chore(deps): bump blake3 from 1.8.0 to 1.8.1 #15650 (dependabot[bot])
  • chore(deps): bump mimalloc from 0.1.44 to 0.1.46 #15651 (dependabot[bot])
  • chore: avoid erroneuous warning for FFI table operation (only not default value) #15579 (chenkovsky)
  • Update datafusion-testing pin (to fix extended test on main) #15655 (alamb)
  • Ignore false positive only_used_in_recursion Clippy warning #15635 (DerGut)
  • chore: Rename protobuf Java package #15658 (andygrove)
  • Remove redundant Precision combination code in favor of Precision::min/max/add #15659 (alamb)
  • Introduce DynamicFilterSource and DynamicPhysicalExpr #15568 (adriangb)
  • Public some projected methods in FileScanConfig #15671 (xudong963)
  • fix decimal precision issue in simplify expression optimize rule #15588 (jayzhan211)
  • Implement Future for SpawnedTask. #15653 (ashdnazg)
  • chore(deps): bump crossbeam-channel from 0.5.14 to 0.5.15 #15674 (dependabot[bot])
  • chore(deps): bump clap from 4.5.34 to 4.5.35 #15668 (dependabot[bot])
  • [Minor] Use interleave_record_batch in TopK implementation #15677 (Dandandan)
  • Consolidate statistics merging code (try 2) #15661 (alamb)
  • Add Table Functions to FFI Crate #15581 (timsaucer)
  • Remove waits from blocking threads reading spill files. #15654 (ashdnazg)
  • chore(deps): bump sysinfo from 0.33.1 to 0.34.2 #15682 (dependabot[bot])
  • Minor: add order by arg for last value #15695 (jayzhan211)
  • Upgrade to arrow/parquet 55, and object_store to 0.12.0 and pyo3 to 0.24.0 #15466 (alamb)
  • tests: only refresh the minimum sysinfo in mem limit tests. #15702 (ashdnazg)
  • ci: fix workflow triggering extended tests from pr comments. #15704 (ashdnazg)
  • chore(deps): bump flate2 from 1.1.0 to 1.1.1 #15703 (dependabot[bot])
  • Fix internal error in sort when hitting memory limit #15692 (DerGut)
  • Update checked in Cargo.lock file to get clean CI #15725 (alamb)
  • chore(deps): bump indexmap from 2.8.0 to 2.9.0 #15732 (dependabot[bot])
  • Minor: include output partition count of RepartitionExec to tree explain #15717 (2010YOUY01)

Credits

Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.

    48	dependabot[bot]
    34	Andrew Lamb
    16	xudong.w
    15	Jay Zhan
    15	Qi Zhu
    15	irenjj
    13	Chen Chongchen
    13	Yongting You
    10	Tommy shu
     7	Shruti Sharma
     6	Alan Tang
     6	Arttu
     6	Jiashu Hu
     6	Shreyas (Lua)
     6	logan-keede
     6	zeb
     5	Dmitrii Blaginin
     5	Geoffrey Claude
     5	Jax Liu
     5	YuNing Chen
     4	Bruce Ritchie
     4	Christian
     4	Eshed Schacham
     4	Xiangpeng Hao
     4	wiedld
     3	Adrian Garcia Badaracco
     3	Daniël Heres
     3	Gabriel
     3	LB7666
     3	Namgung Chan
     3	Ruihang Xia
     3	Tim Saucer
     3	jsai28
     3	kosiew
     3	suibianwanwan
     2	Bryce Mecum
     2	Carol (Nichols || Goulding)
     2	Heran Lin
     2	Jannik Steinmann
     2	Jyotir Sai
     2	Li-Lun Lin
     2	Lía Adriana
     2	Oleks V
     2	Raz Luvaton
     2	UBarney
     2	aditya singh rathore
     2	westhide
     2	zjregee
     1	@clflushopt
     1	Adam Gutglick
     1	Alex Huang
     1	Alex Wilcoxson
     1	Amos Aidoo
     1	Andy Grove
     1	Andy Yen
     1	Berkay Şahin
     1	Chang
     1	Danila Baklazhenko
     1	David Hewitt
     1	Emily Matheys
     1	Eren Avsarogullari
     1	Hari Varsha
     1	Ian Lai
     1	Jack Kelly
     1	Jagdish Parihar
     1	Joseph Koshakow
     1	Lokesh
     1	LuQQiu
     1	Matt Butrovich
     1	Matt Friede
     1	Matthew Kim
     1	Matthijs Brobbel
     1	Om Kenge
     1	Owen Leung
     1	Peter L
     1	Piotr Findeisen
     1	Rohan Krishnaswamy
     1	Satyam018
     1	Sava Vranešević
     1	Siddhartha Sahu
     1	Sile Zhou
     1	Vadim Piven
     1	Zaki
     1	christophermcdermott
     1	cht42
     1	cjw
     1	delamarch3
     1	ding-young
     1	haruband
     1	jamxia155
     1	oznur-synnada
     1	peasee
     1	pranavJibhakate
     1	张林伟

Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.