Back to Datafusion

38.0.0

dev/changelog/38.0.0.md

53.1.042.4 KB
Original Source
<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -->

38.0.0 (2024-05-07)

Breaking changes:

  • refactor: make dfschema wrap schemaref #9595 (haohuaijin)
  • Make FirstValue an UDAF, Change AggregateUDFImpl::accumulator signature, support ORDER BY for UDAFs #9874 (jayzhan211)
  • Remove OwnedTableReference and OwnedSchemaReference #9933 (comphead)
  • Consistent LogicalPlan subquery handling in TreeNode::apply and TreeNode::visit #9913 (peter-toth)
  • Refactor Optimizer to use owned plans and TreeNode API (10% faster planning) #9948 (alamb)
  • Stop copying plans in LogicalPlan::with_param_values #10016 (alamb)
  • Move coalesce to datafusion-functions and remove BuiltInScalarFunction #10098 (Omega359)
  • Refactor sessionconfig set fns to avoid an unnecessary enum to string conversion #10141 (psvri)
  • ScalarUDF: Remove supports_zero_argument and avoid creating null array for empty args #10193 (jayzhan211)
  • Clean-up: Remove AggregateExec::group_by() #10297 (berkaysynnada)
  • Remove ScalarFunctionDefinition::Name #10277 (lewiszlw)
  • feat: Determine ordering of file groups #9593 (suremarc)
  • Split parquet bloom filter config and enable bloom filter on read by default #10306 (lewiszlw)
  • Improve coerce API so it does not need DFSchema #10331 (alamb)
  • Minor: Do not force analyzer to copy logical plans #10367 (alamb)
  • Move Covariance (Sample) covar / covar_samp to be a User Defined Aggregate Function #10372 (jayzhan211)

Performance related:

  • perf: Use Arc<str> instead of Cow<&'a> in the analyzer #9824 (comphead)

Implemented enhancements:

  • feat: Add display_pg_json for LogicalPlan #9789 (liurenjie1024)
  • feat: eliminate redundant sorts on monotonic expressions #9813 (suremarc)
  • feat: optimize lower and upper functions #9971 (JasonLi-cn)
  • feat: support unnest multiple arrays #10044 (jonahgao)
  • feat: DataFrame supports unnesting multiple columns #10118 (jonahgao)
  • feat: support input reordering for NestedLoopJoinExec #9676 (korowa)
  • feat: add static_name() to ExecutionPlan #10266 (waynexia)
  • feat: add optimizer config param to avoid grouping partitions prefer_existing_union #10259 (NGA-TRAN)
  • feat: unwrap casts of string and dictionary columns #10323 (erratic-pattern)
  • feat: Add CrossJoin match case to unparser #10371 (sardination)
  • feat: run expression simplifier in a loop until a fixedpoint or 3 cycles #10358 (erratic-pattern)

Fixed bugs:

  • fix: detect non-recursive CTEs in the recursive WITH clause #9836 (jonahgao)
  • fix: improve unnest_generic_list handling of null list #9975 (jonahgao)
  • fix: reduce lock contention in RepartitionExec::execute #10009 (crepererum)
  • fix: RepartitionExec metrics #10025 (crepererum)
  • fix: Support Dict types in in_list physical plans #10031 (advancedxy)
  • fix: Specify row count in sort_batch for batch with no columns #10094 (viirya)
  • fix: another non-deterministic test in joins.slt #10122 (korowa)
  • fix: duplicate output for HashJoinExec in CollectLeft mode #9757 (korowa)
  • fix: cargo warnings of import item #10196 (waynexia)
  • fix: reduce lock contention in distributor channels #10026 (crepererum)
  • fix: no longer support the substring function #10242 (jonahgao)
  • fix: Correct null_count in describe() #10260 (Weijun-H)
  • fix: schema error when parsing order-by expressions #10234 (jonahgao)
  • fix: LogFunc simplify swaps arguments #10360 (erratic-pattern)

Documentation updates:

  • Update COPY documentation to reflect changes #9754 (alamb)
  • doc: Add datafusion-federation to Integrations #9853 (phillipleblanc)
  • Improve AggregateUDFImpl::state_fields documentation #9919 (alamb)
  • Update datafusion-cli docs, split up #10078 (alamb)
  • Fix large futures causing stack overflows #10033 (sergiimk)
  • Update documentation to replace Apache Arrow DataFusion with Apache DataFusion #10130 (andygrove)
  • Update github repo links #10167 (lewiszlw)
  • minor: fix installation section link #10179 (comphead)
  • Improve documentation on TreeNode #10035 (alamb)
  • Update .asf.yaml to publish docs to datafusion.apache.org #10190 (phillipleblanc)
  • Update links to point to datafusion.apache.org #10195 (phillipleblanc)
  • doc: fix subscribe mail link to datafusion mailing lists #10225 (jackwener)
  • Fix docs.rs build for datafusion-proto (hopefully) #10254 (alamb)
  • docs: add download page #10271 (tisonkun)
  • Clarify docs explaining the relationship between SessionState and SessionContext #10350 (alamb)
  • docs: Add DataFusion subprojects to navigation menu, other minor updates #10362 (andygrove)

Merged pull requests:

  • Prepare 37.0.0 Release #9697 (andygrove)
  • move Left, Lpad, Reverse, Right, Rpad functions to datafusion_functions #9841 (Omega359)
  • Add non-column expression equality tracking to filter exec #9819 (mustafasrepo)
  • datafusion-cli support for multiple commands in a single line #9831 (berkaysynnada)
  • Add tests for filtering, grouping, aggregation of ARRAYs #9695 (alamb)
  • Remove vestigal conbench integration #9855 (alamb)
  • feat: Add display_pg_json for LogicalPlan #9789 (liurenjie1024)
  • Update COPY documentation to reflect changes #9754 (alamb)
  • Minor: Remove the bench most likely to cause OOM in CI #9858 (gruuya)
  • Minor: make uuid an optional dependency on datafusion-functions #9771 (alamb)
  • doc: Add Spice.ai to Known Users #9852 (phillipleblanc)
  • minor: add a hint how to adjust max rows displayed #9845 (comphead)
  • Exclude .github directory from release tarball #9850 (andygrove)
  • move strpos, substr functions to datafusion_functions #9849 (Omega359)
  • doc: Add datafusion-federation to Integrations #9853 (phillipleblanc)
  • chore(deps): update cargo requirement from 0.77.0 to 0.78.1 #9844 (dependabot[bot])
  • chore(deps-dev): bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /datafusion/wasmtest/datafusion-wasm-app #9741 (dependabot[bot])
  • Implement semi/anti join output statistics estimation #9800 (korowa)
  • move Log2, Log10, Ln to datafusion-functions #9869 (tinfoil-knight)
  • Add CI compile checks for feature flags in datafusion-functions #9772 (alamb)
  • move the Translate, SubstrIndex, FindInSet functions to datafusion-functions #9864 (Omega359)
  • Support custom struct field names with new scalar function named_struct #9743 (gstvg)
  • Allow declaring partition columns in PARTITION BY clause, backwards compatible #9599 (MohamedAbdeen21)
  • Minor: Move depcheck out of datafusion crate (200 less crates to compile) #9865 (alamb)
  • Minor: delete duplicate bench test #9866 (Lordworms)
  • parquet: Add tests for pruning on Int8/Int16/Int64 columns #9778 (progval)
  • move Atan2, Atan, Acosh, Asinh, Atanh to datafusion-function #9872 (Weijun-H)
  • minor(doc): fix dead link for catalogs example #9883 (yjshen)
  • parquet: Add tests for page pruning on unsigned integers #9888 (progval)
  • fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. #9871 (wiedld)
  • [CI] Use alias for table.struct #9894 (jayzhan211)
  • fix: detect non-recursive CTEs in the recursive WITH clause #9836 (jonahgao)
  • Minor: Add SIGMOD paper reference to architecture guide #9886 (alamb)
  • refactor: add macro for the binary math function in datafusion-function #9889 (Weijun-H)
  • Add benchmark for substr_index #9878 (Omega359)
  • Add test for reading back file created with COPY ... OPTIONS (FORMAT..) options #9753 (alamb)
  • Add Expr->String for SimilarTo, IsNotTrue, IsNotUnknown,Negative #9902 (yyy1000)
  • refactor: make dfschema wrap schemaref #9595 (haohuaijin)
  • Add spilled_rows metric to ExternalSorter by IPCWriter #9885 (erenavsarogullari)
  • Minor: Add ParquetExec::table_parquet_options accessor #9909 (alamb)
  • Add support for Bloom filters on unsigned integer columns in Parquet tables #9770 (progval)
  • Move radians, signum, sin, sinh and sqrt functions to datafusion-functions crate #9882 (erenavsarogullari)
  • refactor: make all udf function impls public #9903 (universalmind303)
  • Minor: Improve math expr description #9911 (caicancai)
  • perf: Use Arc<str> instead of Cow<&'a> in the analyzer #9824 (comphead)
  • Use struct instead of named_struct when there are no aliases #9897 (alamb)
  • Improve planning speed using impl Into<Arc<str>> to create Arc<str> rather than &str #9916 (alamb)
  • Make FirstValue an UDAF, Change AggregateUDFImpl::accumulator signature, support ORDER BY for UDAFs #9874 (jayzhan211)
  • Add TPCH-DS planning benchmark #9907 (alamb)
  • Simplify Expr::map_children #9876 (peter-toth)
  • CrossJoin Refactor #9830 (berkaysynnada)
  • Optimization: concat function #9732 (JasonLi-cn)
  • Improve AggregateUDFImpl::state_fields documentation #9919 (alamb)
  • chore(deps): update substrait requirement from 0.28.0 to 0.29.0 #9942 (dependabot[bot])
  • test: fix intermittent failure in cte.slt #9934 (jonahgao)
  • Move cbrt, cos, cosh, degrees to datafusion-functions #9938 (erenavsarogullari)
  • Add Expr->String for Exists, Sort #9936 (kevinmingtarja)
  • Remove OwnedTableReference and OwnedSchemaReference #9933 (comphead)
  • Prune out constant expressions from output ordering. #9947 (mustafasrepo)
  • Move AggregateExpr, PhysicalExpr and PhysicalSortExpr to physical-expr-core #9926 (jayzhan211)
  • Minor: Update release README #9956 (alamb)
  • Optimize COUNT(1): Change the sentinel value's type for COUNT(*) to Int64 #9944 (gruuya)
  • Improve docs for TableProvider::supports_filters_pushdown and remove deprecated function #9923 (alamb)
  • Minor: Improve documentation for AggregateUDFImpl::accumulator and AccumulatorArgs #9920 (alamb)
  • Minor: improve TableReference docs #9952 (alamb)
  • Fix datafusion-cli publishing #9955 (alamb)
  • Simplify TreeNode recursions #9965 (peter-toth)
  • Validate partitions columns in CREATE EXTERNAL TABLE if table already exists. #9912 (MohamedAbdeen21)
  • Minor: Add additional documentation to CommonSubexprEliminate #9959 (alamb)
  • Fix tpcds planning stack overflows - Join planning refactoring #9962 (Jefffrey)
  • coercion vec[Dictionary, Utf8] to Dictionary for coalesce function #9958 (Lordworms)
  • Minor: Update library documentation with new crates #9966 (alamb)
  • Minor: Return InternalError rather than panic for NamedStructField should be rewritten in OperatorToFunction #9968 (alamb)
  • minor: update MSRV 1.73 #9977 (comphead)
  • Move First Value UDAF and builtin first / last function to aggregate-functions #9960 (jayzhan211)
  • Minor: Avoid copying all expressions in Analzyer / check_plan #9974 (alamb)
  • Minor: Improve documentation about optimizer #9967 (alamb)
  • Minor: Use Expr::apply() instead of inspect_expr_pre() #9984 (peter-toth)
  • Update documentation for COPY command #9931 (alamb)
  • Minor: fix bug in pruning predicate doc #9986 (alamb)
  • fix: improve unnest_generic_list handling of null list #9975 (jonahgao)
  • Consistent LogicalPlan subquery handling in TreeNode::apply and TreeNode::visit #9913 (peter-toth)
  • Remove unnecessary result in DFSchema::index_of_column_by_name #9990 (lewiszlw)
  • Removes Bloom filter for Int8/Int16/Uint8/Uint16 #9969 (edmondop)
  • Move LogicalPlan tree_node module #9995 (alamb)
  • Optimize performance of substr_index and add tests #9973 (kevinmingtarja)
  • move Floor, Gcd, Lcm, Pi to datafusion-functions #9976 (Omega359)
  • Minor: Improve documentation on LogicalPlan::apply* and LogicalPlan::map* #9996 (alamb)
  • move the Log, Power functions to datafusion-functions #9983 (tinfoil-knight)
  • Remove FORMAT <..> backwards compatibility options from COPY #9985 (tinfoil-knight)
  • move Trunc, Cot, Round, iszero functions to datafusion-functions #10000 (Omega359)
  • Minor: Clarify documentation on PruningStatistics::row_counts and PruningStatistics::null_counts and make test match #10004 (alamb)
  • Avoid LogicalPlan::clone() in LogicalPlan::map_children when possible #9999 (alamb)
  • Introduce TreeNode::exists() API, avoid copying expressions #10008 (peter-toth)
  • Minor: Make LogicalPlan::apply_subqueries and LogicalPlan::map_subqueries pub #9998 (alamb)
  • Move Nanvl and random functions to datafusion-functions #10017 (Omega359)
  • fix: reduce lock contention in RepartitionExec::execute #10009 (crepererum)
  • chore(deps): update rstest requirement from 0.18.0 to 0.19.0 #10021 (dependabot[bot])
  • Minor: Document LogicalPlan tree node transformations #10010 (alamb)
  • Refactor Optimizer to use owned plans and TreeNode API (10% faster planning) #9948 (alamb)
  • Further clarification of the supports_filters_pushdown documentation #9988 (cisaacson)
  • Prune columns are all null in ParquetExec by row_counts , handle IS NOT NULL #9989 (Ted-Jiang)
  • Improve the performance of ltrim/rtrim/btrim #10006 (JasonLi-cn)
  • fix: RepartitionExec metrics #10025 (crepererum)
  • modify emit() of TopK to emit on batch_size rather than batch_size-1 #10030 (JasonLi-cn)
  • Consolidate LogicalPlan tree node walking/rewriting code into one module #10034 (alamb)
  • Introduce OptimizerRule::rewrite to rewrite in place, rewrite ExprSimplifier (20% faster planning) #9954 (alamb)
  • Fix DistinctCount for timestamps with time zone #10043 (joroKr21)
  • Improve documentation on LogicalPlan TreeNode methods #10037 (alamb)
  • chore(deps): update prost-build requirement from =0.12.3 to =0.12.4 #10045 (crepererum)
  • Fix datafusion-cli cursor isn't on the right position in windows 7 cmd #10028 (colommar)
  • Always pass DataType to PrimitiveDistinctCountAccumulator #10047 (joroKr21)
  • Stop copying plans in LogicalPlan::with_param_values #10016 (alamb)
  • fix NamedStructField should be rewritten in OperatorToFunction in subquery regression (change ApplyFunctionRewrites to use TreeNode API #10032 (alamb)
  • Avoid copies in InlineTableScan via TreeNode API #10038 (alamb)
  • Bump sccache-action to v0.0.4 #10060 (phillipleblanc)
  • chore: add GitHub workflow to close stale PRs #10046 (andygrove)
  • feat: eliminate redundant sorts on monotonic expressions #9813 (suremarc)
  • Disable crypto_expressions feature properly for --no-default-features #10059 (phillipleblanc)
  • Return self in EmptyExec and PlaceholderRowExec with_new_children #10052 (joroKr21)
  • chore(deps): update sqllogictest requirement from 0.19.0 to 0.20.0 #10057 (dependabot[bot])
  • Rename FileSinkExec to DataSinkExec #10065 (phillipleblanc)
  • fix: Support Dict types in in_list physical plans #10031 (advancedxy)
  • Prune pages are all null in ParquetExec by row_counts and fix NOT NULL prune #10051 (Ted-Jiang)
  • Refactor EliminateOuterJoin to implement OptimizerRule::rewrite() #10081 (peter-toth)
  • chore(deps): update substrait requirement from 0.29.0 to 0.30.0 #10084 (dependabot[bot])
  • feat: optimize lower and upper functions #9971 (JasonLi-cn)
  • Prepend sqllogictest explain result with line number #10019 (duongcongtoai)
  • Use PhysicalExtensionCodec consistently #10075 (joroKr21)
  • Minor: Do not truncate SHOW ALL in datafusion-cli #10079 (alamb)
  • Minor: get mutable ref to SessionConfig in SessionState #10050 (MichaelScofield)
  • Move ceil, exp, factorial to datafusion-functions crate #10083 (erenavsarogullari)
  • feat: support unnest multiple arrays #10044 (jonahgao)
  • cleanup(tests): Move tests from push_down_projections.rs to optimize_projections.rs #10071 (kavirajk)
  • Move conversion of FIRST/LAST Aggregate function to independent physical optimizer rule #10061 (jayzhan211)
  • Avoid copies in CountWildcardRule via TreeNode API #10066 (alamb)
  • Coerce Dictionary types for scalar functions #10077 (viirya)
  • Refactor UnwrapCastInComparison to implement OptimizerRule::rewrite() #10087 (peter-toth)
  • Improve ApproxPercentileAccumulator merge api and fix bug #10056 (Ted-Jiang)
  • Support http s3 endpoints in datafusion-cli via CREATE EXTERNAL TABLE #10080 (alamb)
  • [Bug Fix]: Deem hash repartition unnecessary when input and output has 1 partition #10095 (mustafasrepo)
  • fix: Specify row count in sort_batch for batch with no columns #10094 (viirya)
  • Move concat, concat_ws, ends_with, initcap to datafusion-functions #10089 (Omega359)
  • Update datafusion-cli docs, split up #10078 (alamb)
  • Refactor physical create_initial_plan to iteratively & concurrently construct plan from the bottom up #10023 (Jefffrey)
  • Adding TPCH benchmarks for Sort Merge Join #10092 (comphead)
  • [minor] make parquet prune tests more readable #10112 (Ted-Jiang)
  • Fix intermittent CI test failure in joins.slt #10120 (alamb)
  • Update dependabot to consider datafusion-cli #10108 (Jefffrey)
  • fix: another non-deterministic test in joins.slt #10122 (korowa)
  • Minor: only trigger dependency check on changes to Cargo.toml #10099 (alamb)
  • Refactor UnwrapCastInComparison to remove Expr clones #10115 (peter-toth)
  • Fix large futures causing stack overflows #10033 (sergiimk)
  • Avoid cloning in log::simplify and power::simplify #10086 (alamb)
  • feat: DataFrame supports unnesting multiple columns #10118 (jonahgao)
  • Minor: Refine dev/release/README.md #10129 (alamb)
  • Minor: Add default for Expr #10127 (peter-toth)
  • Update documentation to replace Apache Arrow DataFusion with Apache DataFusion #10130 (andygrove)
  • Fix AVG groups accummulator ignoring return type #10114 (gruuya)
  • Port 37.1.0 changes to main #10136 (alamb)
  • chore(deps): update substrait requirement from 0.30.0 to 0.31.0 #10140 (dependabot[bot])
  • Minor: Support more args for udaf #10146 (jayzhan211)
  • Minor: Signature check for UDAF #10147 (jayzhan211)
  • minor: avoid cloning the SetExpr during planning of SelectInto #10152 (jonahgao)
  • Add distinct aggregate tests to sqllogictest #10158 (Jefffrey)
  • Add test for LIKE newline handling #10160 (Jefffrey)
  • minor: unparser cleanup and new roundtrip test #10150 (devinjdangelo)
  • Support Duration and Union types in ScalarValue::iter_to_array #10139 (joroKr21)
  • chore(deps): update sqlparser requirement from 0.44.0 to 0.45.0 #10137 (Jefffrey)
  • fix: duplicate output for HashJoinExec in CollectLeft mode #9757 (korowa)
  • Move coalesce to datafusion-functions and remove BuiltInScalarFunction #10098 (Omega359)
  • [DOC] Add test example for backtraces #10143 (comphead)
  • Update github repo links #10167 (lewiszlw)
  • feat: support input reordering for NestedLoopJoinExec #9676 (korowa)
  • minor: fix installation section link #10179 (comphead)
  • Improve TreeNode and LogicalPlan APIs to accept owned closures, deprecate transform_down_mut() and transform_up_mut() #10126 (peter-toth)
  • Projection Expression - Input Field Inconsistencies during Projection #10088 (berkaysynnada)
  • implement short_circuits function for ScalarUDFImpl trait #10168 (Lordworms)
  • Improve documentation on TreeNode #10035 (alamb)
  • implement rewrite for ExtractEquijoinPredicate and avoid clone in filter #10165 (Lordworms)
  • Update .asf.yaml to point to new mailing list #10189 (phillipleblanc)
  • Update NOTICE.txt to be relevant to DataFusion #10185 (alamb)
  • Update .asf.yaml to publish docs to datafusion.apache.org #10190 (phillipleblanc)
  • Minor: Add Column::from(Tableref, &FieldRef), Expr::from(Column) and Expr::from(Tableref, &FieldRef) #10178 (alamb)
  • implement rewrite for FilterNullJoinKeys #10166 (Lordworms)
  • Implement rewrite for EliminateOneUnion and EliminateJoin #10184 (Lordworms)
  • Update links to point to datafusion.apache.org #10195 (phillipleblanc)
  • Minor: Introduce Expr::is_volatile(), adjust TreeNode::exists() #10191 (peter-toth)
  • Doc: Modify docs to fix old naming #10199 (comphead)
  • [MINOR] Remove ScalarFunction from datafusion.proto #10173 #10202 (dmitrybugakov)
  • Allow expr_to_sql unparsing with no quotes #10198 (phillipleblanc)
  • Minor: Avoid a clone in ArrayFunctionRewriter #10204 (alamb)
  • Move coalesce function from math to core #10201 (xxxuuu)
  • fix: cargo warnings of import item #10196 (waynexia)
  • Minor: Remove some clone in TypeCoercion #10203 (alamb)
  • doc: fix subscribe mail link to datafusion mailing lists #10225 (jackwener)
  • Minor: Prevent empty datafusion-cli commands #10219 (comphead)
  • Optimize date_bin (2x faster) #10215 (simonvandel)
  • Refactor sessionconfig set fns to avoid an unnecessary enum to string conversion #10141 (psvri)
  • fix: reduce lock contention in distributor channels #10026 (crepererum)
  • Avoid Expr copies OptimizeProjection, 12% faster planning, encapsulate indicies #10216 (alamb)
  • chore: Create a doap file #10233 (tisonkun)
  • Allow adding user defined metadata to ParquetSink #10224 (wiedld)
  • refactor EliminateDuplicatedExpr optimizer pass to avoid clone #10218 (Lordworms)
  • Support for median(distinct) aggregation function #10226 (Jefffrey)
  • Add tests that random() and uuid() produce unique values for each row #10248 (alamb)
  • ScalarUDF: Remove supports_zero_argument and avoid creating null array for empty args #10193 (jayzhan211)
  • Add Expr->String for WindowFunction #10243 (yyy1000)
  • Make function modules public, add Default impl's. #10239 (Omega359)
  • chore: Update release scripts to reflect move to TLP #10235 (andygrove)
  • Stop copying plans in EliminateLimit #10253 (kevinmingtarja)
  • Minor Clean-up in JoinSelection Tests #10249 (berkaysynnada)
  • fix: no longer support the substring function #10242 (jonahgao)
  • Fix docs.rs build for datafusion-proto (hopefully) #10254 (alamb)
  • Minor: Possibility to strip datafusion error name #10186 (comphead)
  • Docs: Add governance page to contributor guide #10238 (alamb)
  • Improve documentation on ColumnarValue #10265 (alamb)
  • Minor: Add comments for removed protobuf nodes #10252 (alamb)
  • feat: add static_name() to ExecutionPlan #10266 (waynexia)
  • Zero-copy conversion from SchemaRef to DfSchema #10298 (tustvold)
  • chore: Update Error for Unnest Rewritter #10263 (Weijun-H)
  • feat(CLI): print column headers for empty query results #10300 (jonahgao)
  • Clean-up: Remove AggregateExec::group_by() #10297 (berkaysynnada)
  • Add mailing list descriptions to documentation #10284 (alamb)
  • chore(deps): update substrait requirement from 0.31.0 to 0.32.0 #10279 (dependabot[bot])
  • refactor: Convert IPCWriter metrics from u64 to usize #10278 (erenavsarogullari)
  • Validate ScalarUDF output rows and fix nulls for array_has and get_field for Map #10148 (duongcongtoai)
  • Minor: return NULL for range and generate_series #10275 (Lordworms)
  • docs: add download page #10271 (tisonkun)
  • Minor: Add some more tests to map.slt #10301 (alamb)
  • fix: Correct null_count in describe() #10260 (Weijun-H)
  • chore: Add datatype info to error message #10307 (viirya)
  • feat: add optimizer config param to avoid grouping partitions prefer_existing_union #10259 (NGA-TRAN)
  • Remove ScalarFunctionDefinition::Name #10277 (lewiszlw)
  • Display: Support preserve_partitioning on SortExec physical plan. #10153 (kavirajk)
  • Fix build with missing use ( " return internal_err!("UDF returned a different ...") #10317 (alamb)
  • [Minor] Update link to list of committers in contributor guide #10312 (alamb)
  • Optimize EliminateFilter to avoid unnecessary copies #10288 #10302 (dmitrybugakov)
  • chore: add function to set prefer_existing_union #10322 (NGA-TRAN)
  • ExecutionPlan visitor example documentation #10286 (matthewmturner)
  • fix: schema error when parsing order-by expressions #10234 (jonahgao)
  • Stop copying LogicalPlan and Exprs in RewriteDisjunctivePredicate #10305 (rohitrastogi)
  • feat: unwrap casts of string and dictionary columns #10323 (erratic-pattern)
  • feat: Determine ordering of file groups #9593 (suremarc)
  • Stop copying LogicalPlan and Exprs in DecorrelatePredicateSubquery #10318 (alamb)
  • Minor: Add additional coalesce tests #10334 (alamb)
  • Minor: add a few more dictionary unwrap tests #10335 (alamb)
  • Check list size before concat in ScalarValue #10329 (timsaucer)
  • Split parquet bloom filter config and enable bloom filter on read by default #10306 (lewiszlw)
  • Improve coerce API so it does not need DFSchema #10331 (alamb)
  • Stop copying LogicalPlan and Exprs in PropagateEmptyRelation #10332 (dmitrybugakov)
  • Stop copying LogicalPlan and Exprs in EliminateNestedUnion #10319 (emgeee)
  • Fix clippy lints found by Clippy in Rust 1.78 #10353 (alamb)
  • Minor: Add sql level test for lead/lag on arrays #10345 (alamb)
  • fix: LogFunc simplify swaps arguments #10360 (erratic-pattern)
  • Refine documentation for Transformed::{update,map,transform})_data #10355 (alamb)
  • Clarify docs explaining the relationship between SessionState and SessionContext #10350 (alamb)
  • Optimized push down filter #10291 #10366 (dmitrybugakov)
  • Unparser: Support ORDER BY in window function definition #10370 (yyy1000)
  • docs: Add DataFusion subprojects to navigation menu, other minor updates #10362 (andygrove)
  • feat: Add CrossJoin match case to unparser #10371 (sardination)
  • Minor: Do not force analyzer to copy logical plans #10367 (alamb)
  • Minor: Move Sum aggregate function test to slt #10382 (jayzhan211)
  • chore: remove DataPtr trait since Arc::ptr_eq ignores pointer metadata #10378 (intoraw)
  • Move Covariance (Sample) covar / covar_samp to be a User Defined Aggregate Function #10372 (jayzhan211)
  • Support limit in StreamingTableExec #10309 (lewiszlw)
  • Minor: Move count test to slt #10383 (jayzhan211)
  • [MINOR]: Reduce test run time #10390 (mustafasrepo)
  • Fix coalesce, struct and named_strct expr_fn function to take multiple arguments #10321 (alamb)
  • Minor: remove old create_physical_expr to scalar_function #10387 (jayzhan211)
  • Move average unit tests to slt #10401 (lewiszlw)
  • Move array_agg unit tests to slt #10402 (lewiszlw)
  • feat: run expression simplifier in a loop until a fixedpoint or 3 cycles #10358 (erratic-pattern)
  • Add SessionContext/SessionState::create_physical_expr() to create PhysicalExpressions from Exprs #10330 (alamb)