Back to Datafusion

Apache DataFusion 43.0.0 Changelog

dev/changelog/43.0.0.md

53.1.052.4 KB
Original Source
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -->

Apache DataFusion 43.0.0 Changelog

This release consists of 403 commits from 96 contributors. See credits at the end of this changelog for more information.

Breaking changes:

  • Remove Arc wrapping from create_udf's return_type #12489 (findepi)
  • Make make_scalar_function() result candidate for inlining, by removing the Arc #12477 (findepi)
  • Bump MSRV to 1.78 #12398 (comphead)
  • fix: DataFusion panics with "No candidates provided" #12469 (Weijun-H)
  • Implement PartialOrd for Expr and sub fields/structs without using hash values #12481 (ngli-me)
  • Add field trait method to WindowUDFImpl, remove return_type/nullable #12374 (jcsherin)
  • parquet: Make page_index/pushdown metrics consistent with row_group metrics #12545 (progval)
  • Make SessionContext::enable_url_table consume self #12573 (alamb)
  • LexRequirement as a struct, instead of a type #12583 (ngli-me)
  • Require Debug for AnalyzerRule, FunctionRewriter, and OptimizerRule #12556 (alamb)
  • Require Debug for TableProvider, TableProviderFactory and PartitionStream #12557 (alamb)
  • Require Debug for PhysicalOptimizerRule #12624 (AnthonyZhOon)
  • Rename aggregation modules, GroupColumn #12619 (alamb)
  • Update register_table functions args to take Into<TableReference> #12630 (JasonLi-cn)
  • Derive Debug for SessionStateBuilder, adding Debug requirements to fields #12632 (AnthonyZhOon)
  • Support REPLACE INTO for INSERT statements #12516 (fmeringdal)
  • Add PartitionEvaluatorArgs to WindowUDFImpl::partition_evaluator #12804 (jcsherin)
  • Convert rank / dense_rank and percent_rank builtin functions to UDWF #12718 (jatin510)
  • Bug-fix: MemoryExec sort expressions do NOT refer to the projected schema #12876 (berkaysynnada)
  • Minor: add flags for temporary ddl #12561 (hailelagi)
  • Convert BuiltInWindowFunction::{Lead, Lag} to a user defined window function #12857 (jcsherin)
  • Improve performance for physical plan creation with many columns #12950 (askalt)
  • Improve recursive unnest options API #12836 (duongcongtoai)
  • fix(substrait): disallow union with a single input #13023 (tokoko)
  • feat: support arbitrary expressions in LIMIT plan #13028 (jonahgao)
  • Remove unused LogicalPlan::CrossJoin as it is unused #13076 (buraksenn)
  • Minor: make Expr::volatile infallible #13206 (alamb)
  • Convert LexOrdering type to struct. #13146 (ngli-me)

Implemented enhancements:

  • feat(unparser): adding alias for table scan filter in sql unparser #12453 (Lordworms)
  • feat(substrait): set ProjectRel output_mapping in producer #12495 (vbarua)
  • feat:Support applying parquet bloom filters to StringView columns #12503 (my-vegetable-has-exploded)
  • feat: Support adding a single new table factory to SessionStateBuilder #12563 (Weijun-H)
  • feat(planner): Allowing setting sort order of parquet files without specifying the schema #12466 (devanbenz)
  • feat: add support for Substrait ExtendedExpression #12728 (westonpace)
  • feat(substrait): add intersect support to consumer #12830 (tokoko)
  • feat: Implement grouping function using grouping id #12704 (eejbyfeldt)
  • feat(substrait): add set operations to consumer, update substrait to 0.45.0 #12863 (tokoko)
  • feat(substrait): add wildcard handling to producer #12987 (tokoko)
  • feat: Add regexp_count function #12970 (Omega359)
  • feat: Decorrelate more predicate subqueries #12945 (eejbyfeldt)
  • feat: Run (logical) optimizers on subqueries #13066 (eejbyfeldt)
  • feat: Convert CumeDist to UDWF #13051 (jonathanc-n)
  • feat: Migrate Map Functions #13047 (jonathanc-n)
  • feat: improve type inference for WindowFrame #13059 (notfilippo)
  • feat: Move subquery check from analyzer to PullUpCorrelatedExpr (Fix TPC-DS q41) #13091 (eejbyfeldt)
  • feat: Add Date32/Date64 in aggregate fuzz testing #13041 (LeslieKid)
  • feat(substrait): support order_by in aggregate functions #13114 (bvolpato)
  • feat: Support Substrait's IntervalCompound type/literal instead of interval-month-day-nano UDT #12112 (Blizzara)
  • feat: Implement LeftMark join to fix subquery correctness issue #13134 (eejbyfeldt)
  • feat: support logical plan for EXECUTE statement #13194 (jonahgao)
  • feat(substrait): handle emit_kind when consuming Substrait plans #13127 (vbarua)
  • feat(substrait): AggregateRel grouping_expressions support #13173 (akoshchiy)

Fixed bugs:

  • fix: Panic/correctness issue in variance GroupsAccumulator #12615 (eejbyfeldt)
  • fix: coalesce schema issues #12308 (mesejo)
  • fix: Correct results for grouping sets when columns contain nulls #12571 (eejbyfeldt)
  • fix(substrait): remove optimize calls from substrait consumer #12800 (tokoko)
  • fix(substrait): consuming AggregateRel as last node #12875 (tokoko)
  • fix: Update TO_DATE, TO_TIMESTAMP scalar functions to support LargeUtf8, Utf8View #12929 (Omega359)
  • fix: Add Int32 type override for Dialects #12916 (peasee)
  • fix: using simple string match replace regex match for contains udf #12931 (zhuliquan)
  • fix: Dialect requires derived table alias #12994 (peasee)
  • fix: join swap for projected semi/anti joins #13022 (korowa)
  • fix: Verify supported type for Unary::Plus in sql planner #13019 (eejbyfeldt)
  • fix: Do NOT preserve names (aliases) of Exprs for simplification in TableScan filters #13048 (eejbyfeldt)
  • fix: planning of prepare statement with limit clause #13088 (jonahgao)
  • fix: add missing NotExpr::evaluate_bounds #13082 (crepererum)
  • fix: Order by mentioning missing column multiple times #13158 (eejbyfeldt)
  • fix: import JoinTestType without triggering unused_qualifications lint #13170 (smarticen)
  • fix: default UDWFImpl::expressions returns all expressions #13169 (Michael-J-Ward)
  • fix: date_bin() on timstamps before 1970 #13204 (mhilton)
  • fix: array_resize null fix #13209 (jonathanc-n)
  • fix: CSV Infer Schema now properly supports escaped characters. #13214 (mnorfolk03)

Documentation updates:

  • chore: Prepare 42.0.0 Release #12465 (andygrove)
  • Minor: improve ParquetOpener docs #12456 (alamb)
  • Improve doc wording around scalar authoring #12478 (findepi)
  • Minor: improve GroupsAccumulator docs #12501 (alamb)
  • Minor: improve GroupsAccumulatorAdapter docs #12502 (alamb)
  • Improve flamegraph profiling instructions #12521 (alamb)
  • docs: :memo: Add expected answers to DataFrame method examples #12564 (Eason0729)
  • parquet: Add finer metrics on operations covered by time_elapsed_opening #12585 (progval)
  • Update scalar_functions.md #12627 (Abdullahsab3)
  • Move kurtosis_pop to datafusion-functions-extra and out of core #12647 (dharanad)
  • Update introduction.md for blaze project #12577 (liyuance)
  • docs: improve the documentation for Aggregate code #12617 (alamb)
  • doc: Fix malformed hex string literal in user guide #12708 (kawadakk)
  • docs: Update DataFusion introduction to clarify that DataFusion does provide an "out of the box" query engine #12666 (andygrove)
  • Framework for generating function docs from embedded code documentation #12668 (Omega359)
  • Fix misformatted links on project index page #12750 (amoeba)
  • Add DocumentationBuilder::with_standard_argument to reduce copy/paste #12747 (alamb)
  • Minor: doc how field name is to be set for WindowUDF #12757 (jcsherin)
  • Port / Add Documentation for VarianceSample and VariancePopulation #12742 (alamb)
  • Transformed::new_transformed: Fix documentation formatting #12787 (progval)
  • Migrate documentation for all string functions from scalar_functions.md to code #12775 (Omega359)
  • Minor: add README to Catalog Folder #12797 (jonathanc-n)
  • Remove redundant aggregate/window/scalar function documentation #12745 (alamb)
  • Improve description of function migration #12743 (alamb)
  • Crypto Function Migration #12840 (jonathanc-n)
  • Minor: more doc to MemoryPool module #12849 (2010YOUY01)
  • Migrate documentation for all core functions from scalar_functions.md to code #12854 (Omega359)
  • Migrate documentation for Aggregate Functions to code #12861 (jonathanc-n)
  • Wordsmith project description #12778 (matthewmturner)
  • Migrate Regex Functions from static docs #12886 (jonathanc-n)
  • Migrate documentation for all math functions from scalar_functions.md to code #12908 (juroberttyb)
  • Combine the logic of rank, dense_rank and percent_rank udwf to reduce duplications #12893 (jatin510)
  • Migrate Array function Documentation to code #12948 (jonathanc-n)
  • Minor: fix Aggregation Docs from review #12880 (jonathanc-n)
  • Minor: expr-doc small fixes #12960 (jonathanc-n)
  • docs: Add documentation about conventional commits #12971 (andygrove)
  • Migrate datetime documentation to code #12966 (jatin510)
  • Fix CI on main ( regenerate function docs) #12991 (alamb)
  • Split output batches of joins that do not respect batch size #12969 (alihan-synnada)
  • Minor: Fixed regexpr_match docs #13008 (jonathanc-n)
  • Minor: Fix spelling in regexpr_count docs #13014 (jonathanc-n)
  • Update version to 42.1.0, add CHANGELOG (#12986) #12989 (alamb)
  • Added expresion to "with_standard_argument" #12926 (jonathanc-n)
  • Migrate documentation for regr* aggregate functions to code #12871 (alamb)
  • Minor: Add documentation for cot #13069 (alamb)
  • Documentation: Add API deprecation policy #13083 (comphead)
  • docs: Fixed generate_series docs #13097 (jonathanc-n)
  • [docs]: migrate lead/lag window function docs to new docs #13095 (buraksenn)
  • minor: Add deprecated policy to the contributor guide contents #13100 (comphead)
  • Introduce binary_as_string parquet option, upgrade to arrow/parquet 53.2.0 #12816 (goldmedal)
  • Convert ntile builtIn function to UDWF #13040 (jatin510)
  • docs: Added Special Functions Page #13102 (jonathanc-n)
  • [docs]: added alternative_syntax function for docs #13140 (jonathanc-n)
  • Minor: Delete old cume_dist and percent_rank docs #13137 (jonathanc-n)
  • docs: Add alternative syntax for extract, trim and substring. #13143 (Omega359)
  • docs: switch completely to generated docs for scalar and aggregate functions #13161 (Omega359)
  • Minor: improve testing docs, mention cargo nextest #13160 (alamb)
  • minor: Update HOWTO to help with updating new docs #13172 (jonathanc-n)
  • Add config option skip_physical_aggregate_schema_check #13176 (alamb)
  • Enable reading StringViewArray by default from Parquet (8% improvement for entire ClickBench suite) #13101 (alamb)
  • Forward port changes for 42.2.0 release (#13191) #13193 (alamb)
  • [minor] overload from_unixtime func to have optional timezone parameter #13130 (buraksenn)

Other:

  • Impl convert_to_state for GroupsAccumulatorAdapter (faster median for high cardinality aggregates) #11827 (Rachelint)
  • Upgrade sqlparser-rs to 0.51.0, support new interval logic from sqlparse-rs #12222 (samuelcolvin)
  • Do not silently ignore unsupported CREATE TABLE and CREATE VIEW syntax #12450 (alamb)
  • use FileFormat::get_ext as the default file extension filter #12417 (waruto210)
  • fix interval units parsing #12448 (samuelcolvin)
  • test(substrait): update TPCH tests #12462 (vbarua)
  • Add "Extended Clickbench" benchmark for median and approx_median for high cardinality aggregates #12438 (alamb)
  • date_trunc small update for readability #12479 (findepi)
  • cleanup array_has #12460 (samuelcolvin)
  • chore: bump chrono to 0.4.38 #12485 (my-vegetable-has-exploded)
  • Remove deprecated ScalarUDF::new #12487 (findepi)
  • Remove deprecated config setup functions #12486 (findepi)
  • Remove unnecessary shifts in gcd() #12480 (findepi)
  • Return TableProviderFilterPushDown::Exact when Parquet Pushdown Enabled #12135 (itsjunetime)
  • Update substrait requirement from 0.41 to 0.42, prost-build to 0.13.2 #12483 (dependabot[bot])
  • Faster strpos() string function for ASCII-only case #12401 (goldmedal)
  • Specialize ASCII case for substr() #12444 (2010YOUY01)
  • Improve SQLite subquery tables aliasing unparsing #12482 (sgrebnov)
  • Minor: use Option rather than Result for not found suggestion #12512 (alamb)
  • Remove deprecated datafusion_physical_expr::functions module #12505 (findepi)
  • Remove deprecated AggregateUDF::new #12508 (findepi)
  • Make required_guarantees output to be deterministic #12484 (austin362667)
  • Deprecate unused ScalarUDF::fun #12506 (findepi)
  • Remove deprecated WindowUDF::new #12507 (findepi)
  • Preserve the order of right table in NestedLoopJoinExec #12504 (alihan-synnada)
  • Improve benchmark for ltrim #12513 (Rachelint)
  • Fix: check ambiguous column reference #12467 (HuSen8891)
  • Minor: move imports to top in row_hash.rs #12530 (Rachelint)
  • tests: Fix typo in config setting name #12535 (progval)
  • Expose DataFrame select_exprs method #12520 (milenkovicm)
  • Replace some usages of Expr::to_field with Expr::qualified_name #12522 (jonahgao)
  • Bump aws-sdk-sso to 1.43.0, aws-sdk-sts to 1.43.0 and aws-sdk-ssooidc from 1.40.0 to 1.44.0 in /datafusion-cli #12409 (dependabot[bot])
  • Fix NestedLoopJoin performance regression #12531 (alihan-synnada)
  • Produce informative error message on insert plan type mismatch #12540 (findepi)
  • Fix unparse table scan with the projection pushdown #12534 (goldmedal)
  • Automate sqllogictest for String, LargeString and StringView behavior #12525 (goldmedal)
  • Fix unparsing offset #12539 (Stazer)
  • support EXTRACT on intervals and durations #12514 (nrc)
  • Support List type coercion for CASE-WHEN-THEN expression #12490 (goldmedal)
  • Sort metrics alphabetically in EXPLAIN ANALYZE output #12568 (progval)
  • Add RuntimeEnv::try_new and deprecate RuntimeEnv::new #12566 (OussamaSaoudi)
  • Reorgnize the StringView tests in sqllogictests #12572 (goldmedal)
  • fix parquet infer statistics for BinaryView types #12575 (XiangpengHao)
  • Minor: add example to of assert_batches_eq #12580 (alamb)
  • Use qualified aliases to simplify searching DFSchema #12546 (jonahgao)
  • return absent stats when filters are pushed down #12471 (waruto210)
  • Minor: add new() function for ParquetReadOptions #12579 (Smith-Cruise)
  • make Debug for MemoryExec prettier #12582 (samuelcolvin)
  • Add SessionStateBuilder::with_object_store method #12578 (OussamaSaoudi)
  • Fix and Improve Sort Pushdown for Nested Loop and Hash Join #12559 (berkaysynnada)
  • Add Docs and Examples and helper methods to PhysicalSortExpr #12589 (alamb)
  • Warn instead of error for unused imports #12588 (samuelcolvin)
  • Update prost-build requirement from =0.13.2 to =0.13.3 #12587 (dependabot[bot])
  • Add JOB benchmark dataset [1/N] (imdb dataset) #12497 (doupache)
  • Improve documentation and add Display impl to EquivalenceProperties #12590 (alamb)
  • physical-plan: Cast nested group values back to dictionary if necessary #12586 (brancz)
  • Support Date32 for date_trunc function #12603 (goldmedal)
  • Avoid RowConverter for multi column grouping (10% faster clickbench queries) #12269 (jayzhan211)
  • Refactor to support recursive unnest in physical plan #11577 (duongcongtoai)
  • Use original value when comparing with dictionary column in unparser #12610 (Sevenannn)
  • Fix to unparse the plan with multiple UNION statements into an SQL string #12605 (goldmedal)
  • Keep the float information in scalar_to_sql #12609 (Sevenannn)
  • Add Dictionary String (UTF8) type to String sqllogictests #12621 (goldmedal)
  • Improve SanityChecker error message #12595 (alamb)
  • Improve performance of trim for string view (10%) #12395 (Rachelint)
  • Simplify update_skip_aggregation_probe method #12332 (lewiszlw)
  • Minor: Encapsulate type check in GroupValuesColumn, avoid panic #12620 (alamb)
  • Fix sort node deserialization from proto #12626 (palaska)
  • Minor: improve documentation to StringView trim #12629 (alamb)
  • [MINOR]: Simplifications Sort Operator #12639 (akurmustafa)
  • [Minor] Remove redundant member from RepartitionExec #12638 (akurmustafa)
  • implement nested identifier access #12614 (Lordworms)
  • [MINOR]: Rename get_arrayref_at_indices to take_arrays #12654 (akurmustafa)
  • [MINOR]: Use take_arrays in repartition , fix build #12657 (doupache)
  • Add binary_view to string_view coercion #12643 (doupache)
  • [Minor] Improve error message when bitwise_* operator takes wrong unsupported type #12646 (dharanad)
  • Minor: Add github link to code that was upstreamed #12660 (alamb)
  • Minor: Improve documentation on execution error handling #12651 (alamb)
  • Adds WindowUDFImpl::reverse_exprtrait method + Support for IGNORE NULLS #12662 (jcsherin)
  • Fill in missing Debug fields for SessionState #12663 (AnthonyZhOon)
  • Minor: add partial assertion for skip aggregation probe #12640 (Rachelint)
  • Add more functions for string sqllogictests #12665 (goldmedal)
  • Update rstest requirement from 0.22.0 to 0.23.0 #12678 (dependabot[bot])
  • Minor: Change LiteralGuarantee try_new to new #12669 (pgwhalen)
  • Refactor PrimitiveGroupValueBuilder to use MaybeNullBufferBuilder #12623 (alamb)
  • Add value_from_statisics to AggregateUDFImpl, remove special case for min/max/count aggregate statistics #12296 (edmondop)
  • Provide field and schema metadata missing on distinct aggregations. #12691 (wiedld)
  • [MINOR]: Simplify required_input_ordering of BoundedWindowAggExec #12656 (akurmustafa)
  • handle 0 and NULL value of NTH_VALUE function #12676 (thinh2)
  • Improve documentation for AggregateUDFImpl::value_from_stats #12689 (alamb)
  • Add support for external tables with qualified names #12645 (OussamaSaoudi)
  • Fix Regex signature types #12690 (blaginin)
  • Refactor ByteGroupValueBuilder to use MaybeNullBufferBuilder #12681 (alamb)
  • Simplify match patterns in coercion rules #12711 (findepi)
  • Remove aggregate functions dependency on frontend #12715 (findepi)
  • Minor: Remove clone in transform_to_states #12707 (jayzhan211)
  • Refactor tests for union sorting properties, add tests for unions and constants #12702 (alamb)
  • Fix: support Qualified Wildcard in count aggregate function #12673 (HuSen8891)
  • Reduce code duplication in PrimitiveGroupValueBuilder with const generics #12703 (alamb)
  • Disallow duplicated qualified field names #12608 (eejbyfeldt)
  • Optimize base64/hex decoding by pre-allocating output buffers (~2x faster) #12675 (simonvandel)
  • Allow DynamicFileCatalog support to query partitioned file #12683 (goldmedal)
  • Support LIMIT Push-down logical plan optimization for Extension nodes #12685 (austin362667)
  • Fix AvroReader: Add union resolving for nested struct arrays #12686 (JonasDev1)
  • Adds macros for creating WindowUDF and WindowFunction expression #12693 (jcsherin)
  • Support unparsing plans with both Aggregation and Window functions #12705 (sgrebnov)
  • Fix strpos invocation with dictionary and null #12712 (findepi)
  • Add IMDB(JOB) Benchmark [2/N] (imdb queries) #12529 (austin362667)
  • Minor: avoid clone while calculating union equivalence properties #12722 (alamb)
  • Simplify streaming_merge function parameters #12719 (mertak-synnada)
  • Provide field and schema metadata missing on cross joins, and union with null fields. #12729 (wiedld)
  • Minor: Update string tests for strpos #12739 (alamb)
  • Apply type_union_resolution to array and values #12753 (jayzhan211)
  • fix equal_to in PrimitiveGroupValueBuilder #12758 (Rachelint)
  • Fix equal_to in ByteGroupValueBuilder #12770 (alamb)
  • Allow boolean Expr simplification even when nullable #12746 (eejbyfeldt)
  • Fix unnest conjunction with selecting wildcard expression #12760 (goldmedal)
  • Improve round scalar function unparsing for Postgres #12744 (sgrebnov)
  • Fix stack overflow calculating projected orderings #12759 (alamb)
  • Upgrade arrow/parquet to 53.1.0 / fix clippy #12724 (alamb)
  • Account for constant equivalence properties in union, tests #12562 (alamb)
  • Minor: clarify comment about empty dependencies #12786 (alamb)
  • Introduce Signature::String and return error if input of strpos is integer #12751 (jayzhan211)
  • Minor: improve docs on MovingMin/MovingMax #12790 (alamb)
  • Add union sorting equivalence end to end tests #12721 (alamb)
  • Fix bug in TopK aggregates #12766 (avantgardnerio)
  • Minor: clean up TODO comments in unnest.slt #12795 (goldmedal)
  • Refactor DependencyMap and Dependencies into structs #12761 (alamb)
  • Remove unnecessary DFSchema::check_ambiguous_name #12805 (jonahgao)
  • API from ParquetExec to ParquetExecBuilder #12799 (alamb)
  • Minor: add documentation note about NullState #12791 (alamb)
  • Chore: Move aggregate statistics optimizer test from core to optimizer crate #12783 (jayzhan211)
  • Clarify documentation on ArrowBytesMap and ArrowBytesViewMap #12789 (alamb)
  • Bump cookie and express in /datafusion/wasmtest/datafusion-wasm-app #12825 (dependabot[bot])
  • Remove unused dependencies and features #12808 (jonahgao)
  • Add Aggregation fuzzer framework #12667 (Rachelint)
  • Retry apt-get and rustup on CI #12714 (findepi)
  • Support creating tables via SQL with FixedSizeList column (e.g. a int[3]) #12810 (jandremarais)
  • Make HashJoinExec::join_schema public #12807 (progval)
  • Fix convert_to_state bug in GroupsAccumulatorAdapter #12834 (alamb)
  • Fix: approx_percentile_cont_with_weight Panic #12823 (jonathanc-n)
  • Fix clippy error on wasmtest #12844 (jonahgao)
  • Fix panic on wrong number of arguments to substr #12837 (eejbyfeldt)
  • Fix Bug in Display for ScalarValue::Struct #12856 (avantgardnerio)
  • Support DictionaryString for Regex matching operators #12768 (blaginin)
  • Minor: Small comment changes in sql folder #12838 (jonathanc-n)
  • Add DuckDB struct test and row as alias #12841 (jayzhan211)
  • Support struct coercion in type_union_resolution #12839 (jayzhan211)
  • Added check for aggregate functions in optimizer rules #12860 (jonathanc-n)
  • Optimize iszero function (3-5x faster) #12881 (simonvandel)
  • Macro for creating record batch from literal slice #12846 (timsaucer)
  • Implement special min/max accumulator for Strings and Binary (10% faster for Clickbench Q28) #12792 (alamb)
  • Make PruningPredicate's rewrite public #12850 (adriangb)
  • octet_length + string view == ❤️ #12900 (Omega359)
  • Remove Expr clones in select_to_plan #12887 (jonahgao)
  • Minor: added to docs in expr folder #12882 (jonathanc-n)
  • Print undocumented functions to console while generating docs #12874 (alamb)
  • Fix: handle NULL offset of NTH_VALUE window function #12851 (HuSen8891)
  • Optimize signum function (3-25x faster) #12890 (simonvandel)
  • re-export PartitionEvaluatorArgs from datafusion_expr::function #12878 (Michael-J-Ward)
  • Unparse Sort with pushdown limit to SQL string #12873 (goldmedal)
  • Add spilling related metrics for aggregation #12888 (2010YOUY01)
  • Move equivalence fuzz testing to fuzz test binary #12767 (alamb)
  • Remove unused math_expressions.rs #12917 (jonahgao)
  • Improve AggregationFuzzer error reporting #12832 (alamb)
  • Import Arc consistently #12899 (findepi)
  • Optimize isnan (2-5x faster) #12889 (simonvandel)
  • Minor: Move StringArrayType, StringViewArrayBuilder, etc outside of string module #12912 (Omega359)
  • Remove redundant unsafe in test #12914 (findepi)
  • Ensure that math functions fulfil the ColumnarValue contract #12922 (joroKr21)
  • Optimization: support push down limit when full join #12963 (JasonLi-cn)
  • Implement GroupColumn support for StringView / ByteView (faster grouping performance) #12809 (Rachelint)
  • Implement native support StringView for REGEXP_LIKE #12897 (tlm365)
  • Minor: Refactor benchmark imports to use util module #12885 (loloxwg)
  • Fix zero data type in expr % 1 simplification #12913 (eejbyfeldt)
  • Optimize performance of math::cot (~2x faster) #12910 (tlm365)
  • Expand wildcard expressions in distinct on #12941 (epsio-banay)
  • chores: remove redundant clone #12964 (JasonLi-cn)
  • Fix: handle NULL input in lead/lag window function #12811 (HuSen8891)
  • Fix logical vs physical schema mismatch for aliased now() #12951 (wiedld)
  • Optimize performance of math::trunc (~2.5x faster) #12909 (tlm365)
  • Minor: Add slt test for DISTINCT ON with wildcard #12968 (alamb)
  • Fix 'Too many open files' on fuzz test. #12961 (dhegberg)
  • Increase minimum supported Rust version (MSRV) to 1.79 #12962 (findepi)
  • Unparse SubqueryAlias without projections to SQL #12896 (goldmedal)
  • Fix 2 bugs related to push down partition filters #12902 (eejbyfeldt)
  • Move TableConstraint to Constraints conversion #12953 (findepi)
  • Added current_timestamp alias #12958 (jonathanc-n)
  • Improve unparsing for ORDER BY, UNION, Windows functions with Aggregation #12946 (sgrebnov)
  • Handle one-element array return value in ScalarFunctionExpr #12965 (joroKr21)
  • Add links to new_constraint_from_table_constraints doc #12995 (findepi)
  • Fix:fix HashJoin projection swap #12967 (my-vegetable-has-exploded)
  • refactor(substrait): refactor ReadRel consumer #12983 (tokoko)
  • Move SMJ join filtered part out of join_output stage. LeftOuter, LeftSemi #12764 (comphead)
  • Remove logical cross join in planning #12985 (Dandandan)
  • [MINOR]: Use arrow take_arrays, remove datafusion take_arrays #13013 (akurmustafa)
  • Don't preserve functional dependency when generating UNION logical plan #12979 (Sevenannn)
  • [Minor]: Add data based sort expression test #12992 (akurmustafa)
  • Removed last usages of scalar_inputs, scalar_input_types and inputs2 to use arrow unary/binary for performance #12972 (buraksenn)
  • Minor: Update release instructions to include new crates #13024 (alamb)
  • Extract CSE logic to datafusion_common #13002 (peter-toth)
  • Enhance table scan unparsing to avoid unnamed subqueries. #13006 (goldmedal)
  • Fix count on all null VALUES clause #13029 (findepi)
  • Support filter in cross join elimination #13025 (Dandandan)
  • [minor]: remove same util functions from the code base. #13026 (akurmustafa)
  • Improve AggregateFuzz testing: generate random queries #12847 (alamb)
  • Fix functions with Volatility::Volatile and parameters #13001 (agscpp)
  • refactor: Incorporate RewriteDisjunctivePredicate rule into SimplifyExpressions #13032 (eejbyfeldt)
  • Move filtered SMJ right join out of join_partial phase #13053 (comphead)
  • Remove functions and types deprecated since 37 #13056 (findepi)
  • Minor: Cleaned physical-plan Comments #13055 (jonathanc-n)
  • improve the condition checking for unparsing table_scan #13062 (goldmedal)
  • minor: simplify associated item bound of hash_array_primitive #13070 (jonahgao)
  • extended log.rs tests for unary/binary and f32/f64 casting #13034 (buraksenn)
  • Fix check_not_null_constraints null detection #13033 (findepi)
  • [Minor] Update info/list of TPC-DS queries #13075 (Dandandan)
  • Fix logical vs physical schema mismatch for UNION where some inputs are constants #12954 (wiedld)
  • Improve CSE stats #13080 (peter-toth)
  • Infer data type from schema for Values and add struct coercion to coalesce #12864 (jayzhan211)
  • [minor]: use arrow take_batch instead of get_record_batch_indices #13084 (akurmustafa)
  • chore: Added a number of physical planning join benchmarks #13085 (mnorfolk03)
  • Fix more instances of schema missing metadata #13068 (itsjunetime)
  • Bug-fix / Limit with_new_exprs() #13109 (berkaysynnada)
  • Minor: doc IMDB in benchmark README #13107 (2010YOUY01)
  • removed --prefer_hash_join option from parquet_filter command. #13106 (neyama)
  • Make CI error if a function has no documentation #12938 (alamb)
  • Allow using cargo nextest for running tests #13045 (alamb)
  • Add benchmark for memory-limited aggregation #13090 (2010YOUY01)
  • Add clickbench parquet based queries to sql_planner benchmark #13103 (Omega359)
  • Improve documentation and examples for SchemaAdapterFactory, make record_batch "hygenic" #13063 (alamb)
  • Move filtered SMJ Left Anti filtered join out of join_partial phase #13111 (comphead)
  • Improve TableScan with filters pushdown unparsing (multiple filters) #13131 (sgrebnov)
  • Raise a plan error on union if column count is not the same between plans #13117 (Omega359)
  • Add basic support for unnest unparsing #13129 (sgrebnov)
  • Improve TableScan with filters pushdown unparsing (joins) #13132 (sgrebnov)
  • Report offending plan node when In/Exist subquery misused #13155 (findepi)
  • Remove unused assert_analyzed_plan_ne test helper #13121 (findepi)
  • Fix Utf8View as Join Key #13115 (demetribu)
  • Add Support for modulus operation in substrait #13108 (LatrecheYasser)
  • unify cast_to function of ScalarValue #13122 (JasonLi-cn)
  • Add unused_qualifications rustic lint with deny lint level. #13086 (dhegberg)
  • [Optimization] Infer predicate under all JoinTypes #13081 (JasonLi-cn)
  • Support negate arithmetic expression in substrait #13112 (LatrecheYasser)
  • Fix to_char signature ordering #13126 (Omega359)
  • chore: re-export functions_window_common::ExpressionArgs #13149 (Michael-J-Ward)
  • minor: Fix build on main #13159 (eejbyfeldt)
  • minor: Update test case for issue #5771 showing it is resolved #13180 (eejbyfeldt)
  • Test LIKE with dynamic pattern #13141 (findepi)
  • Increase fuzz testing of streaming group by / low cardinality columns #12990 (alamb)
  • FFI initial implementation #12920 (timsaucer)
  • Report file location and offset when CSV schema mismatch #13185 (findepi)
  • Round robin polling between tied winners in sort preserving merge #13133 (jayzhan211)
  • Fix rendering of dictionary empty string values in SLT tests #13198 (findepi)
  • Improve push down filter of join #13184 (JasonLi-cn)
  • Minor: Reduce indirection for finding changlog #13199 (alamb)
  • Support DictionaryArray in OVER clause #13153 (adriangb)
  • Allow testing records with sibling whitespace in SLT tests and add more string tests #13197 (findepi)
  • Use single file write when an extension is present in the path. #13079 (dhegberg)
  • Deprecate ScalarUDF::invoke and invoke_no_args for invoke_batch #13179 (findepi)
  • consider volatile function in simply_expression #13128 (Lordworms)
  • Fix CI compile failure due to merge conflict #13219 (alamb)
  • Revert "Improve push down filter of join (#13184)" #13229 (eejbyfeldt)
  • Derive Clone for more ExecutionPlans #13203 (alamb)
  • feat(logical-types): add NativeType and LogicalType #12853 (notfilippo)
  • Apply projection to Statistics in FilterExec #13187 (alamb)
  • Minor: make LeftJoinData into a struct in CrossJoinExec #13227 (alamb)
  • Deprecate invoke and invoke_no_args in favor of invoke_batch #13174 (findepi)
  • Support timestamp(n) SQL type #13231 (findepi)
  • Remove elements deprecated since v 38. #13245 (findepi)

Credits

Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.

    68	Andrew Lamb
    34	Piotr Findeisen
    24	Jonathan Chen
    19	Emil Ejbyfeldt
    17	Jax Liu
    12	Bruce Ritchie
    11	Jonah Gao
     9	Jay Zhan
     8	Mustafa Akur
     8	kamille
     7	Sergei Grebnov
     7	Tornike Gurgenidze
     6	JasonLi
     6	Oleks V
     6	Val Lorentz
     6	jcsherin
     5	Burak Şen
     5	Samuel Colvin
     5	Yongting You
     5	dependabot[bot]
     4	HuSen
     4	Jagdish Parihar
     4	Simon Vandel Sillesen
     4	wiedld
     3	Alihan Çelikcan
     3	Andy Grove
     3	AnthonyZhOon
     3	Austin Liu
     3	Berkay Şahin
     3	Daniel Hegberg
     3	Daniël Heres
     3	Lordworms
     3	Michael J Ward
     3	OussamaSaoudi
     3	Qianqian
     3	Tai Le Manh
     3	Victor Barua
     3	doupache
     3	ngli-me
     3	yi wang
     2	Adrian Garcia Badaracco
     2	Alex Huang
     2	Brent Gardner
     2	Dharan Aditya
     2	Dmitrii Blaginin
     2	Duong Cong Toai
     2	Filippo Rossi
     2	Georgi Krastev
     2	June
     2	Max Norfolk
     2	Peter Toth
     2	Tim Saucer
     2	Yasser Latreche
     2	peasee
     2	waruto
     1	Abdullah Sabaa Allil
     1	Agaev Guseyn
     1	Albert Skalt
     1	Andrey Koshchiy
     1	Arttu
     1	Baris Palaska
     1	Bruno Volpato
     1	Bryce Mecum
     1	Daniel Mesejo
     1	Dmitry Bugakov
     1	Eason
     1	Edmondo Porcu
     1	Eduard Karacharov
     1	Frederic Branczyk
     1	Fredrik Meringdal
     1	Haile
     1	Jan
     1	JonasDev1
     1	Justus Flerlage
     1	Leslie Su
     1	Marco Neumann
     1	Marko Milenković
     1	Martin Hilton
     1	Matthew Turner
     1	Nick Cameron
     1	Paul
     1	Smith Cruise
     1	Tomoaki Kawada
     1	WeblWabl
     1	Weston Pace
     1	Xiangpeng Hao
     1	Xwg
     1	Yuance.Li
     1	epsio-banay
     1	iamthinh
     1	juroberttyb
     1	mertak-synnada
     1	neyama
     1	smarticen
     1	zhuliquan
     1	张林伟

Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.