Back to Datafusion

29.0.0

dev/changelog/29.0.0.md

53.1.015.8 KB
Original Source
<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -->

29.0.0 (2023-08-11)

Full Changelog

Breaking changes:

  • change the input_type parameter of the create_udaf function from DataType to Vec<DataType> #7096 (jiangzhx)
  • Implement array_slice and array_element, remove array_trim #6936 (izveigor)
  • improve the ergonomics of creating field and list array accesses #7215 (izveigor)
  • Update Arrow 45.0.0 And Datum Arithmetic, change Decimal Division semantics #6832 (tustvold)

Implemented enhancements:

  • feat: support SQL array replacement and removement functions #7057 (izveigor)
  • feat: array containment operator @> and <@ #6885 (izveigor)
  • feat: add sqllogictests crate #7134 (tshauck)
  • feat: allow datafusion-cli to accept multiple statements #7138 (NiwakaDev)
  • feat: Add linear regression aggregate functions #7211 (2010YOUY01)

Fixed bugs:

  • fix: disallow interval - timestamp #7086 (jackwener)
  • fix: Projection columns_map remove name search #7099 (mustafasrepo)
  • fix: fix index bug and add test to check it #7124 (mustafasrepo)
  • fix: Fix panic in filter predicate #7126 (alamb)
  • fix: correct count(*) alias #7081 (jackwener)
  • fix: skip compression tests on --no-default-features #7172 (not-my-profile)
  • fix: typo in substrait #7224 (waynexia)

Documentation updates:

  • Add additional links to main README #7102 (alamb)
  • docs: fix broken link #7177 (SteveLauC)

Merged pull requests:

  • [Minor] Speedup to_array_of_size for Decimal128 #7055 (Dandandan)
  • Replace array_contains with SQL array functions: array_has, array_has_any, array_has_all #6990 (jayzhan211)
  • Add more Decimal256 type coercion #7047 (viirya)
  • Create dfbench, split up tpch benchmark runner into modules #7054 (alamb)
  • chore(deps): update sqlparser requirement from 0.35 to 0.36.1 #7051 (alamb)
  • use ObjectStore for dataframe writes #6987 (devinjdangelo)
  • Prepare 28.0.0 Release #7056 (andygrove)
  • refactor: with_inputs() can use original schema to avoid recompute schema. #7069 (jackwener)
  • Fix cli tests #7083 (mustafasrepo)
  • Ignore blank lines and comments at the end of query files for datafusion-cli #7076 (sarutak)
  • Support case sensitive column for with_column_renamed #7063 (comphead)
  • Add Decimal256 to ScalarValue #7048 (viirya)
  • Enrich CSV reader config: quote & escape #6927 (parkma99)
  • [Refactor] PipelineFixer physical optimizer rule removal #7059 (metesynnada)
  • fix: disallow interval - timestamp #7086 (jackwener)
  • Add Utf8->Binary type coercion for comparison #7080 (jonahgao)
  • Refactor Replace Repartition rule #7090 (mustafasrepo)
  • change the input_type parameter of the create_udaf function from DataType to Vec<DataType> #7096 (jiangzhx)
  • fix: Projection columns_map remove name search #7099 (mustafasrepo)
  • Minor: Refine doc comments for BuiltinScalarFunction::return_dimension #7045 (alamb)
  • Relax check during aggregate partial mode. #7101 (mustafasrepo)
  • refactor byte_to_string and string_to_byte #7091 (parkma99)
  • Minor: add test + docs for 2 argument trunc with columns #7042 (alamb)
  • Move inactive projects to a different section #7104 (alamb)
  • Port remaining information_schema rust tests to sqllogictests #7050 (palash25)
  • Change rust-version in Cargo.toml to comply with MSRV #7107 (sarutak)
  • create all needed folders in advance for benchmarks #7105 (smiklos)
  • Initial support for functional dependencies handling primary key and unique constraints #7040 (mustafasrepo)
  • Add ClickBench queries to DataFusion benchmark runner #7060 (alamb)
  • feat: support SQL array replacement and removement functions #7057 (izveigor)
  • [doc], [minor]. Update docstring of group by rewrite. #7111 (mustafasrepo)
  • Add additional links to main README #7102 (alamb)
  • fix: fix index bug and add test to check it #7124 (mustafasrepo)
  • fix: Fix panic in filter predicate #7126 (alamb)
  • Add MSRV check as a GA job #7123 (sarutak)
  • Minor: move AnalysisContext out of physical_expr and into its own module #7127 (alamb)
  • fix: correct count(*) alias #7081 (jackwener)
  • make_array with column of list #7137 (jayzhan211)
  • feat: array containment operator @> and <@ #6885 (izveigor)
  • [MINOR]: Make memory exec partition number =1, in test utils #7148 (mustafasrepo)
  • Substrait union/union all #7117 (nseekhao)
  • minor: Remove mac m1 compilation for size_of_scalar test #7151 (mustafasrepo)
  • chore: add config option for allowing bounded use of sort-preserving operators #7164 (wolffcm)
  • chore: edition use workspace #7140 (jackwener)
  • [bug]: Fix multi partition wrong column requirement bug #7129 (mustafasrepo)
  • Refactor memory_limit tests to make them easier to extend #7131 (alamb)
  • Minor: show output ordering in MemoryExec #7169 (alamb)
  • Move ordering equivalence, and output ordering for joins to util functions #7167 (mustafasrepo)
  • Add regr_slope() aggregate function #7135 (2010YOUY01)
  • Add expression for array_agg #7159 (willrnch)
  • fix: skip compression tests on --no-default-features #7172 (not-my-profile)
  • HashJoin order fixing #7155 (metesynnada)
  • tweak: demote heading levels in PR template #7176 (not-my-profile)
  • feat: add sqllogictests crate #7134 (tshauck)
  • docs: fix broken link #7177 (SteveLauC)
  • Add nanvl builtin function #7171 (sarutak)
  • chore(deps): update apache-avro requirement from 0.14 to 0.15 #7174 (jackwener)
  • make dataframe.task_ctx public #7183 (milenkovicm)
  • feat: allow datafusion-cli to accept multiple statements #7138 (NiwakaDev)
  • Add plan_err! error macro #7115 (comphead)
  • refactor: add ExecutionPlan::file_scan_config to avoid downcasting #7175 (not-my-profile)
  • Minor: Add documentation + diagrams for ExternalSorter #7179 (alamb)
  • Support simplifying expressions such as ~ ^(ba_r|foo)$ , where the string includes underline #7186 (tanruixiang)
  • Add MemoryReservation::{split_off, take, new_empty} #7184 (alamb)
  • Update bench.sh to only run 5 iterations #7189 (alamb)
  • Implement array_slice and array_element, remove array_trim #6936 (izveigor)
  • Unify DataFrame and SQL (Insert Into) Write Methods #7141 (devinjdangelo)
  • Minor: Further Increase stack_size to prevent roundtrip_deeply_nested test stack overflow #7208 (devinjdangelo)
  • Don't track files generated by regen.sh #7204 (sarutak)
  • Update some docs/scripts to reflect the removed/added packages. #7202 (sarutak)
  • Implement array_repeat, remove array_fill #7199 (izveigor)
  • Use tokio only if running from a multi-thread tokio context #7205 (viirya)
  • Remove Outdated NY Taxi benchmark #7210 (alamb)
  • improve the ergonomics of creating field and list array accesses #7215 (izveigor)
  • [MINOR] Document refactor on NestedLoopJoin #7217 (metesynnada)
  • Docs: Add GlareDB to list of DataFusion users #7223 (alamb)
  • fix: typo in substrait #7224 (waynexia)
  • Minor: Add constructors to GetFieldAccessExpr and add docs #7219 (alamb)
  • chore: required at least 1 approve before merge #7226 (jackwener)
  • feat: Add linear regression aggregate functions #7211 (2010YOUY01)
  • Add Expr::field, Expr::index, and Expr::slice, add docs #7218 (alamb)
  • Extend insert into support to include Json backed tables #7212 (devinjdangelo)
  • Minor: rename GetFieldAccessCharacteristic and add docs #7220 (alamb)
  • Minor: Remove unecessary clone_with_replacement #7232 (alamb)
  • Update Arrow 45.0.0 And Datum Arithmetic, change Decimal Division semantics #6832 (tustvold)
  • Support make_array null handling in nested version #7207 (jayzhan211)
  • [Minor], Bug Fix: Add empty ordering check at the source. #7230 (mustafasrepo)
  • Minor: with preserve order now receives argument #7231 (mustafasrepo)
  • Minor: Remove [[example]] table from datafusion-examples/Cargo.toml #7235 (sarutak)
  • Remove additional cast from TPCH q8 #7233 (viirya)
  • Minor: Move project_schema to datafusion_common #7237 (alamb)
  • Minor: Extract ExecutionPlanVisitor to its own module #7236 (alamb)
  • Minor: Move streams out of physical_plan module #7234 (alamb)
  • doc: Add link to contributor's guide for new functions within the src #7240 (2010YOUY01)
  • Account for memory usage in SortPreservingMerge (#5885) #7130 (alamb)
  • Deprecate batch_byte_size #7245 (alamb)
  • Minor: Move Partitioning andDistribution to physical_expr #7238 (alamb)
  • Minor: remove duplication in create_writer #7229 (alamb)
  • Support array flatten sql function #7239 (jayzhan211)
  • Minor: fix clippy for memory_limit test #7248 (yjshen)
  • Update physical_plan tests to not use SessionContext #7243 (alamb)
  • Add API to make unnest consistent with DuckDB/ClickHouse, add option for preserve_nulls, update docs #7168 (alamb)
  • chore(sqllogictests-doc): add testing set up #7258 (appletreeisyellow)
  • Avoid to use TempDir::into_path for temporary dirs expected to be deleted automatically #7252 (sarutak)
  • [MINOR]: update benefits_from_input_partitioning implementation for projection and repartition #7246 (mustafasrepo)
  • Adding order equivalence support on MemoryExec #7259 (metesynnada)
  • chore(functions): fix function names typo #7269 (appletreeisyellow)