Impala 3.0 Change Log
The changes in this log are in comparison to Impala 2.11.
New Feature
- [IMPALA-4167] - Support insert plan hints for CREATE TABLE AS SELECT
- [IMPALA-6537] - Add missing ODBC scalar functions
Improvement
- [IMPALA-1803] - Avoid hitting OOM in HdfsTableSink when inserting to Parquet
- [IMPALA-2782] - Allow Impala Shell to connect directly to impalad when config with proxy load balancer and kerberos
- [IMPALA-2963] - Deprecate query option: disable_cached_reads
- [IMPALA-4132] - Consider using -fno-omit-frame-pointer in release builds
- [IMPALA-4277] - Impala should build against latest Hadoop components
- [IMPALA-4744] - Apache Impala release should include release tag or hash in version string
- [IMPALA-4953] - Prevent large statestore updates from head-of-line blocking subsequent updates to different topics
- [IMPALA-5037] - Change default Parquet array resolution according to Parquet standard.
- [IMPALA-5721] - stress test: save profiles during binary search
- [IMPALA-5801] - Clean up codegen GetType() interface
- [IMPALA-5814] - Remove flag to disable admission control
- [IMPALA-6059] - Enhance ltrim() and rtrim() functions to trim any set of characters
- [IMPALA-6075] - Add Impala daemon metric for catalog version
- [IMPALA-6077] - remove BIT_PACKED encoding for Parquet levels
- [IMPALA-6113] - Skip row groups with predicates on NULL columns
- [IMPALA-6395] - Allow the accumulated row batch size of a data sink to be tunable
- [IMPALA-6437] - increase frequency of admission control topic updates
- [IMPALA-6479] - Update DESCRIBE statement to respect column level privileges
- [IMPALA-6497] - Impala should expose when the last row is fetched
- [IMPALA-6519] - Allow atomic allocation of an unreserved buffer
- [IMPALA-6629] - Clearer and more concise logging during catalog topic updates
- [IMPALA-6641] - Support more separators between date and time in default timestamp format
- [IMPALA-6655] - Set owner information on database creation
- [IMPALA-6675] - Change default configuration to --compact_catalog_topic=true
- [IMPALA-6682] - Support hash function other than md5 in pypi download script
- [IMPALA-6779] - Impala Doc: Improve REPLICA_PREFERENCE doc
- [IMPALA-6791] - Create scripts to automate distcc server setup and toolchain updates
- [IMPALA-6805] - Show current database in Impala shell prompt
- [IMPALA-6809] - bootstrap_system.sh does an unconditional git clone to ~/Impala
- [IMPALA-6817] - Clean up Impala privilege model
- [IMPALA-6820] - Remove builtins db from catalogd
- [IMPALA-6822] - Provide a query option to not shuffle on distinct exprs
- [IMPALA-6850] - Print the actual error message to the console when Sentry fails
- [IMPALA-6878] - SentryServicePinger should not print stacktrace at every retry
Task
- [IMPALA-3271] - Remove deprecated command-line options (including Llama)
- [IMPALA-3916] - Reserve SQL:2016 keywords
- [IMPALA-5893] - Remove old kinit code for Impala 3
- [IMPALA-6733] - Impala 3.0 Doc: Release Nots
- [IMPALA-6736] - stress test --filter-query-mem-ratio doesn't work
- [IMPALA-6780] - test_recover_paritions.py have always-true asserts
- [IMPALA-6860] - Impala 3.0 Doc: Upgrade Considerations
- [IMPALA-6872] - Impala 3.0 Doc: Update Known Issues
- [IMPALA-6886] - Impala Doc: Remove Impala Cluster Sizing doc
- [IMPALA-6959] - Update HAProxy configuration sample for Impala
Sub-task
- [IMPALA-3562] - Extend "compute stats" syntax to support a list of columns
- [IMPALA-4886] - Expose per table partition/files/blocks count in web UI
- [IMPALA-5518] - Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
- [IMPALA-5528] - tcmalloc contention much higher with concurrency after KRPC patch
- [IMPALA-6116] - Bound memory usage of KRPC service queue
- [IMPALA-6219] - Use AES-GCM for spill-to-disk encryption when CLMUL instruction is present and performant
- [IMPALA-6314] - Add run time scalar subquery check for uncorrelated subqueries
- [IMPALA-6356] - Excessive synchronous logging in RpczStore::LogTrace causes severe slowdown for exchange operators spanning 2-3 minutes
- [IMPALA-6396] - Exchange node should correctly report peak memory in query profile and summary
- [IMPALA-6459] - Doc: TABLESAMPLE for COMPUTE STATS
- [IMPALA-6462] - Impala 3 Doc: Update for reserved keywords
- [IMPALA-6463] - Impala 3 Doc: Remove the unused query options from docs
- [IMPALA-6464] - Impala 2.12 & 3.0 Docs: Extend "compute stats" syntax to support a list of columnss
- [IMPALA-6470] - Impala 3 Doc: Remove all user-facing Llama configuration options
- [IMPALA-6480] - Impala 3.0 Doc: Update DESCRIBE statement to respect column level privileges
- [IMPALA-6482] - Add query option for query time limit
- [IMPALA-6483] - Impala 3.0 & 2.12 Docs: Describe the query option for query time limit
- [IMPALA-6508] - Allow tests to run with Krpc
- [IMPALA-6510] - Impala 2.12 Doc: Remove the refresh_after_connect option from INVALIDATE METADATA statemement
- [IMPALA-6512] - test_exchange_delays does not work with KRPC enabled
- [IMPALA-6514] - Impala 2.12 & 3.0 Docs: Allow Impala Shell to connect directly to impalad when config with proxy load balancer and kerberos
- [IMPALA-6529] - Impala 3 Doc: ROUND function output type change
- [IMPALA-6538] - Fix read path when Parquet min(_value)/max(_value) statistics contain NaN
- [IMPALA-6542] - Fix inconsistent write path of Parquet min/max statistics
- [IMPALA-6546] - Impala 2.12 & 3.0 Docs: Document the new ODBC scalar functions
- [IMPALA-6554] - Check failed: consumption_->current_value() == 0 (126 vs. 0) KrpcDataStreamRecvr
- [IMPALA-6565] - Stress test with KRPC enabled shows inconsistent results for some queries
- [IMPALA-6576] - Add metric for Data Stream Service Queue memory consumption
- [IMPALA-6592] - Fix test gap - no test of handling for invalid/unsupported Parquet codec
- [IMPALA-6609] - Some COUNTER_ADD() in KrpcDataStreamRecvr may lead to use-after-free
- [IMPALA-6624] - Network error: failed to write to TLS socket: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c
- [IMPALA-6643] - Add fine-grained REFRESH privilege
- [IMPALA-6647] - Add fine-grained CREATE privilege
- [IMPALA-6649] - Add fine-grained ALTER privilege
- [IMPALA-6650] - Add fine-grained DROP privilege
- [IMPALA-6651] - Impala 2.13 & 3.0 Docs: Fine-grained privileges
- [IMPALA-6666] - Impala 2.12 & 3.0 Docs: Use AES-GCM for spill-to-disk encryption when CLMUL instruction is present and performant
- [IMPALA-6669] - Remove NeedsSeedingForBatchedReading()
- [IMPALA-6685] - Improve profile in KrpcDataStreamRecvr and KrpcDataStreamSender
- [IMPALA-6688] - Impala 2.12 & 3.0 Docs: Change default configuration to --compact_catalog_topic=true
- [IMPALA-6723] - Impala 2.12 & 3.0 Docs: Support insert plan hints for CREATE TABLE AS SELECT
- [IMPALA-6728] - The combination use_kudu_kinit=false and use_krpc=true crashes Impalad
- [IMPALA-6748] - Impala 2.12 & 3.0 Docs: Support more separators between date and time in default timestamp format
- [IMPALA-6800] - Add missing test cases in statements that require ALTER privilege
- [IMPALA-6804] - Allow SELECT and INSERT privileges on SERVER
- [IMPALA-6842] - Impala 3 Doc: Remove 'disable_admission_control' as a startup flag
- [IMPALA-6867] - Impala 2.12 & 3.0 Docs: Provide a query option to not shuffle on distinct exprs
- [IMPALA-6868] - Impala 3.0 Doc: Remove old kinit code for Impala 3
- [IMPALA-6961] - Impala Doc: Doc --enable_minidumps flag
Bug
- [IMPALA-2567] - KRPC milestone 1
- [IMPALA-3464] - Show Create Table with Unusual Delimiters Incorrect
- [IMPALA-4319] - Remove unused query options in compat-breaking version
- [IMPALA-4371] - Incorrect DCHECK-s in hdfs-parquet-table-writer
- [IMPALA-4924] - Remove DECIMAL V1 code at next compatibility breaking version
- [IMPALA-5139] - Make mvn-quiet.sh write its output to a logfile
- [IMPALA-5191] - Behavior of GROUP BY, HAVING, ORDER BY with column aliases should be more standard conforming
- [IMPALA-5269] - Comment on Final Line of Query Breaks Parsing
- [IMPALA-5293] - Turn insert clustering on by default
- [IMPALA-5903] - Inconsistent specification of result set and result set metadata
- [IMPALA-5930] - Document that SCAN_NODE_CODEGEN_THRESHOLD has had no effect since 2.7
- [IMPALA-6008] - Creating a UDF from a shared library with a .ll extension crashes Impala
- [IMPALA-6092] - Flaky test: query_test/test_udfs.py still happening
- [IMPALA-6215] - Race between lib_cache and java udf class loading
- [IMPALA-6230] - The output type of a round() function should match the input type
- [IMPALA-6275] - Successful CTAS logs warning
- [IMPALA-6303] - [DOCS] Incorrect mention of DataNodes in Impala docs
- [IMPALA-6322] - Group by expression fails when expression includes a CAST
- [IMPALA-6340] - There is no error when inserting an invalid value into a decimal column under decimal_v2
- [IMPALA-6372] - Dataload should execute Hive loads in parallel
- [IMPALA-6392] - Explain format for parquet predicate statistics should be consistent with predicates
- [IMPALA-6405] - There is no error under Decimal v2 when there is an overflow when casting String to Decimal
- [IMPALA-6429] - Decimal division returns an incorrect result
- [IMPALA-6441] - concurrent select binary search explain string pattern match is wrong
- [IMPALA-6447] - Failure in tests.stress.concurrent_select
- [IMPALA-6449] - Use CLOCK_MONOTONIC in ConditonVariable
- [IMPALA-6451] - Creating a Kudu table with CTAS fails with AuthorizationException: User 'username' does not have privileges to access: server1
- [IMPALA-6454] - CTAS into Kudu fails with mixed-case partition and/pr primary key column names
- [IMPALA-6468] - Round() is inconsistent for Decimal and Double
- [IMPALA-6471] - Incorrect Impala ALTER TABLE statement documentation
- [IMPALA-6472] - Builds broken because test_exprs uses a reserved word
- [IMPALA-6473] - Error in analytic sort with same expr in 'partition by' and 'order by'
- [IMPALA-6488] - Crash in LibCache::GetCacheEntryInternal()
- [IMPALA-6489] - ASAN use-after-poison in impala::HdfsScanner::InitTupleFromTemplate
- [IMPALA-6495] - targeted-perf tests broken by column alias change
- [IMPALA-6498] - test_query_profile_thrift_timestamps causes following tests to fail
- [IMPALA-6500] - Impala crashes randomly on different queries with GROUP BY
- [IMPALA-6511] - Fix event "Open Finished" in state machine in FragmentInstanceState::UpdateState()
- [IMPALA-6516] - Avoid logging during catalog update if the catalog version didn't change
- [IMPALA-6518] - Fix the output type of a decimal union for decimal_v2
- [IMPALA-6526] - Regression: query_test.test_spilling.TestSpillingDebugActionDimensions.test_spilling
- [IMPALA-6527] - NaN values lead to incorrect filtering under certain circumstances
- [IMPALA-6571] - NullPointerException in SHOW CREATE TABLE for HBase tables
- [IMPALA-6577] - TestQueryExpiration::test_concurrent_query_expiration failing
- [IMPALA-6582] - Flaky test: TestImpalaShellInteractive::test_multiline_queries_in_history
- [IMPALA-6583] - Various tests fail with missing database or table from catalog
- [IMPALA-6584] - TestKuduOperations::test_column_storage_attributes broken on exhaustive build
- [IMPALA-6585] - test_low_mem_limit_q21 flaky under ASAN
- [IMPALA-6586] - FrontendTest.TestGetTablesTypeTable failing on some builds
- [IMPALA-6588] - test_compute_stats_tablesample failing with "Cancelled"
- [IMPALA-6589] - Fuzz test on parquet table crashes impala
- [IMPALA-6595] - Hit crash freeing buffer in NljBuilder::Close()
- [IMPALA-6599] - Log spam: ImpaladCatalog.java:525] NativeLibCacheSetNeedsRefresh(hdfs://localhost:20500/test-warehouse/test-udfs.ll) failed.
- [IMPALA-6602] - TestQueryExpiration.test_query_expiration fails on Isilon with FINISHED rather than EXCEPTION state
- [IMPALA-6615] - An insert query using a CTE does not show the expected output when executed in Impala-shell
- [IMPALA-6619] - Alter table recover partitions creates unneeded partitions when faces percent sign
- [IMPALA-6670] - Executor-only impalads do not refresh their lib-cache entries
- [IMPALA-6683] - Restarting the Catalog without restarting Impalad and SS can block topic updates
- [IMPALA-6690] - Invalid syntax in pip_download.py due to a recent patch
- [IMPALA-6694] - BufferPool appears misaligned in query profile
- [IMPALA-6695] - Builds fail with pkg_resources.VersionConflict
- [IMPALA-6697] - Setuptools 39.0.0 does not work with Python 2.6
- [IMPALA-6715] - stress test is double-counting TPCDS queries
- [IMPALA-6716] - ImpalaShell should not rely on global access to parsed command line options
- [IMPALA-6717] - common_query_options are not used in binary search phase of stress test
- [IMPALA-6719] - refresh function case sensitivity
- [IMPALA-6722] - Local build failing due to missing libTestUdfs.so
- [IMPALA-6724] - Allow creating/dropping functions with the same name as built-ins
- [IMPALA-6731] - Build failed with distutils.errors.DistutilsError: Could not find suitable distribution
- [IMPALA-6739] - Exception in ALTER TABLE SET statements
- [IMPALA-6752] - import kudu fails in python on Ubuntu 16.04
- [IMPALA-6759] - stress test can't parse explain string with petabyte memory estimates (really)
- [IMPALA-6774] - SyntaxError: invalid syntax diagnostics/collect_diagnostics.py
- [IMPALA-6785] - Staring an Impalad on an already running cluster may result in inconsistent cluster subscription
- [IMPALA-6790] - sqlparse needs to be upgraded in the Python environment
- [IMPALA-6793] - Metadata doesn't recover after restarting statestore
- [IMPALA-6824] - Crash in RuntimeProfile::EventSequence::AddNewerEvents() when events_ is empty
- [IMPALA-6825] - /share/hadoop/tools/lib/ missing from HADOOP_CLASSPATH when using Hadoop 3, breaking S3 dev support
- [IMPALA-6851] - custom cluster test TestGrantRevoke::test_role_update doesn't clean up restarted impala daemons correctly
- [IMPALA-6862] - Privilege.java needs to support Sentry 1.5.1 and Sentry 2.0.0 API
- [IMPALA-6884] - test_misaligned_orc_stripes fails for local fs tests
- [IMPALA-6887] - Typo in authz-policy.ini.template
- [IMPALA-6889] - test_breakpad.py test_minidump_cleanup_thread failing
- [IMPALA-6896] - NullPointerException in DESCRIBE FORMATTED on views
- [IMPALA-6898] - Dataload from scratch loads Kudu tables twice
- [IMPALA-6899] - Dataload uses excessive HDFS commands
- [IMPALA-6927] - Crash when click Fragment Instances on web page
- [IMPALA-6934] - Wrong results with EXISTS subquery containing ORDER BY, LIMIT, and OFFSET
Documentation
- [IMPALA-6176] - Document that SCAN_NODE_CODEGEN_THRESHOLD has no effect
- [IMPALA-6415] - Impala 3 Doc: Document breaking change of alias and ordinal substitution