Impala 3.1 Change Log
New Feature
- [IMPALA-110] - Add support for multiple distinct operators in the same query block
- [IMPALA-589] - add sql function to return the impalad coordinator hostname
- [IMPALA-1760] - Add Impala SQL command to gracefully shut down an Impala daemon
- [IMPALA-4848] - add WIDTH_BUCKET() function
- [IMPALA-5614] - Add COMMENT ON syntax to support comments on all objects
- [IMPALA-5717] - Support for ORC format files
- [IMPALA-5842] - Write page index in Parquet files
- [IMPALA-6373] - Allow type widening primitive conversion on parquet tables
- [IMPALA-7127] - Fetch-on-demand metadata for the impalad-side catalog
Epic
- [IMPALA-7075] - Add support for object ownership for Impala
Improvement
- [IMPALA-402] - Add test for dynamic partition expr involving rand()
- [IMPALA-1624] - Allow toggling --quiet and -B settings interactively in impala-shell
- [IMPALA-2346] - Create unit test that exposes (now fixed) disk IO mgr race
- [IMPALA-3134] - Admission controller should not assume all backends have same proc mem limit
- [IMPALA-3307] - add support for IANA time zone database
- [IMPALA-3819] - Block locality metadata may be stale message is misleading
- [IMPALA-4308] - Make the minidumps archived in our Jenkins jobs usable
- [IMPALA-4835] - HDFS scans should operate with a constrained number of I/O buffers
- [IMPALA-4970] - Record identity of largest latency ExecQueryFInstances() RPC per query
- [IMPALA-5004] - Switch to sorting node for large TopN queries
- [IMPALA-5384] - Simplify coordinator locking protocol
- [IMPALA-5552] - Proxy user list should support groups
- [IMPALA-5642] - [DOCS] Impala restrictions on using Hive UDFs
- [IMPALA-5662] - Log all information relevant to admission control decision making
- [IMPALA-5690] - Upgrade Thrift version to 0.9.3
- [IMPALA-5931] - Don't synthesize block metadata in the catalog for S3/ADLS
- [IMPALA-6233] - Document the column definitions list in CREATE VIEW
- [IMPALA-6249] - Expose several build flags via web UI
- [IMPALA-6271] - Impala daemon should log a message when it's being shut down
- [IMPALA-6299] - IRBuilder codegen should using LLVMCodeGen::cpu_attrs instead of CpuInfo to determine valid instructions
- [IMPALA-6305] - Allow column definitions in ALTER VIEW
- [IMPALA-6323] - Support a constant in a window specification
- [IMPALA-6335] - Remove the unnecessary decorator "pytest.mark.execute_serially" from tests which can be run in parallel
- [IMPALA-6425] - Change Mempool memory allocation size to be <1MB to avoid allocating from CentralFreeList
- [IMPALA-6490] - Automatically reconnect shell session if remote died
- [IMPALA-6507] - Consider removing --disable_mem_pools debugging feature
- [IMPALA-6568] - The profile of all statements should contain the Query Compilation timeline
- [IMPALA-6644] - Add last heartbeat timestamp into Statestore metric
- [IMPALA-6645] - Consider enabling disk spill encryption by default
- [IMPALA-6709] - Simplify tests that copy local files to tables
- [IMPALA-6802] - Clean up tests in AuthorizationTest
- [IMPALA-6835] - Improve Kudu scanner error messages to include the table name and the plan node id
- [IMPALA-6837] - Allow setting multiple allowed networks in distcc server script
- [IMPALA-6847] - Consider adding workaround for high memory estimates in admission control
- [IMPALA-6857] - Add JVM Pause Monitor to Impala Processes
- [IMPALA-6883] - [DOCS] Clean up Impala authorization doc for 3.x
- [IMPALA-6904] - Allow configuration of stress test binary search cutoff point
- [IMPALA-6905] - Allow use of row_regex with VERIFY_IS_SUBSET and VERIFY_IS_SUPERSET
- [IMPALA-6941] - Allow loading more text scanner plugins
- [IMPALA-6942] - "Cancelled due to unreachable impalad(s)" error message is misleading
- [IMPALA-6953] - Improve encapsulation within DiskIoMgr
- [IMPALA-6957] - Include number of required threads in explain plan
- [IMPALA-6969] - Profile doesn't include the reason that a query couldn't be dequeued from admission controller
- [IMPALA-6993] - Don't print status stack trace when propagating thrift status in Coordinator::BackendState::Exec()
- [IMPALA-6999] - Upgrade to sqlparse 0.1.19 in Impala Shell
- [IMPALA-7006] - Rebase KRPC onto Kudu upstream repository
- [IMPALA-7011] - Cleanups around PlanRootSink::CloseConsumer()
- [IMPALA-7014] - Disable stacktrace symbolisation by default
- [IMPALA-7018] - OpenSSL pending errors not cleared when spill-to-disk encryption is enabled
- [IMPALA-7024] - Convert Coordinator::wait_lock_ from boost::mutex to SpinLock
- [IMPALA-7071] - Make get_fs_path() idempotent
- [IMPALA-7078] - Selective Avro scans of wide tables use more memory than necessary
- [IMPALA-7082] - Show Human Readable Size in Query Backend Page
- [IMPALA-7115] - Set a default THREAD_RESERVATION_LIMIT value
- [IMPALA-7121] - Clean up partitionIds_ member from HdfsTable
- [IMPALA-7146] - log session ID in Impala daemon log
- [IMPALA-7157] - Avoid unnecessarily pretty printing profiles per fragment instance
- [IMPALA-7171] - Add docs for Kudu insert partitioning/sorting
- [IMPALA-7180] - Pin Impala CDH dependencies
- [IMPALA-7185] - Reduce statestore frequency for custom cluster tests by default
- [IMPALA-7191] - Daemons should call srand() early in main rather than at random locations
- [IMPALA-7215] - Implement a templatized CountingBarrier
- [IMPALA-7252] - Backport rate limiting of fadvise calls into toolchain glog
- [IMPALA-7291] - [DOCS] Document recommendation to use VARCHAR or STRING instead of CHAR(N)
- [IMPALA-7295] - Remove IMPALA_MINICLUSTER_PROFILE=2
- [IMPALA-7296] - Soft limit for memory queue in scan node row batch queue
- [IMPALA-7297] - Reject reservation increase if it would exceed soft limit
- [IMPALA-7314] - Impala doc generation should fail when there is an error
- [IMPALA-7330] - Make the table metadata refresh after "LOAD" commands incremental
- [IMPALA-7340] - Clarify which fields belong in "descriptor" vs "full" catalog thrift objects
- [IMPALA-7349] - Automatically choose mem_limit based on estimate, clamped to range
- [IMPALA-7362] - Add query option to set timezone
- [IMPALA-7364] - Upgrade RapidJson to the latest version
- [IMPALA-7381] - Prevent build failure after switching to a new CDH_BUILD_NUMBER
- [IMPALA-7383] - Make METASTORE_DB configurable and default to escaped $IMPALA_HOME
- [IMPALA-7393] - Test infra should log query IDs
- [IMPALA-7394] - Don't print stack trace in ImpalaServer::ExpireSessions()
- [IMPALA-7398] - Add logged_in_user alias for effective_user
- [IMPALA-7406] - Flatbuffer wrappers use almost as much memory as underlying data
- [IMPALA-7408] - Add a flag to selectively disable fs operations used by catalogd
- [IMPALA-7410] - HDFS Datanodes unable to start
- [IMPALA-7420] - Alternative cancellation messages with "internal" cancellation
- [IMPALA-7444] - Improve debug logging of session opening and expiry
- [IMPALA-7448] - Periodically evict recently unused table from catalogd
- [IMPALA-7453] - Intern HdfsStorageDescriptors
- [IMPALA-7457] - Allow StateStore subscribers to filter keys by a prefix
- [IMPALA-7466] - Improve readability/usability of describe authorization tests
- [IMPALA-7499] - Build against CDH Kudu
- [IMPALA-7519] - Support elliptic curve ssl ciphers
- [IMPALA-7554] - Update custom cluster tests to have sentry create new log on each start
- [IMPALA-7573] - Fix GroupingAggregator::Reset's handling of output_partition_
- [IMPALA-7576] - Add e default timeout for all e2e tests
- [IMPALA-7599] - Make num retries for InconsistentMetadataFetchException configurable
- [IMPALA-7622] - Add query profile metrics for RPC's used when pulling incremental stats.
- [IMPALA-7639] - impala-shell stuck at "Starting Impala Shell without Kerberos authentication" in test_multiline_queries_in_history
- [IMPALA-7644] - Hide Parquet page index writing with feature flag
- [IMPALA-7660] - Support ECDH ciphers for debug webserver
- [IMPALA-7673] - Parse --var variable values to replace variables within the value
- [IMPALA-7680] - Impala Doc: Document how to set default query options for a resource pool
- [IMPALA-7689] - Improve size estimate for incremental stats
- [IMPALA-7691] - test_web_pages not being run
- [IMPALA-7702] - Enable pull incremental stats a default
- [IMPALA-7703] - Upgrade to Sentry 2.1.0
- [IMPALA-7708] - Switch to faster compression strategy for incremental stats
- [IMPALA-7709] - Add options to restart catalogd and statestored in start-impala-cluster.py
- [IMPALA-7713] - Add test coverage for catalogd restart when authorization is enabled
- [IMPALA-7715] - "Impala Conditional Functions" documentation errata
- [IMPALA-7735] - Expose admission control status in impala-shell
- [IMPALA-7739] - Errata in documentation of decode() method
- [IMPALA-7786] - Start Hive and HMS in debug mode in the mini cluster
Bug
- [IMPALA-2014] - rapidjson can produce invalid json metrics if locale is changed
- [IMPALA-2063] - impala 'cancelled' query status now has extra characters in it
- [IMPALA-2195] - Improper handling of comments in queries
- [IMPALA-2566] - Result of casttochar() not handled properly in SQL operations
- [IMPALA-2717] - impala-shell breaks on non-ascii characters in the resultset
- [IMPALA-2751] - quote in WITH block's comment breaks shell
- [IMPALA-3040] - test_caching_ddl failed with unexpected get_num_cache_requests
- [IMPALA-3082] - BST between 1972 and 1995
- [IMPALA-3316] - convert_legacy_hive_parquet_utc_timestamps=true makes reading parquet tables 30x slower
- [IMPALA-3652] - Fix resource transfer in subplans with limits
- [IMPALA-3813] - How to handle replication factor while creating KUDU table through impala
- [IMPALA-3833] - Fix invalid data handling in Sequence and RCFile scanners
- [IMPALA-3956] - Impala shell variable substitution should ignore comments embedded in query.
- [IMPALA-4690] - conv() needs substantially more documentation
- [IMPALA-4850] - Create table "comment" comes after "partitioned by"
- [IMPALA-4908] - NULL floats don't compare equal to other NULL floats
- [IMPALA-5202] - Debug action WAIT in PREPARE leads to hung query that cannot be cancelled.
- [IMPALA-5337] - DITA not including anchor for "concepts"
- [IMPALA-5380] - impala_authorization.html should mention other filesystems in
- [IMPALA-5563] - Timezone lookup may be ambiguous
- [IMPALA-5740] - Clarify STRING size limit
- [IMPALA-5826] - Document IMPALA-2248
- [IMPALA-5839] - nullifzero and zeroifnull documentation about return type doesn't match implementation
- [IMPALA-5918] - Improve the documentation around REFRESH and INVALIDATE METADATA
- [IMPALA-5937] - Docs are missing some query options
- [IMPALA-5950] - Q35a and Q48 test files don't match standard qualification queries
- [IMPALA-5956] - Add TPC-DS q31 and q89 to test suite
- [IMPALA-5981] - Update documentation for 'SET key=""' changes in IMPALA-5908
- [IMPALA-6020] - REFRESH statement cannot detect HDFS block movement
- [IMPALA-6036] - Queries in admission control pool queue stay in the queue even after end client had already disconnected.
- [IMPALA-6086] - Use of permanent function should require SELECT privilege on DB
- [IMPALA-6153] - Prevent Coordinator::UpdateFilter() running after query exec resources are released
- [IMPALA-6174] - Mismatched size of input seed of rand/random between docs and backend
- [IMPALA-6202] - mod and % should be equivalent
- [IMPALA-6221] - Kudu partition clause doc is wrong
- [IMPALA-6227] - TestAdmissionControllerStress can be flaky
- [IMPALA-6337] - Infinite loop in impala_shell.py
- [IMPALA-6408] - [DOCS] Description of "shuffle" hint does not mention changes in IMPALA-3930
- [IMPALA-6436] - Impala Catalog generates a core file / mini dump when the HMS is not available
- [IMPALA-6442] - Misleading file offset reporting in error messages
- [IMPALA-6532] - NullPointerException in HiveContextAwareRecordReader.initIOContext() when executing Hive query
- [IMPALA-6661] - Group by float results in one group per NaN value
- [IMPALA-6711] - Flaky test: TestImpalaShellInteractive.test_multiline_queries_in_history
- [IMPALA-6776] - Failed to assign hbase regions to servers
- [IMPALA-6781] - Stress tests should take into account of non-deterministic results of queries
- [IMPALA-6789] - Failed to launch HiveServer2 in minicluster after switching to Hadoop3
- [IMPALA-6810] - query_test::test_runtime_filters.py::test_row_filters fails when run against an external cluster
- [IMPALA-6812] - Kudu scans not returning all rows
- [IMPALA-6813] - Hedged reads metrics broken when scanning non-HDFS based table
- [IMPALA-6816] - Statestore spends a lot of time in GetMinSubscriberTopicVersion()
- [IMPALA-6827] - airlines_parquet data not available in dropbox
- [IMPALA-6844] - Fix possible NULL dereference in to_date() builtin
- [IMPALA-6866] - S3 and Isilon tests fail test_exchange_delay::test_exchange_large_delay
- [IMPALA-6881] - COMPUTE STATS should check for SELECT privilege at analysis
- [IMPALA-6882] - Inline assembly instructions hoisted out of if(CpuInfo::IsSupported()) checks
- [IMPALA-6892] - CheckHashAndDecrypt doesn't report disk and host where the verification failed
- [IMPALA-6902] - query_test.test_udfs.TestUdfExecution.test_native_functions_race failed during core/thrift build
- [IMPALA-6906] - test_admission_controller.TestAdmissionController.test_memory_rejection on S3
- [IMPALA-6907] - ImpalaServer::MembershipCallback() may not remove all stale connections to disconnected Impalad nodes
- [IMPALA-6910] - Multiple tests failing on S3 build: error reading from HDFS file
- [IMPALA-6920] - Multithreaded scans are not guaranteed to get a thread token immediately
- [IMPALA-6929] - Create Kudu table syntax does not allow multi-column range partitions
- [IMPALA-6931] - TestQueryExpiration.test_query_expiration fails on ASAN with unexpected number of expired queries
- [IMPALA-6933] - test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"
- [IMPALA-6946] - Hit DCHECK in impala::RleBatchDecoder<unsigned int>::GetRepeatedValue
- [IMPALA-6947] - kudu: GetTableLocations RPC timing out with ASAN
- [IMPALA-6956] - check_num_executing fails test_query_expiration
- [IMPALA-6966] - Estimated Memory in Catalogd webpage is not sorted correctly
- [IMPALA-6968] - TestBlockVerificationGcmDisabled failure in exhaustive
- [IMPALA-6970] - DiskMgr::AllocateBuffersForRange crashes on failed DCHECK
- [IMPALA-6975] - TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
- [IMPALA-6980] - Impala Doc: Impala can add comment with alter table
- [IMPALA-6983] - stress test binary search exits if process mem_limit is too low
- [IMPALA-6987] - Impala Doc: Clear up Impala' invalidate metadata page
- [IMPALA-6990] - TestClientSsl.test_tls_v12 failing due to Python SSL error
- [IMPALA-6995] - False-positive DCHECK when converting whitespace-only strings to timestamp
- [IMPALA-6997] - Query execution should notice UDF MemLimitExceeded errors more quickly
- [IMPALA-7000] - Wrong info about Impala dedicated executors
- [IMPALA-7008] - TestSpillingDebugActionDimensions.test_spilling test setup fails
- [IMPALA-7010] - Multiple flaky tests failing with MemLimitExceeded on S3
- [IMPALA-7012] - NullPointerException with CTAS query
- [IMPALA-7017] - TestMetadataReplicas.test_catalog_restart fails with exception
- [IMPALA-7022] - TestQueries.test_subquery: Subquery must not return more than one row
- [IMPALA-7023] - TestInsertQueries.test_insert_overwrite fails by hitting memory limit
- [IMPALA-7025] - PlannerTest.testTableSample has wrong mem-reservation
- [IMPALA-7026] - buildall.sh make shell fails due to sqlparse problem
- [IMPALA-7032] - Codegen crash when UNIONing NULL and CHAR(N)
- [IMPALA-7033] - Impala crashes on exhaustive release tests
- [IMPALA-7039] - Frontend HBase tests cannot tolerate HBase running on a different port
- [IMPALA-7043] - Failure in HBase splitting should not fail dataload
- [IMPALA-7048] - Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
- [IMPALA-7055] - test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to table format AVRO is not supported")
- [IMPALA-7056] - Changing Text Delimiter Does Not Work
- [IMPALA-7058] - RC and Seq fuzz tests cause crash
- [IMPALA-7059] - Inconsistent privilege model between DESCRIBE and DESCRIBE DATABASE
- [IMPALA-7061] - PlannerTests should assign HBase splits as part of the test
- [IMPALA-7062] - Fix unsafe RuntimeProfile::SortChildren() API
- [IMPALA-7067] - sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail
- [IMPALA-7068] - Failed test: metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression on S3
- [IMPALA-7069] - Java UDF tests can trigger a crash in Java ClassLoader
- [IMPALA-7073] - Failed test: query_test.test_scanners.TestScannerReservation.test_scanners
- [IMPALA-7088] - Parallel data load breaks load-data.py if loading data on a real cluster
- [IMPALA-7089] - test_kudu_dml_reporting failing
- [IMPALA-7090] - EqualityDisjunctsToInRule should respect the limit on the number of children in an expr
- [IMPALA-7095] - Improve scanner thread counters in HDFS and Kudu scans
- [IMPALA-7096] - Ensure no memory limit exceeded regressions from IMPALA-4835 because of non-reserved memory
- [IMPALA-7099] - test_unsupported_text_compression fails: Expected one file per partition dir
- [IMPALA-7100] - [DOCS] extend hardware requirements to have consistent backend memory
- [IMPALA-7101] - Builds are timing out/hanging
- [IMPALA-7104] - test_bloom_wait_time failing with timeout on asan
- [IMPALA-7105] - Some fe tests fail when run standalone
- [IMPALA-7106] - Log the original and rewritten SQL when SQL rewrite fails
- [IMPALA-7108] - IllegalStateException hit during CardinalityCheckNode.<init>
- [IMPALA-7109] - TestPartitionMetadata::test_multiple_partitions_same_location uses incorrect paths
- [IMPALA-7111] - ASAN heap-use-after-free in impala::HdfsPluginTextScanner::CheckPluginEnabled
- [IMPALA-7119] - HBase tests failing with RetriesExhausted and "RuntimeException: couldn't retrieve HBase table"
- [IMPALA-7120] - GVD failed talking to oss.sonatype.org "Bad Gateway"
- [IMPALA-7124] - Error during data load: Can't recover key from keystore file
- [IMPALA-7130] - impala-shell -b / --kerberos_host_fqdn flag overrides value passed in via -i
- [IMPALA-7132] - run_clang_tidy.sh produces unrelated output
- [IMPALA-7143] - TestDescribeTableResults started failing because of OwnerType
- [IMPALA-7144] - Reenable tests disabled by IMPALA-7143
- [IMPALA-7145] - Leak of memory from OpenSSL when spill-to-disk encryption is enabled
- [IMPALA-7147] - Hit DCHECK in Parquet fuzz test
- [IMPALA-7150] - Crash in Reflection::invoke_method()
- [IMPALA-7151] - session-expiry-test failed - unable to open ThriftServer port
- [IMPALA-7158] - Incorrect init of HdfsScanNodeBase::progress_
- [IMPALA-7161] - Bootstrap's handling of JAVA_HOME needs improvement
- [IMPALA-7165] - Impala Doc: Example for Dynamic Partition Pruning need to be improved
- [IMPALA-7166] - ExecSummary should be a first class object
- [IMPALA-7169] - TestHdfsEncryption::()::test_drop_partition_encrypt fails to find file
- [IMPALA-7170] - "tests/comparison/data_generator.py populate" is broken
- [IMPALA-7174] - TestAdmissionController.test_cancellation failed with incorrect total-admitted metric
- [IMPALA-7175] - In a local FS build, test_native_functions_race thinks there are 2 impalads where there should be 1
- [IMPALA-7181] - Fix flaky test shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening
- [IMPALA-7187] - Fix test_group_impersonation when running inside Docker
- [IMPALA-7193] - Local filesystem failes with fs.defaultFS (file:/tmp) is not supported
- [IMPALA-7198] - Impala hints in docs are wrong
- [IMPALA-7199] - Need to have scripts to generate coverage
- [IMPALA-7200] - Local filesystem dataload fails due to missing FILESYSTEM_PREFIX
- [IMPALA-7209] - Disallow self referencing ALTER VIEW statments
- [IMPALA-7210] - Global debug_action should use case insensitive comparisons
- [IMPALA-7211] - Query with a decimal between predicate needlessly fails
- [IMPALA-7222] - [DOCS] authorization_proxy_user_config needs clarification
- [IMPALA-7224] - UpdateCatalogMetrics very slow when there are many tables
- [IMPALA-7225] - Refresh on single partition resets partition's row count to -1
- [IMPALA-7234] - Non-deterministic majority format for a table with equal partition instances
- [IMPALA-7236] - Erasure coding dataload broken by IMPALA-7102
- [IMPALA-7237] - Parsing bug in ParseSMaps()
- [IMPALA-7238] - test_kudu.TestCreateExternalTable sees unique database already exists
- [IMPALA-7239] - Mitigate ParseSmaps() overhead
- [IMPALA-7240] - TestSpillingDebugActionDimensions.test_spilling_regression_exhaustive hits memory limit on exhaustive tests
- [IMPALA-7241] - progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
- [IMPALA-7242] - Dcheck fails in width_bucket() function
- [IMPALA-7243] - width_bucket() returns an incorrect result
- [IMPALA-7251] - Fix QueryMaintenance calls in Aggregators
- [IMPALA-7254] - Inconsistent decimal behavior for the IN predicate
- [IMPALA-7256] - Aggregator mem usage isn't reflected in summary
- [IMPALA-7259] - impala-shell is weirdly slow with some large queries
- [IMPALA-7260] - Query with decimal binary predicate needlessly fails
- [IMPALA-7271] - KRPC: cross-port caching of GetLoggedInUser
- [IMPALA-7272] - impalad crash when Fatigue test
- [IMPALA-7275] - Create table with insufficient privileges should not show table
- [IMPALA-7279] - test_rows_availability failing: incompatible regex
- [IMPALA-7288] - Codegen crash in FinalizeModule()
- [IMPALA-7294] - TABLESAMPLE clause allocates arrays based on total file count instead of selected partitions
- [IMPALA-7298] - Don't pass resolved IP address as hostname when creating proxy
- [IMPALA-7302] - Build fails on Centos6
- [IMPALA-7304] - Impala shouldn't write column indexes for float columns until PARQUET-1222 is resolved
- [IMPALA-7305] - membership entry for failed impalad gets stuck in statestore due to race between failure detection and update processing
- [IMPALA-7306] - Add regression test for IMPALA-7305
- [IMPALA-7311] - INSERT with specified target partition fails if any other partition is missing write permissions
- [IMPALA-7315] - tests_statestore.py fails at assert len(args.topic_deltas[topic_name].topic_entries) == 1
- [IMPALA-7316] - Fix broken build due to Hadoop JAR mismatch
- [IMPALA-7325] - SHOW CREATE VIEW on a view that references built-in functions requires access to the built-in database
- [IMPALA-7329] - Blacklist CDH Maven snapshots repository
- [IMPALA-7335] - Assertion Failure - test_corrupt_files
- [IMPALA-7347] - Assertion Failure - test_show_create_table
- [IMPALA-7348] - PlannerTest.testKuduSelectivity failing due to missing Cardinality information
- [IMPALA-7360] - Avro scanner sometimes skips blocks when skip marker is on HDFS block boundary
- [IMPALA-7361] - test_heterogeneous_proc_mem_limit: Query aborted: Not enough memory available on host (s3)
- [IMPALA-7363] - Spurious error generated by sequence file scanner with weird scan range length
- [IMPALA-7376] - Impala hits a DCHECK if a fragment instance fails to initialize the filter bank
- [IMPALA-7386] - CatalogObjectVersionQueue should not be a queue
- [IMPALA-7387] - Set appropriate MIME type for JSON web pages
- [IMPALA-7388] - JNI THROW_IF_ERROR macros use local scope variables which likely conflict
- [IMPALA-7395] - TRUNCATE <table> syntax not documented
- [IMPALA-7396] - Fix crashes when thread_creation_fault_injection is enabled
- [IMPALA-7397] - dcheck in impala::AggregationNode::Close
- [IMPALA-7400] - "SQL Statements to Remove or Adapt" is out of date
- [IMPALA-7402] - DCHECK failed min_bytes_to_write <= dirty_unpinned_pages_ in buffer-pool
- [IMPALA-7403] - AnalyticEvalNode does not manage BufferedTupleStream memory correctly
- [IMPALA-7412] - width_bucket() function overflows too easily
- [IMPALA-7415] - Flaky test: TestImpalaShellInteractive.test_multiline_queries_in_history
- [IMPALA-7418] - test_udf_errors - returns Cancelled instead of actual error
- [IMPALA-7419] - NullPointerException in SimplifyConditionalsRule
- [IMPALA-7421] - Static methods called with wrong JNI function
- [IMPALA-7422] - Crash in QueryState::PublishFilter() fragment_map_.count(params.dst_fragment_idx) == 1 (0 vs. 1)
- [IMPALA-7423] - NoSuchMethodError when starting Sentry in the minicluster
- [IMPALA-7426] - T-test is an unreliable method for comparing non-normal distributions
- [IMPALA-7428] - Flaky test: test_shell_commandline.py / test_large_sql
- [IMPALA-7433] - Reduce volume of -v=1 logs on executors
- [IMPALA-7439] - CREATE DATABASE creates a catalog entry with empty location field
- [IMPALA-7442] - test_semi_joins_exhaustive occasionally fails
- [IMPALA-7443] - Fix intermittent GVO failures due to stale Maven cache
- [IMPALA-7445] - test_resource_limits running in unsupported envs
- [IMPALA-7449] - TotalNetworkThroughput in KrpcDataStreamSender is broken
- [IMPALA-7452] - test_disable_catalog_data_op s3 failiure
- [IMPALA-7459] - Query with width_bucket() crashes with Check failed: lhs > 0 && rhs > 0
- [IMPALA-7460] - update paramiko and fabric
- [IMPALA-7464] - DCHECK(!released_exec_resources_) is triggered when ExecFInstance() RPC times out
- [IMPALA-7465] - Hit memory limit in TestScanMemLimit.test_kudu_scan_mem_usage on ASAN build
- [IMPALA-7470] - SentryServicePinger logs a lot of error messages on success
- [IMPALA-7476] - test_statestore.py statestore client persists after the test is over
- [IMPALA-7487] - Failures in stress test when running against minicluster: "int() argument must be a string or a number, not 'NoneType'"
- [IMPALA-7488] - TestShellCommandLine::test_cancellation hangs occasionally
- [IMPALA-7490] - Uninitialized variable in data-load.py causes misleading error messages
- [IMPALA-7494] - Hang in TestTpcdsDecimalV2Query::test_tpcds_q69
- [IMPALA-7500] - Clarify workaround for IMPALA-635
- [IMPALA-7502] - ALTER TABLE RENAME should require ALL on the source table
- [IMPALA-7516] - Rejected queries dont get removed from the list of running queries
- [IMPALA-7517] - Hung scanner when soft memory limit exceeded
- [IMPALA-7520] - NPE in SentryProxy
- [IMPALA-7522] - milliseconds_add() can lead to overflow / DCHECK
- [IMPALA-7528] - Division by zero when computing cardinalities of many to many joins on NULL columns
- [IMPALA-7537] - REVOKE GRANT OPTION regression
- [IMPALA-7542] - find-fragment-instances in impala-gdb.py misses to find the "root threads"
- [IMPALA-7545] - Add admission control status to query log
- [IMPALA-7555] - impala-shell can hang in connect in certain cases
- [IMPALA-7559] - Parquet stat filtering ignores convert_legacy_hive_parquet_utc_timestamps
- [IMPALA-7569] - Impala Doc: Remove "safety valves" from docs
- [IMPALA-7570] - Impala Doc: Add a table of built-in functions
- [IMPALA-7575] - Fix doc for fmod, mod and %
- [IMPALA-7579] - TestObservability.test_query_profile_contains_all_events fails for S3 tests
- [IMPALA-7580] - test_local_catalog fails on S3 build
- [IMPALA-7581] - Hang in buffer-pool-test
- [IMPALA-7585] - Always set user credentials after creating a KRPC proxy
- [IMPALA-7586] - Incorrect results when querying primary = "\"" in Kudu and HBase
- [IMPALA-7588] - incorrect HS2 null handling introduced by IMPALA-7477
- [IMPALA-7591] - Use short name for setting owner.
- [IMPALA-7593] - test_automatic_invalidation failing in S3
- [IMPALA-7595] - Check failed: IsValidTime(time_) at timestamp-value.h:322
- [IMPALA-7597] - "show partitions" does not retry on InconsistentMetadataFetchException
- [IMPALA-7600] - Mem limit exceeded in test_kudu_scan_mem_usage
- [IMPALA-7606] - Time based auto invalidation not working
- [IMPALA-7616] - Refactor PrincipalPrivilege.buildPrivilegeName
- [IMPALA-7626] - Throttle requests in catalogd to avoid overloading catalogd
- [IMPALA-7628] - test_tls_ecdh failing on CentOS 6/Python 2.6
- [IMPALA-7632] - Erasure coding builds still failing because of default query options
- [IMPALA-7633] - count_user_privilege isn't 0 at the end of test_owner_privileges_without_grant
- [IMPALA-7646] - SHOW GRANT USER not working on kerberized clusters
- [IMPALA-7651] - Add Kudu support to scheduler-related query hints and options
- [IMPALA-7654] - TRUNCATE docs appear to be inaccurate
- [IMPALA-7661] - test_reconnect is flaky in asan
- [IMPALA-7662] - test_parquet reads bad_magic_number.parquet without an error
- [IMPALA-7663] - count_user_privilege isn't 0 at the end of test_owner_privileges_without_grant
- [IMPALA-7667] - sentry.db.explicit.grants.permitted does not accept empty value
- [IMPALA-7668] - close() URLClassLoaders after usage.
- [IMPALA-7669] - Concurrent invalidate with compute (or drop) stats throws NPE.
- [IMPALA-7671] - SHOW GRANT USER ON <object> is broken
- [IMPALA-7675] - The result of UpdateTableUsage() RPC is not correctly handled.
- [IMPALA-7676] - DESCRIBE on table should require VIEW_METADATA privilege
- [IMPALA-7677] - multiple count(distinct): Check failed: !hash_partitions_.empty()
- [IMPALA-7678] - Revert IMPALA-7660
- [IMPALA-7681] - Support new URI scheme for ADLS Gen2
- [IMPALA-7682] - AuthorizationPolicy is not thread-safe
- [IMPALA-7684] - Fix Admission result string in printed in the query profile
- [IMPALA-7688] - Spurious error messages when updating owner privileges
- [IMPALA-7690] - TestAdmissionController.test_pool_config_change_while_queued fails on centos6
- [IMPALA-7693] - stress test binary search fails to set query names
- [IMPALA-7697] - flakiness in test_resource_limits: query completes execution faster than expected
- [IMPALA-7699] - TestSpillingNoDebugActionDimensions fails earlier than expected
- [IMPALA-7701] - Grant option always shows as NULL in SHOW GRANT ROLE/USER for any HS2 clients
- [IMPALA-7704] - ASAN tests failing in HdfsParquetTableWriter
- [IMPALA-7710] - test_owner_privileges_with_grant failed with AuthorizationException
- [IMPALA-7714] - Statestore::Subscriber::SetLastTopicVersionProcessed() crashed in AtomicInt64::Store()
- [IMPALA-7717] - Partition id does not exist exception - Catalog V2
- [IMPALA-7721] - /catalog_object web API is broken for getting a privilege
- [IMPALA-7727] - failed compute stats child query status no longer propagates to parent query
- [IMPALA-7729] - Invalidate metadata hangs when there is an upper case role name
- [IMPALA-7740] - Incorrect doc description for nvl2()
- [IMPALA-7742] - User names in Sentry are now case sensitive
- [IMPALA-7758] - chars_formats dependent tables are created using the wrong LOCATION
- [IMPALA-7760] - Privilege version inconsistency causes a hang when running invalidate metadata
- [IMPALA-7775] - StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock
- [IMPALA-7777] - Fix crash due to arithmetic overflows in Exchange Node
- [IMPALA-7792] - Disabling ORC scanner can cause query hang
- [IMPALA-7794] - Rewrite ownership authorization tests
- [IMPALA-7822] - Crash in repeat() constructing strings > 1GB
- [IMPALA-7824] - Running INVALIDATE METADATA with authorization enabled can cause a hang if Sentry is unavailable
- [IMPALA-7835] - Creating a role with the same name as the user name with object ownership enabled can cause INVALIDATE METADATA to hang
- [IMPALA-7840] - test_concurrent_schema_change is missing a possible error message
- [IMPALA-7861] - ABFS docs: SSL is now enabled by default regardless of URI scheme
- [IMPALA-7901] - Docs problem describing Kudu partitioning syntax
Task
- [IMPALA-1995] - Flaky test: PlannerTest.testHbase: splits for HBASE KEYRANGE not set up correctly.
- [IMPALA-3330] - translate deserves more documentation
- [IMPALA-5502] - "*DBC Connector for Impala" is without context
- [IMPALA-5604] - Document DISABLE_CODEGEN_ROWS_THRESHOLD
- [IMPALA-5900] - document -fe_service_threads
- [IMPALA-6859] - De-templatize RpcMgrTestBase
- [IMPALA-6982] - Publish Impala Best Practice on Hot Spot Analysis
- [IMPALA-7046] - Add targeted regression test for race in IMPALA-7033
- [IMPALA-7050] - Impala Doc: Document inc_stats_size_limit_bytes command line option for Impalad
- [IMPALA-7079] - test_scanners.TestParquet.test_multiple_blocks fails in the erasure coding job
- [IMPALA-7102] - Add a query option to enable/disable running queries erasure coded files
- [IMPALA-7182] - Impala does not allow the use of insecure clusters with public IPs by default
- [IMPALA-7186] - Docs for kudu_read_mode
- [IMPALA-7190] - Remove unsupported format write support
- [IMPALA-7202] - Add WIDTH_BUCKET() function to the decimal fuzz test
- [IMPALA-7299] - Impala fails to work with the krb5 config 'rdns=false' in Impala 2.12.0/3.0
- [IMPALA-7317] - Instant validation job that runs on Gerrit upload (clang tidy, other simple checks)
- [IMPALA-7430] - Remove the log added to HdfsScanNode::ScannerThread
- [IMPALA-7440] - Consider removing --nlj-filter support from stress test
- [IMPALA-7607] - Add a reference to EXEC_TIME_LIMIT_S to "Setting Timeouts" page
- [IMPALA-7614] - Impala 3.1 Doc: Document the New Invalidate Options
- [IMPALA-7624] - test-with-docker sometimes hangs creating docker containers
- [IMPALA-7706] - Impala Doc: ALTER TABLE SET OWNER not on Sentry page for Impala
- [IMPALA-7806] - Impala 3.1 Doc: Check the existing known issues against 3.1 fixes
Sub-task
- [IMPALA-4784] - Remove InProcessStatestore
- [IMPALA-5216] - Admission control queuing should be asynchronous
- [IMPALA-5542] - Impala cannot scan Parquet decimal stored as int64_t/int32_t
- [IMPALA-5706] - Parallelise read I/O in sorter
- [IMPALA-6034] - Add query option that limits scanned bytes at runtime
- [IMPALA-6035] - Add query option that rejects queries based on query complexity
- [IMPALA-6481] - Impala Doc: WIDTH_BUCKET function
- [IMPALA-6587] - Crash in DiskMgr::AllocateBuffersForRange
- [IMPALA-6598] - Investigate memory requirement regression from scanner buffer pool change
- [IMPALA-6676] - Impala Doc: SHOW CREATE VIEW
- [IMPALA-6677] - Impala Doc: Doc next_day function
- [IMPALA-6678] - Better estimate of per-column compressed data size for low-NDV columns.
- [IMPALA-6679] - Don't claim ideal reservation in scanner until actually processing scan ranges
- [IMPALA-6832] - Impala 3.1 Doc: Add additional units to EXTRACT, DATE_PART, TRUNC for temporal data type1
- [IMPALA-6916] - Implement COMMENT ON DATABASE
- [IMPALA-6917] - Implement COMMENT ON TABLE/VIEW
- [IMPALA-6918] - Implement COMMENT ON COLUMN
- [IMPALA-6919] - Impala 3.1 Doc: Doc the COMMENT ON statements
- [IMPALA-6949] - Make it possible to start the minicluster with HDFS erasure coding enabled
- [IMPALA-6988] - Implement ALTER TABLE/VIEW SET OWNER
- [IMPALA-6989] - Implement SHOW GRANT USER
- [IMPALA-7004] - Deflake erasure coding data loading
- [IMPALA-7016] - Implement ALTER DATABASE SET OWNER
- [IMPALA-7019] - Discard block locations and schedule as remote read with erasure coding
- [IMPALA-7074] - Update OWNER privilege on CREATE, DROP, and ALTER SET OWNER
- [IMPALA-7076] - Impala 2.13 & 3.1 Docs: ALTER TABLE / VIEW SET OWNER
- [IMPALA-7103] - Impala 3.1 Doc: Query option to enable EC
- [IMPALA-7128] - Extract interfaces for frontend interaction with catalog objects
- [IMPALA-7135] - Skeleton implementation of LocalCatalog
- [IMPALA-7137] - Support configuring impalad to use the LocalCatalog
- [IMPALA-7140] - Build out support for HDFS tables and views in LocalCatalog
- [IMPALA-7141] - Extract interfaces for partition pruning prior to fetching partitions
- [IMPALA-7163] - Implement a state machine for the QueryState class
- [IMPALA-7179] - Consider changing --allow_multiple_scratch_dirs_per_device default
- [IMPALA-7195] - Impala 3.1 Doc: Proxy user list should support groups
- [IMPALA-7201] - Support DDL in LocalCatalog using existing catalogd
- [IMPALA-7203] - Support functions in LocalCatalog
- [IMPALA-7205] - Respond to ReportExecStatus() RPC with CANCELLED whenever query execution has terminated
- [IMPALA-7206] - Doc THREAD_RESERVATION_LIMIT
- [IMPALA-7207] - Make Coordinator ExecState an atomic enum
- [IMPALA-7212] - Deprecate --use_krpc flag
- [IMPALA-7218] - Impala 3.1 Doc: Allow Column Definitions in ALTER VIEW
- [IMPALA-7231] - Classify plan nodes into pipelines in frontend
- [IMPALA-7233] - Impala 3.1 Doc: Doc the support for IANA timezone database
- [IMPALA-7244] - Impala 3.1 Doc: Remove unsupported format write support
- [IMPALA-7257] - Support querying Kudu tables in LocalCatalog
- [IMPALA-7258] - Support querying HBase tables in LocalCatalog
- [IMPALA-7276] - Support CREATE TABLE AS SELECT with LocalCatalog
- [IMPALA-7277] - Support for INSERT and LOAD DATA in LocalCatalog
- [IMPALA-7307] - Support TABLESAMPLE and stats extrapolation in LocalCatalog
- [IMPALA-7308] - Support Avro tables in LocalCatalog
- [IMPALA-7324] - Remove MarkNeedsDeepCopy() from Sorter
- [IMPALA-7333] - Remove MarkNeedsDeepCopy from Aggregation and Hash Join Nodes
- [IMPALA-7337] - Run docs build job on upload
- [IMPALA-7342] - Initial refactor to support user-level privileges
- [IMPALA-7343] - Change Sentry proxy to use the bulk API
- [IMPALA-7344] - Restrict ALTER DATABASE/TABLE SET OWNER to ALL/OWNER privilege with GRANT OPTION enabled
- [IMPALA-7345] - Add the OWNER privilege
- [IMPALA-7352] - HdfsTableSink doesn't take into account insert clustering
- [IMPALA-7354] - Validate memory estimates for standard workloads in planner tests
- [IMPALA-7377] - Update Sentry for object ownership feature
- [IMPALA-7392] - Document SCAN_BYTES_LIMIT
- [IMPALA-7424] - Improve in-memory representation of incremental stats
- [IMPALA-7425] - Add option to load incremental statistics from catalog
- [IMPALA-7432] - Impala 3.1 Doc: Add logged_in_user alias for effective_user
- [IMPALA-7434] - Impala 2.12 Doc: Kudu's kinit does not support auth_to_local rules with Heimdal kerberos
- [IMPALA-7436] - Fetch table and partition metadata from catalogd
- [IMPALA-7437] - Simple granular caching of partition metadata in impalad
- [IMPALA-7447] - Size-based eviction for LocalCatalog LRU cache
- [IMPALA-7469] - Invalidate local catalog cache based on topic updates
- [IMPALA-7481] - Impala Doc: Deprecated file-based authorization
- [IMPALA-7483] - Abort stuck impalad if JVM deadlock is detected
- [IMPALA-7498] - impalad should wait for catalogd during start up
- [IMPALA-7503] - SHOW GRANT USER not showing all privileges
- [IMPALA-7510] - Support sentry roles/privileges with LocalCatalog
- [IMPALA-7525] - Impala Doc: Document SHOW GRANT USER
- [IMPALA-7527] - Expose fetch-from-catalogd cache and latency metrics in profiles
- [IMPALA-7530] - Re-plan queries on InconsistentMetadataException
- [IMPALA-7531] - Add daemon-level metrics about fetch-from-catalog cache
- [IMPALA-7532] - Add retry/back-off to fetch-from-catalog RPCs
- [IMPALA-7543] - Enhance scan ranges to support sub-ranges
- [IMPALA-7546] - Impala 3.1 Doc: Doc the new query option TIMEZONE
- [IMPALA-7589] - Allow setting default query options for custom cluster tests
- [IMPALA-7623] - Impala Doc: Disallow set kudu.table_name in CREATE and ALTER TABLE
- [IMPALA-7631] - Add Sentry configuration to allow specific privileges to be granted explicitly
- [IMPALA-7634] - Impala 3.1 Doc: Doc the command to gracefully shutdown Impala
- [IMPALA-7643] - Report the number of currently queued queries in stress test
- [IMPALA-7647] - Fix test gap - no test coverage of non-trivial HS2 result sets
- [IMPALA-7687] - Impala 3.1 Doc: Add support for multiple distinct operators in the same query block
- [IMPALA-7705] - Impala 2.13 & 3.1 Docs: ALTER DATABASE SET OWNER
- [IMPALA-7743] - Impala 3.1 Doc: Document the option to load incremental statistics from catalog
- [IMPALA-7749] - Merge aggregation node memory estimate is incorrectly influenced by limit
- [IMPALA-7765] - Impala 3.1 Doc: Docuement MAX_MEM_ESTIMATE_FOR_ADMISSION
- [IMPALA-7788] - Impala Doc: ADLS Gen2 Support
- [IMPALA-7789] - Impala 3.1 Doc: Admission Control Status in Impala Shell
- [IMPALA-7791] - Aggregation Node memory estimates don't account for number of fragment instances
- [IMPALA-7836] - Impala 3.1 Doc: New query option 'topn_bytes_limit' for TopN to Sort conversion
Test
- [IMPALA-6374] - test tpcds-q98.test has some incorrect data
- [IMPALA-6560] - Come up with reliable regression test for IMPALA-2376
Dependency upgrade
Documentation
- [IMPALA-5763] - Setting logbufsecs to 0 causes Impala to spin on one core
- [IMPALA-7162] - [DOCS] idle_query_timeout setting in Impala needs more clarification