STRING Data Type
      A data type used in CREATE TABLE and ALTER TABLE statements.
    
Syntax:
      In the column definition of a CREATE TABLE statement:
    
column_name STRING
      Length: Maximum of 32,767 bytes. Do not use any length constraint when declaring
      STRING columns, as you might be familiar with from VARCHAR,
      CHAR, or similar column types from relational database systems. If you do
      need to manipulate string values with precise or maximum lengths, in Impala 2.0 and higher you can declare
      columns as VARCHAR(max_length) or
      CHAR(length), but for best performance use STRING
      where practical.
    
Character sets: For full support in all Impala subsystems, restrict string values to the ASCII character set. Although some UTF-8 character data can be stored in Impala and retrieved through queries, UTF-8 strings containing non-ASCII characters are not guaranteed to work properly in combination with many SQL aspects, including but not limited to:
- String manipulation functions.
- Comparison operators.
- 
        The ORDER BYclause.
- Values in partition key columns.
For any national language aspects such as collation order or interpreting extended ASCII variants such as ISO-8859-1 or ISO-8859-2 encodings, Impala does not include such metadata with the table definition. If you need to sort, manipulate, or display data depending on those national language characteristics of string data, use logic on the application side.
Conversions:
- 
        Impala does not automatically convert STRINGto any numeric type. Impala does automatically convertSTRINGtoTIMESTAMPif the value matches one of the acceptedTIMESTAMPformats; see TIMESTAMP Data Type for details.
- 
        You can use CAST()to convertSTRINGvalues toTINYINT,SMALLINT,INT,BIGINT,FLOAT,DOUBLE, orTIMESTAMP.
- 
        You cannot directly cast a STRINGvalue toBOOLEAN. You can use aCASEexpression to evaluate string values such as'T','true', and so on and return Booleantrueandfalsevalues as appropriate.
- 
        You can cast a BOOLEANvalue toSTRING, returning'1'fortruevalues and'0'forfalsevalues.
Partitioning:
      Although it might be convenient to use STRING columns for partition keys, even when those
      columns contain numbers, for performance and scalability it is much better to use numeric columns as
      partition keys whenever practical. Although the underlying HDFS directory name might be the same in either
      case, the in-memory storage for the partition key columns is more compact, and computations are faster, if
      partition key columns such as YEAR, MONTH, DAY and so on
      are declared as INT, SMALLINT, and so on.
    
        Zero-length strings: For purposes of clauses such as DISTINCT and GROUP
        BY, Impala considers zero-length strings (""), NULL, and space
        to all be different values.
      
Text table considerations: Values of this type are potentially larger in text tables than in tables using Parquet or other binary formats.
Avro considerations:
        The Avro specification allows string values up to 2**64 bytes in length.
        Impala queries for Avro tables use 32-bit integers to hold string lengths.
        In Impala 2.5 and higher, Impala truncates CHAR
        and VARCHAR values in Avro tables to (2**31)-1 bytes.
        If a query encounters a STRING value longer than (2**31)-1
        bytes in an Avro table, the query fails. In earlier releases,
        encountering such long values in an Avro table could cause a crash.
      
        Column statistics considerations: Because the values of this type have variable size, none of the
        column statistics fields are filled in until you run the COMPUTE STATS statement.
      
Examples:
The following examples demonstrate double-quoted and single-quoted string literals, and required escaping for quotation marks within string literals:
SELECT 'I am a single-quoted string';
SELECT "I am a double-quoted string";
SELECT 'I\'m a single-quoted string with an apostrophe';
SELECT "I\'m a double-quoted string with an apostrophe";
SELECT 'I am a "short" single-quoted string containing quotes';
SELECT "I am a \"short\" double-quoted string containing quotes";
The following examples demonstrate calls to string manipulation functions to concatenate strings, convert numbers to strings, or pull out substrings:
SELECT CONCAT("Once upon a time, there were ", CAST(3 AS STRING), ' little pigs.');
SELECT SUBSTR("hello world",7,5);
      The following examples show how to perform operations on STRING columns within a table:
    
CREATE TABLE t1 (s1 STRING, s2 STRING);
INSERT INTO t1 VALUES ("hello", 'world'), (CAST(7 AS STRING), "wonders");
SELECT s1, s2, length(s1) FROM t1 WHERE s2 LIKE 'w%';
Related information:
String Literals, CHAR Data Type (Impala 2.0 or higher only), VARCHAR Data Type (Impala 2.0 or higher only), Impala String Functions, Impala Date and Time Functions