MapR 5.0 Documentation : Impala Built-In Functions

You can use built-in functions to transform data directly in SELECT statements to avoid post processing in another application. Built-in functions allow an SQL query to return result sets with formatting, calculating, and type conversions already applied.

Impala supports the following categories of functions:

The following table provides a few rules that apply to built-in functions:

Rule

Example

Call functions through the SELECT statement.
You can omit the FROM clause in most functions.
Supply literal values for any required arguments.

select abs(-1);

select concat('The sly ', 'brown fox');

select power(2,5);

When you include a FROM clause and specify a
column name as a function argument, the function
is applied to each item in the result set.

select concat('State = ',state_code) from all_states where population > 20000000;

select round(price) as dollar_value from product_directory where price between 400.0 and 500.0; 

If an argument to a built-in function is NULL, the result value NULL.

select cos(null);

select power(2,null);

select concat('a',null,'b');

Aggregate functions require a FROM clause in the query.
They calculate a return value across all items in a result set.
Aggregate functions ignore NULL values rather than returning a NULL result.

 

select count(customer_id) from customer_directory;

select max(weight), avg(weight) from census_data where age > 40;

 

Aggregation Functions

You can use the following aggregation functions:

APPX_MEDIAN()
AVG()
COUNT()
MAX()
MIN()
NDV()
STDDEV()
STDDEV_SAMP()
STDDEV_POP()
SUM()
VARIANCE()
VARIANCE_SAMP()
VARIANCE_POP() 

Type Conversion Functions

Use conversion functions in combination with other functions to explicitly pass the expected data types. Use CAST when passing a column value or literal to a function that expects a parameter with a different type.

You can use the following conversion function:

cast(expr as type)

Example:

select concat('Here are the first ',10,' results.'); -- Fails
select concat('Here are the first ',cast(10 as string),' results.'); -- Succeeds

Mathematical Functions

Function

Description

Return type

abs(double a)

Ensures all return values are positive. Returns the absolute value of the argument.

double

acos(double a)

Returns the arccosine of the argument.

double 

asin(double a)

Returns the arcsine of the argument.

double

bin(bigint a)

Returns the binary representation of an integer value.

string

ceil(double a), ceiling(double a)

Returns the smallest integer > or = to the argument.

int 

conv(bigint num, int from_base, int to_base), conv(string num, int from_base, int to_base)

Returns a string representation of an integer value in a particular base. The input value can be a string. To use the return value as a number, use CAST() to convert to the appropriate type.

 

string

cos(double a)

Returns the cosine of the argument.

double

degrees(double a)

Converts the argument value from radians to degrees.

double 

e()

Returns the mathematical constant e.

double 

exp(double a)

Returns the mathematical constant e raised to the power of the argument.

double

floor(double a)

Returns the largest integer that is less than or equal to the argument.

int 
fnv_hash(type v)

Returns a consistent 64-bit value, derived from the input argument. Use for implementing hashing logic in an application.

bigint
greatest()Returns the largest value from a list of expressions.

The return type is the same as the initial argument value. Integer values are promoted to BIGINT. Floating-point values are promoted to DOUBLE. Use CAST() when inserting into a smaller numeric column.

hex(bigint a), hex(string a)

Returns the hexadecimal representation of an integer value, or of the characters in a string.

string 

is_inf(double a)

Tests whether a value is equal to the special value "inf", signifying infinity.boolean
is_nan(double a)Tests whether a value is equal to the special value "NaN", signifying "not a number".boolean
least()Returns the smallest value from a list of expressions.

The return type is the same as the initial argument value. Integer values are promoted to BIGINT. Floating-point values are promoted to DOUBLE. Use CAST() when inserting into a smaller numeric column.

ln(double a)

Returns the natural logarithm of the argument.

double 

log(double base, double a)

Returns the logarithm of the second argument to the specified base.

double 

log10(double a)

Returns the logarithm of the argument to the base 10.

double 

log2(double a)

Returns the logarithm of the argument to the base 2.

double 
mod()This function returns the modulus of a number. MOD is equivalent to using the % arithmetic operator. It works with any size integer type, any size floating-point type, and DECIMAL with any precision and scale. Same as the input value

negative(int a), negative(double a)

 

Returns the argument with the sign reversed; returns a positive value if the argument was already negative.

If return values must be negative, use -abs(a) instead.

int or double
max_int(),
min_smallint()
Checks whether data values are in an expected range. You might be able to switch a column to a smaller type to save memory during processing.The same as the integral type being checked.

pi()

Returns the constant pi.

double

pmod(int a, int b), pmod(double a, double b) 

Purpose: Returns the positive modulus of a number.

int or double

positive(int a), positive(double a)

 

Returns the original argument; applies to negative arguments also.

If return values must be positive, use abs(). 

int or double

pow(double a, double p), power(double a, double p) 

Returns the first argument raised to the power of the second argument.

double

quotient(int numerator, int denominator)

Returns the first argument divided by the second argument and discards any fractional part. Avoids promoting arguments to DOUBLE as happens with the / SQL operator.

int 

radians(double a)

Converts argument value from degrees to radians.

double 

rand(), rand(int seed)

Returns a random value between 0 and 1. After rand() is called with a seed argument, produces a consistent random sequence based on the seed value.

double 

round(double a), round(double a, int d)

Rounds a floating-point value. By default (with a single argument), rounds to the nearest integer. Values ending in .5 are rounded up for positive numbers, down for negative numbers. The second argument is optional and specifies how many digits to leave after the decimal point. Values greater than zero produce a floating-point return value rounded to the requested number of digits to the right of the decimal point.

bigint for single argument; double when second argument is greater than zero

sign(double a)

Returns -1, 0, or 1 to indicate the sign of the argument value.

int 

sin(double a) 

Returns the sine of the argument.

double

sqrt(double a)

Returns the square root of the argument.

double 

tan(double a)

Returns the tangent of the argument.

double 

unhex(string a)

Returns a string of characters with ASCII values corresponding to pairs of hexadecimal digits in the argument.

string 

 

Date and Time Functions

TIMESTAMP is the underlying datatype for data and time data. Functions that extract a single field, such as hour() or minute(), typically return an integer value. Functions that format the date portion, such as date_add() or to_date(), typically return a string value.

Function

Description

Return Type

add_months()

Alias for the existing MONTHS_ADD() function.

timestamp

date_add(string startdate, int days)

Adds a specified number of days to a date represented as a string.

string
date_part()

A new date and time function, similar to EXTRACT(), but with the order of the arguments reversed. You can also call the EXTRACT() function using the SQL-99 syntax, EXTRACT(unit FROM timestamp). These enhancements simplify the porting process for date-related code from other systems.  

int

date_sub(string startdate, int days)

Subtracts a specified number of days from a date represented as a string.

string

datediff(string enddate, string startdate)

Returns the number of days between two dates represented as strings.

int

day(string date), dayofmonth(string date)

Returns the day field from a date represented as a string.

int

dayname(string date)

Returns the day field from a date represented as a string, converted to the string corresponding to that day name. The range of return values is 'Sunday' to 'Saturday'. Use in report-generating queries, instead of calling dayofweek() and turning that numeric return value into a string using a CASE expression.

string

dayofweek(string date)

Returns the day field from a date represented as a string, corresponding to the day of the week. The range of return values is 1 (Sunday) to 7 (Saturday).

int
extract()Returns one date or time field from a TIMESTAMP value.timestamp

from_unixtime(bigint unixtime[, string format])

Converts the number of seconds from the Unix epoch to the specified time into a string.

string

from_utc_timestamp(timestamp, string timezone)

Converts a specified UTC timestamp value into the appropriate value for a specified time zone.

timestamp

hour(string date)

Returns the hour field from a date represented as a string.

int

minute(string date)

Returns the minute field from a date represented as a string.

int

month(string date)

Returns the month field from a date represented as a string.

int

now()

Returns the current date and time (in the UTC time zone) as a timestamp value.

timestamp

second(string date)

Returns the second field from a date represented as a string.

int

to_date(string timestamp)

Returns the date field from a timestamp represented as a string. 

string

to_utc_timestamp(timestamp, string timezone)

Converts a specified timestamp value in a specified time zone into the corresponding value for the UTC time zone.

timestamp
trunc()Truncates date/time values to a particular granularity, such as year, month, day, hour, and so on. 

unix_timestamp(), unix_timestamp(string date), unix_timestamp(string date, string pattern)

Returns a timestamp representing the current date and time, or converts from a specified date and time value represented as a string.

bigint

weekofyear(string date)

Returns the corresponding week (1-53) from a date represented as a string.

int

year(string date)

Returns the year field from a date represented as a string.

int

 

String Functions

Function

Description

Return Type

ascii(string str)

Returns the numeric ASCII code of the first character of the argument.

int

concat(string a, string b...)

Returns a single string representing all the argument values joined together.

string

concat_ws(string sep, string a, string b...)

Returns a single string representing the second and following argument values joined together, delimited by a specified separator.

string

find_in_set(string str, string strList)

 

Returns the position (starting from 1) of the first occurrence of a specified string within a comma-separated string. Returns NULL if either argument is NULL, 0 if the search string is not found, or 0 if the search string contains a comma.

int
initcap(string str)

Returns the input string with the first letter capitalized.

string

instr(string str, string substr)

Returns the position (starting from 1) of the first occurrence of a substring within a longer string.

int

length(string a)

Returns the length in characters of the argument string.

int

locate(string substr, string str[, int pos])

Returns the position (starting from 1) of the first occurrence of a substring within a longer string, optionally after a particular position.

int

lower(string a), lcase(string a)

Returns the argument string converted to all-lowercase.

string

lpad(string str, int len, string pad)

 

Returns a string of a specified length, based on the first argument string. If the specified string is too short, it is padded on the left with a repeating sequence of the characters from the pad string. If the specified string is too long, it is truncated on the right.

string

ltrim(string a)

Returns the argument string with any leading spaces removed from the left side.

string

parse_url(string urlString, string partToExtract [, string keyToExtract])

Returns the portion of a URL corresponding to a specified part. The part argument can be 'PROTOCOL', 'HOST', 'PATH', 'REF', 'AUTHORITY', 'FILE', 'USERINFO', or 'QUERY'. Literal values must be uppercase. You can specify a key to retrieve only the associated value from the key-value pairs in the query string when you request the query portion of the URL. Useful for importing web logs.

 

string

regexp_extract(string subject, string pattern, int index)

Returns the specified () group from a string based on a regular expression pattern.

string

regexp_replace(string initial, string pattern, string replacement)

Returns the initial argument with the regular expression pattern replaced by the final argument string.

string

repeat(string str, int n)

Returns the argument string repeated a specified number of times.

string

reverse(string a)

Purpose: Returns the argument string with characters in reversed order.

string

rpad(string str, int len, string pad)

 

Returns a string of a specified length, based on the first argument string. If the specified string is too short, it is padded on the right with a repeating sequence of the characters from the pad string. If the specified string is too long, it is truncated on the right.

string

rtrim(string a)

Returns the argument string with any trailing spaces removed from the right side.

string

space(int n)

Returns a concatenated string of the specified number of spaces. Shorthand for repeat(' ',n).

string

substr(string a, int start [, int len]), substring(string a, int start [, int len])

Returns the portion of the string starting at a specified point, optionally with a specified maximum length. The characters in the string are indexed starting at 1.

string

translate(string input, string from, string to)

Returns the input string with a set of characters replaced by another set of characters.

string

trim(string a)

Returns the input string with leading and trailing spaces removed. The same as passing the string through both ltrim() and rtrim().

string

upper(string a), ucase(string a)

Returns the argument string converted to all-uppercase.

string

 

Conditional Functions

Use the following conditional functions for testing equality, comparison operators, and nullity:

Function

Description

Result Type

CASE a WHEN b THEN c [WHEN d THEN e]... [ELSE f] END

Compares an expression to one or more possible values, and returns a corresponding result when a match is found.

Same as the initial argument value

CASE WHEN a THEN b [WHEN c THEN d]... [ELSE e] END

Tests whether any of a sequence of expressions is true, and returns a corresponding result for the first true expression.

Same as the initial argument value

coalesce(type v1, type v2, ...)

Returns the first specified argument that is not NULL, or NULL if all arguments are NULL.

Same as the initial argument value
decode()

Compares an expression to one or more possible values and returns a corresponding result when a match is found. This function works as a shorthand for a CASE() expression and improves compatibility with SQL code containing vendor extensions.

Same as the initial argument value, except that integer values are promoted to BIGINT and floating-point values are promoted to DOUBLE; use CAST() when inserting into a smaller numeric column

if(boolean condition, type ifTrue, type ifFalseOrNull)

Tests an expression and returns a corresponding result depending on whether the result is true, false, or NULL.

Same as ifTrue argument value
isfalse(), isnotfalse(), isnottrue(), istrue(), notnullvalue(), nullvalue()These conditional functions provide enhanced compatibility when porting code that uses industry extensions. 

isnull(type a, type ifNotNull)

 

Tests if an expression is NULL, and returns the expression result value if not. If the first argument is NULL, returns the second argument. Equivalent to the nvl() function from Oracle Database orifnull() from MySQL.

Same as the first argument value
nullif(expr1,expr2)Returns NULL if the two specified arguments are equal. If the specified arguments are not equal, returns the value of expr1. The data types of the expressions must be compatible. You cannot use an expression that evaluates to NULL for expr1, so you can distinguish a return value of NULL from an argument value of NULL, which would never match expr2.Same as the initial argument value, except that integer values are promoted to BIGINT and floating-point values are promoted to DOUBLE; use CAST() when inserting into a smaller numeric column
nullifzero(numeric_expr)Returns NULL if the numeric expression evaluates to 0, otherwise returns the result of the expression.Same as the initial argument value, except that integer values are promoted to BIGINT and floating-point values are promoted to DOUBLE; use CAST() when inserting into a smaller numeric column

nvl(type a, type ifNotNull)

 

Alias for the isnull() function; added in Impala 1.1. Tests if an expression is NULL, and returns the expression result value if not. If the first argument is NULL, returns the second argument. Equivalent to the nvl() function from Oracle Database or ifnull() from MySQL.

Same as the first argument value
zeroifnull(numeric_expr)Returns 0 if the numeric expression evaluates to NULL, otherwise returns the result of the expression.Same as the initial argument value, except that integer values are promoted to BIGINT and floating-point values are promoted to DOUBLE; use CAST() when inserting into a smaller numeric column