hiveql - w3toppers.com

What is the equivalent of Presto UNNEST function in Hive

Use lateral view [outer] explode. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias. This is example from Presto migration from Hive docs: SELECT student, score FROM tests LATERAL VIEW explode(scores) … Read more

How can I convert array to string in hive sql?

Use concat_ws(string delimiter, array<string>) function to concatenate array: select actor, concat_ws(‘,’,collect_set(date)) as grpdate from actor_table group by actor; If the date field is not string, then convert it to string: concat_ws(‘,’,collect_set(cast(date as string))) Read also this answer about alternative ways if you already have an array (of int) and do not want to explode it … Read more

Array Intersection in Spark SQL

Since Spark 2.4 array_intersect function can be used directly in SQL spark.sql( “SELECT array_intersect(array(1, 42), array(42, 3)) AS intersection” ).show() +————+ |intersection| +————+ | [42]| +————+ and Dataset API: import org.apache.spark.sql.functions.array_intersect Seq((Seq(1, 42), Seq(42, 3))) .toDF(“a”, “b”) .select(array_intersect($”a”, $”b”) as “intersection”) .show() +————+ |intersection| +————+ | [42]| +————+ Equivalent functions are also present in the … Read more

java.lang.RuntimeException:Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Looks like problem with your metastore. If you are using the default hive metastore embedded derby. Lock file would be there in case of abnormal exit. if you remove that lock file this issue would be solved rm metastore_db/*.lck

How to update table in Hive 0.13?

You can use row_number or full join. This is example using row_number: insert overwrite table_1 select customer_id, items, price, updated_date from ( select customer_id, items, price, updated_date, row_number() over(partition by customer_id order by new_flag desc) rn from ( select customer_id, items, price, updated_date, 0 as new_flag from table_1 union all select customer_id, items, price, updated_date, … Read more

Find last day of a month in Hive

As of Hive 1.1.0, last_day(string date) function is available. last_day(string date) Returns the last day of the month which the date belongs to. date is a string in the format ‘yyyy-MM-dd HH:mm:ss’ or ‘yyyy-MM-dd’. The time part of date is ignored.

Hive Explode / Lateral View multiple arrays

I found a very good solution to this problem without using any UDF, posexplode is a very good solution : SELECT COOKIE , ePRODUCT_ID, eCAT_ID, eQTY FROM TABLE LATERAL VIEW posexplode(PRODUCT_ID) ePRODUCT_IDAS seqp, ePRODUCT_ID LATERAL VIEW posexplode(CAT_ID) eCAT_ID AS seqc, eCAT_ID LATERAL VIEW posexplode(QTY) eQTY AS seqq, eDateReported WHERE seqp = seqc AND seqc = … Read more

Specify minimum number of generated files from Hive insert

The number of files generated during INSERT … SELECT depends on the number of processes running on final reducer (final reducer vertex if you are running on Tez) plus bytes per reducer configured. If the table is partitioned and there is no DISTRIBUTE BY specified, then in the worst case each reducer creates files in … Read more

Explode (transpose?) multiple columns in Spark SQL table

Spark >= 2.4 You can skip zip udf and use arrays_zip function: df.withColumn(“vars”, explode(arrays_zip($”varA”, $”varB”))).select( $”userId”, $”someString”, $”vars.varA”, $”vars.varB”).show Spark < 2.4 What you want is not possible without a custom UDF. In Scala you could do something like this: val data = sc.parallelize(Seq( “””{“userId”: 1, “someString”: “example1”, “varA”: [0, 2, 5], “varB”: [1, 2, … Read more

How do I output the results of a HiveQL query to CSV?

Although it is possible to use INSERT OVERWRITE to get data out of Hive, it might not be the best method for your particular case. First let me explain what INSERT OVERWRITE does, then I’ll describe the method I use to get tsv files from Hive tables. According to the manual, your query will store … Read more