Execute Hive Query with IN clause parameters in parallel

There is no need to read the same data many times in separate queries to achieve better parallelism. Tune proper mapper and reducer parallelism for the same. First of all, enable PPD with vectorizing, use CBO and Tez: SET hive.optimize.ppd=true; SET hive.optimize.ppd.storage=true; SET hive.vectorized.execution.enabled=true; SET hive.vectorized.execution.reduce.enabled = true; SET hive.cbo.enable=true; set hive.stats.autogather=true; set hive.compute.query.using.stats=true; set … Read more

How to describe performance issue in relational database?

For Oracle Database provide this information: Describe the symptoms of the problem Describe the behavior that cause the problem. Is the behavior of the query stable or does the problem occurs only sometimes, with specific parameters or simple random. Can you reproduce this behavior in an IDE (e.g. SQL Developer)? Describe the environment Define the … Read more