query-optimization - w3toppers.com

Subquery v/s inner join in sql server

Usually joins will work faster than inner queries, but in reality it will depend on the execution plan generated by SQL Server. No matter how you write your query, SQL Server will always transform it on an execution plan. If it is “smart” enough to generate the same plan from both queries, you will get … Read more

Postgres query optimization (forcing an index scan)

For testing purposes you can force the use of the index by “disabling” sequential scans – best in your current session only: SET enable_seqscan = OFF; Do not use this on a productive server. Details in the manual here. I quoted “disabling”, because you cannot actually disable sequential table scans. But any other available option … Read more

Whats the fastest way to lookup big tables for points within radius MySQL (latitude longitude)

Well first of all if you have a lot of geospatial data, you should be using mysql’s geospatial extensions rather than calculations like this. You can then create spatial indexes that would speed up many queries and you don’t have to write long drawn out queries like the one above. Using a comparision with ST_Distance … Read more

Why does direction of index matter in MongoDB?

MongoDB concatenates the compound key in some way and uses it as the key in a BTree. When finding single items – The order of the nodes in the tree is irrelevant. If you are returning a range of nodes – The elements close to each other will be down the same branches of the … Read more

Hive query performance for high cardinality field

Use ORC with bloom filters: CREATE TABLE employee ( employee_id bigint, name STRING ) STORED AS ORC TBLPROPERTIES (“orc.bloom.filter.columns”=”employee_id”) ; Enable PPD with vectorizing, use CBO and Tez: SET hive.optimize.ppd=true; SET hive.optimize.ppd.storage=true; SET hive.vectorized.execution.enabled=true; SET hive.vectorized.execution.reduce.enabled = true; SET hive.cbo.enable=true; set hive.stats.autogather=true; set hive.compute.query.using.stats=true; set hive.stats.fetch.partition.stats=true; set hive.execution.engine=tez; set hive.stats.fetch.column.stats=true; set hive.map.aggr=true; SET hive.tez.auto.reducer.parallelism=true; Ref: … Read more

PostgreSQL query very slow with limit 1

You can avoid this issue by adding an unneeded ORDER BY clause to the query. SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp, objectID DESC limit 1;

SQL: How to properly check if a record exists

It’s better to use either of the following: — Method 1. SELECT 1 FROM table_name WHERE unique_key = value; — Method 2. SELECT COUNT(1) FROM table_name WHERE unique_key = value; The first alternative should give you no result or one result, the second count should be zero or one. How old is the documentation you’re … Read more

Mysql Improve Search Performance with wildcards (%%)

No, because MySQL will not be able to utilize the index when you have a leading wildcard. If you changed your LIKE to ‘f%’, then it would be able to use the index.

Checking multiple columns for one value

You can use the IN predicate, like so: SELECT * FROM table WHERE 123 IN(col1, col2, col3, col4); SQL Fiddle Demo it’s the opposite version of IN. No it is not, It is the same as using the ORs the way you did in your question. To clarify this: The predicate IN or set membership … Read more

60 million entries, select entries from a certain month. How to optimize database?

To get entries in a particular month, for a particular year, faster – you will need to index the time column: CREATE INDEX idx_time ON ENTRIES(time) USING BTREE; Additionally, use: SELECT e.* FROM ENTRIES e WHERE e.time BETWEEN ‘2010-04-01’ AND DATE_SUB(‘2010-05-01′ INTERVAL 1 SECOND) …because BETWEEN is inclusive, so you’d get anything dated “2010-05-01 00:00:00” … Read more