selecting top N rows for each group in a table

If you’re using SQL Server 2005 or newer, you can use the ranking functions and a CTE to achieve this: ;WITH HairColors AS (SELECT id, name, hair, score, ROW_NUMBER() OVER(PARTITION BY hair ORDER BY score DESC) as ‘RowNum’ ) SELECT id, name, hair, score FROM HairColors WHERE RowNum <= 3 This CTE will “partition” your … Read more

Row Rank in a MySQL View

Use: SELECT t.id, t.variety, (SELECT COUNT(*) FROM TABLE WHERE id < t.id) +1 AS NUM FROM TABLE t It’s not an ideal manner of doing this, because the query for the num value will execute for every row returned. A better idea would be to create a NUMBERS table, with a single column containing a … Read more

How do I find the closest values in a Pandas series to an input number?

You could use argsort() like Say, input = 3 In [198]: input = 3 In [199]: df.iloc[(df[‘num’]-input).abs().argsort()[:2]] Out[199]: num 2 4 4 2 df_sort is the dataframe with 2 closest values. In [200]: df_sort = df.iloc[(df[‘num’]-input).abs().argsort()[:2]] For index, In [201]: df_sort.index.tolist() Out[201]: [2, 4] For values, In [202]: df_sort[‘num’].tolist() Out[202]: [4, 2] Detail, for the … Read more

A better similarity ranking algorithm for variable length strings

Simon White of Catalysoft wrote an article about a very clever algorithm that compares adjacent character pairs that works really well for my purposes: http://www.catalysoft.com/articles/StrikeAMatch.html Simon has a Java version of the algorithm and below I wrote a PL/Ruby version of it (taken from the plain ruby version done in the related forum entry comment … Read more

Using LIMIT within GROUP BY to get N results per group?

You could use GROUP_CONCAT aggregated function to get all years into a single column, grouped by id and ordered by rate: SELECT id, GROUP_CONCAT(year ORDER BY rate DESC) grouped_year FROM yourtable GROUP BY id Result: ———————————————————– | ID | GROUPED_YEAR | ———————————————————– | p01 | 2006,2003,2008,2001,2007,2009,2002,2004,2005,2000 | | p02 | 2001,2004,2002,2003,2000,2006,2007 | ———————————————————– And then … Read more