To query the df by the MultiIndex values, for example where (A > 1.7) and (B < 666):
In [536]: result_df = df.loc[(df.index.get_level_values('A') > 1.7) & (df.index.get_level_values('B') < 666)]
In [537]: result_df
Out[537]:
C
A B
3.3 222 43
333 59
5.5 333 56
Hence, to get for example the ‘A’ index values, if still required:
In [538]: result_df.index.get_level_values('A')
Out[538]: Index([3.3, 3.3, 5.5], dtype=object)
The problem is, that in large data frames the performance of by index selection worse by 10% than the sorted regular rows selection. And in repetitive work, looping, the delay accumulated. See example:
In [558]: df = store.select(STORE_EXTENT_BURSTS_DF_KEY)
In [559]: len(df)
Out[559]: 12857
In [560]: df.sort(inplace=True)
In [561]: df_without_index = df.reset_index()
In [562]: %timeit df.loc[(df.index.get_level_values('END_TIME') > 358200) & (df.index.get_level_values('START_TIME') < 361680)]
1000 loops, best of 3: 562 µs per loop
In [563]: %timeit df_without_index[(df_without_index.END_TIME > 358200) & (df_without_index.START_TIME < 361680)]
1000 loops, best of 3: 507 µs per loop