Merging two tables with millions of rows in Python
This is a little pseudo codish, but I think should be quite fast. Straightforward disk based merge, with all tables on disk. The key is that you are not doing selection per se, just indexing into the table via start/stop, which is quite fast. Selecting the rows that meet a criteria in B (using A’s … Read more