Spark: Inconsistent performance number in scaling number of cores

Theoretical limitations I assume you are familiar Amdahl’s law but here is a quick reminder. Theoretical speedup is defined as followed : where : s – is the speedup of the parallel part. p – is fraction of the program that can be parallelized. In practice theoretical speedup is always limited by the part that … Read more

Which is faster: multiple single INSERTs or one multiple-row INSERT?

https://dev.mysql.com/doc/refman/8.0/en/insert-optimization.html The time required for inserting a row is determined by the following factors, where the numbers indicate approximate proportions: Connecting: (3) Sending query to server: (2) Parsing query: (2) Inserting row: (1 × size of row) Inserting indexes: (1 × number of indexes) Closing: (1) From this it should be obvious, that sending one … Read more