scala-collections - w3toppers.com

scala parallel collections degree of parallelism

With the newest trunk, using the JVM 1.6 or newer, use the: collection.parallel.ForkJoinTasks.defaultForkJoinPool.setParallelism(parlevel: Int) This may be a subject to changes in the future, though. A more unified approach to configuring all Scala task parallel APIs is planned for the next releases. Note, however, that while this will determine the number of processors the query … Read more

Scala best way of turning a Collection into a Map-by-key?

You can use c map (t => t.getP -> t) toMap but be aware that this needs 2 traversals.

Scala: Parallel collection in object initializer causes a program to hang

This is an inherent problem which can happen in Scala when releasing a reference to the singleton object before the construction is complete. It happens due to a different thread trying to access the object ParCollectionInInitializerTest before it has been fully constructed. It has nothing to do with the main method, rather, it has to … Read more

Difference between MutableList and ListBuffer

A little explanation on how they work. ListBuffer uses internally Nil and :: to build an immutable List and allows constant-time removal of the first and last elements. To do so, it keeps a pointer on the first and last element of the list, and is actually allowed to change the head and tail of … Read more

Convert Java Map to Scala Map

Edit: the recommended way is now to use JavaConverters and the .asScala method: import scala.collection.JavaConverters._ val myScalaMap = myJavaMap.asScala.mapValues(_.asScala.toSet) This has the advantage of not using magical implicit conversions but explicit calls to .asScala, while staying clean and consise. The original answer with JavaConversions: You can use scala.collection.JavaConversions to implicitly convert between Java and Scala: … Read more

Scala Sets contain the same elements, but sameElements() returns false

The Scala collections library provides specialised implementations for Sets of fewer than 5 values (see the source). The iterators for these implementations return elements in the order in which they were added, rather than the consistent, hash-based ordering used for larger Sets. Furthermore, sameElements (scaladoc) is defined on Iterables (it is implemented in IterableLike – … Read more

Reduce, fold or scan (Left/Right)?

In general, all 6 fold functions apply a binary operator to each element of a collection. The result of each step is passed on to the next step (as input to one of the binary operator’s two arguments). This way we can cumulate a result. reduceLeft and reduceRight cumulate a single result. foldLeft and foldRight … Read more

Scala: Why does Seq.contains take an Any argument, instead of an argument of the sequence type?

Lots of interesting answers, but here’s my own theory: if contains did not receive an Any, then Seq could not be co-variant. See, for instance, Set, which is not co-variant and whose contains take an A instead of an Any. The reasons for that is left as an exercise to the reader. 😉 But here … Read more

Converting a Java collection into a Scala collection

For future reference: With Scala 2.8, it could be done like this: import scala.collection.JavaConversions._ val list = new java.util.ArrayList[String]() list.add(“test”) val set = list.toSet set is a scala.collection.immutable.Set[String] after this. Also see Ben James’ answer for a more explicit way (using JavaConverters), which seems to be recommended now.

Function returns an empty List in Spark

It happens because filesInZip is not shared between workers. foreach operates on a local copy of filesInZip and when it finishes this copy is simply discarded and garbage collected. If you want to keep the results you should use transformation (most likely a flatMap) and return collected aggregated values. def listFiles(stream: PortableDataStream): TraversableOnce[String] = ??? … Read more