Apache Spark — Assign the result of UDF to multiple dataframe columns

It is not possible to create multiple top level columns from a single UDF call but you can create a new struct. It requires an UDF with specified returnType: from pyspark.sql.functions import udf from pyspark.sql.types import StructType, StructField, FloatType schema = StructType([ StructField(“foo”, FloatType(), False), StructField(“bar”, FloatType(), False) ]) def udf_test(n): return (n / 2, … Read more

How to call a user defined Matlab from Java using matlabcontrol.jar

You must have any user-defined m-files on the MATLAB search path, just as if you were working normally inside MATLAB. I tested with the following example: C:\some\path\myfunc.m function myfunc() disp(‘hello from MYFUNC’) end HelloWorld.java import matlabcontrol.*; public class HelloWorld { public static void main(String[] args) throws MatlabConnectionException, MatlabInvocationException { // create proxy MatlabProxyFactoryOptions options = … Read more

Spark column string replace when present in other column (row)

You could simply use regexp_replace df5.withColumn(“sentence_without_label”, regexp_replace($”sentence” , lit($”label”), lit(“” ))) or you can use simple udf function as below val df5 = spark.createDataFrame(Seq( (“Hi I heard about Spark”, “Spark”), (“I wish Java could use case classes”, “Java”), (“Logistic regression models are neat”, “models”) )).toDF(“sentence”, “label”) val replace = udf((data: String , rep : String)=>data.replaceAll(rep, … Read more

Does query plan optimizer works well with joined/filtered table-valued functions?

In this case, it’s an “inline table valued function” The optimiser simply expands (unnests) it if it’s useful (or view). If the function is treated as “black box” by the outer query, the quickest way is to compare IO shown in SSMS vs IO in profiler. Profler captures “black box” IO that SSMS does not. … Read more

PHP Optional Parameters – specify parameter value by name?

No, in PHP that is not possible as of writing. Use array arguments: function doSomething($arguments = array()) { // set defaults $arguments = array_merge(array( “argument” => “default value”, ), $arguments); var_dump($arguments); } Example usage: doSomething(); // with all defaults, or: doSomething(array(“argument” => “other value”)); When changing an existing method: //function doSomething($bar, $baz) { function doSomething($bar, … Read more

How to pass a constant value to Python UDF?

Everything that is passed to an UDF is interpreted as a column / column name. If you want to pass a literal you have two options: Pass argument using currying: def comparatorUDF(n): return udf(lambda c: c == n, BooleanType()) df.where(comparatorUDF(“Bonsanto”)(col(“name”))) This can be used with an argument of any type as long as it is … Read more