Functional pipes in python like %>% from R’s magrittr

Pipes are a new feature in Pandas 0.16.2. Example: import pandas as pd from sklearn.datasets import load_iris x = load_iris() x = pd.DataFrame(x.data, columns=x.feature_names) def remove_units(df): df.columns = pd.Index(map(lambda x: x.replace(” (cm)”, “”), df.columns)) return df def length_times_width(df): df[‘sepal length*width’] = df[‘sepal length’] * df[‘sepal width’] df[‘petal length*width’] = df[‘petal length’] * df[‘petal width’] x.pipe(remove_units).pipe(length_times_width) … Read more

Writing console output to a file – file is unexpectedly empty

Guenther Schmitz’ answer is effective, but it’s worth explaining why: Your Out-File -FilePath C:Filepath is a stand-alone command that receives no input. An Out-File call with no input simply creates an empty file (0 bytes). In order for cmdlets such as Out-File to receive input from (an)other command(s) (represented as … below), you must use … Read more

sklearn pipeline – how to apply different transformations on different columns

The way I usually do it is with a FeatureUnion, using a FunctionTransformer to pull out the relevant columns. Important notes: You have to define your functions with def since annoyingly you can’t use lambda or partial in FunctionTransformer if you want to pickle your model You need to initialize FunctionTransformer with validate=False Something like … Read more

Sklearn Pipeline: Get feature names after OneHotEncode In ColumnTransformer

You can access the feature_names using the following snippet: clf.named_steps[‘preprocessor’].transformers_[1][1]\ .named_steps[‘onehot’].get_feature_names(categorical_features) Using sklearn >= 0.21 version, we can make it even simpler: clf[‘preprocessor’].transformers_[1][1]\ [‘onehot’].get_feature_names(categorical_features) Reproducible example: import numpy as np import pandas as pd from sklearn.impute import SimpleImputer from sklearn.preprocessing import OneHotEncoder, StandardScaler from sklearn.pipeline import Pipeline from sklearn.compose import ColumnTransformer from sklearn.linear_model import LinearRegression … Read more

Getting model attributes from pipeline

Did you look at the documentation: http://scikit-learn.org/dev/modules/pipeline.html I feel it is pretty clear. Update: in 0.21 you can use just square brackets: pipeline[‘pca’] or indices pipeline[1] There are two ways to get to the steps in a pipeline, either using indices or using the string names you gave: pipeline.named_steps[‘pca’] pipeline.steps[1][1] This will give you the … Read more

Pipe complete array-objects instead of array items one at a time?

Short answer: use unary array operator ,: ,$theArray | foreach{Write-Host $_} Long answer: there is one thing you should understand about @() operator: it always interpret its content as statement, even if content is just an expression. Consider this code: $a=”A”,’B’,’C’ $b=@($a;) $c=@($b;) I add explicit end of statement mark ; here, although PowerShell allows … Read more