You can use MultiLabelBinarizer present in scikit which is specifically used for doing this.
Code for your example:
features = [
['f1', 'f2', 'f3'],
['f2', 'f4', 'f5', 'f6'],
['f1', 'f2']
]
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
new_features = mlb.fit_transform(features)
Output:
array([[1, 1, 1, 0, 0, 0],
[0, 1, 0, 1, 1, 1],
[1, 1, 0, 0, 0, 0]])
This can also be used in a pipeline, along with other feature_selection utilities.