Improper use of __new__ to generate class instances?

I don’t think using __new__() to do what you want is improper. In other words, I disagree with the accepted answer to this question which claims factory functions are always the “best way to do it”.

If you really want to avoid using it, then the only options are metaclasses or a separate factory function/method (however see Python 3.6+ Update below). Given the choices available, making the __new__() method one — since it’s static by default — is a perfectly sensible approach.

That said, below is what I think is an improved version of your code. I’ve added a couple of class methods to assist in automatically finding all the subclasses. These support the most important way in which it’s better — which is now adding subclasses doesn’t require modifying the __new__() method. This means it’s now easily extensible since it effectively supports what you could call virtual constructors.

A similar implementation could also be used to move the creation of instances out of the __new__() method into a separate (static) factory method — so in one sense the technique shown is just a relatively simple way of coding an extensible generic factory function regardless of what name it’s given.

# Works in Python 2 and 3.

import os
import re

class FileSystem(object):
    class NoAccess(Exception): pass
    class Unknown(Exception): pass

    # Regex for matching "xxx://" where x is any non-whitespace character except for ":".
    _PATH_PREFIX_PATTERN = re.compile(r'\s*([^:]+)://')

    @classmethod
    def _get_all_subclasses(cls):
        """ Recursive generator of all class' subclasses. """
        for subclass in cls.__subclasses__():
            yield subclass
            for subclass in subclass._get_all_subclasses():
                yield subclass

    @classmethod
    def _get_prefix(cls, s):
        """ Extract any file system prefix at beginning of string s and
            return a lowercase version of it or None when there isn't one.
        """
        match = cls._PATH_PREFIX_PATTERN.match(s)
        return match.group(1).lower() if match else None

    def __new__(cls, path):
        """ Create instance of appropriate subclass using path prefix. """
        path_prefix = cls._get_prefix(path)

        for subclass in cls._get_all_subclasses():
            if subclass.prefix == path_prefix:
                # Using "object" base class method avoids recursion here.
                return object.__new__(subclass)
        else:  # No subclass with matching prefix found (& no default defined)
            raise FileSystem.Unknown(
                'path "{}" has no known file system prefix'.format(path))

    def count_files(self):
        raise NotImplementedError


class Nfs(FileSystem):
    prefix = 'nfs'

    def __init__ (self, path):
        pass

    def count_files(self):
        pass


class LocalDrive(FileSystem):
    prefix = None  # Default when no file system prefix is found.

    def __init__(self, path):
        if not os.access(path, os.R_OK):
            raise FileSystem.NoAccess('Cannot read directory')
        self.path = path

    def count_files(self):
        return sum(os.path.isfile(os.path.join(self.path, filename))
                     for filename in os.listdir(self.path))


if __name__ == '__main__':

    data1 = FileSystem('nfs://192.168.1.18')
    data2 = FileSystem('c:/')  # Change as necessary for testing.

    print(type(data1).__name__)  # -> Nfs
    print(type(data2).__name__)  # -> LocalDrive

    print(data2.count_files())  # -> <some number>

Python 3.6+ Update

The code above works in both Python 2 and 3.x. However in Python 3.6 a new class method was added to object named __init_subclass__() which makes the finding of subclasses simpler by using it to automatically create a “registry” of them instead of potentially having to check every subclass recursively as the _get_all_subclasses() method is doing in the above.

I got the idea of using __init_subclass__() to do this from the Subclass registration section in the PEP 487 — Simpler customisation of class creation proposal. Since the method will be inherited by all the base class’ subclasses, registration will automatically be done for sub-subclasses, too (as opposed to only to direct subclasses) — it completely eliminates the need for a method like _get_all_subclasses().

# Requires Python 3.6+

import os
import re

class FileSystem(object):
    class NoAccess(Exception): pass
    class Unknown(Exception): pass

    # Pattern for matching "xxx://"  # x is any non-whitespace character except for ":".
    _PATH_PREFIX_PATTERN = re.compile(r'\s*([^:]+)://')
    _registry = {}  # Registered subclasses.

    @classmethod
    def __init_subclass__(cls, /, path_prefix, **kwargs):
        super().__init_subclass__(**kwargs)
        cls._registry[path_prefix] = cls  # Add class to registry.

    @classmethod
    def _get_prefix(cls, s):
        """ Extract any file system prefix at beginning of string s and
            return a lowercase version of it or None when there isn't one.
        """
        match = cls._PATH_PREFIX_PATTERN.match(s)
        return match.group(1).lower() if match else None

    def __new__(cls, path):
        """ Create instance of appropriate subclass. """
        path_prefix = cls._get_prefix(path)
        subclass = cls._registry.get(path_prefix)
        if subclass:
            return object.__new__(subclass)
        else:  # No subclass with matching prefix found (and no default).
            raise cls.Unknown(
                f'path "{path}" has no known file system prefix')

    def count_files(self):
        raise NotImplementedError


class Nfs(FileSystem, path_prefix='nfs'):
    def __init__ (self, path):
        pass

    def count_files(self):
        pass

class Ufs(Nfs, path_prefix='ufs'):
    def __init__ (self, path):
        pass

    def count_files(self):
        pass

class LocalDrive(FileSystem, path_prefix=None):  # Default file system.
    def __init__(self, path):
        if not os.access(path, os.R_OK):
            raise self.NoAccess(f'Cannot read directory {path!r}')
        self.path = path

    def count_files(self):
        return sum(os.path.isfile(os.path.join(self.path, filename))
                     for filename in os.listdir(self.path))


if __name__ == '__main__':

    data1 = FileSystem('nfs://192.168.1.18')
    data2 = FileSystem('c:/')  # Change as necessary for testing.
    data4 = FileSystem('ufs://192.168.1.18')

    print(type(data1))  # -> <class '__main__.Nfs'>
    print(type(data2))  # -> <class '__main__.LocalDrive'>
    print(f'file count: {data2.count_files()}')  # -> file count: <some number>

    try:
        data3 = FileSystem('c:/foobar')  # A non-existent directory.
    except FileSystem.NoAccess as exc:
        print(f'{exc} - FileSystem.NoAccess exception raised as expected')
    else:
        raise RuntimeError("Non-existent path should have raised Exception!")

    try:
        data4 = FileSystem('foobar://42')  # Unregistered path prefix.
    except FileSystem.Unknown as exc:
        print(f'{exc} - FileSystem.Unknown exception raised as expected')
    else:
        raise RuntimeError("Unregistered path prefix should have raised Exception!")

Leave a Comment