Check out linkchecker—it will crawl the site (while obeying robots.txt
) and generate a report. From there, you can script up a solution for creating the directory tree.
More Related Contents:
- How to make g++ search for header files in a specific directory?
- Iterate through folders, then subfolders and print filenames with path to text file
- Difference Between getcwd() and dirname(__FILE__) ? Which should I use?
- Batch file – Write list of files to variable
- Find the current directory and file’s directory [duplicate]
- How do I get the full path of the current file’s directory?
- How can I find script’s directory? [duplicate]
- How to delete files/subfolders in a specific directory at the command prompt in Windows
- List all files in one directory PHP [duplicate]
- How to config nltk data directory from code?
- PHP list all files in directory [duplicate]
- Escaping backslash (\) in string or paths in R
- Creating a new directory in C
- Deleting all files from a folder using PHP?
- How do I move a single folder from one Subversion repository to another repository?
- Fetch contents(loaded through AJAX call) of a web page
- Get names of all files from a folder with Ruby
- How to exclude this / current / dot folder from find “type d”
- Get parent directory of running script
- Why is it whenever I use scandir() I receive periods at the beginning of the array?
- Python not recognising directories os.path.isdir() [duplicate]
- How to list all folder with size via batch file
- Nodejs: Async request with a list of URL
- Getting current directory in .NET web application
- Extract a part of the filepath (a directory) in Python
- deleting folder from java [duplicate]
- Submit form with no submit button in rvest
- Finding the layers and layer sizes for each Docker image
- C#: How to open Windows Explorer windows with a number of files selected
- Handling directories with spaces Python subprocess.call()