Go to file
2023-12-12 09:20:09 +00:00
blogger_remove_img_lines.py Upload files to "/" 2023-12-12 09:11:22 +00:00
blogger_url_cleaner.py Upload files to "/" 2023-12-12 09:11:22 +00:00
README.md Upload files to "/" 2023-12-12 09:20:09 +00:00

Overview:

Use the following scripts to extract urls from .txt.gz files and output to a txt file.

Depending on the types of URL's that are being processed you will either need to only use "blogger_url_clearner.py" (plainly extract the urls from a file) or also use "blogger_remove_img_lines.py" which will read the txt file and output all lines that do not contain "jpg|png|gif|jpeg"