|
722838b24a
|
Updated to include zstd package
|
2024-02-13 22:35:10 +00:00 |
|
|
3151fce353
|
Update warc_wat_url_processor.py
|
2024-02-08 00:39:09 +00:00 |
|
|
486a68a796
|
Removed multithread compression and added force overwrite for compression files
|
2024-01-28 22:50:09 +00:00 |
|
|
ebc07a6974
|
Update zstd to overwrite conflicts
|
2024-01-28 11:39:00 +00:00 |
|
|
29d24e9826
|
Reverting
|
2024-01-28 11:36:17 +00:00 |
|
|
6d591ef0d0
|
Added in error logging
|
2024-01-28 11:29:53 +00:00 |
|
|
54747b64f6
|
Update warc_wat_url_processor.py
|
2024-01-28 11:23:27 +00:00 |
|
|
b6a9c68140
|
Update warc_wat_url_processor.py
|
2024-01-28 09:14:12 +00:00 |
|
|
d0fa7c84f4
|
Update warc_wat_url_processor.py
|
2024-01-28 09:10:17 +00:00 |
|
|
875082e8d9
|
Update prerequisites.sh
|
2024-01-27 06:31:36 +00:00 |
|
|
ea8aa8f755
|
Update commoncrawl_local_to_share_move.ps1
|
2024-01-27 06:28:44 +00:00 |
|
|
54d85523b4
|
Update commoncrawl_transfer.ps1
|
2024-01-27 06:28:16 +00:00 |
|
|
0014256679
|
Update commoncrawl_transfer.ps1
|
2024-01-27 06:03:49 +00:00 |
|
|
4f46c841b3
|
Updated to include disk space checking
|
2024-01-27 03:02:54 +00:00 |
|
|
488571b4c0
|
Added in Lock file support
|
2024-01-27 02:42:11 +00:00 |
|
|
674a7ae450
|
Update to zst extension
|
2024-01-27 02:26:27 +00:00 |
|
|
7eafadc2fe
|
Update to zst file extension
|
2024-01-27 02:24:36 +00:00 |
|
|
8244b9241a
|
Update concurrency
|
2024-01-26 08:01:39 +00:00 |
|
|
f76dbb13fd
|
Update README.md
|
2024-01-26 07:17:23 +00:00 |
|
|
5661ce44b2
|
Updated loop issue
|
2024-01-26 07:14:25 +00:00 |
|
|
2bee6db85a
|
Updated error processing for zstd compression errors
|
2024-01-25 01:33:11 +00:00 |
|
|
097ec759ab
|
Update prerequisites.sh
|
2024-01-24 06:00:42 +00:00 |
|
|
295c3daba4
|
Update warc_wat_url_processor.py
|
2024-01-23 10:45:32 +00:00 |
|
|
6edffba451
|
Update warc_wat_url_processor.py
|
2024-01-23 04:43:15 +00:00 |
|
|
513b32e80a
|
Update warc_wat_url_processor.py
|
2024-01-20 12:33:28 +00:00 |
|
|
06e3399861
|
Update warc_wat_url_processor.py
|
2024-01-20 12:28:23 +00:00 |
|
|
3ce09f46d7
|
Update prerequisites.sh
|
2024-01-20 11:30:43 +00:00 |
|
|
e98f80aec4
|
Updated to use command line zstd
|
2024-01-20 11:29:57 +00:00 |
|
|
78f6b69cdf
|
Updated with higher compression
|
2024-01-20 11:25:59 +00:00 |
|
|
d1cfd0178f
|
Update prerequisites.sh
|
2024-01-20 11:14:42 +00:00 |
|
|
17b4ce6077
|
Update warc_wat_url_processor.py
|
2024-01-20 03:26:03 +00:00 |
|
|
99c2f07498
|
Add warc_wat_url_processor.py
|
2024-01-20 03:25:50 +00:00 |
|
|
b881641c69
|
Commented out URL Extractions (to be done post downloading of files)
|
2024-01-12 04:02:06 +00:00 |
|
|
cba96e96e7
|
Rollback of change
|
2023-12-22 01:06:30 +00:00 |
|
|
bfc13cb6ef
|
Updated script to keep regenerating a list of files to download
|
2023-12-21 09:35:39 +00:00 |
|
|
50e89b9de2
|
Add in checking for new files once list is depleted
|
2023-12-21 01:58:55 +00:00 |
|
|
0aad853966
|
Update README.md
|
2023-12-20 04:09:47 +00:00 |
|
|
5f152307f2
|
Add commoncrawl_local_to_share_move.ps1
|
2023-12-20 04:09:13 +00:00 |
|
|
fd9376cbe0
|
Updated to extract Pastebin URL's
|
2023-12-19 00:23:55 +00:00 |
|
|
1036de64a7
|
Documentation Update
|
2023-12-18 04:34:20 +00:00 |
|
|
727d2c3187
|
Update commoncrawl_transfer.ps1
|
2023-12-18 04:27:30 +00:00 |
|
|
171d3e2d2d
|
Upload files to "/"
|
2023-12-18 04:27:03 +00:00 |
|
|
65757f8cc4
|
Update README.md
|
2023-12-12 11:04:52 +00:00 |
|
|
1e25ef86fe
|
Upload files to "/"
|
2023-12-12 10:23:58 +00:00 |
|
|
b24287ef6f
|
Upload files to "/"
|
2023-12-12 10:23:37 +00:00 |
|
|
b7ce7aa4b0
|
Upload files to "/"
|
2023-12-12 10:22:31 +00:00 |
|
|
7f9480bc40
|
Update urlextractor_archiveteam.sh
|
2023-12-12 10:22:22 +00:00 |
|
|
ac0f299269
|
Update commoncrawl_url_processor.py
|
2023-12-12 10:05:46 +00:00 |
|
|
ba11c6af9f
|
Update commoncrawl_url_processor.py
|
2023-12-12 10:05:23 +00:00 |
|
|
d97491b4f0
|
Update urlextractor_archiveteam.sh
|
2023-12-10 05:10:10 +00:00 |
|