datechnoman
  • Joined on 2023-03-10
datechnoman pushed to main at ArchiveTeam/CommonCrawl_URL_Processor 2024-03-31 11:38:21 +00:00
8b7607c7e6 Update prerequisites.sh
datechnoman pushed to main at ArchiveTeam/CommonCrawl_URL_Processor 2024-03-31 11:38:08 +00:00
a882482f52 Delete rsyncd.conf
datechnoman pushed to main at ArchiveTeam/CommonCrawl_URL_Processor 2024-03-31 11:38:03 +00:00
1849c03cdf Delete rsyncd.passwd
datechnoman pushed to main at ArchiveTeam/CommonCrawl_URL_Processor 2024-03-31 11:36:00 +00:00
datechnoman pushed to main at ArchiveTeam/CommonCrawl_URL_Processor 2024-03-31 11:34:57 +00:00
b33c8148a4 Upload files to "test"
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-14 03:51:17 +00:00
caeec000c3 Roll back to prechange
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-13 04:34:50 +00:00
9e94e1e108 Added in removed cdxcount and cdxsummary
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-13 04:29:56 +00:00
9793f0e9db Updated to change JSON output for tophosts file
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-12 10:58:58 +00:00
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-12 04:22:09 +00:00
d36d5ca79e Update comment with correct directory
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-12 04:19:45 +00:00
27a2b56936 Update due to bug in folder location
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-12 04:08:46 +00:00
7b4651b07e Updated to delete older tophost json files
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-12 02:38:09 +00:00
3fb177fb70 Updated to move tophost json file
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-11 23:47:53 +00:00
eaa8278db2 Update urls_automated_cdx_processor.py
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-03-11 23:32:26 +00:00
7106415581 Add urls_automated_cdx_processor.py
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveTeam_Project_U... 2024-03-02 00:47:04 +00:00
3649ea306f Update archiveteam_project_url_extractor.py
datechnoman pushed to main at ArchiveTeam/Migrated_CommonCrawl_WAT_Path_... 2024-02-20 11:09:14 +00:00
9144f85820 Bug fix for files not comparing correctly
datechnoman pushed to main at ArchiveTeam/Migrated_ArchiveOrg_CDX_Stats_... 2024-02-17 12:59:53 +00:00
9cb12880cc Updated to 72 hour retention
datechnoman pushed to main at ArchiveTeam/CommonCrawl_URL_Processor 2024-02-15 11:30:02 +00:00
8ecbfc8696 Add archive_org_url_processor.py