250
250
2015
2015
2015
by
Internet Archive
web
eye 250
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl444.us.archive.org:youtube from Wed Dec 2 22:57:51 PST 2015 to Wed Dec 2 15:27:43 PST 2015.
Topic: crawldata
16
16
2015
2015
2015
by
Archive-It
web
eye 16
favorite 0
comment 0
recurrence=QUARTERLY, maxDuration=604800, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=false, oneTimeSubtype=null, seedCount=55, accountId=550, accountType=SUBSCRIBER, organizationName="Innsbruck Newspaper Archive / University of Innsbruck", collectionId=2697, collectionName="DILIMAG", collectionPublic=true
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Fri Jan 9 06:21:58 PST 2015 to Thu Jan 8 22:38:29 PST 2015.
Topic: crawldata
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 13 August 2015 and 5 November 2015 mainly from .PT domain. The AWP18 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP17 as baseline. Thus, the files that remained unchanged from the AWP17 complete crawl were not archived (duplicated) on the AWP18 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
50
50
2015
2015
2015
by
Internet Archive
web
eye 50
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl819.us.archive.org:youtube from Sun Dec 27 00:41:25 PST 2015 to Sat Dec 26 17:27:37 PST 2015.
Topic: crawldata
61
61
web
eye 61
favorite 0
comment 0
12
12
2015
2015
2015
by
Archive-It
web
eye 12
favorite 0
comment 0
1
1.0
web
eye 1
favorite 0
comment 0
Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
7
7.0
web
eye 7
favorite 0
comment 0
127
127
2015
2015
2015
by
العيد في ظل الخلافة
movies
eye 127
favorite 0
comment 0
العيد في ظل الخلافة
Topic: العيد في ظل الخلافة
jobId=172834, recurrence=BIMONTHLY, maxDuration=259200, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=false, oneTimeSubtype=null, seedCount=1, accountId=805, accountType=SUBSCRIBER, organizationName="Memphis University School ", collectionId=6269, collectionName="Memphis University School Owls Tube Video Collection", collectionPublic=true
Incremental crawl of the Portuguese web performed between 13 August 2015 and 5 November 2015 mainly from .PT domain. The AWP18 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP17 as baseline. Thus, the files that remained unchanged from the AWP17 complete crawl were not archived (duplicated) on the AWP18 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
16
16
2015
2015
2015
by
Archive-It
web
eye 16
favorite 0
comment 0
jobId=182622, recurrence=DAILY, maxDuration=82800, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=false, oneTimeSubtype=null, seedCount=1, accountId=946, accountType=SUBSCRIBER, organizationName="Vassar College", collectionId=5543, collectionName="Vassar Homepage", collectionPublic=true
72
72
web
eye 72
favorite 0
comment 0
11
11
2015
2015
2015
by
Archive-It
web
eye 11
favorite 0
comment 0
recurrence=WEEKLY, maxDuration=259200, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=false, oneTimeSubtype=null, seedCount=1, accountId=305, accountType=SUBSCRIBER, organizationName="Alaska State Library", collectionId=1084, collectionName="Government in Alaska", collectionPublic=true
jobId=187218, recurrence=NONE, maxDuration=86400, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=true, oneTimeSubtype=MISSING_URLS_PATCH_CRAWL, seedCount=49, accountId=413, accountType=SUBSCRIBER, organizationName="Michigan State University", collectionId=2356, collectionName="MSU Colleges, Schools, Research Centers & Institutes Collection", collectionPublic=true
jobId=169703, recurrence=NONE, maxDuration=86400, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=false, oneTimeSubtype=CRAWL_SELECTED_SEEDS, seedCount=12, accountId=336, accountType=SUBSCRIBER, organizationName="University of Hawaii", collectionId=3322, collectionName="University of Hawaii at Manoa Schedule of Classes", collectionPublic=true
1
1.0
web
eye 1
favorite 0
comment 0
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
33
33
2015
2015
2015
by
Internet Archive
web
eye 33
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl443.us.archive.org:youtube from Mon Oct 19 19:19:53 PDT 2015 to Mon Oct 19 13:55:25 PDT 2015.
Topic: crawldata
jobId=187231, recurrence=NONE, maxDuration=86400, maxDocumentCount=null, isTestCrawl=false, isPatchCrawl=true, oneTimeSubtype=MISSING_URLS_PATCH_CRAWL, seedCount=7, accountId=950, accountType=SUBSCRIBER, organizationName="UC Davis", collectionId=5778, collectionName="University of California, Davis Web Archives", collectionPublic=true
0
0.0
web
eye 0
favorite 0
comment 0
9
9.0
2015
2015
2015
by
Sean Hannity
audio
eye 9
favorite 0
comment 0
Topics: Sean Hannity, The Sean Hannity Show
Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
169
169
2015
2015
2015
by
إن أردتم أن تصونوا دمائكم فإما الإسلام وإما الجزية
movies
eye 169
favorite 0
comment 0
إن أردتم أن تصونوا دمائكم فإما الإسلام وإما الجزية
Topic: إن أردتم أن تصونوا دمائكم فإما الإسلام وإما الجزية
202
202
2015
2015
2015
by
هاجت الأشواق
movies
eye 202
favorite 0
comment 0
هاجت الأشواق
Topic: هاجت الأشواق
41
41
web
eye 41
favorite 0
comment 0
67
67
web
eye 67
favorite 0
comment 0
14
14
2015
2015
2015
by
forqan.wa.aato.zakat.original.quality
movies
eye 14
favorite 0
comment 0
forqan.wa.aato.zakat.original.quality
Topic: forqan.wa.aato.zakat.original.quality
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Wed Jan 7 01:32:03 PST 2015 to Tue Jan 6 17:48:48 PST 2015.
Topic: crawldata
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Tue Jan 6 04:04:03 PST 2015 to Mon Jan 5 23:09:12 PST 2015.
Topic: crawldata
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Tue Jan 6 03:13:13 PST 2015 to Tue Jan 6 00:24:00 PST 2015.
Topic: crawldata
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Tue Jan 6 13:18:11 PST 2015 to Tue Jan 6 05:37:35 PST 2015.
Topic: crawldata
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Thu Jan 8 23:50:29 PST 2015 to Thu Jan 8 16:14:29 PST 2015.
Topic: crawldata
Internet Archive crawldata from the NARA 113th Congressional Crawl, captured by wbgrp-crawl013.us.archive.org:congress113th from Fri Jan 9 10:58:18 PST 2015 to Fri Jan 9 03:19:18 PST 2015.
Topic: crawldata
58
58
2015
2015
2015
by
Internet Archive
web
eye 58
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl445.us.archive.org:youtube from Fri Oct 30 23:43:01 PDT 2015 to Fri Oct 30 17:12:49 PDT 2015.
Topic: crawldata
85
85
2015
2015
2015
by
Internet Archive
web
eye 85
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl443.us.archive.org:youtube from Fri Nov 6 17:35:41 PST 2015 to Fri Nov 6 09:50:10 PST 2015.
Topic: crawldata
41
41
2015
2015
2015
by
Internet Archive
web
eye 41
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl445.us.archive.org:youtube from Wed Nov 4 11:57:13 PST 2015 to Wed Nov 4 05:17:58 PST 2015.
Topic: crawldata
32
32
2015
2015
2015
by
Internet Archive
web
eye 32
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl440.us.archive.org:youtube from Mon Nov 23 22:08:31 PST 2015 to Mon Nov 23 15:33:59 PST 2015.
Topic: crawldata
46
46
2015
2015
2015
by
Internet Archive
web
eye 46
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl440.us.archive.org:youtube from Wed Nov 25 11:33:58 PST 2015 to Wed Nov 25 04:53:13 PST 2015.
Topic: crawldata
56
56
2015
2015
2015
by
Internet Archive
web
eye 56
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl440.us.archive.org:youtube from Thu Nov 26 21:08:40 PST 2015 to Thu Nov 26 14:15:48 PST 2015.
Topic: crawldata