View Full Version : Size of Wikipedia
Trzone
September 28th, 2008, 10:10 PM
I know that Wikipedia has these so called "dumps" but does anyone know the actual size of all the information wikipedia has? Just curious.
jespdj
September 28th, 2008, 10:15 PM
Where else would you find the answer to that than on ...Wikipedia ?! ;)
http://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia
schauerlich
September 28th, 2008, 10:15 PM
wget -R http://wikipedia.org
(Just kidding, that will take hours and get you nothing useful. :) )
Trzone
September 28th, 2008, 10:28 PM
theo@Ubuntu:~$ wget -R http://wikipedia.org
wget: missing URL
Usage: wget [OPTION]... [URL]...
Try `wget --help' for more options.
I seriously want to know, and well, wikipedia is only specific in terms of books, not actual computer data! hehe
LaRoza
September 28th, 2008, 10:30 PM
theo@Ubuntu:~$ wget -R http://wikipedia.org
wget: missing URL
Usage: wget [OPTION]... [URL]...
Try `wget --help' for more options.
I seriously want to know, and well, wikipedia is only specific in terms of books, not actual computer data! hehe
The data would be stored in a database, MySQL I think they use.
http://stats.wikimedia.org/EN/Sitemap.htm
schauerlich
September 28th, 2008, 10:37 PM
Don't put the "http://" in there.
The data would be stored in a database, MySQL I think they use.
http://stats.wikimedia.org/EN/Sitemap.htm
Also it's -r not -R
t0p
September 28th, 2008, 10:47 PM
Don't put the "http://" in there.
The data would be stored in a database, MySQL I think they use.
http://stats.wikimedia.org/EN/Sitemap.htm
No, the "http://" does need to be in there. But you should use the "-r" flag not the "-R" flag. Ergo~
wget -r http://wikipedia.org
LaRoza
September 28th, 2008, 10:48 PM
No, the "http://" does need to be in there. But you should use the "-r" flag not the "-R" flag. Ergo~
wget -r http://wikipedia.org
It was a suggestion, but I see now.
I should have RTM'd first.
snova
September 28th, 2008, 10:50 PM
A static HTML dump is about 14.3 GB, compressed with 7zip. Does that answer your question?
Trzone
September 28th, 2008, 11:16 PM
I think i want to rephrase my question, it was way too vague. What is the size of the combined databases that wikipedia owns?:)
init1
September 29th, 2008, 01:05 AM
wget -R http://wikipedia.org
(Just kidding, that will take hours and get you nothing useful. :) )
Heh yeah I tried that once :D
Trzone
September 29th, 2008, 11:14 AM
That command doesn't seem to be working :P but um, i think that the statistics are behind by at least 2 years due to the absolute maassive scale that is wikipedia
snova
September 30th, 2008, 01:34 AM
I think i want to rephrase my question, it was way too vague. What is the size of the combined databases that wikipedia owns?:)
You can download that too, as a dump of the database tables. You can always find out by trying to start the download; but then, that's textual SQL and possibly not a good indication of the size of the binary MySQL tables.
Powered by vBulletin® Version 4.2.2 Copyright © 2024 vBulletin Solutions, Inc. All rights reserved.