tea.mathoverflow.net - Discussion Feed (MO public database dumps) 2018-11-04T12:58:47-08:00 http://mathoverflow.tqft.net/ Lussumo Vanilla & Feed Publisher darijgrinberg comments on "MO public database dumps" (22074) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=22074#Comment_22074 2013-05-09T18:17:41-07:00 2018-11-04T12:58:47-08:00 darijgrinberg http://mathoverflow.tqft.net/account/478/ About half a year has passed since the last one... About half a year has passed since the last one...

]]>
Anton Geraschenko comments on "MO public database dumps" (19672) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=19672#Comment_19672 2012-08-13T16:36:29-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ @darijgrinberg: I will definitely generate a new public dump before the 2.0 update, and will maintain redundant copies of the full dump. @darijgrinberg: I will definitely generate a new public dump before the 2.0 update, and will maintain redundant copies of the full dump.

]]>
darijgrinberg comments on "MO public database dumps" (19650) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=19650#Comment_19650 2012-08-13T01:30:56-07:00 2018-11-04T12:58:47-08:00 darijgrinberg http://mathoverflow.tqft.net/account/478/ How about a dump before the 2.0 update? How about a dump before the 2.0 update?

]]>
darijgrinberg comments on "MO public database dumps" (18889) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=18889#Comment_18889 2012-04-12T14:40:13-07:00 2018-11-04T12:58:47-08:00 darijgrinberg http://mathoverflow.tqft.net/account/478/ Thanks for making it work again! Thanks for making it work again!

]]>
Anton Geraschenko comments on "MO public database dumps" (18885) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=18885#Comment_18885 2012-04-12T08:31:28-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Permissions issue resolved. The most recent dumps are now available. Permissions issue resolved. The most recent dumps are now available.

]]>
Anton Geraschenko comments on "MO public database dumps" (18879) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=18879#Comment_18879 2012-04-09T11:43:39-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Sorry, my fault. Unfortunately, I have to do some things manually each time. For a while, it became overwhelming because the database export feature broke (it's working again now that MO is on a new ... Sorry, my fault. Unfortunately, I have to do some things manually each time. For a while, it became overwhelming because the database export feature broke (it's working again now that MO is on a new server), so I had to contact SE by email to get the dumps. I still got the snapshots, but didn't clean them up for public consumption. I'll get the last few dumps up within the hour. [ok, maybe a bit more than an hour; my up-bandwidth isn't so great and the dumps are >100MB each]

Edit: hmmm ... I'm running into some permissions issues. I'm checking with Scott to see if he changed something.

]]>
darijgrinberg comments on "MO public database dumps" (18876) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=18876#Comment_18876 2012-04-08T15:59:40-07:00 2018-11-04T12:58:47-08:00 darijgrinberg http://mathoverflow.tqft.net/account/478/ So what happened to the dumping? I thought there was some script doing that automatically, or does Anton have to tweak things every time? So what happened to the dumping? I thought there was some script doing that automatically, or does Anton have to tweak things every time?

]]>
Harald Hanche-Olsen comments on "MO public database dumps" (18545) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=18545#Comment_18545 2012-02-21T07:42:29-08:00 2018-11-04T12:58:47-08:00 Harald Hanche-Olsen http://mathoverflow.tqft.net/account/18/ @Anton: Too busy to make notes of new dumps here? These alerts are much appreciated, if only for avoiding this thread getting buried too deep. It's the place to go when looking for fresh dumps, after ... @Anton: Too busy to make notes of new dumps here? These alerts are much appreciated, if only for avoiding this thread getting buried too deep. It's the place to go when looking for fresh dumps, after all. Oh, and the latest dump is dated 2012-01-03. Is the next one overdue?

]]>
Anton Geraschenko comments on "MO public database dumps" (15349) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=15349#Comment_15349 2011-08-01T23:06:05-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ There's a new dump posted. There's a new dump posted.

]]>
Anton Geraschenko comments on "MO public database dumps" (14889) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=14889#Comment_14889 2011-07-07T11:34:33-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ There's a new dump posted. There's a new dump posted.

]]>
Anton Geraschenko comments on "MO public database dumps" (14675) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=14675#Comment_14675 2011-06-04T11:31:31-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ There's a new dump posted. There's a new dump posted.

]]>
Anton Geraschenko comments on "MO public database dumps" (14437) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=14437#Comment_14437 2011-05-10T01:57:44-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Sorry for not posting a dump last month, and for being late this month. My excuse is that I've been writing my dissertation. There's a new dump posted. Sorry for not posting a dump last month, and for being late this month. My excuse is that I've been writing my dissertation.

There's a new dump posted.

]]>
Anton Geraschenko comments on "MO public database dumps" (13631) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=13631#Comment_13631 2011-03-13T12:58:27-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ There's a new dump posted. There's a new dump posted.

]]>
Anton Geraschenko comments on "MO public database dumps" (13079) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=13079#Comment_13079 2011-02-04T08:42:45-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ There's a new dump posted. There's a new dump posted.

]]>
Anton Geraschenko comments on "MO public database dumps" (12308) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=12308#Comment_12308 2011-01-02T00:07:03-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ There's a new dump for the new year. There are a couple of changes (mostly discussed in the few posts above this one). The biggest one is that post histories are now included! See the readme for ... There's a new dump for the new year.

There are a couple of changes (mostly discussed in the few posts above this one). The biggest one is that post histories are now included! See the readme for details about what all the fields mean.

]]>
Anton Geraschenko comments on "MO public database dumps" (11806) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=11806#Comment_11806 2010-12-13T01:04:59-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Okay, I've updated the public dump script to include the RevisionGUID for each revision (that's right, the dumps will contain revision histories starting 2011) and GravatarHash for each ... Okay, I've updated the public dump script to include the RevisionGUID for each revision (that's right, the dumps will contain revision histories starting 2011) and GravatarHash for each user.

LastEditorDisplayName isn't in the non-public table unless the user has been deleted (and so LastEditorUserId doesn't exist). Looking the the SO public dumps (at least the Stack Apps dump, since that's the smallest one to deal with), it looks like LastEditorDisplayName is always given, but is always empty!

]]>
Scott Morrison comments on "MO public database dumps" (11799) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=11799#Comment_11799 2010-12-12T17:56:39-08:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Re: my post above on Dec 4th. I see you've removed the reference to LastEditorDisplayName in the README.txt. Can't we instead include all of LastActivityDate, LastActivityUserId and ... Re: my post above on Dec 4th.

I see you've removed the reference to LastEditorDisplayName in the README.txt. Can't we instead include all of LastActivityDate, LastActivityUserId and LastActivityDisplayName, in the public dumps? Surely this is public data too!

]]>
Scott Morrison comments on "MO public database dumps" (11798) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=11798#Comment_11798 2010-12-12T17:53:50-08:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Two requests for the public dumps: Include the GUID for each revision, in posthistory.xml. This is public information, contained in the URL for "view source" on the list of ... Two requests for the public dumps:

  1. Include the GUID for each revision, in posthistory.xml. This is public information, contained in the URL for "view source" on the list of revisions.
  2. Include the Gravatar hash: take the user's email address or last login IP address, as a string, and compute the MD5 hash.
]]>
Scott Morrison comments on "MO public database dumps" (11257) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=11257#Comment_11257 2010-12-04T12:24:58-08:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ The README.txt for the dumps says that posts.xml should contain a field "LastEditorDisplayName". It doesn't seem to. The README.txt for the dumps says that posts.xml should contain a field "LastEditorDisplayName". It doesn't seem to.

]]>
Anton Geraschenko comments on "MO public database dumps" (11178) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=11178#Comment_11178 2010-12-01T15:03:38-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ I've just posted the December dump. I've just posted the December dump.

]]>
Anton Geraschenko comments on "MO public database dumps" (10153) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=10153#Comment_10153 2010-11-03T16:45:07-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ I've just posted the November dump. I've just posted the November dump.

]]>
Anton Geraschenko comments on "MO public database dumps" (9593) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=9593#Comment_9593 2010-10-14T22:29:55-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ No reason to censor; it just takes a bit of work to include. I've been meaning to do it for a while, but I haven't gotten around to it yet. When I originally wrote the code to generate public dumps, ... No reason to censor; it just takes a bit of work to include. I've been meaning to do it for a while, but I haven't gotten around to it yet. When I originally wrote the code to generate public dumps, I modeled it after the SO public dumps, which don't include post histories for some reason.

There are two tables in the database. One contains the posts in their current state in html (for fast serving). The other contains an entry for every edit, retag, or other action that can be taken on a post; notably, it contains the markdown source. It shouldn't be too bad to include another file, posthistories.xml, in the public dump. It would make the dump perhaps 50% larger.

]]>
Scott Morrison comments on "MO public database dumps" (9592) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=9592#Comment_9592 2010-10-14T20:40:52-07:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Anton didn't announce it, but he put up a dump for October, too. A question about the database dumps -- they don't include the edit history of posts. I assume, @Anton, that you have these in the ... Anton didn't announce it, but he put up a dump for October, too.

A question about the database dumps -- they don't include the edit history of posts. I assume, @Anton, that you have these in the full dump? Was there some reason to censor these?

]]>
Bill Dubuque comments on "MO public database dumps" (7985) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=7985#Comment_7985 2010-08-03T19:10:45-07:00 2018-11-04T12:58:47-08:00 Bill Dubuque http://mathoverflow.tqft.net/account/301/ Thanks much Anton. Thanks much Anton.

]]>
Anton Geraschenko comments on "MO public database dumps" (7984) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=7984#Comment_7984 2010-08-03T19:04:57-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ I've just posted a fresh database dump. By popular demand (somebody asked for it), I've included the hour each vote was cast. Before it only included the day. This should be enough to do some rough ... I've just posted a fresh database dump. By popular demand (somebody asked for it), I've included the hour each vote was cast. Before it only included the day. This should be enough to do some rough analysis of voting habits. If you want aggregate statistics finer than that, you'll have to ask me to look at the full dump.

]]>
Kevin Buzzard comments on "MO public database dumps" (6732) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=6732#Comment_6732 2010-07-06T23:44:08-07:00 2018-11-04T12:58:47-08:00 Kevin Buzzard http://mathoverflow.tqft.net/account/65/ Quick and hopefully easy question, I thought that a recent question on MO was a duplicate and I couldn't find it using MO search, so I turned to the dumps and I found what I was looking for using ...
EDIT: I have solved this myself. The trick I found is to go to http://mathoverflow.net/questions/5703 and you get redirected to the right place.]]>
Andrey Rekalo comments on "MO public database dumps" (6681) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=6681#Comment_6681 2010-07-03T13:52:03-07:00 2018-11-04T12:58:47-08:00 Andrey Rekalo http://mathoverflow.tqft.net/account/267/ Thank you. This is very useful. Anton Geraschenko comments on "MO public database dumps" (6632) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=6632#Comment_6632 2010-07-01T16:54:59-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Early 4th of July present: a fresh database dump. Early 4th of July present: a fresh database dump.

]]>
Scott Morrison comments on "MO public database dumps" (5697) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=5697#Comment_5697 2010-06-01T08:55:39-07:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Oh dear, is it June already? :-) Oh dear, is it June already? :-)

]]>
Anton Geraschenko comments on "MO public database dumps" (5695) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=5695#Comment_5695 2010-06-01T08:32:32-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Happy June. I just posted a fresh database dump. Happy June. I just posted a fresh database dump.

]]>
Scott Morrison comments on "MO public database dumps" (5231) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=5231#Comment_5231 2010-04-29T15:37:02-07:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Fixed. Copy and paste error, I presume. Fixed. Copy and paste error, I presume.

]]>
Harald Hanche-Olsen comments on "MO public database dumps" (5228) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=5228#Comment_5228 2010-04-29T15:26:58-07:00 2018-11-04T12:58:47-08:00 Harald Hanche-Olsen http://mathoverflow.tqft.net/account/18/ Why does the zip file have the same filename as the previous dump? Why does the zip file have the same filename as the previous dump?

]]>
Anton Geraschenko comments on "MO public database dumps" (5216) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=5216#Comment_5216 2010-04-29T12:18:24-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ I just posted a fresh database dump. I just posted a fresh database dump.

]]>
Scott Morrison comments on "MO public database dumps" (4441) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=4441#Comment_4441 2010-04-05T10:32:29-07:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Okay, I agree that the benefits outweigh the slight, arguable, violation of the "only public data" rule. Okay, I agree that the benefits outweigh the slight, arguable, violation of the "only public data" rule.

]]>
Anton Geraschenko comments on "MO public database dumps" (4326) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=4326#Comment_4326 2010-04-02T14:11:08-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Yes, it does mean that there is some information in the dump which is not available through the website: votes to close have user ids even if they are not "effective". Non-effective votes ... Yes, it does mean that there is some information in the dump which is not available through the website: votes to close have user ids even if they are not "effective". Non-effective votes to close occur in two ways: either not enough votes to close have accumulated on a question (e.g. there are four such votes right now), or the vote to close has expired (votes to close only have a lifespan of four days).

When you vote to close (or reopen), you're volunteering to associate your name to that vote (after all, your name is displayed once the question is actually closed/reopened), so I don't feel like anybody can reasonably object that they meant to keep their identity private when voting to close. The main reason I think it's good to not display who has voted to close a question which is not yet closed is that people vote differently when they know who is "on their side". Not displaying who has voted to close is a way of getting people to take personal responsibility for their vote to close. The dump contains so little information that undermines this purpose that I don't think it's a problem. I'm happy to change it back if somebody has a good reason to do so.

]]>
Scott Morrison comments on "MO public database dumps" (4325) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=4325#Comment_4325 2010-04-02T11:13:24-07:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Just a sec: doesn't this make "votes to close" publicly identifiable before the question actually gets closed? Just a sec: doesn't this make "votes to close" publicly identifiable before the question actually gets closed?

]]>
Anton Geraschenko comments on "MO public database dumps" (4321) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=4321#Comment_4321 2010-04-01T22:09:54-07:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ The April database dump is now available, fresh off the server. I've corrected/changed a couple of things in the script I use to produce the public dump. Specifically, vote tallies are now included ... The April database dump is now available, fresh off the server. I've corrected/changed a couple of things in the script I use to produce the public dump. Specifically, vote tallies are now included in the users.xml file (in the last dump, I claimed in the readme that they were included, but they weren't actually). Also the votes.xml file now includes the UserId when somebody votes to close or reopen a question.

]]>
Anton Geraschenko comments on "MO public database dumps" (3654) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3654#Comment_3654 2010-03-04T16:32:01-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ You can now access the dump (or any part of it) at dumps.mathoverflow.net. You can now access the dump (or any part of it) at dumps.mathoverflow.net.

]]>
Harry Gindi comments on "MO public database dumps" (3643) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3643#Comment_3643 2010-03-04T00:11:21-08:00 2018-11-04T12:58:47-08:00 Harry Gindi http://mathoverflow.tqft.net/account/55/ That was Anton, before I suggested putting it on a free hosting site. That was Anton, before I suggested putting it on a free hosting site.

]]>
Scott Morrison comments on "MO public database dumps" (3642) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3642#Comment_3642 2010-03-04T00:05:52-08:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ Also, someone has already put up a torrent, for example at http://thepiratebay.org/torrent/5408662/. Also, someone has already put up a torrent, for example at http://thepiratebay.org/torrent/5408662/.

]]>
Scott Morrison comments on "MO public database dumps" (3641) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3641#Comment_3641 2010-03-03T23:51:35-08:00 2018-11-04T12:58:47-08:00 Scott Morrison http://mathoverflow.tqft.net/account/3/ It's now available at http://dumps.tqft.net/. Soon after Anton reads this, you can access dumps at http://dumps.mathoverflow.net/. (Anton, do the usual! I'll give you shell access as well.) It's now available at http://dumps.tqft.net/.

Soon after Anton reads this, you can access dumps at http://dumps.mathoverflow.net/. (Anton, do the usual! I'll give you shell access as well.)

]]>
Anton Geraschenko comments on "MO public database dumps" (3638) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3638#Comment_3638 2010-03-03T23:20:04-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ You're right. I'll just do that. Edit: done. You're right. I'll just do that. Edit: done.

]]>
Harry Gindi comments on "MO public database dumps" (3636) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3636#Comment_3636 2010-03-03T23:10:44-08:00 2018-11-04T12:58:47-08:00 Harry Gindi http://mathoverflow.tqft.net/account/55/ =(. The database dump is 9.6 MB. Why don't you rapidshare/megaupload/etc it? =(. The database dump is 9.6 MB. Why don't you rapidshare/megaupload/etc it?

]]>
Anton Geraschenko comments on "MO public database dumps" (3634) http://mathoverflow.tqft.net/discussion/266/mo-public-database-dumps/?Focus=3634#Comment_3634 2010-03-03T23:02:55-08:00 2018-11-04T12:58:47-08:00 Anton Geraschenko http://mathoverflow.tqft.net/account/2/ Here's the first public dump of the database. It's from earlier today. http://dumps.mathoverflow.net/, or http://ifile.it/soyqa09/MOdump20100303.zip Here's the first public dump of the database. It's from earlier today.

http://dumps.mathoverflow.net/, or
http://ifile.it/soyqa09/MOdump20100303.zip

]]>