General: Interesting experiments for MO?

Bottom of Page

1 to 19 of 19

- CommentAuthorFrançois G. Dorais
- CommentTimeApr 9th 2010
I've been enjoying looking at the MO usage statistics posted on meta. They give loads of interesting info but, like all such statistics, they raise more questions than answers. The scientific response is to design experiments in order to test hypothetical answer these questions. For example, one such experiment was proposed (half seriously) by David Speyer in the competitiveness thread. This makes me wonder... Is it too early to experiment with MO? What would be reasonable experiments for MO?
- CommentAuthorRegenbogen
- CommentTimeApr 9th 2010
I think more and more experiments should be conducted.
- CommentAuthorMariano
- CommentTimeApr 9th 2010
"More and more experiments"? I honestly have no idea about what you have in mind!
- CommentAuthorRegenbogen
- CommentTimeApr 9th 2010 edited
Oh of course there are lots. I refrained from giving examples because I didn't want to appear cheeky.

Now since you want one, here is an example. I still believe reputation hunting is the reason for a lot of evils.. Reputation can be made less prominent. For a couple of weeks, one could hide it from everywhere except the user page and see what is the public reaction.

Realistically I do not expect this to be implemented given how people are fond of reputation as shown in the thread mentioned by the OP.
- CommentAuthorAnton Geraschenko
- CommentTimeApr 9th 2010 edited
Let's clarify the meaning of the word "experiment". The only indication fgdorais gave of what an experiment might be is the one David Speyer suggested. The one suggested by Regenbogen is one that I would have to do. Either one of these should be worked out more carefully before somebody actually tries them. Remember that the purpose of an experiment is to find something out. Putting a frog in the microwave is not an experiment, it's just animal cruelty.

If we're going to think about running any sort of experiment, we have to be able to answer the following questions:
1. What is this experiment supposed to determine? What do we want to know?
2. How are we going to read the outcome of the experiment to determine then answer to (1)?
The experiment suggested by David Speyer was aimed at determining how much the reputation of the poster determines how much people vote on the post. I see two problems with the experimental procedure. First, I think the name (i.e. real life reputation) makes more of a difference than points, and the experiment doesn't separate these factors. Second, you'd need a pretty dedicated experimenter because there's a lot of variation in how much the posts of any person get voted on; it will take a lot of data to say something statistically significant.

As for the experiment Regenbogen suggested, I have no idea how to answer question (2), and I'm not even sure I can answer question (1).

I should also point out that the best way to get answers to most questions is to simply look at the large amount of data we've already collected. We should only undertake the task of performing an experiment if there's something we really want to know the answer to that we don't have any other way of getting data about.
- CommentAuthorRegenbogen
- CommentTimeApr 9th 2010
Ah! Perhaps there is something in the database dump; one can perhaps determine right away if and how voting patterns correspond to reputation. I am however incapable of operating databases and so such an attempt would be beyond me.
- CommentAuthorFrançois G. Dorais
- CommentTimeApr 9th 2010 edited
Thanks for clarifying Anton, you are right on the money about the second part of my question. I think giving "David's experiment" as an example gave the wrong direction to my question, so I'll rephrase...
1. Is it is too early to ask questions about MO usage? MO is still evolving quickly but at a much slower pace than a few months ago (some graphs here). It seems like a good idea to try to understand the raw data better, but it might still be too early to ask sensible questions.
2. Some of these questions might not be testable with reasonable experiments but some might. What would be interesting experiments that could be carried out? I do mean sensible and well designed experiments. (We are mathematicians, this is well within our collective skill set...) There are plenty of simple experiments that could be done with existing data or by collecting new data to address the question through polls and surveys, for example.
- CommentAuthorgrp
- CommentTimeApr 9th 2010
Two points.

1. If someone who appears to be closely associated with MathOverflow starts doing experiments, and the community sees it, some of the community will feel like guinea pigs (unwilling test subjects) and will not
participate, no matter how innocuous the experiment. (I assume the experimenter does not do anything like offer free beer or other enticements to do the experiment.) This could possibly skew the results. Also, if some popup window happens during the browsing of the site, that will turn off many users. I recommend against an MO-sponsored experiment that involves any significant form of community interaction.

2. If someone who is interested in a valid statisitcal question posts it on Math Overflow, they can also post a link to a site which explains the experiment, solicits test subjects, whatever else is needed. If they also post that this is NOT MO-sponsored, but that they are doing it apart from Math Overflow and are willing to share the results, etc., that should get a favourable, if self-selected, test group. I recommend anyone who is enthusiastic about polling the community to do it this way.

Why shouldn't MO do it itself? MO is growing, and MO is about answering specific questions. It is not an experimental laboratory or an agar culture. Harvesting collected data and doing data mining is OK as long as it is done in accordance with whatever privacy policy MO has, but I feel uncomfortable just letting anyone doing the data mining.

(As an aside, this is one loss of privacy that I perceive as being a registered user. Someone will record my activity and use the data for purposes of which I do not approve. This is part of an answer to a question someone asked some time ago, about why I chose not to register. Never mind that there may be a similar loss with unregistered users as well.)

Gerhard "Ask Me About System Design" Paseman, 2010.04.09
- CommentAuthorgrp
- CommentTimeApr 9th 2010
It just occurred to me that there is another type of experiment: changing the web site to improve community usage. As long as that is the intent of the experiment (and that any thing messed up can be undone with no harm), then I encourage gradual experimentation by the site administrators.

Gerhard "Open wider, you can get another toe in your mouth" Paseman, 2010.04.09
- CommentAuthorFrançois G. Dorais
- CommentTimeApr 9th 2010
Here are some examples of things I find of interest...
- Response time. How are the time to first answer, time to first twice upvoted answer, time to accepted answer distributed? (Accepted answer can be done with existing data; I don't know how I would do the other two.)
- User retention. How many users visit MO for a second time? How many stay active for a week, a month? (For the second, last_access_date - creation_date should be accurate enough; I'm not sure how to tackle the first.)
- How is the question/answer ratio distributed among users? (Doesn't seem to be accessible through the dump data, but the relevant population size is small.)
- User perception survey. (That would help answer some open questions from the competition thread. Would take some time and expertise to design.)
- CommentAuthorSteve Huntsman
- CommentTimeApr 9th 2010
Other ideas requiring a moving average:

What are the average number of upvotes per question and per answer over time?

What are the percentages of questions and answers by arxiv subject tag over time? [Note that the ca.classical-analysis tag presents some interpretation issues here.]
- CommentAuthorAnton Geraschenko
- CommentTimeApr 9th 2010
Harvesting collected data and doing data mining is OK as long as it is done in accordance with whatever privacy policy MO has, but I feel uncomfortable just letting anyone doing the data mining.

(As an aside, this is one loss of privacy that I perceive as being a registered user. Someone will record my activity and use the data for purposes of which I do not approve. This is part of an answer to a question someone asked some time ago, about why I chose not to register. Never mind that there may be a similar loss with unregistered users as well.)

The public dump only contains information you would in principle be able to gather by browsing the site (almost ... it would be pretty hard to extract the dates of each vote by browsing the site, and those dates (but not exact times) are in the public dump). I have no intention of showing the full database dump to anybody unless there is a very good reason for doing so. Not even MO moderators have access to the full database dump. If somebody can convince me to do so (on a case by case basis), I'm willing to extract and make public some aggregate statistics from the full database dump which are impossible to extract from the public dump.

In any case, there isn't really any difference in privacy between registered and unregistered users. The activities of unregistered users is also tracked and included in the database dump (including the public dump). Information about how users browse the site (as opposed to how they vote or post) is not included, even in the full dump. The only way I can get at those data is with Google analytics. Again, this doesn't distinguish between registered and unregistered users.

It just occurred to me that there is another type of experiment: changing the web site to improve community usage. As long as that is the intent of the experiment (and that any thing messed up can be undone with no harm), then I encourage gradual experimentation by the site administrators.

Yes, that's pretty much the only sort of actual experiment I would consider performing. Almost anything else I want to know, I can get from the data dump (i.e. I can get by observing).
- CommentAuthorSteve Huntsman
- CommentTimeApr 9th 2010
Oh, yeah, and someone wondered if rep was power-law distributed a while back. It would be interesting to determine this.

More generally, it would be interesting to analyze the weighted directed multigraph with vertices given by users and directed weighted edges given with initial vertices at voters and terminal vertices at votees, with the weights given by +1 and -1 according to up or down votes. What is the degree distribution? etc. ...all that stuff that the physicists call "network theory" these days and that we recognize as applied (random) graph theory.
- CommentAuthorQiaochu Yuan
- CommentTimeApr 9th 2010
I think the votes-on-questions graph and the votes-on-answers graph would be very different. In particular, the sinks in the latter graph would be very different from the sinks in the former graph.
- CommentAuthorAnton Geraschenko
- CommentTimeApr 9th 2010
Response time. How are the time to first answer, time to first twice upvoted answer, time to accepted answer distributed? (Accepted answer can be done with existing data; I don't know how I would do the other two.)

The public dump contains exact times at which answers were posted, but only dates of votes (to prevent some kind of vote time correlation approach to guessing who cast what votes). So time to first answer is easy to extract, but time to first twice upvoted answer would have to be done by me.

User retention. How many users visit MO for a second time? How many stay active for a week, a month? (For the second, last_access_date - creation_date should be accurate enough; I'm not sure how to tackle the first.)

According to Google analytics, 28% of visits in the last month were the visitor's first visit. I'm not sure exactly how to interpret that. Users who visit the site multiple times contribute more visits, so the percentage of visitors who only visited the site once in the last month must be much higher than 28%.

How is the question/answer ratio distributed among users? (Doesn't seem to be accessible through the dump data, but the relevant population size is small.)

This information is absolutely in the public data dump. Questions are posts with PostTypeId="1" and answers are posts with PostTypeId="2". You (François) are user 2000, so I can extract the number of questions and answers you've posted with the commands
```
grep 'OwnerUserId="2000"' posts.xml | grep -c 'PostTypeId="1"'
grep 'OwnerUserId="2000"' posts.xml | grep -c 'PostTypeId="2"'
```
User perception survey. (That would help answer some open questions from the competition thread. Would take some time and expertise to design.)

Yes, I think this would take a lot of expertise to design and conduct, and I don't think it's worth it. I'm happy to go with the (possibly extremely skewed) feeling I can gather from that thread and other discussions here on meta.
- CommentAuthorFrançois G. Dorais
- CommentTimeApr 10th 2010
Thanks for the tips Anton!

By the way, there is a meta.stackexchange request for better analytic tools (for moderators).
- CommentAuthorMariano
- CommentTimeApr 10th 2010
(«Putting a frog in the microwave is not an experiment, it's just animal cruelty.» that's going into my collection of quotes...)

First, I think the name (i.e. real life reputation) makes more of a difference than points, and the experiment doesn't separate these factors

Indeed. I think no one looks at Terry Tao's reputation, and if Richard Stanley's reputation were set to 1 each time he comes on the site I'd read his posts with the same attention...

Maybe we should change the font on the user names? :P
- CommentAuthorFrançois G. Dorais
- CommentTimeApr 11th 2010
After Anton's tips, I've been looking at the user retention data. I've plotted the data in multiple ways looking for a pattern to test, the most revealing has been graphing the use ratio (last_use_date - creation_date)/(collection_date - creation_date) against the age (collection_date - creation_date) of each user. This graph shows dominant clustering around ratios 0 and 1, with scattered points in between (i.e. most users quit immediately or use regularly). The idea is to test the data against some model behavior, but nothing of the sort pops to mind. (This is not at all my area, so my knowledge base is limited.) Any thoughts on a good model for this?

PS: How do you post images on meta?
- CommentAuthorDylan Moreland
- CommentTimeApr 11th 2010
PS: How do you post images on meta?

Use Markdown formatting and write something like
```
![alternate text](http://address.of/image.gif)
```
See the syntax documentation for more incantations.

1 to 19 of 19

Back to Discussions Top of Page

tea.mathoverflow.net

Discussion Feed

General: Interesting experiments for MO?