Reaching into the worlds of economics and statistics, I'd like to share a way to measure the health of online communities:
This all started sometime around 2001 -- I originally heard of the Gini coefficient freshman year in college during one of those massive lecture courses of Economics 101. From Wikipedia:
The Gini coefficient is a measure of inequality of income distribution or inequality of wealth distribution. It is defined as a ratio with values between 0 and 1: 0 corresponds to perfect equality (e.g. everyone has the same income) and 1 corresponds to perfect inequality (e.g. one person has all the income, while everyone else has zero income).
It bounced back in my brain one day a while back as I overheard someone lamenting the 90-9-1 rule of online participation: that 90% of your users will be "lurkers," those who read but don't contribute, 9% will contribute sporadically or only occasionally, and 1% of your entire user base will make up the bulk of the total participation in your community.
Some people like to use the 90-9-1 rule to boo-hoo any attempt at building an online community, some like to do a little math and say "hey, 1% of my total user base is still a big number if they really do become outspoken evangelists" -- but everyone is always looking for a way to break the rule and encourage widespread participation.
But how do we create a metric that allows us to track the ROI of our efforts to increase participation? We can build our own Gini-like metric ....
WARNING: this is a long one but if you stick with me, I bet you're going to start thinking about measuring online communities in a different way.
In most communities, I encourage point systems driven by participation -- leave a comment, get a point, write a blog, get a point -- sometimes certain activities are worth more points(be careful when doing this), and always, the community itself has an effect on the total score: for instance, write a defamatory comment, get negative points from other users and your total score drops. Another choice we often have to make is to decide whether or not to make the score visible to the community -- it almost always encourages competition between users, which in some communities is perfect and in others, can lead to negative behaviors. Digg, for instance, used a visible participation score and it led to the top users wielding too much influence over the entire community -- which fostered a drop in the quality of the content.
Regardless of how visible we make the score, we, as the community organizers, can use it in all manner of ways. In this example, we can use the score to compare the participation of users across the entire community to determine the distribution of participation and build a dynamic metric we can track over time -- just like economists use the Gini coefficient to measure income distribution.
In statistics, what we're looking for is called statistical dispersion -- how far data elements fall from eachother or a mean value. In our case, a perfectly distributed community would all have the same participation points, or each member would have the same number of points as the total community points divided by the number of members.
The perfectly distributed community would look like:
User1: 500 Points, User2: 500 points, User3: 500 points and so on... Everyone is participating equally.
But we know that's not how it looks in real communities, we're much more likely to see:
User1: 0 points, User2: 0 points, User 3: 5 points, User 4: 500 points... Participation is very unequally dispersed.
And we also know that as participation grows increasingly less equal, we see new entrants into the community drop-off more quickly and even older members fade away -- as good community managers, we look out for this type of activity, but it would be extremely beneficial to have a dashboard of quantitative data to back up our qualitative assumptions.
To solve this, in short terms, I start by running a calculation on each user to find the average deviation, also known as the absolute deviation, from the mean (or ideal mean) of the community. Once I know this, I take the coefficient of the variance, which is the average deviation divided by the mean, times 100% which gives us the deviation as a percentage of the mean. Understand? Good, cause I just confused myself.
Ok, I'll show my work!
Let's start with a community:
| community 1 | points |
| User 1 | 50 |
| User 2 | 4 |
| User 3 | 6 |
| User 4 | 18 |
Total Points: 78 Mean (or perfect score): 19.5 points
The average deviation of the group is 15.25. On average, each score is 15.25 units away from the mean.
Taking the coefficient of variance, 15.25/19.5 x 100% = 78.21% -- which means, the average deviation is 78.21% of the mean -- or, the participation in this community is largely unequal.
Unequal compared to what? I'm glad you asked!
Let's look at another community:
| community 2 | points |
| User 1 | 8 |
| User 2 | 7 |
| User 3 | 9 |
| User 4 | 10 |
Total Points: 34 Mean (or perfect score): 8.5 points
The average deviation of the group is 1. On average, each score is 1 point away from the mean.
Taking the coefficient of variance, 1/8.5 x 100% = 11.76% -- which means, the average deviation is 11.76% of the mean -- or the participation in this community is more equally distributed than community 1.
How can we use this?
Each period, we can track the change in our coefficient to see if the participation in the community has grown more or less equally distributed, and on what scale the change has occurred. We shouldn't use this metric by itself, of course, it's also necessary to see the overall growth of participation -- by total number of points -- which we can also segment by our user types or buying segments that we've already constructed beforehand.
Imagine now as you deploy an online community, you can track the distribution of participation from the very start -- and as you see more users register on the site and as you attempt to push more of them to contribute more often -- you now have a metric ready at your side to measure the effectiveness of each new campaign.
* Just an FYI -- I'm no statistician, and I built this model on my own. One reason I'm putting it out there is genuinely for information share, but I'd also love to kick start a conversation about measuring the equality of participation. Give it a thought.




Finding ways to compare participating in social networks is never easy. As you rightly pointed out, many people count their comments and then call it a day. I think you've taken a statistic device and used it nicely to figure out how to compare, apples-to-apples across multiple sites.
In practice, how do we get this to work? Do you envision many sites releasing their data? Lots of us say we are transparent but are we really?
Posted by: Darren Herman | November 08, 2007 at 01:46 PM
Hi there..
Great input to the overall debate on how to measure Social Networks and I simply had to forward it to members of our professional services team for inspiration. Have a good one..
N.B.
I previously added some input on the matter as well:
http://visualrevenue.com/blog/2007/08/online-social-network-participation.html
Cheers mate
Dennis R. Mortensen, COO at IndexTools
http://visualrevenue.com/blog
Posted by: Dennis R. Mortensen | October 28, 2007 at 09:02 AM
Hi Bud,
I just wanted you to know that I've been weaving my blogs into each other and even created two one-minute poems to highlight Inconnue's book and these things have proven very fruitful. Best of all is that Ninaalvarez.net has started to become a much more active forum for poetry and discussion and I have a good number of readers who are interacting and celebrating poetry. I just held a small contest "Send me a poem/I'll send you a book" where I had readers write in with their favorite poem and why they loved it and in return I send them a free copy of '4x1'. It added so much life to the site and was incredibly gratifying. I wrote a blog about Web 2.0 and my excitement about it and linked back to your blog.
Anwyay, thanks for the inspiration.
Posted by: Nina Alvarez | August 22, 2007 at 12:14 PM
Wow, memories of my class in Psych Stats are flowing back to me!. I found you via your post on Bokardo and saved you to my RSS feed, so it pays to put out the word :-)
I am thinking that this level of detailed analysis is the kind that you will see as an organization/site starts to mature/grow or at Fortune 500 type of companies. Not to say that is not useful but it is devoid of passion.
For startup's and smaller companies who are in touch with their communities it will be more like your posted comment. It reminds me of the Community Next conference I was at where the founders of Threadless demoed there awesomeness metric. To them it was about keeping their passion alive and well and authentic in their community. In the end that is what is most important and I have noticed is a key ingredient for successful online communities. However, without sometype of metrics all you have is your gut. This is critical for a successful entrepreneur, however does not fly as well in corporate.
IMHO, I think this is one of the reasons big companies have a hard time launching communities since it is soooo hard to tap into that startup passion in a corporate environment.
I think your post is excellent and I would be happy to continue dialog if you care to continue this as a series as it is an area I am very interested in.
Posted by: Robert Franklin | August 21, 2007 at 09:29 PM
I think for the sake of time, it's all pretty much automatic -- and again, you let the community decide if a comment or post was uniquely valuable by providing rating systems for just about everything.
But as we keep track of other participation metrics, say 'top contributing users' for a month or week, that gives us a list of users we could follow up with one-on-one to discuss the health and value of the community. I think we let the community bubble up opportunities for a greater depth of interaction. Think of an off-line event, like a conference, perhaps you invite your top users for a free dinner or networking reception to get offline feedback or encourage greater evangelism.
Posted by: Bud Caddell | August 20, 2007 at 02:03 PM
Wow. My head is spinning. I need to think about that metric for a little while before it will sink in.
In the meantime, I'm interested in learning more about how you manage/assign/track individual user points within a community. I'm assuming you can build some code that tallies points automatically and is triggered by comment submission, etc., but isn't some of that subjective, and in that case, how much should be done manually?
Posted by: Jill | August 20, 2007 at 01:57 PM