Stories
Slash Boxes
Comments

SoylentNews is people

posted by NCommander on Friday June 06 2014, @08:00PM   Printer-friendly
from the seeing-how-big-our-userbase-is dept.
So, right now, I'm currently sitting with mrcoolbp and martyb in meatspace working out the finer points of incorporation, and the future needs of SoylentNews. One thing that has come up is we really don't have a great idea of our actual usage numbers are. Slashcode has decent internal numbers which give us some rough numbers, but they're only really valid for logged-in users (which bypass the varnish cache), and we're not 100% sure they're accurate anyway. According to slash, we're averaging approximately 50-60k page views per day (I've included the statistics email below), but it doesn't help us in knowing what AC usage look like. According to varnish, we average roughly 400-500k connections per day, but that number is inflated since we're not using keep-alive or HTTP pipelining as of yet.

Furthermore, since we don't log IP addresses in access.log, and IP's run through Slash are turned into IPIDs, its hard to get an idea of where our userbase is (the general feeling is the vast majority of us are based in the United States, but even then, that's more because our peak hours of traffic are between 4 and 10 PM EST). We've wanted to get a better idea of what our traffic and userbase are, so we're asking permission from the community to install piWik, and embed its javascript tag in the footer of each page, which will give us a wide berth of solid information to work from.Our plan is to setup piwik on a separate server, and have it available at stats.soylentnews.org, which can easily be killed via a hostfile. Furthermore, piwik honors the Do-Not-Tracker header for all web browsers except IE10, allowing easy opt-out. I can understand that a lot of users have concerns about any tracking, but we're trying to be upfront and honest about this, so no one gets hugely surprised. While we might post general information (i.e., usage from countries, user agents, etc) that piwik generates, we will purge IP addresses out of the piwik database as soon we're able, to limit the amount of personal information we're keeping about any user. While we're running piwik, we'll have a persistent notification in the "Site News" slashbox that collection is ongoing which will link to this post.

I'd like to get this setup over the weekend, and start collecting information by Sunday at the latest, then run collection for a few weeks. After that, we'll remove the tracking code, publish the results, and purge the piWik database of all personal information. We'll likely periodically re-enable stat tracking to get an idea of how we're doing, with a similiar notification post going up before we do so to give people the chance to opt-out before collection. Obviously, if the community feels dead-set against this, we'll abandon this plan, and simply work with what little information we have available.

SoylentNews Stats for 2014-06-05

                   UIDs      IPIDs      Pages
        total:        -          -      57452 (1341.1 MB)
 static total:        -          -       3822
gstatic total:        -          -       5972
  grand total:      892       4549      59666 (1561.6 MB)
 secure total:        -          -          0
sbscrbr total:        -          -          0

        posts:      153        219
     comments:      437       1546      19402 (330.4 MB)
        index:      726       2319       9107
     articles:      683       2860       9889 (373.1 MB)
       search:       11         92        209 (5.7 MB)
     journals:       43         98        229 (6.2 MB)
        users:      109        161        593 (15.9 MB)
          rss:       46        362       2214 (220.6 MB)
        other:      217        700      18023 (173.3 MB)


     formkeys:      487 rows total
     comments:      573 posted yesterday
  submissions:       16 submissions
 sub/comments:     31.2% of the submissions came from comment posters from this day



    not found:     4769 pages sent with status 404 (not found)

   total hits: 140856136





------------------------
                            Yesterday   | 2 days ago | 3 days ago
    Avg Hits Per Article:          706.4|       690.1|       629.9
Avg Comments Per Article:           30.4|        32.1|        18.4



Pages From RSS By Section
------------------------------------------------
Section		         Pages     UIDS    IPIDS
           Main Page      2508       87      539



For Main Page
                  Pages      IPs   Bandwidth    Users
        total:    57452     4353   1341.1 MB      885
        index:     9107     2319    436.5 MB      726
     comments:    19402     1546    330.4 MB      437
     articles:     9889     2860    373.1 MB      683
       search:      209       92      5.7 MB       11
          rss:     2214      362    220.6 MB       46
        other:    18023      700    173.3 MB      885


-----------------------

Top stories viewed by article.pl:
   883 14/06/05/0025257 n1         First-Person Shooter Engine in
   789 14/06/04/2126226 n1         Apple CEO Says Users Buy an An
   708 14/06/05/0132243 n1         Seattle Approves $15 Minimum W
   617 14/06/05/0121251 n1         Tesla S Road Trip Report
   578 14/06/04/2131208 n1         Intel Wants Your Next PC to Ha
   468 14/06/05/1256249 Woods      Dwarf Fortress Update Coming N
   453 14/06/04/1343246 janrinok   ISPs Urged to Quarantine Infec
   332 14/06/05/1418207 martyb     Computer Programs Are People,
   328 14/06/05/133219  Woods      FBI Offers $10,000 Reward For
   261 14/06/04/1329216 janrinok   Underwater Sound Examined for
   259 14/06/05/1419254 janrinok   How to Spend $750 for One Minu
   252 14/06/05/133201  LaminatorX Apple to Allow Virtual Currenc
   230 14/06/04/1310212 janrinok   Pixar Releasing its 3D Renderi
   225 14/06/04/1337244 janrinok   Vincent van Gogh's Severed Ear
   217 14/06/05/1315234 LaminatorX High Brain Integration and Cre
   194 14/06/04/1212207 martyb     Google Trying Out End-to-End E
   178 14/06/04/111243  LaminatorX Domestic Terror Task Force is
   155 14/06/04/1315250 janrinok   Ambulance Drones Might Appear
   141 14/06/03/211257  n1         What's Lost as Handwriting Fad
   139 14/06/04/1059208 LaminatorX Learning to Eat Vegetables in
   125 14/06/03/2048227 n1         Battlestar Galactica Reboot
   122 14/06/04/0527240 LaminatorX Windows Start Menu Won't Retur
   118 14/06/05/149215  janrinok   British Recording Industry Thi




-----------------------

Top referers:
84  http://www.netvibes.com
67  http://feedly.com
61  http://www.google.co.uk
42  http://google.com
38  http://barrapunto.com
30  https://www.google.com
29  http://7rmath4ro2of2a42.onion
29  http://maps.google.com
27  http://www.newsblur.com
22  http://www.inoreader.com
19  http://www.protopage.com
15  http://t.co
14  http://li694-22.members.linode.com
14  http://sylnt.us
14  http://theoldreader.com
10  http://www.google.com
9  http://www.jaruzel.com
7  http://hager.pipedot.org
7  http://pi.local
6  http://www.igoogleportal.com
 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by janrinok on Friday June 06 2014, @08:21PM

    by janrinok (52) Subscriber Badge on Friday June 06 2014, @08:21PM (#52378) Journal

    If you want to reach the majority of the community, I suggest that you pose the question when most of the community visit the site. The weekends are notorious for low numbers of submissions and page hits. And where I am it is already evening on Friday.

    However, I have no objections to the use of this method of data collection.

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by NCommander on Friday June 06 2014, @08:41PM

    by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Friday June 06 2014, @08:41PM (#52392) Homepage Journal

    And that is why it went out at 4PM on Friday, which is start of peak hours during the week :-)

    --
    Still always moving
    • (Score: 2) by janrinok on Friday June 06 2014, @08:46PM

      by janrinok (52) Subscriber Badge on Friday June 06 2014, @08:46PM (#52395) Journal

      I'll bet it's not peak from outside the USA though - if you are trying to find out where people are, you could try putting it out to cover a time period when everyone gets to see it. The Europeans have already gone home for the weekend, while the USA is still winding down.

      • (Score: 3, Informative) by captain normal on Friday June 06 2014, @09:57PM

        by captain normal (2205) on Friday June 06 2014, @09:57PM (#52419)

        Plus it's already Sat. eve in New Zealand, Australia, Hong Kong, Japan...etc.

        --
        When life isn't going right, go left.
    • (Score: 1) by Ethanol-fueled on Saturday June 07 2014, @12:38AM

      by Ethanol-fueled (2792) on Saturday June 07 2014, @12:38AM (#52466) Homepage

      Peak drinking hours during the week.

      So don't be surprised if some people start posting obscenities and everybody else slobbers all over you and the other admins' nuts telling you all how you're the coolest people they've ever met.

    • (Score: 2) by lhsi on Monday June 09 2014, @08:50AM

      by lhsi (711) on Monday June 09 2014, @08:50AM (#53164) Journal

      Peak hours in the USA maybe...

      Do I need to allow something in Ghostery to let this work? It is crossed out by default. I'm going to miss the little "0" on the icon, it was a rare occurrence when browsing the net...

      • (Score: 2) by NCommander on Monday June 09 2014, @09:37AM

        by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday June 09 2014, @09:37AM (#53172) Homepage Journal

        For you to show up on the stats, you need to allow javascript, allowing the tracing image to load (this is something I didn't realize when I posted the article; it WILL get non-JS users), and allow stats.soylentnews.org to work. I won't worry about it to much.

        Out of curosity, does it still say zero? The tracking code been enabled since late Saturday

        --
        Still always moving
        • (Score: 2) by lhsi on Monday June 09 2014, @10:05AM

          by lhsi (711) on Monday June 09 2014, @10:05AM (#53174) Journal

          It does not say 0 anymore - it says 1 (With "Piwik Analytics" crossed out). I think I have enabled it now (it still says 1 but it is no longer crossed out).

        • (Score: 2) by lhsi on Monday June 09 2014, @10:38AM

          by lhsi (711) on Monday June 09 2014, @10:38AM (#53176) Journal

          Have you made a change to the Piwik thing? Ghostery has gone back to "0" but I have a little shield in Chrome that says "This page includes a script from unauthenticated sources" with an option to "Load unsafe script".

          I get this if I click on "Learn more": https://support.google.com/chrome/answer/1342714?hl=en-GB [google.com]

          I am on the https SN, I don't know if that makes a difference.

          • (Score: 2) by NCommander on Monday June 09 2014, @02:03PM

            by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday June 09 2014, @02:03PM (#53221) Homepage Journal

            We had a problem that https access would get caught up due to a misconfiguration on nitrogen (the error manifested itself as "Kerberos Authetication" boxes, since it tried to access a resource on the staff slash. I forced piwik access to http only until we can get it resolved (we need a second IP which linode just granted us). It will be restored later today, I've got new SSL certificates which will be installed in a few hours.

            --
            Still always moving