Stories
Slash Boxes
Comments

SoylentNews is people

posted by NCommander on Friday June 06 2014, @08:00PM   Printer-friendly
from the seeing-how-big-our-userbase-is dept.
So, right now, I'm currently sitting with mrcoolbp and martyb in meatspace working out the finer points of incorporation, and the future needs of SoylentNews. One thing that has come up is we really don't have a great idea of our actual usage numbers are. Slashcode has decent internal numbers which give us some rough numbers, but they're only really valid for logged-in users (which bypass the varnish cache), and we're not 100% sure they're accurate anyway. According to slash, we're averaging approximately 50-60k page views per day (I've included the statistics email below), but it doesn't help us in knowing what AC usage look like. According to varnish, we average roughly 400-500k connections per day, but that number is inflated since we're not using keep-alive or HTTP pipelining as of yet.

Furthermore, since we don't log IP addresses in access.log, and IP's run through Slash are turned into IPIDs, its hard to get an idea of where our userbase is (the general feeling is the vast majority of us are based in the United States, but even then, that's more because our peak hours of traffic are between 4 and 10 PM EST). We've wanted to get a better idea of what our traffic and userbase are, so we're asking permission from the community to install piWik, and embed its javascript tag in the footer of each page, which will give us a wide berth of solid information to work from.Our plan is to setup piwik on a separate server, and have it available at stats.soylentnews.org, which can easily be killed via a hostfile. Furthermore, piwik honors the Do-Not-Tracker header for all web browsers except IE10, allowing easy opt-out. I can understand that a lot of users have concerns about any tracking, but we're trying to be upfront and honest about this, so no one gets hugely surprised. While we might post general information (i.e., usage from countries, user agents, etc) that piwik generates, we will purge IP addresses out of the piwik database as soon we're able, to limit the amount of personal information we're keeping about any user. While we're running piwik, we'll have a persistent notification in the "Site News" slashbox that collection is ongoing which will link to this post.

I'd like to get this setup over the weekend, and start collecting information by Sunday at the latest, then run collection for a few weeks. After that, we'll remove the tracking code, publish the results, and purge the piWik database of all personal information. We'll likely periodically re-enable stat tracking to get an idea of how we're doing, with a similiar notification post going up before we do so to give people the chance to opt-out before collection. Obviously, if the community feels dead-set against this, we'll abandon this plan, and simply work with what little information we have available.

SoylentNews Stats for 2014-06-05

                   UIDs      IPIDs      Pages
        total:        -          -      57452 (1341.1 MB)
 static total:        -          -       3822
gstatic total:        -          -       5972
  grand total:      892       4549      59666 (1561.6 MB)
 secure total:        -          -          0
sbscrbr total:        -          -          0

        posts:      153        219
     comments:      437       1546      19402 (330.4 MB)
        index:      726       2319       9107
     articles:      683       2860       9889 (373.1 MB)
       search:       11         92        209 (5.7 MB)
     journals:       43         98        229 (6.2 MB)
        users:      109        161        593 (15.9 MB)
          rss:       46        362       2214 (220.6 MB)
        other:      217        700      18023 (173.3 MB)


     formkeys:      487 rows total
     comments:      573 posted yesterday
  submissions:       16 submissions
 sub/comments:     31.2% of the submissions came from comment posters from this day



    not found:     4769 pages sent with status 404 (not found)

   total hits: 140856136





------------------------
                            Yesterday   | 2 days ago | 3 days ago
    Avg Hits Per Article:          706.4|       690.1|       629.9
Avg Comments Per Article:           30.4|        32.1|        18.4



Pages From RSS By Section
------------------------------------------------
Section		         Pages     UIDS    IPIDS
           Main Page      2508       87      539



For Main Page
                  Pages      IPs   Bandwidth    Users
        total:    57452     4353   1341.1 MB      885
        index:     9107     2319    436.5 MB      726
     comments:    19402     1546    330.4 MB      437
     articles:     9889     2860    373.1 MB      683
       search:      209       92      5.7 MB       11
          rss:     2214      362    220.6 MB       46
        other:    18023      700    173.3 MB      885


-----------------------

Top stories viewed by article.pl:
   883 14/06/05/0025257 n1         First-Person Shooter Engine in
   789 14/06/04/2126226 n1         Apple CEO Says Users Buy an An
   708 14/06/05/0132243 n1         Seattle Approves $15 Minimum W
   617 14/06/05/0121251 n1         Tesla S Road Trip Report
   578 14/06/04/2131208 n1         Intel Wants Your Next PC to Ha
   468 14/06/05/1256249 Woods      Dwarf Fortress Update Coming N
   453 14/06/04/1343246 janrinok   ISPs Urged to Quarantine Infec
   332 14/06/05/1418207 martyb     Computer Programs Are People,
   328 14/06/05/133219  Woods      FBI Offers $10,000 Reward For
   261 14/06/04/1329216 janrinok   Underwater Sound Examined for
   259 14/06/05/1419254 janrinok   How to Spend $750 for One Minu
   252 14/06/05/133201  LaminatorX Apple to Allow Virtual Currenc
   230 14/06/04/1310212 janrinok   Pixar Releasing its 3D Renderi
   225 14/06/04/1337244 janrinok   Vincent van Gogh's Severed Ear
   217 14/06/05/1315234 LaminatorX High Brain Integration and Cre
   194 14/06/04/1212207 martyb     Google Trying Out End-to-End E
   178 14/06/04/111243  LaminatorX Domestic Terror Task Force is
   155 14/06/04/1315250 janrinok   Ambulance Drones Might Appear
   141 14/06/03/211257  n1         What's Lost as Handwriting Fad
   139 14/06/04/1059208 LaminatorX Learning to Eat Vegetables in
   125 14/06/03/2048227 n1         Battlestar Galactica Reboot
   122 14/06/04/0527240 LaminatorX Windows Start Menu Won't Retur
   118 14/06/05/149215  janrinok   British Recording Industry Thi




-----------------------

Top referers:
84  http://www.netvibes.com
67  http://feedly.com
61  http://www.google.co.uk
42  http://google.com
38  http://barrapunto.com
30  https://www.google.com
29  http://7rmath4ro2of2a42.onion
29  http://maps.google.com
27  http://www.newsblur.com
22  http://www.inoreader.com
19  http://www.protopage.com
15  http://t.co
14  http://li694-22.members.linode.com
14  http://sylnt.us
14  http://theoldreader.com
10  http://www.google.com
9  http://www.jaruzel.com
7  http://hager.pipedot.org
7  http://pi.local
6  http://www.igoogleportal.com
 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by NCommander on Monday June 09 2014, @02:06PM

    by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday June 09 2014, @02:06PM (#53223) Homepage Journal

    I won't worry about it too much. DNT is unfortunate its an all or nothing (noscript actually forces it on all the time unless the plugin is physcally uninstalled) so we're aware that its somewhat of an issue. We're working on rough numbers, not a perfect/accurate count :-)

    --
    Still always moving
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2