I'm definitely 100% against collecting information without telling anyone. For a variety of reasons, many of which have been mentioned already, and just a few of which are:
- Ren'Py is not a program, it is a tool to make
programs, which means you're actually intending to make Ren'Py game developers complicit in your data-collection. I don't want to have to carefully read the Data Protection Act every time I distribute a Ren'Py game, and I seriously doubt you're well enough informed about data protection laws in every jurisdiction worldwide to be able to confidently say it would be perfectly legal for everyone.
- However harmless you think the data is, you're still holding people's data. As the old adage goes, take only what you need, and keep only what you absolutely definitely need. And in the case of Ren'Py, it doesn't need
any of that information to perform its primary function, which is "running VNs".
- Precedent != justification. Starforce (or something like it, I forget which exactly) disabled my ability to backup my own data with my own CD writer ten years ago, just because I had the temerity to pay a publisher money for a videogame - that doesn't mean it was remotely justified then, and it doesn't mean that you'd be justified in doing the same thing today. It's an extreme example, sure, but the point is that "other people do it" isn't a good enough reason to think it's OK for you to do it.
- It will
decrease the user's faith in the program, and probably in other programs made with the same engine. People, when taken as a whole, are not rational or objective enough to say "hey, look, it's only my OGL version and platform, it's no problem" - the reaction of a significant proportion of people who notice will be "Ren'Py is spyware", regardless of whether you've mentioned it in the readme or license. Worse, it'll also probably be "Developer X's game is spyware", which gives them a bad name for no good reason - they may not even be aware Ren'Py is doing this, you know that nobody reads the documentation unless they absolutely have to. Such a move is likely to put people off downloading Ren'Py-made games and possibly tarnish the name of innocent game-makers.
- Just because you aren't using it for evil doesn't mean other people won't. Ren'Py is open-source and pretty easy for a skilled developer to modify in a lot of ways, and if it did come to pass that people get used to the idea that Ren'Py games connect to the Internet and do mysterious data-exchange stuff, then this also makes people more willing to ignore Ren'Py games doing such things, leaving the possibility open for malicious people to suck up web-browsing history and personal info and random files stored on the user's hard disk and passwords accidentally typed into the Ren'Py window and whatever
while the user is playing their cunningly-disguised Ren'Py-game-which-is-actually-spyware. Not very likely, I know, but if you make people used to the idea of saying "yes, ignore this traffic" on their firewall, that also opens the door for people to exploit it.
As it goes, you probably already know this, but since you brought it up - the Unity logging isn't so notable or similar as it might at first seem, because the user has to voluntarily send most of that data already just to be able to view web pages; the HTTP request your browser makes already sends a lot of information about the computer it's running on. The other reason it gets away with it is most likely simply because the user is already connecting to the Internet with the "Firefox.exe" application (or whatever) in order to get to the Unity web player in the first place, so their firewall quite probably doesn't even mention the new connection to the Unity servers from the same app... even a technically-aware user will have no idea that the transfer goes on unless they specifically look for it. That's simply not the case for a game which doesn't have to access the network at all for its normal mode of use.
What do people think about asking the user to send statistics on quit?
I'm thinking the checkbox could be checked by default, but the user would always have time to uncheck it before he or she quits. After it's changed, the setting would be remembered. (Perhaps even across multiple games.)
On one hand, asking the user explicitly on game-exit is a much better approach than just doing it without telling them.
On the other hand, I would strongly advise the following:
- Actually ask the user a direct question which is obviously different from the regular "do you want to quit yes/no?" question, so it's actually clear to the user what's going on and they can't accidentally click "yes, send all my information to some random guy" button when they really meant "yes, of course I want to quit, that's why I clicked the 'quit' button".
- (One way you can do the above is by popping a new obviously-different window up after the user has already hit 'Yes', and never doing this on their first use or two of the software, so they have a chance to get used to what the 'normal' quit flow is like.)
- Remember that a significant proportion of the user base doesn't have English as a first language, and even some English-speakers ignore bits of text that they don't think they absolutely have to read. That doesn't mean these people are going to be happy to find out that Ren'Py is sending their details into the aether.
- Give the developer an option to disable it. Make sure that this is prominent in the options.rpy, and preferably defaults to off. The developer should not be drawn into your plans unwittingly, and you can bet that a significant number of potential Ren'Py developers aren't reading this thread and may well remain unaware of this whole idea.
- You probably have sufficient grounding in stats already, but of course: realise that just by giving the user an option to not send their details, you're already biasing your population and ensuring that you don't necessarily get a representative sample of users. You can bet that there's a very significant demographic skew on the kind of user who's likely to uncheck that checkbox, say 'no', or whatever... and it's quite possibly the kind of demographic that also has a strong correlation to particular ranges of hardware. Is it still so much better than a voluntary survey of users?
Since the user is opting to send the data, we could even send slightly more sensitive data than I'd be comfortable sending otherwise - like the game name and version, the play time, and perhaps some measurement of user activity, like the number of times he or she clicked.
... Why on Earth would you want to know how many times they clicked?! If you can't think of a really useful reason to have that data, then don't collect it. I can't, myself.
EDIT: I could even see making some sort of API available to game-makers. Something like this could be useful for tracking how many times each ending was reached, to help tune the difficulty of a game.
This sounds even more potentially-problematic, to me - how are you going to make the information available to the game-maker? Does it go to their server (most game-makers probably don't have or couldn't configure a server with software to receive the data)? Does it go to your server and you allow them access? How do you ensure that the data isn't viewed by anyone other than the people the various players have authorised to see it? Insert dramatic ellipses here.
I understand that it's useful to have some idea of the hardware that Ren'Py's end-users are running on, but it's also useful to have a clean nose and a good reputation, and not piss off players or get developers into trouble. I'm not a tinfoil hat, and I don't mind reasonable data being sent off to software vendors a lot of the time, but bear in mind that we're living in an age where Internet identity theft is front-page newspaper news and people call up their techy friends afraid that their own AV software is malicious internet-delivered spyware. I work in the same room as a guy who refuses to block ad-bots on Skype because he doesn't want any record on any remote server to connect him to them, even in such a negative way as a block!(Also, data collection requires central servers, and everyone knows centralisation is basically communism! :P)