Aim Data Classification Ideas #14

jaredb1011 · 2021-10-23T18:38:29Z

jaredb1011
Oct 23, 2021
Maintainer

Current Status

Right now, we haven't started any work on this aspect of the project.

What we Need

We need to start researching methods of classifying aim data as either cheating or human.
I've used Tensorflow/Keras in python before to do some light machine learning / neural net classification and I think it could be helpful.
Do we need to pursue a deep learning route? Is simple machine learning appropriate?
How should we store and tag the aiming data to feed our program?
What variables do we need from the aiming data?

Post any ideas, research, or thoughts here!

Tazeffiro · 2021-10-23T22:27:18Z

Tazeffiro
Oct 23, 2021

Data Collection
The most straightforward method to gather known-good data would be for players to actively cheat in their games and submit that footage to some sort of server and have those same trusted players submit non-cheating footage as well. Essentially we could have a program running that randomly turns cheats on and off while the players play. However, there are some pretty obvious ethical concerns with this workflow.

There do exist other options... If you are willing to crowdsource video annotation. AWS Sagemaker has a tool called ground truth where we could attempt data annotation. The problem here is that there are non-trivial costs to annotating in this way. Namely, we would need to host the video somewhere, and AWS charges a fee to annotate an image through their tool, even if we are doing the annotation ourselves. More open-source data annotation tools may be applicable here, but they would require more in-depth server infrastructure in order to function.

How Can this be Done?
Here is the rub, the system you are describing, when discussed in the parlance of GANs, is called a Discriminator. The other side of the metaphorical equation here would be a Generator.

I bring up GANs here because, while we try to catch cheaters, the cheat developers ought to engage in some adversarial process to frustrate our progress.

With this in mind, I would suggest that in tandem with the cheat Discriminator, that a Cheat Generator that acts in the user input space be developed in parallel. Ideally with this method, we would only be providing a cheat that the discriminator could already detect, thus it would not be a legitimate threat to a gaming ecosystem.

Further in this vein of architecture selection; I would like to point out that the real user actions (i.e. keyboard presses, or joystick motion) relative to the displayed video, are what we are attempting to gauge as either human or not. It is very likely that we would need both, a representation of the video, and some extracted features for user input for this system to be useable. Conceptually this would be a comparison between user input to the game and the image data that they received. Such a mapping would likely be specific to a given game or at least a certain genre of game.

4 replies

jaredb1011 Oct 24, 2021
Maintainer Author

Great thoughts.

As we work on this, cheaters will of course attempt to counter it in new ways.

jaredb1011 Oct 24, 2021
Maintainer Author

I'l do some research on adversarial networks, most of my experience is with simple binary classification and regression

Fruit-Salad-1 Oct 26, 2021

In regards to the ethical dilemma of actively using cheats to acquire data, we can just use private lobbies in whatever game we choose to prevent becoming the problem we are trying to solve. Even if it's ultimately for the greater good haha.

sander-hergarten Mar 26, 2022

Hey, you brought up some interesting ideas, but I disagree that the user input features are required for good results, or rather that if you have the actual user input that the Discriminator is not required. Assuming you have this data, you could just simulate the exact situation with the exact inputs and just compare the recorded video files. All the discriminator would do would be de-noising the data. In addition extracting this feature would probably also be done by a network so you might as well just slap the embeddings directly into the discriminator. I do however think that using recorded inputs of legit players could make for a good intermediate loss.

I am also not sure how you would create a generator that can simulate future hacks. The most practical thing I can think about would be extracting the "fake" part of the video into some kind of embedding. This would need to be imitated by the generator then and extrapolated in a way to fool the discriminator. Even if this would work, this would assume that cheaters would take a gradient descent / Adam or whatever approach to tricking the discriminator. This might not be true though and the actual "generators" would take a completely different approach that is not "tune the hacks just right".

finally I think if there is enough of one thing on the internet its video game footage. If you take a self-supervised approach the data would not have to be labeled as well. Assuming you want to stick with the supervised approach, I don't think it would be very expensive or hard to receive footage of cheaters from the video game studios. Valve even gives them out for free in CS:GO with the Overwatch program. A big problem with going supervised in my opinion is that its going to be hard to receive confident classifications for cheaters who could trick human reviewers

miguelalba96 · 2021-10-24T11:14:37Z

miguelalba96
Oct 24, 2021

Current Status

Right now, we haven't started any work on this aspect of the project.

What we Need

We need to start researching methods of classifying aim data as either cheating or human.
I've used Tensorflow/Keras in python before to do some light machine learning / neural net classification and I think it could be helpful.
Do we need to pursue a deep learning route? Is simple machine learning appropriate?
How should we store and tag the aiming data to feed our program?
What variables do we need from the aiming data?

Post any ideas, research, or thoughts here!

The first step as always is recollecting the data and check how to properly label it, also answer the following things:

What kind of information we can get from a collection of frames ?
For detecting cheating the temporal information is the key (we cannot rely only on visual features from a single frame)
We need ways to standardize the code and pipelines, so we need a database that contains metadata about every game and every video
From which sources we can get the data?
(Future) if image data is not enough, we can leverage the predictions recollecting behavioral auxiliary data from cheating systems

Once we answer this questions/concerns we can discuss properly the machine learning algorithm (just doing a fast check it will be probably a CNN-LSTM ( like it is done for activity recognition), or an ensemble method (maybe take the images to a lower dimension with an auto encoder and train a time series model))

0 replies

mcdcam · 2021-12-19T11:48:12Z

mcdcam
Dec 19, 2021

If this project isn't dead, we could use a technique similar to that used in this article. We could generate one of these kinds of images for every ~5-10s of video and feed it into an image recognition model. Obviously the hard part is getting enough data, but a proof of concept for this wouldn't need more than an hour or so of labeled data, especially if we use a pretrained model through something like fastai.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aim Data Classification Ideas #14

{{title}}

Replies: 3 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Current Status

What we Need

Post any ideas, research, or thoughts here!

{{title}}

Select a reply

Aim Data Classification Ideas #14

jaredb1011 Oct 23, 2021 Maintainer

Current Status

What we Need

Post any ideas, research, or thoughts here!

Replies: 3 comments · 4 replies

Tazeffiro Oct 23, 2021

jaredb1011 Oct 24, 2021 Maintainer Author

jaredb1011 Oct 24, 2021 Maintainer Author

Fruit-Salad-1 Oct 26, 2021

sander-hergarten Mar 26, 2022

miguelalba96 Oct 24, 2021

Current Status

What we Need

Post any ideas, research, or thoughts here!

mcdcam Dec 19, 2021

jaredb1011
Oct 23, 2021
Maintainer

Replies: 3 comments 4 replies

Tazeffiro
Oct 23, 2021

jaredb1011 Oct 24, 2021
Maintainer Author

jaredb1011 Oct 24, 2021
Maintainer Author

miguelalba96
Oct 24, 2021

mcdcam
Dec 19, 2021