[Evening Read] SECURITYNET: Assessing Machine Learning Vulnerabilities on Public Models

Mohamed Nabeel

2 min readOct 5, 2024

Public models democratizes the access to AI. Can we trust them blindly?

Researchers collect publicly available image classification models (910 models) and create what is called SecurityNet.

They evaluate these models under three prominent ML attacks:

Model stealing
Backdoor detection
Membership inference

What makes a model less stealable?

It is more difficult to steal classifiers with more diverse data and classes compared to those with a few classes and limited data
The higher the target model accuracy is more difficult it is to steal it. So, the model accuracy contributes to the robustness of the model.
Using out-of-distribution data makes stealing less effective.
Complex surrogate models do not necessary help in stealing the target model.

What makes a model susceptible to membership inference attacks?

The more the target model is overfitting, the higher the chance of succeeding with this attack.
The more data and variety of data you use for the model training, the harder it is to launch inference attacks.

Can we identify backdoored models?

NeuralClense was less effective in identifying backdoored models
Strip/Neo (input filtering methods) were more effective at identifying backdoored models

Large and varied dataset help to develop a model with nuanced and robust understanding of the patterns. Such a model is difficult to tricked by outlier data points.

It is not about teaching the model to what to do, but also giving it the broadest possible experience by exposing to as much data as possible.

References:

https://www.usenix.org/system/files/usenixsecurity24-zhang-boyang.pdf

[Evening Read] SECURITYNET: Assessing Machine Learning Vulnerabilities on Public Models

Written by Mohamed Nabeel

No responses yet