[Evening Read] SECURITYNET: Assessing Machine Learning Vulnerabilities on Public Models

Mohamed Nabeel
2 min readOct 5, 2024

--

Public models democratizes the access to AI. Can we trust them blindly?

Researchers collect publicly available image classification models (910 models) and create what is called SecurityNet.

They evaluate these models under three prominent ML attacks:

  • Model stealing
  • Backdoor detection
  • Membership inference
SecurityNet Statistics (Source: paper)

What makes a model less stealable?

  • It is more difficult to steal classifiers with more diverse data and classes compared to those with a few classes and limited data
  • The higher the target model accuracy is more difficult it is to steal it. So, the model accuracy contributes to the robustness of the model.
  • Using out-of-distribution data makes stealing less effective.
  • Complex surrogate models do not necessary help in stealing the target model.

What makes a model susceptible to membership inference attacks?

  • The more the target model is overfitting, the higher the chance of succeeding with this attack.
  • The more data and variety of data you use for the model training, the harder it is to launch inference attacks.

Can we identify backdoored models?

  • NeuralClense was less effective in identifying backdoored models
  • Strip/Neo (input filtering methods) were more effective at identifying backdoored models

Large and varied dataset help to develop a model with nuanced and robust understanding of the patterns. Such a model is difficult to tricked by outlier data points.

It is not about teaching the model to what to do, but also giving it the broadest possible experience by exposing to as much data as possible.

References:

--

--

Mohamed Nabeel
Mohamed Nabeel

Written by Mohamed Nabeel

Cyber Security Researcher | Machine Learning | Crypto for everyone! LLMs: https://bit.ly/4h9XZMW AI + Cyber Security: https://bit.ly/3CwY3r2

No responses yet