I am currently working on source code that is over 5 GB in size. This is mostly due to a poorly thought-out folder structure; there are code files, images, and Excel files all jumbled together. I think a clear distinction should be made between source code and data.
I would define data as anything that is added to the project during its life. So, if you have an upload option, anything that is uploaded would be described as data. The site should still function without (or with very little) data.
Images can fit into both groups. Any icons or images attached to the functionality of the project should be classed as source code. However, anything that is uploaded should be classed as data.
The database should also be classed as both. The data, anything that is inside a database table, should normally be classed as data. Stored procedures, functions, and views are all source code and would benefit from version control.
Source Control != Backup
Source control is not an excuse not to backup things. Don’t just commit files to source control so you know you can restore them if you need to. In general, files in source control are there so that you can see how they changed over time as the code base changed. Files in you backup are a snapshot of what the application was at a point in time and will include ALL the data.
One last point before I end: If you are hosting on a cloud computing platform like Azure, it gives you an easy way to distinguish between data and code.
Anything in your web app is code.
Anything in blob storage is data.
Anything having to do with SQL is data and code.
Each project is unique and there will always be exceptions to these suggestions, but I think this is a good goal to have. What do you think?