Uncategorized

Coding for Data Science Tips 2 – Standardize csv reading between Windows and Mac

When one starts to learn data science it is extremely useful to ask feedback from other data scientists and data enthusiasts on the quality of our code and the process, we are using to analyse data. To ease this process, we often send notebooks and projects back and forth. But way too often the code looks like this.

Don’t do this. This points the file to a file pathway specific to your computer and makes the life of those wanting to help you a lot more…boring.

Instead try something like this. Keep a folder within the folder of your project where you keep your data and name it something straightforward like….data. The function to upload a .csv or .xlsx starts by the working directory where the notebook is so it won’t be a problem. The code will look like this:

Much simpler, isn’t it? With this all you will need to ask for feedback from another data enthusiast is to copy the folder where the notebook is and the code should work fine.

You can do an extra step to be sure it works in all environments (Mac, Linux and Windows). Add an r before the pathway to the file, like this:

As usual all tips are stored in code in https://github.com/insilicobiologyblog/DScodingTips so you can check them.

Any tips & tricks you might have for coding in Data Science for all levels of data scientists? Share them with me! =)