Tutorials

File Manager

How to handle multiple files with a folder structure without crying (Matlab).

Handling multiple files is a very common situation in research, where you need to save data for different subjects, conditions and equipment. For example, in my open-source biomechanics dataset, we captured multiple wearable sensors for different conditions and ambulation modes.

The traditional solution for the typical grad student is to develop a complicated naming convention. This involves creating really long file names that describe the characteristics of the data collection. In the best case, the grad student will use underscores, since it’s known that life is easier without whitespaces in filenames. Then, she could do something like strcat or char concatenation to form names and read a particular file from a Matlab script.

subject='AB01'; date='01_02_20'; sensor=['emg'];
aFileName=[subject '_' date '_' sensor '.mat'];
disp(aFileName);
% Shows: AB01_01_02_20_emg.mat

With that solution, you have to store all your data in a single folder and then programmatically access with fopen or load. This is not efficient for disk access as folders with many files will read significantly slower.

As opposed to this, the ideal file organization would categorize the data in folders, with a path structure that is meaningful for the organization. However, if you split the data in folders, it can become a nightmare when you want to read the data, since you will need to access multiple folders from your Matlab script.

Don’t worry… this is where FileManager comes to the rescue!

Introducing FileManager

I created a Matlab class that does all the file handling for you. Now you can write beautiful scripts, removing all the manual char concatenation and handling the nested folder structure.

  1. First, download the FileManager from MatlabExchange and include in your path with addpath.
  2. Create a FileManager object in your script. For example, let’s assume that you are storing the data in C:\my_groundbreaking_experiment\. Also, you are doing a longitudinal study, collecting data for different subjects, once every week for 5 weeks. For each session, you save the data as multiple files with a csv table that contains your observations and some pictures and additional notes of the experiment.

To create a FileManager for this situation:

f=FileManager('C:\my_groundbreaking_experiment\','PathStructure',{'Subject','Week','File'});

Now you can use the different methods of your FileManager f to find files or generate arbitrary paths following this structure. Let’s say you need all the files for Subject1.

subject1Files=f.fileList('Subject','Subject1')
% This produces a cell array with all the files found for Subject1
% e.g. {'C:\my_groundbreaking_experiment\Subject1\Week1\datatable.csv'}
%      {'C:\my_groundbreaking_experiment\Subject1\Week1\somepicture.jpg'}
%      {'C:\my_groundbreaking_experiment\Subject1\Week2\datatable.csv'}
%      ... and so on 

Now, let’s say you need the Week1 csv files for all the subjects. You can pass multiple key-value pairs to the fileList to refine your search. Note the use of * as a keyword to find all the files with that format.

week1CSVFiles=f.fileList('Week','Week1','File','*.csv')
% This produces a cell array with all the csv files found for Week1
% e.g. {'C:\my_groundbreaking_experiment\Subject1\Week1\datatable.csv'}
%      {'C:\my_groundbreaking_experiment\Subject2\Week1\datatable.csv'}
%      ... and so on 

More functionality

There are more methods in the FileManager and I suggest you check the documentation help FileManager to get all the details. I will end this post with another useful method. When you want to create a new filepath, for e.g.
writing analysis results, try using f.genList. Here is an example: you computed some results for each subject and you want to save them in C:\my_groundbreaking_experiment\<Subject>\Results\resultstable.mat.

subjectOutputFiles=f.genList('Subject',{'Subject1','Subject2'},'Week','Results','File','resultstable.mat')
% This produces a cell array with a generated list of all the combinations of name-value pairs.
% e.g.  {'C:\my_groundbreaking_experiment\Subject1\Results\resultstable.mat'}
%		{'C:\my_groundbreaking_experiment\Subject2\Results\resultstable.mat'}

Now in your for loop on each i-th subject you can save the results to the i-th subjectOutputFile.

I hope that this class helps to clean up your code. If you like it please rate and star on Github! Feel free to reach out with comments or improvement suggestions.